•  


read, store, write a file including nulls · Issue #130 · dylanaraps/pure-bash-bible · GitHub
Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

read, store, write a file including nulls #130

Open
bkw777 opened this issue Jul 22, 2022 · 0 comments
Open

read, store, write a file including nulls #130

bkw777 opened this issue Jul 22, 2022 · 0 comments

Comments

@bkw777
Copy link

Read and write a file including nulls, pure bash, no externals, no sub-shell forks. not limited to late versions.
Slow and ram-hungry. Every byte is read one byte at a time, and stored as a hex pair in an array.

https://gist.github.com/bkw777/c1413d0e3de6c54524ddae890fe8d705

Variations on this could store the data more compactly and read the file faster, by removing "-n 1" and reading all contiguous non-null bytes together and storing normally in a variable rather than an array, and only the nulls would be stored in some encoded form, (and a further simple enhancement would store contiguous strings of nulls with a single code that means N nulls instead of null) and the loop would only tick over on every null instead of on every byte. But the resulting data wouldn't be as convenient to work with, depedning on why you wanted to read the file and what you wanted to do with the data. Reading binary data and wanting to operate on the binary values, read them as numbers, count bytes, edit specifically positioned bytes in-place etc, an array of ints or hex pairs was more convenient for what I was working on. But if you were merely storing and reproducing the data without needing to parse it or edit it, this other method would be more efficient.

But in either the simple or fancy cases, the point and the essential trick is the same:

  • combination of LANG, IFS, and read option flags to arrange that null is the only delimiter and no other bytes have any special meaning.
  • on each read, be it a byte or a chunk, consult the return value from read to determine the difference between "got nothing because eof" and "got nothing because delimiter"

htof() could run a lot faster if willing to abuse the commandline to hold the entire file. It could be a single printf with a singe brace-expansion with a global replace instead of a loop that does a printf for each byte.
x=" ${h[*]}" ;printf '%b' "${x// /\\x}"

Sign up for free to subscribe to this conversation on GitHub . Already have an account? Sign in .
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
- "漢字路" 한글한자자동변환 서비스는 교육부 고전문헌국역지원사업의 지원으로 구축되었습니다.
- "漢字路" 한글한자자동변환 서비스는 전통문화연구회 "울산대학교한국어처리연구실 옥철영(IT융합전공)교수팀"에서 개발한 한글한자자동변환기를 바탕하여 지속적으로 공동 연구 개발하고 있는 서비스입니다.
- 현재 고유명사(인명, 지명등)을 비롯한 여러 변환오류가 있으며 이를 해결하고자 많은 연구 개발을 진행하고자 하고 있습니다. 이를 인지하시고 다른 곳에서 인용시 한자 변환 결과를 한번 더 검토하시고 사용해 주시기 바랍니다.
- 변환오류 및 건의,문의사항은 juntong@juntong.or.kr로 메일로 보내주시면 감사하겠습니다. .
Copyright ⓒ 2020 By '전통문화연구회(傳統文化硏究會)' All Rights reserved.
 한국   대만   중국   일본