CompressorSourceThis module offers some small utilities to compress and decompress data by writing it to a single byte sequence. One key advantage of doing it this way is to pack multiple discriminating booleans in a single byte, another is to store multiple int32 or int64 values without the boxing cost.
The compress type accumulates data and then renders it to bytes when done. The decompress type does the reverse, extracting data from bytes. Data MUST be extracted in the SAME order that it was inserted:
# open Compressor;;
# let compress = make 12;;
val compress : compress = <abstr>
# write_int8 compress 'a';
write_bool compress true;
write_int32 compress 42l;
write_bytes compress (Bytes.of_string "hello");
write_bool compress false;
write_bool compress true;;
- : unit = ()
# let bytes : bytes = to_bytes compress;;
...
# let decompress = of_bytes bytes;;
val decompress : decompress = <abstr>
# read_int8 decompress;;
- : char = 'a'
# read_bool decompress;;
- : bool = true
# read_int32 decompress;;
- : int32 = 42l
# read_bytes decompress 5 |> String.of_bytes;;
- : string = "hello"
# read_bool decompress;;
- : bool = false
# read_bool decompress;;
- : bool = trueException raised when trying to compress a Z.t that does not fit int64.
make n creates a compress object with an n bytes internal buffer. The buffer is resized as needed, but picking a large enough n avoids unnecessary copies
These write the data exactly as is to the byte sequence.
These use a combination of writing booleans and data in an attempt to be small. For instance, write_int will use write_int32 if the value is small enough, else write_int64:
# let compress = make 8 in
write_z ~signed:true compress (Z.of_int 3);
to_bytes compress;;
- : bytes = Bytes.of_string "\003\000\000\000\001"
# (* Bigger numbers lead to longer sequences *)
let compress = make 8 in
write_z ~signed:true compress (Z.of_int 5_000_000_000);
to_bytes compress;;
- : bytes = Bytes.of_string "\000\242\005*\001\000\000\000\000"write_z ~signed z attempts to push z as a 32 or 64 bit value.
Decompression must be performed in the same order as compression. There is no way to check that the bytes being decompressed were originally of the given type.
Arbitrary decompression will not create invalid values, (since all types have no invalid values) but may fail with Invalid_argument "Index out of bounds" (if decompressing more bytes then were compressed).
May raise Z.Overflow if incorrectly called, as an arbitrary value may not fit 31 or 63 bits.