package decompress
Library
Module
Module type
Parameter
Class
Class type
The type for input sources. With a `Manual
source the client must provide input with src
. With `String
or `Channel
source the client can safely discard `Await
cae (with assert false
).
The type for output destinations. With a `Manual
destination the client must provide output storage with dst
. With `String
or `Channel
destination the client can safely discard `Flush
case (with assert
false
).
val encoder :
src ->
dst ->
?ascii:bool ->
?hcrc:bool ->
?filename:string ->
?comment:string ->
mtime:int32 ->
os ->
q:De.Queue.t ->
w:De.Lz77.window ->
level:int ->
encoder
encoder src dst ~mtime os ~q ~w ~level
is an encoder that inputs from src
and that outputs to dst
.
Internal queue.
encoder
deals internally with compression algorithm and DEFLATE encoder. To pass compression values to DEFLATE encoder, we need a queue q
. Length of q
has an impact on performance, and small lengths can be a bottleneck, leading encode
to emit many `Flush
. We recommend a que as large as output buffer.
Window.
GZIP needs a sliding window to operate the LZ77 compression. The window must be a 32k window (De.make_window
with bits = 15
). The allocated window can be re-used by an other inflation/deflation process - but it can not be re-used concurrently or cooperatively with another inflation/deflation process.
Level.
Current implementation of GZIP does not handle any compression level. However, the client must give a level between 0 and 9, inclusively, Otherwise, we raise an Invalid_argument
.
Metadata.
Client is able to store some metadata such as:
mtime
time of last modification of the input.os
os
which did the compression.filename
filename of the input (no limitation about length).comment
an arbitrary payload (no limitation about length).ascii
if encoding of contents is ASCII.hcrc
if the client wants a checksum of the GZIP header.
val src_rem : encoder -> int
src_rem e
is how many bytes it remains in given input buffer.
val dst_rem : encoder -> int
dst_rem e
is how many unused bytes remain in the output buffer of e
.
src e s j l
provides e
with l
bytes to read, starting at j
in s
. This byte range is read by calls to encode
with e
until `Await
is returned. To signal the end of input call the function with l = 0
.
dst e s j l
provides e
with l
bytes available to write, starting at j
in s
. This byte range is fill by calls to encode
with e
until `Flush
is returned.
encode e0
is:
`Await e1
ife0
has a`Manual
input source and awaits for more input. The client must usesrc
withe1
to provide it.`Flush e1
ife0
has a`Manual
destination and needs more output storage. The client must drain the buffer before resuming operation.`End e1
ife0
encoded all input. Output buffer is possibly not empty (it can be check withdst_rem
).
Limitation.
The encoder must manipulate an output buffer of, at least, 2 bytes. If it's not the case, encode
does nothing - and it tells you nothing more than it did nothing. Depending on what you do, a loop can infinitely call encode
without any updates until the given output still has less than 2 bytes.