package decompress
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=d1669e07446d73dd5e16f020d4a1682abcbb1b7a1e3bf19b805429636c26a19b
sha512=808e278640ab84b8ead7c5b7d22b70e3809255e37cc80a595cc58dd4974e5240f70307f048041ab1d8678826ce041da4f186179aa7ebbba5e7cfacaaf054f3e6
Description
Decompress is an implementation of Zlib and GZip in OCaml
It provides a pure non-blocking interface to inflate and deflate data flow.
Published: 22 Apr 2021
README
Decompress - Pure OCaml implementation of decompression algorithms
decompress
is a library which implements:
The library
The library is available with:
$ opam install decompress
It provides three sub-packages:
decompress.de
to handle RFC1951 streamdecompress.zl
to handle Zlib streamdecompress.gz
to handle Gzip streamdecompress.lzo
to handle LZO contents
Each sub-package provide 3 sub-modules:
Inf
to inflate/decompress a streamDef
to deflate/compress a streamHigher
as a easy entry point to use the stream
How to use it
Link issue
decompress
uses checkseum
to compute CRC of streams. checkseum
provides 2 implementations:
a C implementation to be fast
an OCaml implementation to be usable with
js_of_ocaml
(or, at least, require only the caml runtime)
When the user wants to make an OCaml executable, it must choose which implementation of checkseum
he wants. A compilation of an executable with decompress.zl
is:
$ ocamlfind opt -linkpkg -package checkseum.c,decompress.zl main.ml
Otherwise, the end-user should have a linking error (see #47).
With dune
checkseum
uses a mechanism integrated into dune
which solves the link issue. It provides a way to silently choose the default implementation of checkseum
: checkseum.c
.
By this way (and only with dune
), an executable with decompress.zl
is:
(executable
(name main)
(libraries decompress.zl))
Of course, the user still is able to choose which implementation he wants:
(executable
(name main)
(libraries checkseum.ocaml decompress.zl))
The API
decompress
proposes to the user a full control of:
the input/output loop
the allocation
Input / Output
The process of the inflation/deflation is non-blocking and it does not require any syscalls (as an usual MirageOS project). The user can decide how to get the input and how to store the output.
An usual loop (which can fit into lwt
or async
) of decompress.zl
is:
let rec go decoder = match Zl.Inf.decode decoder with
| `Await decoder ->
let len = input itmp 0 (Bigstringaf.length tmp) in
go (Zl.Inf.src decoder itmp 0 len)
| `Flush decoder ->
let len = Bigstringaf.length otmp - Zl.Inf.dst_rem decoder in
output stdout otmp 0 len ;
go (Zl.Inf.flush decoder)
| `Malformed err -> invalid_arg err
| `End decoder ->
let len = Bigstringaf.length otmp - Zl.Inf.dst_rem decoder in
output stdout otmp 0 len in
go decoder
Allocation
Then, the process does not allocate large objects but it requires at the initialisation these objects. Such objects can be re-used by another inflation/deflation process - of course, these processes can not use same objects at the same time.
val decompress : window:De.window -> in_channel -> out_channel -> unit
let w0 = De.make_windows ~bits:15
(* Safe use of decompress *)
let () =
decompress ~window:w0 stdin stdout ;
decompress ~window:w0 (open_in "file.z") (open_out "file")
(* Unsafe use of decompress,
the second process must use an other pre-allocated window. *)
let () =
Lwt_main.run @@
Lwt.join [ (decompress ~window:w0 stdin stdout |> Lwt.return)
; (decompress ~window:w0 (open_in "file.z") (open_out "file") |> Lwt.return) ]
This ability can be used on:
the input buffer given to the encoder/decoder with
src
the output buffer given to the encoder/decoder
the window given to the encoder/decoder
the shared-queue used by the compression algorithm and the encoder
Example
An example exists into bin/main.ml where you can see how to use decompress.zl
and decompress.de
.
Higher interface
However, decompress
provides a higher interface close to what camlzip
provides to help newcomers to use decompress
:
val compress : refill:(bigstring -> int) -> flush:(bigstring -> int -> unit) -> unit
val uncompress : refill:(bigstring -> int) -> flush:(bigstring -> int -> unit) -> unit
Benchmark
decompress
has a benchmark about inflation to see if any update has a performance implication. The process try to inflate a stream and stop at N second(s) (default is 30), The benchmark requires libzlib-dev
, cmdliner
and bos
to be able to compile zpipe
and the executable to produce the CSV file. To build the benchmark:
$ dune build bench/output.csv
On linux machines, /dev/urandom
will generate the random input for piping to zpipe. To run the benchmark:
$ cat /dev/urandom | ./build/default/bench/zpipe | ./_build/default/bench/bench.exe
The output file is a CSV file which can be processed by a plot software. It records input bytes, output bytes and memory usage at each second.
Build Requirements
OCaml >= 4.07.0
dune
to build the projectbase-bytes
meta-packagebigarray-compat
checkseum
optint
Dependencies (7)
-
checkseum
>= "0.2.0"
-
optint
>= "0.0.5"
-
cmdliner
>= "1.0.0"
- bigarray-compat
- base-bytes
-
dune
>= "2.8.0"
-
ocaml
>= "4.07.0"
Dev Dependencies (6)
Used by (10)
-
albatross
>= "1.1.1"
-
carton
>= "0.3.0" & < "0.4.3"
-
carton-git
>= "0.3.0" & < "0.4.4"
-
carton-lwt
>= "0.3.0" & < "0.4.4"
- clz
-
doi2bib
>= "0.4.0" & < "0.5.2"
-
git
>= "3.0.0" & < "3.1.1" | >= "3.3.1"
-
git-unix
= "3.1.0" | >= "3.3.1"
-
imagelib
>= "20210402"
-
rfc1951
= "1.4.0"
Conflicts
None