Library
Module
Module type
Parameter
Class
Class type
The goal of encore
is to provide combinators to be able to produce an angstrom
's parser or a lavoisier
's encoder.
Combinators are more limited than what angstrom
can provide, but this limitation gives a chance to us to produce a lavoisier
's encoder and be able to deserialize and serialize the value in the same way.
By this fact, we can ensure the isomorphism:
val p : 'v t
val angstrom : 'v Angstrom.t (* = to_angstrom p *)
val lavoisier : 'v Lavoisier.t (* = to_lavoisier p *)
assert (emit_string (parse_string str angstrom) lavoisier = str) ;
assert (parse_string (emit_string v lavoisier) angstrom = v) ;
To be able to make the serializer and the deserializer, the user must provide some bijective elements. To be able to parse and encode an int64
, you must provide the way to get the value from a string
and how you can encode your value to a string
:
let int64 : (int64, string) = Bij.v
~fwd:Int64.of_string
~bwd:Int64.to_string
Then, you are able to play with combinators such as:
let p =
let open Syntax in
int64 <$> while1 is_digit
For some values such as Git values, we must respect isomorphism to ensure to inject/extract exactly the same representation of them into a store
Let's go about the tree Git object. The formal format of it is:
entry := permission ' ' name '\x00' hash tree := entry *
We must describe bijective elements such as:
let permission = Bij.v ~fwd:perm_of_string ~bwd:perm_to_string
let hash =
Bij.v ~fwd:Digestif.SHA1.of_raw_string ~bwd:Digestif.SHA1.to_raw_string
type entry = { perm : permission; hash : Digestif.SHA1.t; name : string }
let entry =
Bij.v
~fwd:(fun ((perm, name), hash) -> { perm; hash; name })
~bwd:(fun { perm; hash; name } -> ((perm, name), hash))
Note that these functions should raise Bij.Bijection
if they fail when they parse the given string.
Then, the format of the entry
can be described like:
let entry =
let open Encore.Syntax in
let permission = permission <$> while1 is_not_space in
let hash = hash <$> fixed 20 in
let name = while1 is_not_null in
entry
<$> (permission
<* (Bij.char ' ' <$> any)
<*> (name <* (Bij.char '\x00' <$> any))
<*> hash
<* commit)
And the tree Git object can be described like:
let tree = rep0 entry
Finally, with tree
and the design of encore
, we can ensure:
let assert random_tree_value =
let p = to_angstrom tree in
let d = to_lavoisier tree in
assert (Angstrom.parse_string ~consume:All p
(Lavoisier.emit_string random_tree_value d) = random_tree_value)
The goal of such design is to describe only one time a format such as our tree Git object and ensure no corruption when we serialize/deserialize values. For our Git purpose, we ensure to keep the same SHA1 (which depends on contents).
module Bij : sig ... end
module Lavoisier : sig ... end
module Either : sig ... end
val to_angstrom : 'a t -> 'a Angstrom.t
to_angstrom t
is the parser of t
.
val to_lavoisier : 'a t -> 'a Lavoisier.t
to_lavoisier t
is the encoder/serializer of t
.
module Syntax : sig ... end