A library to decode S-expression into structured data

Documentation for the Sexp_decode library

The Sexp_decode library consists in the Sexp_decode module only.

The purpose of the library is to help the translation of S-expressions into structured data. It uses the definition of S-expressions provided by the Csexp library.

For example, you may want to transform an address book encoded as an S-expression into structured data, that is easier to process.

Let's assume your address book looks like the following:

# open Sexp_decode;;

# let address_book : sexp =
          Atom "entry";
          List [ Atom "name"; Atom "John Doe" ];
          List [ Atom "country"; Atom "New Zealand" ];
          Atom "entry";
          List [ Atom "name"; Atom "Mary Poppins" ];
          List [ Atom "email"; Atom "" ];
          Atom "entry";
          List [ Atom "name"; Atom "Groot" ];
          List [ Atom "country"; Atom "Groot" ];
val address_book : sexp =
     [Atom "entry"; List [Atom "name"; Atom "John Doe"];
      List [Atom "country"; Atom "New Zealand"]];
     [Atom "entry"; List [Atom "name"; Atom "Mary Poppins"];
      List [Atom "email"; Atom ""]];
     [Atom "entry"; List [Atom "name"; Atom "Groot"];
      List [Atom "country"; Atom "Groot"]]]

A representation as an OCaml value that is probably easier to work with, is by using the following entry type:

# type entry =
  { name : string; country : string option; email : string option };;
type entry = { name : string; country : string option; email : string option; }

# type address_book = entry list;;
type address_book = entry list

It is easy to define decoders that produce values of types entry and address_book:

# let entry_decoder : entry decoder =
  field "entry"
  @@ let* name = field "name" atom in
     let* country = maybe @@ field "country" atom in
     let+ email = maybe @@ field "email" atom in
     { name; country; email };;
val entry_decoder : entry decoder = <abstr>

# let address_book_decoder : address_book decoder = list entry_decoder;;
val address_book_decoder : address_book decoder = <abstr>

Then, you can execute the run function, that has type 'a decoder -> sexp -> 'a option. It produces the following result on our address_book example:

# run address_book_decoder address_book;;
- : address_book option =
 [{name = "John Doe"; country = Some "New Zealand"; email = None};
  {name = "Mary Poppins"; country = None;
   email = Some ""};
  {name = "Groot"; country = Some "Groot"; email = None}]

In addition to the field, maybe, atom and list decoders, the Sexp_decode library provides combinators to build compound decoders from basic ones, and compose them together. In particular, decoders for variants and records are provided.

For example, with the fields combinator, you could define entry_decoder as follows:

let entry_decoder_alt : entry decoder =
  field "entry"
  @@ fields
       ~default:{ name = ""; country = None; email = None }
         ("name", atom >>| fun name entry -> { entry with name });
         ( "country", atom >>| fun country entry -> { entry with country = Some country });
         ("email", atom >>| fun email entry -> { entry with email = Some email });

With this alternative decoder for entries, the fields "name" "country" and "email" might occur in any order, and any number of times.

Important point: Sexp_decode performs no backtracking. Instead, it implements a greedy search, by always selecting the first success in a choice. For example, with the decoder:

# let d = group ((atom |+> (atom >>> atom)) >>> atom);;
val d : string decoder = <abstr>

The following decoding succeeds, as expected:

# run d (List [Atom "A"; Atom "B"]);;
- : string option = Some "B"

but the following decoding fails:

# run d (List [Atom "A"; Atom "B"; Atom "C"]);;
- : string option = None

This is because the first choice in atom |+> (atom >>> atom) succeeds, and thus the second choice is never selected.