Library
Module
Module type
Parameter
Class
Class type
Unstrctrd (Unstructured) is a lexer/parser according RFC822. It accepts any input which respects ABNF described by RFC5322 (including obsolete form). To contextualize the purpose, email header, a part of DEB format, or HTTP 1.1 header respect, at least, a form, the unstructured form which allows to split a value with a folding-whitespace token.
This token permits to limit any values to 80 characters per line:
To: Romain Calascibetta\r\n
<romain@calascibetta.org>
Then, others forms like email address or subject should, at least, be a subset of this form. The goal of this library is to delay complexity of this form to a little and basic library.
Unstrctrd handles UTF-8 as well (RFC6532). Any input should always terminate by CRLF.
type t = private elt list
val empty : t
val length : t -> int
of_string raw
tries to parse raw
and extract the unstructured form. raw
should, at least, terminate by CRLF.
of_list lst
tries to coerce lst
to t
. It verifies that lst
can not produce CRLF terminating token (eg. [`CR; `LF]
).
val to_utf_8_string : t -> string
to_utf_8_string t
returns a valid UTF-8 string of t
.
val wsp : len:int -> elt
val tab : len:int -> elt
val fws : ?tab:bool -> int -> elt
without_comments t
tries to delete any comment of t
. A comment is a part which begins with '('
and ends with ')'
. If we find a non-associated parenthesis, we return an error.
val split_on :
on:[ `WSP | `FWS | `Uchar of Uchar.t | `Char of char | `LF | `CR ] ->
t ->
(t * t) option
split_on ~on t
is either the pair (t0, t1)
of the two (possibly empty) subparts of t
that are delimited by the first match of on
or None
if on
can't be matched in t
.
The invariant t0 ^ sep ^ t1 = t
holds.
/ *
module type MONAD = sig ... end
module type BUFFER = sig ... end
val lexbuf_make : unit -> Lexing.lexbuf
val post_process :
(t -> 'a) ->
[ `FWS of string
| `OBS_UTEXT of int * int * string
| `VCHAR of string
| `WSP of string ]
list ->
'a