package orsetto

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

Simple octet-stream decoders.

Overview

This module provides for safely decoding structured data in octet streams according to schemes composed with functional combinators.

To decode an octet stream progressively, make the appropriate scanner class object sxr for the octet source, then iteratively apply decoding schemes to the sxr#scan method to produce decoded values from the consumed octets.

A decoding scheme is represented internally as a 3-tuple comprising a) the number sz of octets required in the working slice to perform structural analysis, b) a structural analysis function ck described further below, and c) a validating decoder function rd that produces values when provided with the structural analysis returned by the ck function.

Combinators are provided for composing more useful complex decoding schemes from simpler schemes. For example, use pair sa sb to compose a scheme that decodes (va, vb) where sa decodes va and sb decodes vb.

The structural analysis function ck in a decoding scheme is applied to the working slice of octets w to analyze the structure in preparation for scanning. If the ck function finds w does not yet contain the structure of an encoded value, then an exception must be raised. An internal exception is used to signal to the scanner that a valid structure may yet be found if the working slice is extended further with more octets to analyze.

The validating decoder function rd in a decoding scheme is applied to the octet vector s in the working slice, and the structural analysis x returned by the ck function (described above) to decode and validate the indicated octets in s.

Either decoder function may use invalid msg (see below) to signal a fault to the scanner when a value cannot be decoded. It causes the scanner to raise the Invalid exception.

When scanning encounters the end of a finite octet stream, and the decoding scheme structural analysis result indicates that more octets are required, the scanning method raises the Incomplete exception.

Utility
type size = private
  1. | Size of int

Private representation of the size requirement of a decoded value.

type position = private
  1. | Position of int

Private representation of the stream position of a decoded value.

Validation
type exn += private
  1. | Incomplete of string Cf_slice.t * int

Scanning raises Incomplete (s, n) when a finite octet stream terminates and at least n more octets are required in the working slice s before a value can be decoded.

type exn += private
  1. | Invalid of position * string

Scanning raises Invalid (p, s) to indicate the stream position p of the first octet encountered where decoding is not valid, with a diagnostic message s to describe the validation error.

type 'x analysis = private
  1. | Analysis of int * 'x

Private representation of an analysis result, comprising a non-negative count of the encoded octets and the structural analysis value.

val analyze : size -> int -> 'x -> 'x analysis

Use analyze have need x to make a structural analysis result, where have is the size of the current slice of working octets, need is the total number of octets from the analyzed structure of octets in the working slice that decode to the next value, and x is the structural analysis of the octets, which is provided to the rd function in the scheme.

Using analyze have need, i.e. without applying x, raises Incomplete if need > have, otherwise it returns with a unary constructor for the structural analysis.

Raises Invalid_argument if either have or need is negative. A scanner raises Failure if an analysis result indicates that need is less than the minimum required octets for the decoding scheme.

val invalid : position -> string -> 'a

Use invalid p m in a decoding scheme function with the position p in the octet vector where the first invalid octet is located, and a diagnostic message m, to signal a validation error in decoding by raising an internal exception caught by the scanner class (see below).

val advance : int -> position -> position

Use advance i pos in check functions to advance pos by i positions. Raises Invalid_argument if i < 0.

Schemes
type +'v scheme

The type of a decoding scheme for values of the associated type.

val scheme : int -> (string -> position -> size -> 'x analysis) -> ('x -> string -> 'v) -> 'v scheme

Use scheme sz ck rd to make a decoding scheme for values that require sz or more octets to encode, are validated by applying ck s p n for the n octets in s starting at p to produce an analysis x, then by applying rd s x for decoding the octets in s according to x as the scanned value.

Raises Invalid_argument if sz < 0.

val nil : unit scheme

The nil scheme. Scans no octets.

val pos : position scheme

The position scheme. Scans no octets, produces the current position.

val any : char scheme

The any octet scheme. Scans exactly one octet.

val sat : (char -> bool) -> char scheme

Use sat f to make a decoding scheme that scans exactly one octet and produces it if it satisfies the predicate f, otherwise raises Invalid.

val lit : string -> unit scheme

Use lit s to make a decoding scheme that scans a sequence of octets required to be equal to s.

val opaque : int -> string scheme

Use opaque n to make a decoding scheme that scans n octets to produce a string value comprising those octets.

Raises Invalid_argument if n < 0.

val fixed_width : int -> (position -> string -> 'v) -> 'v scheme

Use fixed_width sz rd to make a decoding scheme for values that can be decoded from some sequences of exactly sz octets by applying rd to the working slice vector and its start index. The rd function may use invalid pos s where no valid decoding of the sequence exists.

val valid_fixed_width : int -> (string -> int -> 'v) -> 'v scheme

Use valid_fixed_width sz rd to make a decoding scheme for values that always can be decoded from any sequence of exactly sz octets by applying rd to the working slice vector and its start index. Compare with fixed_width which provides an index of type position suitable for use with the invalid function.

Composers
val ign : 'v scheme -> unit scheme

Use ign s to make a decoding scheme that scans a value according to s and ignores the octets comprising it.

val opt : 'v scheme -> 'v option scheme

Use opt s to make a decoding scheme for optional values, i.e. if octets scanned are valid for s, then decodes Some v according to s, else if the octets are not valid, then decodes None.

val pair : 'a scheme -> 'b scheme -> ('a * 'b) scheme

Use pair a b to make a decoding scheme for pairs of values, which decodes its first value according to a and its second according to b.

val triple : 'a scheme -> 'b scheme -> 'c scheme -> ('a * 'b * 'c) scheme

Use triple a b c to make a decoding scheme for triples of values, which decodes its first value according to a, its second according to b, and its third value according to c.

val vec : int -> 'v scheme -> 'v array scheme

Use vec n s to make a decoding scheme for fixed-length vectors of n values, with each element decoded according to s.

Raises Invalid_argument if n < 0.

val seq : ?a:int -> ?b:int -> 'v scheme -> 'v list scheme

Use seq s to make a deconding scheme for variable-length sequences of values decoded according to s. Use ~a to specify a minimum number of elements to decode. Use ~b to specify a maximum number of elements to decode. Stops decoding if encounters an invalid element according to s. Raises Invalid_argument if a is less than zero or b is less than a.

val map : (position -> 'a -> 'b) -> 'a scheme -> 'b scheme

Use map f s to make a decoding scheme that applies f to each value decoded according to s to map the value into its corresponding type. The function f may call invalid if the map is not injective.

val ntyp : 't Cf_type.nym -> 't scheme -> Cf_type.opaque scheme

Use ntyp n s to create decoding scheme that encloses the value decoded by s in an untyped value with the type indicated by n. It is a convenient and efficient shortcut for map (fun _ -> Cf_type.witness n) s.

Monad
module Monad : Cf_monad.Unary.Profile with type 'r t = 'r scheme

Use this monad to compose decoding schemes where intermediate values scanned earlier in the octet stream are used to select decoding schemes for the values scanned later in the stream.

Scanners
val of_string : 'v scheme -> string -> 'v option

Use of_string octets scheme to decode the octets octets according to the scheme scheme, to return either Some v, if s comprises the complete encoding of v, or otherwise returns None.

val of_slice : 'v scheme -> string Cf_slice.t -> 'v option

Use of_slice octets scheme to decode the octets octets according to the scheme scheme, to return either Some v, if s comprises the complete encoding of v, or otherwise returns None.

class scanner : ?start:position -> unit -> object ... end

The class of imperative octet stream scanners. Use new scanner () to make a basic scanner object that can progressively decode values from a working slice of octets. Use inherit scanner () to derive a subclass that implements more refined behavior by overriding private methods to manipulate the working slice and the cursor position. Use the ~start parameter to initialize the starting position counter to a number other than zero. (See documentation below for the various private members.)

val string_scanner : ?start:position -> string -> scanner

Use string_scanner ?start s to make a scanner that decodes values progressively from the string s and raises Incomplete when the remaining octets in the string are insufficient to decode a value. Use the ~start parameter to initialize the starting position to a number other than zero.

val slice_scanner : ?start:position -> string Cf_slice.t -> scanner

Use slice_scanner ?start sl to make a scanner that decodes values progressively from the string slice sl and raises Incomplete when the remaining octets in the slice are insufficient to decode a value. Use the ~start parameter to initialize the starting position to a number other than zero.

val chars_scanner : ?start:position -> ?limit:int -> char Seq.t -> scanner

Use chars_scanner ?limit s to make a scanner that decodes values progressively from the character sequence s. Use ~limit to make the scan method raise Failure if the size requirement for a scanned value is larger than limit. Raises Incomplete when the end of the sequence is reached and the remaining octets are insufficient to decode a value. Use the ~start parameter to initialize the starting position to a number other than zero.

val channel_scanner : ?start:position -> ?limit:int -> in_channel -> scanner

Use channel_scanner ?limit c to make a scanner that decodes values progressively from the input channel c. Use ~limit to make the scan method raise Failure if the size requirement for a scanned value is larger than limit. Raises Incomplete when the end of the file is reached and the remaining octets are insufficient to decode a value. Use the ~start parameter to initialize the starting position to a number other than zero.

Streaming Decode
val scanner_to_vals : 'v scheme -> scanner -> 'v Seq.t

Use scanner_to_vals sxr s to make a contextually limited volatile sequence enclosing sxr, which produces each progressive application of sxr#scan s. The sequence terminates when the scan method raises Incomplete with an empty working slice.

val string_to_vals : ?start:position -> 'v scheme -> string -> 'v Seq.t

Use string_to_vals ?start sch str to make a concurrently persistent sequence of values decoded progressively from str according to sch. Consuming raises Invalid or Incomplete as necessary. The sequence terminates if no values can be decoded according to sch with an empty slice and the end of str is reached. If ~start is provided then it specifies the starting position of the first octet in str. Otherwise, the starting position is zero.

val slice_to_vals : ?start:position -> 'v scheme -> string Cf_slice.t -> 'v Seq.t

Use slice_to_vals ?start sch sl to make a concurrently persistent sequence of values decoded progressively from sl according to sch. Consuming raises Invalid or Incomplete as necessary. The sequence terminates if no values can be decoded according to sch with an empty slice and the end of sl is reached. If ~start is provided then it specifies the starting position of the first octet in sl. Otherwise, the starting position is zero

val chars_to_vals : ?start:position -> ?limit:int -> 'v scheme -> char Seq.t -> 'v Seq.t

Use chars_to_vals ?start ?limit sch seq as an abbreviation for scanner_to_vals sch (chars_scanner ?start ?limit seq).