# package ocaml-base-compiler

Sequences.

A sequence of type `'a Seq.t`

can be thought of as a **delayed list**, that is, a list whose elements are computed only when they are demanded by a consumer. This allows sequences to be produced and transformed lazily (one element at a time) rather than eagerly (all elements at once). This also allows constructing conceptually infinite sequences.

The type `'a Seq.t`

is defined as a synonym for `unit -> 'a Seq.node`

. This is a function type: therefore, it is opaque. The consumer can **query** a sequence in order to request the next element (if there is one), but cannot otherwise inspect the sequence in any way.

Because it is opaque, the type `'a Seq.t`

does *not* reveal whether a sequence is:

**persistent**, which means that the sequence can be used as many times as desired, producing the same elements every time, just like an immutable list; or**ephemeral**, which means that the sequence is not persistent. Querying an ephemeral sequence might have an observable side effect, such as incrementing a mutable counter. As a common special case, an ephemeral sequence can be**affine**, which means that it must be queried at most once.

It also does *not* reveal whether the elements of the sequence are:

**pre-computed and stored**in memory, which means that querying the sequence is cheap;**computed when first demanded and then stored**in memory, which means that querying the sequence once can be expensive, but querying the same sequence again is cheap; or**re-computed every time they are demanded**, which may or may not be cheap.

It is up to the programmer to keep these distinctions in mind so as to understand the time and space requirements of sequences.

For the sake of simplicity, most of the documentation that follows is written under the implicit assumption that the sequences at hand are persistent. We normally do not point out *when* or *how many times* each function is invoked, because that would be too verbose. For instance, in the description of `map`

, we write: "if `xs`

is the sequence `x0; x1; ...`

then `map f xs`

is the sequence `f x0; f x1; ...`

". If we wished to be more explicit, we could point out that the transformation takes place on demand: that is, the elements of `map f xs`

are computed only when they are demanded. In other words, the definition `let ys = map f xs`

terminates immediately and does not invoke `f`

. The function call `f x0`

takes place only when the first element of `ys`

is demanded, via the function call `ys()`

. Furthermore, calling `ys()`

twice causes `f x0`

to be called twice as well. If one wishes for `f`

to be applied at most once to each element of `xs`

, even in scenarios where `ys`

is queried more than once, then one should use `let ys = memoize (map f xs)`

.

As a general rule, the functions that build sequences, such as `map`

, `filter`

, `scan`

, `take`

, etc., produce sequences whose elements are computed only on demand. The functions that eagerly consume sequences, such as `is_empty`

, `find`

, `length`

, `iter`

, `fold_left`

, etc., are the functions that force computation to take place.

When possible, we recommend using sequences rather than dispensers (functions of type `unit -> 'a option`

that produce elements upon demand). Whereas sequences can be persistent or ephemeral, dispensers are always ephemeral, and are typically more difficult to work with than sequences. Two conversion functions, `to_dispenser`

and `of_dispenser`

, are provided.

`type 'a t = unit -> 'a node`

A sequence `xs`

of type `'a t`

is a delayed list of elements of type `'a`

. Such a sequence is queried by performing a function application `xs()`

. This function application returns a node, allowing the caller to determine whether the sequence is empty or nonempty, and in the latter case, to obtain its head and tail.

A node is either `Nil`

, which means that the sequence is empty, or `Cons (x, xs)`

, which means that `x`

is the first element of the sequence and that `xs`

is the remainder of the sequence.

## Consuming sequences

The functions in this section consume their argument, a sequence, either partially or completely:

`is_empty`

and`uncons`

consume the sequence down to depth 1. That is, they demand the first argument of the sequence, if there is one.`iter`

,`fold_left`

,`length`

, etc., consume the sequence all the way to its end. They terminate only if the sequence is finite.`for_all`

,`exists`

,`find`

, etc. consume the sequence down to a certain depth, which is a priori unpredictable.

Similarly, among the functions that consume two sequences, one can distinguish two groups:

`iter2`

and`fold_left2`

consume both sequences all the way to the end, provided the sequences have the same length.`for_all2`

,`exists2`

,`equal`

,`compare`

consume the sequences down to a certain depth, which is a priori unpredictable.

The functions that consume two sequences can be applied to two sequences of distinct lengths: in that case, the excess elements in the longer sequence are ignored. (It may be the case that one excess element is demanded, even though this element is not used.)

None of the functions in this section is lazy. These functions are consumers: they force some computation to take place.

`val is_empty : 'a t -> bool`

`is_empty xs`

determines whether the sequence `xs`

is empty.

It is recommended that the sequence `xs`

be persistent. Indeed, `is_empty xs`

demands the head of the sequence `xs`

, so, if `xs`

is ephemeral, it may be the case that `xs`

cannot be used any more after this call has taken place.

If `xs`

is empty, then `uncons xs`

is `None`

.

If `xs`

is nonempty, then `uncons xs`

is `Some (head xs, tail xs)`

, that is, a pair of the head and tail of the sequence `xs`

.

This equivalence holds if `xs`

is persistent. If `xs`

is ephemeral, then `uncons`

must be preferred over separate calls to `head`

and `tail`

, which would cause `xs`

to be queried twice.

`val length : 'a t -> int`

`length xs`

is the length of the sequence `xs`

.

The sequence `xs`

must be finite.

`val iter : ('a -> unit) -> 'a t -> unit`

`iter f xs`

invokes `f x`

successively for every element `x`

of the sequence `xs`

, from left to right.

It terminates only if the sequence `xs`

is finite.

`val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b t -> 'a`

`fold_left f _ xs`

invokes `f _ x`

successively for every element `x`

of the sequence `xs`

, from left to right.

An accumulator of type `'a`

is threaded through the calls to `f`

.

It terminates only if the sequence `xs`

is finite.

`val iteri : (int -> 'a -> unit) -> 'a t -> unit`

`iteri f xs`

invokes `f i x`

successively for every element `x`

located at index `i`

in the sequence `xs`

.

It terminates only if the sequence `xs`

is finite.

`iteri f xs`

is equivalent to `iter (fun (i, x) -> f i x) (zip (ints 0) xs)`

.

`val fold_lefti : ('b -> int -> 'a -> 'b) -> 'b -> 'a t -> 'b`

`fold_lefti f _ xs`

invokes `f _ i x`

successively for every element `x`

located at index `i`

of the sequence `xs`

.

An accumulator of type `'b`

is threaded through the calls to `f`

.

It terminates only if the sequence `xs`

is finite.

`fold_lefti f accu xs`

is equivalent to `fold_left (fun accu (i, x) -> f accu i x) accu (zip (ints 0) xs)`

.

`val for_all : ('a -> bool) -> 'a t -> bool`

`for_all p xs`

determines whether all elements `x`

of the sequence `xs`

satisfy `p x`

.

The sequence `xs`

must be finite.

`val exists : ('a -> bool) -> 'a t -> bool`

`exists xs p`

determines whether at least one element `x`

of the sequence `xs`

satisfies `p x`

.

The sequence `xs`

must be finite.

`val find : ('a -> bool) -> 'a t -> 'a option`

`find p xs`

returns `Some x`

, where `x`

is the first element of the sequence `xs`

that satisfies `p x`

, if there is such an element.

It returns `None`

if there is no such element.

The sequence `xs`

must be finite.

`val find_map : ('a -> 'b option) -> 'a t -> 'b option`

`find_map f xs`

returns `Some y`

, where `x`

is the first element of the sequence `xs`

such that `f x = Some _`

, if there is such an element, and where `y`

is defined by `f x = Some y`

.

It returns `None`

if there is no such element.

The sequence `xs`

must be finite.

`iter2 f xs ys`

invokes `f x y`

successively for every pair `(x, y)`

of elements drawn synchronously from the sequences `xs`

and `ys`

.

If the sequences `xs`

and `ys`

have different lengths, then iteration stops as soon as one sequence is exhausted; the excess elements in the other sequence are ignored.

Iteration terminates only if at least one of the sequences `xs`

and `ys`

is finite.

`iter2 f xs ys`

is equivalent to `iter (fun (x, y) -> f x y) (zip xs ys)`

.

`fold_left2 f _ xs ys`

invokes `f _ x y`

successively for every pair `(x, y)`

of elements drawn synchronously from the sequences `xs`

and `ys`

.

An accumulator of type `'a`

is threaded through the calls to `f`

.

If the sequences `xs`

and `ys`

have different lengths, then iteration stops as soon as one sequence is exhausted; the excess elements in the other sequence are ignored.

Iteration terminates only if at least one of the sequences `xs`

and `ys`

is finite.

`fold_left2 f accu xs ys`

is equivalent to `fold_left (fun accu (x, y) -> f accu x y) (zip xs ys)`

.

`for_all2 p xs ys`

determines whether all pairs `(x, y)`

of elements drawn synchronously from the sequences `xs`

and `ys`

satisfy `p x y`

.

If the sequences `xs`

and `ys`

have different lengths, then iteration stops as soon as one sequence is exhausted; the excess elements in the other sequence are ignored. In particular, if `xs`

or `ys`

is empty, then `for_all2 p xs ys`

is true. This is where `for_all2`

and `equal`

differ: `equal eq xs ys`

can be true only if `xs`

and `ys`

have the same length.

At least one of the sequences `xs`

and `ys`

must be finite.

`for_all2 p xs ys`

is equivalent to `for_all (fun b -> b) (map2 p xs ys)`

.

`exists2 p xs ys`

determines whether some pair `(x, y)`

of elements drawn synchronously from the sequences `xs`

and `ys`

satisfies `p x y`

.

If the sequences `xs`

and `ys`

have different lengths, then iteration must stop as soon as one sequence is exhausted; the excess elements in the other sequence are ignored.

At least one of the sequences `xs`

and `ys`

must be finite.

`exists2 p xs ys`

is equivalent to `exists (fun b -> b) (map2 p xs ys)`

.

Provided the function `eq`

defines an equality on elements, `equal eq xs ys`

determines whether the sequences `xs`

and `ys`

are pointwise equal.

At least one of the sequences `xs`

and `ys`

must be finite.

Provided the function `cmp`

defines a preorder on elements, `compare cmp xs ys`

compares the sequences `xs`

and `ys`

according to the lexicographic preorder.

For more details on comparison functions, see `Array.sort`

.

At least one of the sequences `xs`

and `ys`

must be finite.

## Constructing sequences

The functions in this section are lazy: that is, they return sequences whose elements are computed only when demanded.

`val empty : 'a t`

`empty`

is the empty sequence. It has no elements. Its length is 0.

`val return : 'a -> 'a t`

`return x`

is the sequence whose sole element is `x`

. Its length is 1.

`cons x xs`

is the sequence that begins with the element `x`

, followed with the sequence `xs`

.

Writing `cons (f()) xs`

causes the function call `f()`

to take place immediately. For this call to be delayed until the sequence is queried, one must instead write `(fun () -> Cons(f(), xs))`

.

`val init : int -> (int -> 'a) -> 'a t`

`init n f`

is the sequence `f 0; f 1; ...; f (n-1)`

.

`n`

must be nonnegative.

If desired, the infinite sequence `f 0; f 1; ...`

can be defined as `map f (ints 0)`

.

`val unfold : ('b -> ('a * 'b) option) -> 'b -> 'a t`

`unfold`

constructs a sequence out of a step function and an initial state.

If `f u`

is `None`

then `unfold f u`

is the empty sequence. If `f u`

is `Some (x, u')`

then `unfold f u`

is the nonempty sequence `cons x (unfold f u')`

.

For example, `unfold (function [] -> None | h :: t -> Some (h, t)) l`

is equivalent to `List.to_seq l`

.

`val repeat : 'a -> 'a t`

`repeat x`

is the infinite sequence where the element `x`

is repeated indefinitely.

`repeat x`

is equivalent to `cycle (return x)`

.

`val forever : (unit -> 'a) -> 'a t`

`forever f`

is an infinite sequence where every element is produced (on demand) by the function call `f()`

.

For instance, `forever Random.bool`

is an infinite sequence of random bits.

`forever f`

is equivalent to `map f (repeat ())`

.

`cycle xs`

is the infinite sequence that consists of an infinite number of repetitions of the sequence `xs`

.

If `xs`

is an empty sequence, then `cycle xs`

is empty as well.

Consuming (a prefix of) the sequence `cycle xs`

once can cause the sequence `xs`

to be consumed more than once. Therefore, `xs`

must be persistent.

`val iterate : ('a -> 'a) -> 'a -> 'a t`

`iterate f x`

is the infinite sequence whose elements are `x`

, `f x`

, `f (f x)`

, and so on.

In other words, it is the orbit of the function `f`

, starting at `x`

.

## Transforming sequences

The functions in this section are lazy: that is, they return sequences whose elements are computed only when demanded.

`map f xs`

is the image of the sequence `xs`

through the transformation `f`

.

If `xs`

is the sequence `x0; x1; ...`

then `map f xs`

is the sequence `f x0; f x1; ...`

.

`mapi`

is analogous to `map`

, but applies the function `f`

to an index and an element.

`mapi f xs`

is equivalent to `map2 f (ints 0) xs`

.

`filter p xs`

is the sequence of the elements `x`

of `xs`

that satisfy `p x`

.

In other words, `filter p xs`

is the sequence `xs`

, deprived of the elements `x`

such that `p x`

is false.

`filter_map f xs`

is the sequence of the elements `y`

such that `f x = Some y`

, where `x`

ranges over `xs`

.

`filter_map f xs`

is equivalent to `map Option.get (filter Option.is_some (map f xs))`

.

If `xs`

is a sequence `[x0; x1; x2; ...]`

, then `scan f a0 xs`

is a sequence of accumulators `[a0; a1; a2; ...]`

where `a1`

is `f a0 x0`

, `a2`

is `f a1 x1`

, and so on.

Thus, `scan f a0 xs`

is conceptually related to `fold_left f a0 xs`

. However, instead of performing an eager iteration and immediately returning the final accumulator, it returns a sequence of accumulators.

For instance, `scan (+) 0`

transforms a sequence of integers into the sequence of its partial sums.

If `xs`

has length `n`

then `scan f a0 xs`

has length `n+1`

.

`take n xs`

is the sequence of the first `n`

elements of `xs`

.

If `xs`

has fewer than `n`

elements, then `take n xs`

is equivalent to `xs`

.

`n`

must be nonnegative.

`drop n xs`

is the sequence `xs`

, deprived of its first `n`

elements.

If `xs`

has fewer than `n`

elements, then `drop n xs`

is empty.

`n`

must be nonnegative.

`drop`

is lazy: the first `n+1`

elements of the sequence `xs`

are demanded only when the first element of `drop n xs`

is demanded. For this reason, `drop 1 xs`

is *not* equivalent to `tail xs`

, which queries `xs`

immediately.

`take_while p xs`

is the longest prefix of the sequence `xs`

where every element `x`

satisfies `p x`

.

`drop_while p xs`

is the sequence `xs`

, deprived of the prefix `take_while p xs`

.

Provided the function `eq`

defines an equality on elements, `group eq xs`

is the sequence of the maximal runs of adjacent duplicate elements of the sequence `xs`

.

Every element of `group eq xs`

is a nonempty sequence of equal elements.

The concatenation `concat (group eq xs)`

is equal to `xs`

.

Consuming `group eq xs`

, and consuming the sequences that it contains, can cause `xs`

to be consumed more than once. Therefore, `xs`

must be persistent.

The sequence `memoize xs`

has the same elements as the sequence `xs`

.

Regardless of whether `xs`

is ephemeral or persistent, `memoize xs`

is persistent: even if it is queried several times, `xs`

is queried at most once.

The construction of the sequence `memoize xs`

internally relies on suspensions provided by the module `Lazy`

. These suspensions are *not* thread-safe. Therefore, the sequence `memoize xs`

must *not* be queried by multiple threads concurrently.

This exception is raised when a sequence returned by `once`

(or a suffix of it) is queried more than once.

The sequence `once xs`

has the same elements as the sequence `xs`

.

Regardless of whether `xs`

is ephemeral or persistent, `once xs`

is an ephemeral sequence: it can be queried at most once. If it (or a suffix of it) is queried more than once, then the exception `Forced_twice`

is raised. This can be useful, while debugging or testing, to ensure that a sequence is consumed at most once.

If `xss`

is a matrix (a sequence of rows), then `transpose xss`

is the sequence of the columns of the matrix `xss`

.

The rows of the matrix `xss`

are not required to have the same length.

The matrix `xss`

is not required to be finite (in either direction).

The matrix `xss`

must be persistent.

## Combining sequences

`append xs ys`

is the concatenation of the sequences `xs`

and `ys`

.

Its elements are the elements of `xs`

, followed by the elements of `ys`

.

If `xss`

is a sequence of sequences, then `concat xss`

is its concatenation.

If `xss`

is the sequence `xs0; xs1; ...`

then `concat xss`

is the sequence `xs0 @ xs1 @ ...`

.

`concat_map f xs`

is equivalent to `concat (map f xs)`

.

`concat_map`

is an alias for `flat_map`

.

`zip xs ys`

is the sequence of pairs `(x, y)`

drawn synchronously from the sequences `xs`

and `ys`

.

If the sequences `xs`

and `ys`

have different lengths, then the sequence ends as soon as one sequence is exhausted; the excess elements in the other sequence are ignored.

`zip xs ys`

is equivalent to `map2 (fun a b -> (a, b)) xs ys`

.

`map2 f xs ys`

is the sequence of the elements `f x y`

, where the pairs `(x, y)`

are drawn synchronously from the sequences `xs`

and `ys`

.

If the sequences `xs`

and `ys`

have different lengths, then the sequence ends as soon as one sequence is exhausted; the excess elements in the other sequence are ignored.

`map2 f xs ys`

is equivalent to `map (fun (x, y) -> f x y) (zip xs ys)`

.

`interleave xs ys`

is the sequence that begins with the first element of `xs`

, continues with the first element of `ys`

, and so on.

When one of the sequences `xs`

and `ys`

is exhausted, `interleave xs ys`

continues with the rest of the other sequence.

If the sequences `xs`

and `ys`

are sorted according to the total preorder `cmp`

, then `sorted_merge cmp xs ys`

is the sorted sequence obtained by merging the sequences `xs`

and `ys`

.

For more details on comparison functions, see `Array.sort`

.

`product xs ys`

is the Cartesian product of the sequences `xs`

and `ys`

.

For every element `x`

of `xs`

and for every element `y`

of `ys`

, the pair `(x, y)`

appears once as an element of `product xs ys`

.

The order in which the pairs appear is unspecified.

The sequences `xs`

and `ys`

are not required to be finite.

The sequences `xs`

and `ys`

must be persistent.

The sequence `map_product f xs ys`

is the image through `f`

of the Cartesian product of the sequences `xs`

and `ys`

.

For every element `x`

of `xs`

and for every element `y`

of `ys`

, the element `f x y`

appears once as an element of `map_product f xs ys`

.

The order in which these elements appear is unspecified.

The sequences `xs`

and `ys`

are not required to be finite.

The sequences `xs`

and `ys`

must be persistent.

`map_product f xs ys`

is equivalent to `map (fun (x, y) -> f x y) (product xs ys)`

.

## Splitting a sequence into two sequences

`unzip`

transforms a sequence of pairs into a pair of sequences.

`unzip xs`

is equivalent to `(map fst xs, map snd xs)`

.

Querying either of the sequences returned by `unzip xs`

causes `xs`

to be queried. Therefore, querying both of them causes `xs`

to be queried twice. Thus, `xs`

must be persistent and cheap. If that is not the case, use `unzip (memoize xs)`

.

`partition_map f xs`

returns a pair of sequences `(ys, zs)`

, where:

`ys`

is the sequence of the elements`y`

such that`f x = Left y`

, where`x`

ranges over`xs`

;

`zs`

is the sequence of the elements`z`

such that`f x = Right z`

, where`x`

ranges over`xs`

.

`partition_map f xs`

is equivalent to a pair of `filter_map Either.find_left (map f xs)`

and `filter_map Either.find_right (map f xs)`

.

Querying either of the sequences returned by `partition_map f xs`

causes `xs`

to be queried. Therefore, querying both of them causes `xs`

to be queried twice. Thus, `xs`

must be persistent and cheap. If that is not the case, use `partition_map f (memoize xs)`

.

`partition p xs`

returns a pair of the subsequence of the elements of `xs`

that satisfy `p`

and the subsequence of the elements of `xs`

that do not satisfy `p`

.

`partition p xs`

is equivalent to `filter p xs, filter (fun x -> not (p x)) xs`

.

Consuming both of the sequences returned by `partition p xs`

causes `xs`

to be consumed twice and causes the function `f`

to be applied twice to each element of the list. Therefore, `f`

should be pure and cheap. Furthermore, `xs`

should be persistent and cheap. If that is not the case, use `partition p (memoize xs)`

.

## Converting between sequences and dispensers

A dispenser is a representation of a sequence as a function of type `unit -> 'a option`

. Every time this function is invoked, it returns the next element of the sequence. When there are no more elements, it returns `None`

. A dispenser has mutable internal state, therefore is ephemeral: the sequence that it represents can be consumed at most once.

`val of_dispenser : (unit -> 'a option) -> 'a t`

`of_dispenser it`

is the sequence of the elements produced by the dispenser `it`

. It is an ephemeral sequence: it can be consumed at most once. If a persistent sequence is needed, use `memoize (of_dispenser it)`

.

`val to_dispenser : 'a t -> unit -> 'a option`

`to_dispenser xs`

is a fresh dispenser on the sequence `xs`

.

This dispenser has mutable internal state, which is not protected by a lock; so, it must not be used by several threads concurrently.

## Sequences of integers

`val ints : int -> int t`

`ints i`

is the infinite sequence of the integers beginning at `i`

and counting up.