OCaml's byte sequence type, semantically similar to a
char array, but taking less space in memory.
A byte sequence is a mutable data structure that contains a fixed-length sequence of bytes (of type
char). Each byte can be indexed in constant time for reading or writing.
include Sexplib0.Sexpable.S with type t := t
val t_of_sexp : Sexplib0.Sexp.t -> t
val sexp_of_t : t -> Sexplib0.Sexp.t
val t_sexp_grammar : t Sexplib0.Sexp_grammar.t
include Blit.S with type t := t
include Comparable.S with type t := t
include Comparisons.S with type t := t
compare t1 t2 returns 0 if
t1 is equal to
t2, a negative integer if
t1 is less than
t2, and a positive integer if
t1 is greater than
ascending is identical to
descending x y = ascending y x. These are intended to be mnemonic when used like
List.sort ~compare:ascending and
~cmp:descending, since they cause the list to be sorted in ascending or descending order, respectively.
clamp_exn t ~min ~max returns
t', the closest value to
t such that
between t' ~low:min ~high:max is true.
not (min <= max).
val clamp : t -> min:t -> max:t -> t Or_error.t
include Comparator.S with type t := t
val comparator : (t, comparator_witness) Comparator.comparator
include Stringable.S with type t := t
val of_string : string -> t
val to_string : t -> string
pp allocates in order to preserve the state of the byte sequence it was initially called with.
include Pretty_printer.S with type t := t
val pp : Formatter.t -> t -> unit
include Invariant.S with type t := t
val invariant : t -> unit
module To_string : sig ... end
module From_string : Blit.S_distinct with type src := string and type dst := t
val create : int -> t
create len returns a newly-allocated and uninitialized byte sequence of length
len. No guarantees are made about the contents of the return value.
val make : int -> char -> t
make len c returns a newly-allocated byte sequence of length
len filled with the byte
map f t applies function
f to every byte, in order, and builds the byte sequence with the results returned by
map, but passes each character's index to
f along with the char.
val init : int -> f:(int -> char) -> t
init len ~f returns a newly-allocated byte sequence of length
len with index
i in the sequence being initialized with the result of
val of_char_list : char list -> t
of_char_list l returns a newly-allocated byte sequence where each byte in the sequence corresponds to the byte in
l at the same index.
val length : t -> int
length t returns the number of bytes in
val get : t -> int -> char
get t i returns the
ith byte of
val unsafe_get : t -> int -> char
val set : t -> int -> char -> unit
set t i c sets the
ith byte of
val unsafe_set : t -> int -> char -> unit
val unsafe_get_int64 : t -> int -> int64
val unsafe_set_int64 : t -> int -> int64 -> unit
val fill : t -> pos:int -> len:int -> char -> unit
fill t ~pos ~len c modifies
t in place, replacing all the bytes from
pos + len with
val tr : target:char -> replacement:char -> t -> unit
tr ~target ~replacement t modifies
t in place, replacing every instance of
tr_multi ~target ~replacement returns an in-place function that replaces every instance of a character in
target with the corresponding character in
replacement is shorter than
target, it is lengthened by repeating its last character. Empty
replacement is illegal unless
target also is.
target contains multiple copies of the same character, the last corresponding
replacement character is used. Note that character ranges are not supported, so
~target:"a-z" means the literal characters
val to_list : t -> char list
to_list t returns the bytes in
t as a list of chars.
val to_array : t -> char array
to_array t returns the bytes in
t as an array of chars.
val fold : t -> init:'a -> f:('a -> char -> 'a) -> 'a
fold a ~f ~init:b is
f a1 (f a2 (...))
val foldi : t -> init:'a -> f:(int -> 'a -> char -> 'a) -> 'a
foldi works similarly to
fold, but also passes the index of each character to
val contains : ?pos:int -> ?len:int -> t -> char -> bool
contains ?pos ?len t c returns
c appears in
pos + len.
Maximum length of a byte sequence, which is architecture-dependent. Attempting to create a
Bytes larger than this will raise an exception.
Unsafe conversions (for advanced users)
This section describes unsafe, low-level conversion functions between
string. They might not copy the internal data; used improperly, they can break the immutability invariant on strings provided by the
-safe-string option. They are available for expert library authors, but for most purposes you should use the always-correct
val unsafe_to_string : no_mutation_while_string_reachable:t -> string
Unsafely convert a byte sequence into a string.
To reason about the use of
unsafe_to_string, it is convenient to consider an "ownership" discipline. A piece of code that manipulates some data "owns" it; there are several disjoint ownership modes, including:
- Unique ownership: the data may be accessed and mutated
- Shared ownership: the data has several owners, that may only access it, not mutate it.
Unique ownership is linear: passing the data to another piece of code means giving up ownership (we cannot access the data again). A unique owner may decide to make the data shared (giving up mutation rights on it), but shared data may not become uniquely-owned again.
unsafe_to_string s can only be used when the caller owns the byte sequence
s -- either uniquely or as shared immutable data. The caller gives up ownership of
s, and gains (the same mode of) ownership of the returned string. There are two valid use-cases that respect this ownership discipline:
The first is creating a string by initializing and mutating a byte sequence that is never changed after initialization is performed.
let string_init len f : string = let s = Bytes.create len in for i = 0 to len - 1 do Bytes.set s i (f i) done; Bytes.unsafe_to_string ~no_mutation_while_string_reachable:s
This function is safe because the byte sequence
swill never be accessed or mutated after
unsafe_to_stringis called. The
string_initcode gives up ownership of
s, and returns the ownership of the resulting string to its caller.
Note that it would be unsafe if
swas passed as an additional parameter to the function
fas it could escape this way and be mutated in the future --
string_initwould give up ownership of
sto pass it to
f, and could not call
We have provided the
String.mapifunctions to cover most cases of building new strings. You should prefer those over
The second is temporarily giving ownership of a byte sequence to a function that expects a uniquely owned string and returns ownership back, so that we can mutate the sequence again after the call ended.
let bytes_length (s : bytes) = String.length (Bytes.unsafe_to_string ~no_mutation_while_string_reachable:s)
In this use-case, we do not promise that
swill never be mutated after the call to
bytes_length s. The
String.lengthfunction temporarily borrows unique ownership of the byte sequence (and sees it as a
string), but returns this ownership back to the caller, which may assume that
sis still a valid byte sequence after the call. Note that this is only correct because we know that
String.lengthdoes not capture its argument -- it could escape by a side-channel such as a memoization combinator. The caller may not mutate
swhile the string is borrowed (it has temporarily given up ownership). This affects concurrent programs, but also higher-order functions: if
String.lengthreturned a closure to be called later,
sshould not be mutated until this closure is fully applied and returns ownership.
val unsafe_of_string_promise_no_mutation : string -> t
Unsafely convert a shared string to a byte sequence that should not be mutated.
The same ownership discipline that makes
unsafe_to_string correct applies to
unsafe_of_string_promise_no_mutation, however unique ownership of string values is extremely difficult to reason about correctly in practice. As such, one should always assume strings are shared, never uniquely owned (For example, string literals are implicitly shared by the compiler, so you never uniquely own them)
The only case we have reasonable confidence is safe is if the produced
bytes is shared -- used as an immutable byte sequence. This is possibly useful for incremental migration of low-level programs that manipulate immutable sequences of bytes (for example
Marshal.from_bytes) and previously used the
string type for this purpose.