Unsafely convert a byte sequence into a string.
To reason about the use of unsafe_to_string
, it is convenient to consider an "ownership" discipline. A piece of code that manipulates some data "owns" it; there are several disjoint ownership modes, including:
- Unique ownership: the data may be accessed and mutated
- Shared ownership: the data has several owners, that may only access it, not mutate it.
Unique ownership is linear: passing the data to another piece of code means giving up ownership (we cannot write the data again). A unique owner may decide to make the data shared (giving up mutation rights on it), but shared data may not become uniquely-owned again.
unsafe_to_string s
can only be used when the caller owns the byte sequence s
-- either uniquely or as shared immutable data. The caller gives up ownership of s
, and gains ownership of the returned string.
There are two valid use-cases that respect this ownership discipline:
1. Creating a string by initializing and mutating a byte sequence that is never changed after initialization is performed.
let string_init len f : string =
let s = Bytes.create len in
for i = 0 to len - 1 do Bytes.set s i (f i) done;
Bytes.unsafe_to_string s
This function is safe because the byte sequence s
will never be accessed or mutated after unsafe_to_string
is called. The string_init
code gives up ownership of s
, and returns the ownership of the resulting string to its caller.
Note that it would be unsafe if s
was passed as an additional parameter to the function f
as it could escape this way and be mutated in the future -- string_init
would give up ownership of s
to pass it to f
, and could not call unsafe_to_string
safely.
We have provided the String.init
, String.map
and String.mapi
functions to cover most cases of building new strings. You should prefer those over to_string
or unsafe_to_string
whenever applicable.
2. Temporarily giving ownership of a byte sequence to a function that expects a uniquely owned string and returns ownership back, so that we can mutate the sequence again after the call ended.
let bytes_length (s : bytes) =
String.length (Bytes.unsafe_to_string s)
In this use-case, we do not promise that s
will never be mutated after the call to bytes_length s
. The String.length
function temporarily borrows unique ownership of the byte sequence (and sees it as a string
), but returns this ownership back to the caller, which may assume that s
is still a valid byte sequence after the call. Note that this is only correct because we know that String.length
does not capture its argument -- it could escape by a side-channel such as a memoization combinator.
The caller may not mutate s
while the string is borrowed (it has temporarily given up ownership). This affects concurrent programs, but also higher-order functions: if String.length
returned a closure to be called later, s
should not be mutated until this closure is fully applied and returns ownership.