package bap-std

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

Bitvector -- an integer with modular arithmentics.

Overview

A numeric value with the 2-complement binary representation. It is good for representing addresses, offsets and other arithmetic values.

Each value is attributed by a bitwidth and signedness. All arithmetic operations over values are done modulo their widths. It is an error to apply arithmetic operation to values with different widths. Default implementations will raise a an exception, however there exists a family of modules that provide arithmetic operations lifted to an Or_error.t monad. It is suggested to use them, if you know what kind of operands you're expecting.

Clarification on signs

By default, all are numbers represented with bitvectors are considered unsigned. This includes the ordering, e.g., of_int (-1) ~width:32 is greater than of_int 0 ~width:32. If you need to perform a signed operation, you can use the signed operator create a signed word with the same value.

If any operand of a binary operation is signed, then a signed version of an operation is used, i.e., the other operand is upcasted to the signed kind.

Remember to use explicit casts, whenever you really need a signed representation. Examples:

let x = of_int ~-6 ~width:8
let y = to_int x          (* y = 250 *)
let z = to_int (signed x) (* z = ~-6 *)
let zero = of_int 0 ~width:8
let p = x < zero          (* p = false *)
let q = signed x < zero   (* p = true *)

Clarification on size-morphism

Size-monomorphic operations (as opposed to size-polymorphic) expect operands of the same size. When applied to operands of different sizes they either raise exceptions or return an Error variant as the result. All arithmetic operations are size-monomorphic and we provide interface that use either exceptions or Result.t to indicate the outcome.

The comparison operation is size-polymorphic by default and takes the size of the bitvector into account. Bitvectors with equal values but different sizes are unequal. The precise order matches with the order of pairs, where the first constituent is the bitvector value, and the second is its size, for example, the following sequence is in an ascending order:

0x0:1, 0x0:32, 0x0:64, 0x1:1, 0x1:32, 0xD:4, 0xDEADBEEF:32

.

A size-monomorphic interfaced is exposed in a Mono submodule. So if you want a monomorphic map, then just use Mono.Map module. Note, Mono submodule doesn't provide Table, since we cannot guarantee that all keys in a hash-table have equal size. The order functions provided by the Mono module will raise an exception when applied to bitvectors with different sizes.

In the default and Mono orders, if either of two values is signed (see Clarification on signs) then the values will be ordered as 2-complement signed integers.

Another alternative orders are Signed_value_order, Unsigned_value_order, and Literal_order. They will be briefly described below.

Signed_value_order is size-polymoprhic and it simply ignores the sizes of bitvectors and orders them by values, e.g., the following bitvectors are ordered in the Value.Signed order, FF:8; 0:1; 0F:8; FF:32, and 0:1 is equal to 0:32. See Clarification on size-morphism for more details on the signedness of operations. Note, that the size of a word still affects the order since it defines the position of the most significant bit.

Unsigned_value_order ignores the sign and the size of words and compares them by the unsigned order of their values. he following numbers are ordered with the Unsigned_value_order order, 0:1, 1:32, 0F:8 FF:8, and FF:32 is equal to FF:8. Unsigned_value_order is faster than then any previously described order and is useful when the size of the words should be ignored (or is known to be equal and therefore could be ignored).

Literal_order is the fastest order that takes into account all constituents of bitvectors, like if we will treat a bitvector as triple of its value, size, and sign and order bitvectors using the lexicographical order.

Clarification on string representation

As a part of Identifiable interface bitvector provides a pair of complement functions: to_string and of_string, that provides facilities to store bitvector as a human readable string, and to restore it from string. The format of the representation is the following (in EBNF):

repr  = [sign], [base], digit, {digit}, ":", size, [kind]
  sign  = "+" | "-";
base  = "0x" | "0b" | "0o";
size  = dec, {dec};
digit = dec | oct | hex;
dec   = ?decimal digit?;
oct   = ?octal digit?;
hex   = ?hexadecimal digit?;
kind  = u | s

Examples: 0x5D:32s, 0b0101:16u, 5:64, +5:8, +0x5D:16.

If base is omitted base-10 is assumed. If the kind is omitted, then the usigned kind is assumed. The output format is always in a hex representation with a full prefix. .

type t = word

word is an abbreviation to Bitvector.t

Common Interfaces

A bitvector is a value, first of all, so it supports a common set of a value interface: it can be stored, compared, it can be a key in a dictionary, etc. Moreover, being a number it can be compared with zero and applied to a common set of integer operations.

include Regular.Std.Regular.S with type t := t
val bin_size_t : t Bin_prot.Size.sizer
val bin_write_t : t Bin_prot.Write.writer
val bin_read_t : t Bin_prot.Read.reader
val __bin_read_t__ : (int -> t) Bin_prot.Read.reader
val bin_shape_t : Bin_prot.Shape.t
val bin_writer_t : t Bin_prot.Type_class.writer
val bin_reader_t : t Bin_prot.Type_class.reader
val bin_t : t Bin_prot.Type_class.t
val t_of_sexp : Sexplib0__.Sexp.t -> t
val sexp_of_t : t -> Sexplib0__.Sexp.t
val to_string : t -> string
val str : unit -> t -> string
val pps : unit -> t -> string
val ppo : Core_kernel.Out_channel.t -> t -> unit
val pp_seq : Stdlib.Format.formatter -> t Core_kernel.Sequence.t -> unit
val (>=) : t -> t -> bool
val (<=) : t -> t -> bool
val (=) : t -> t -> bool
val (>) : t -> t -> bool
val (<) : t -> t -> bool
val (<>) : t -> t -> bool
val equal : t -> t -> bool
val compare : t -> t -> int
val min : t -> t -> t
val max : t -> t -> t
val ascending : t -> t -> int
val descending : t -> t -> int
val between : t -> low:t -> high:t -> bool
val clamp_exn : t -> min:t -> max:t -> t
val clamp : t -> min:t -> max:t -> t Base__.Or_error.t
type comparator_witness
val validate_lbound : min:t Base__.Maybe_bound.t -> t Base__.Validate.check
val validate_ubound : max:t Base__.Maybe_bound.t -> t Base__.Validate.check
val validate_bound : min:t Base__.Maybe_bound.t -> max:t Base__.Maybe_bound.t -> t Base__.Validate.check
module Replace_polymorphic_compare : sig ... end
val comparator : (t, comparator_witness) Core_kernel__Comparator.comparator
module Map : sig ... end
module Set : sig ... end
val hash_fold_t : Ppx_hash_lib.Std.Hash.state -> t -> Ppx_hash_lib.Std.Hash.state
val hash : t -> Ppx_hash_lib.Std.Hash.hash_value
val hashable : t Core_kernel__.Hashtbl.Hashable.t
module Table : sig ... end
module Hash_set : sig ... end
module Hash_queue : sig ... end
type info = string * [ `Ver of string ] * string option
val version : string
val size_in_bytes : ?ver:string -> ?fmt:string -> t -> int
val of_bytes : ?ver:string -> ?fmt:string -> Regular.Std.bytes -> t
val to_bytes : ?ver:string -> ?fmt:string -> t -> Regular.Std.bytes
val blit_to_bytes : ?ver:string -> ?fmt:string -> Regular.Std.bytes -> t -> int -> unit
val of_bigstring : ?ver:string -> ?fmt:string -> Core_kernel.bigstring -> t
val to_bigstring : ?ver:string -> ?fmt:string -> t -> Core_kernel.bigstring
val blit_to_bigstring : ?ver:string -> ?fmt:string -> Core_kernel.bigstring -> t -> int -> unit
module Io : sig ... end
module Cache : sig ... end
val add_reader : ?desc:string -> ver:string -> string -> t Regular.Std.reader -> unit
val add_writer : ?desc:string -> ver:string -> string -> t Regular.Std.writer -> unit
val available_readers : unit -> info list
val default_reader : unit -> info
val set_default_reader : ?ver:string -> string -> unit
val with_reader : ?ver:string -> string -> (unit -> 'a) -> 'a
val available_writers : unit -> info list
val default_writer : unit -> info
val set_default_writer : ?ver:string -> string -> unit
val with_writer : ?ver:string -> string -> (unit -> 'a) -> 'a
val default_printer : unit -> info option
val set_default_printer : ?ver:string -> string -> unit
val with_printer : ?ver:string -> string -> (unit -> 'a) -> 'a
val find_reader : ?ver:string -> string -> t Regular.Std.reader option
val find_writer : ?ver:string -> string -> t Regular.Std.writer option

Bitvector implements a common set of operations that are expected from integral values.

include Integer.S with type t := t
include Integer.Base with type t := t
val abs : t -> t

abs x absolute value of x

val neg : t -> t

neg x = -x

val add : t -> t -> t

add x y is x + y

val sub : t -> t -> t

sub x y is x - y

val mul : t -> t -> t

mul x y is x * y

val div : t -> t -> t

div x y is x / y

val modulo : t -> t -> t

modulo x y is x mod y

val lnot : t -> t

lnot x is a logical negation of x (1-complement)

logand x y is a conjunction of x and y

val logand : t -> t -> t

logand x y is a conjunction of x and y

val logor : t -> t -> t

logor x y is a disjunction of x and y

val logxor : t -> t -> t

logxor x y is exclusive or between x and y

val lshift : t -> t -> t

lshift x y shift x by y bits left

val rshift : t -> t -> t

rshift x y shift x by y bits to the right

val arshift : t -> t -> t

arshift x y shift x by y bits to the right and fill with the sign bit.

A common set of infix operators

val (~-) : t -> t

~-x = neg x

val (+) : t -> t -> t

x + y = add x y

val (-) : t -> t -> t

x - y = sub x y

val (*) : t -> t -> t

x * y = mul x y

val (/) : t -> t -> t

x / y = div x y

val (mod) : t -> t -> t

x mod y = modulo x y

val (land) : t -> t -> t

x land y = logand x y

val (lor) : t -> t -> t

x lor y = logor x y

val (lxor) : t -> t -> t

lxor x y = logxor x y

val (lsl) : t -> t -> t

x lsl y = lshift x y

val (lsr) : t -> t -> t

x lsr y = rshift x y

val (asr) : t -> t -> t

x asr y = arshift x y

module Mono : Core_kernel.Comparable with type t := t

The comparable interface with size-monomorphic comparison.

module Signed_value_order : sig ... end

Compare by value, ignore size, but take into account the sign.

module Unsigned_value_order : sig ... end

Compare by value, ignore both the size and the sign.

module Literal_order : sig ... end

The lexicographical order of (value,size,sign) triples.

type endian =
  1. | LittleEndian
    (*

    least significant byte comes first

    *)
  2. | BigEndian
    (*

    most significant byte comes first

    *)

Specifies the order of bytes in a word.

val bin_shape_endian : Core_kernel.Bin_prot.Shape.t
val bin_size_endian : endian Core_kernel.Bin_prot.Size.sizer
val bin_write_endian : endian Core_kernel.Bin_prot.Write.writer
val bin_writer_endian : endian Core_kernel.Bin_prot.Type_class.writer
val bin_read_endian : endian Core_kernel.Bin_prot.Read.reader
val __bin_read_endian__ : (int -> endian) Core_kernel.Bin_prot.Read.reader
val bin_reader_endian : endian Core_kernel.Bin_prot.Type_class.reader
val bin_endian : endian Core_kernel.Bin_prot.Type_class.t
val compare_endian : endian -> endian -> int
val sexp_of_endian : endian -> Ppx_sexp_conv_lib.Sexp.t
val endian_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> endian

Constructors

val create : Bitvec.t -> int -> t

create v w creates a word from bitvector v of width w.

val code_addr : Bap_core_theory.Theory.Target.t -> Bitvec.t -> t

code_addr t x uses target's address size to create a word.

Same as create x (Theory.Target.code_addr_size t).

  • since 2.2.0
val data_addr : Bap_core_theory.Theory.Target.t -> Bitvec.t -> t

data_addr t x uses target's code address size to create a word.

Same as create x (Theory.Target.data_addr_size t).

  • since 2.2.0
val data_word : Bap_core_theory.Theory.Target.t -> Bitvec.t -> t

data_word t x uses target's word size to create a word.

Same as create x (Theory.Target.bits t).

  • since 2.2.0
val of_string : string -> t

of_string s parses a bitvector from a string representation defined in section Clarification on string representation.

val of_bool : bool -> t

of_bool x is a bitvector with length 1 and value b0 if x is false and b1 otherwise.

val of_int : width:int -> int -> t

of_int ~width n creates a bitvector of the specified bit-width with the value equal to n. If bits of the n that doesn't fit into width are ignored.

val of_int32 : ?width:int -> int32 -> t

of_int32 ?width n creates a bitvector of the specified bit-width with the value equal to n. If bits of the n that doesn't fit into width are ignored. Parameter width defaults to 32.

val of_int64 : ?width:int -> int64 -> t

of_int32 ?width n creates a bitvector of the specified bit-width with the value equal to n. If bits of the n that doesn't fit into width are ignored. Parameter width defaults to 32.

Some predefined constant constructors

val b0 : t

b0 = of_bool false

val b1 : t

b1 = of_bool true

Helpful shortcuts

val one : int -> t

one width number one with a specified width, is a shortcut for of_int 1 ~width

val zero : int -> t

zero width zero with a specified width, is a shortcut for of_int 0 ~width

val ones : int -> t

ones width is a number with a specified width, and all bits set to 1. It is a shortcut for lnot (zero width)

val of_binary : ?width:int -> endian -> string -> t

of_binary ?width endian num creates a bitvector from a string interpreted as a sequence of bytes in a specified order.

The result is always positive and unsigned. The num argument is not shared. width defaults to the length of num in bits, i.e. 8 * String.length num.

Conversions to OCaml built in integer types

val to_bitvec : t -> Bitvec.t

to_bitvec x returns a Bitvec represenation of x

val to_int : t -> int Core_kernel.Or_error.t

to_int x projects x in to OCaml int.

val to_int32 : t -> int32 Core_kernel.Or_error.t

to_int32 x projects x in to int32

val to_int64 : t -> int64 Core_kernel.Or_error.t

to_int64 x projects x in to int64

val to_int_exn : t -> int

to_int_exn x projects x in to OCaml int.

  • since 1.3
val to_int32_exn : t -> int32

to_int32_exn x projects x in to int32

  • since 1.3
val to_int64_exn : t -> int64

to_int64_exn x projects x in to int64

  • since 1.3
val pp : Stdlib.Format.formatter -> t -> unit

printf "%a" pp x prints x into a formatter. This is a default printer, controlled by set_default_printer. Multiple formats are available, see the available_writers for the actual list of formats and a format description. Out of box it defaults to pp_hex_full. Note, the printf function from examples refers to the Format.printf, thus it is assumed that the Format module is open in the scope.

val pp_hex : Stdlib.Format.formatter -> t -> unit

printf "%a" pp_hex x prints x in the hexadecimal format omitting suffixes, and the prefix if it is not necessary. Example,

# printf "%a\n" pp_hex (Word.of_int32 0xDEADBEEFl);;
0xDEADBEEF
# printf "%a\n" pp_hex (Word.of_int32 0x1);;
1
val pp_dec : Stdlib.Format.formatter -> t -> unit

printf "%a" pp_dec x prints x in the decimal format omitting suffixes and prefixes. Example,

# printf "%a\n" pp_dec (Word.of_int32 0xDEADBEEFl);;
3735928559
# printf "%a\n" pp_dec (Word.of_int32 0x1);;
1
  • since 1.3
val pp_oct : Stdlib.Format.formatter -> t -> unit

printf "%a" pp_oct x prints x in the octal format omitting suffixes, and the prefix if it is not necessary. Example,

# printf "%a\n" pp_oct (Word.of_int32 0xDEADBEEFl);;
0o33653337357
# printf "%a\n" pp_oct (Word.of_int32 0x1);;
1
  • since 1.3
val pp_bin : Stdlib.Format.formatter -> t -> unit

printf "%a" pp_bin x prints x in the binary (0 and 1) format omitting suffixes, and the prefix if it is not necessary. Example,

# printf "%a\n" pp_bin (Word.of_int32 0xDEADBEEFl);;
0b11011110101011011011111011101111
# printf "%a\n" pp_bin (Word.of_int32 0x1);;
1
  • since 1.3
val pp_hex_full : Stdlib.Format.formatter -> t -> unit

printf "%a" pp_hex_full x prints x in the hexadecimal format with suffixes, and the prefix if it is necessary. Example,

# printf "%a\n" pp_hex_full (Word.of_int32 0xDEADBEEFl);;
0xDEADBEEF:32u
             # printf "%a\n" pp_hex_full (Word.of_int32 0x1);;
1:32u
val pp_dec_full : Stdlib.Format.formatter -> t -> unit

printf "%a" pp_dec_full x prints x in the decimal format with suffixes and prefixes. Example,

# printf "%a\n" pp_dec_full (Word.of_int32 0xDEADBEEFl);;
3735928559:32u
             # printf "%a\n" pp_dec_full (Word.of_int32 0x1);;
1:32u
  • since 1.3
val pp_oct_full : Stdlib.Format.formatter -> t -> unit

printf "%a" pp_oct_full x prints x in the octal format with suffixes, and the prefix if it is necessary. Example,

# printf "%a\n" pp_oct_full (Word.of_int32 0xDEADBEEFl);;
0o33653337357:32u
                # printf "%a\n" pp_oct_full (Word.of_int32 0x1);;
1:32u
val pp_bin_full : Stdlib.Format.formatter -> t -> unit

printf "%a" pp_bin_full x prints x in the binary (0 and 1) format omitting suffixes, and the prefix if it is necessary. Example,

          # printf "%a\n" pp_bin_full (Word.of_int32 0xDEADBEEFl);;
          0b11011110101011011011111011101111:32u
          # printf "%a\n" pp_bin_full (Word.of_int32 0x1);;
          1:32u
val pp_generic : ?case:[ `upper | `lower ] -> ?prefix:[ `auto | `base | `none | `this of string ] -> ?suffix:[ `none | `full | `size ] -> ?format:[ `hex | `dec | `oct | `bin ] -> Stdlib.Format.formatter -> t -> unit

pp_generic ?case ?prefix ?suffix ?format ppf x - a printer to suit all tastes.

Note: this is a generic printer factory that should be used if none of the nine preinstantiated suits you.

  • parameter prefix

    defines whether or not a number is prefixed:

    • `auto (default) - a prefix that corresponds to the chosen format is printed if it is necessary to disambiguate a number from a decimal representation;
    • `base - a corresponding prefix is always printed;
    • `none - the prefix is never printed;
    • `this p - the user specified prefix p is always printed;
  • parameter suffix

    defines how the suffix should be printed:

    • `none (default) - the suffix is never printed;
    • `full - a full suffix that denotes size and signedness is printed, e.g., 0xDE:32s is a signed integer modulo 32.
    • `size - only the modulo is printed, e.g., 0xDE:32s is printed as 0xDE:32
  • parameter format

    defines the textual representation format:

    • hex (default) - hexadecimal
    • dec - decimal
    • oct - octal
    • bin - binary (0 and 1).
  • parameter case

    defines the case of hexadecimal letters

val string_of_value : ?hex:bool -> t -> string

string_of_value ?hex x returns a textual representation of the x value, i.e., ignores size and signedness. If hex is true (default), then it is in the hexadecimal representation, otherwise the decimal representation is used. The returned value is not prefixed. No leading zeros are printed. If a value is signed and negative, then a leading negative sign is printed. Hexadecimal letter literals are printed in a lowercase format.

val signed : t -> t

signed t casts t to a signed type, so that any operations applied on t will be signed.

val unsigned : t -> t

unsigned t casts t to an unsigned type, so that any operations applied to it will interpret t as an unsigned word.

  • since 1.3
val is_zero : t -> bool

is_zero bv is true iff all bits are set to zero.

val is_one : t -> bool

is_ones bv is true if the least significant bit is equal to one

val bitwidth : t -> int

bitwidth bv return a bit-width, i.e., the amount of bits

val extract : ?hi:int -> ?lo:int -> t -> t Core_kernel.Or_error.t

extract bv ~hi ~lo extracts a subvector from bv, starting from bit hi and ending with lo. Bits are enumerated from right to left (from least significant to most), starting from zero. hi maybe greater than size.

hi defaults to width bv - 1 lo defaults to 0.

Example:

extract (of_int 17 ~width:8) ~hi:4 ~lo:3 will result in a two bit vector consisting of the forth and third bits, i.e., equal to a number 2.

lo and hi should be non-negative numbers. lo must be less then a width bv and hi >= lo.

val extract_exn : ?hi:int -> ?lo:int -> t -> t

extract_exn bv ~hi ~lo is the same as extract, but will raise an exception on error.

val concat : t -> t -> t

concat b1 b2 concatenates two bitvectors

val (@.) : t -> t -> t

b1 @. b2 is concat b1 b2

val succ : t -> t

succ n returns next value after n. It is not guaranteed that signed (succ n) > signed n

val pred : t -> t

pred n returns a value preceding n.

val nsucc : t -> int -> t

nsucc m n is Fn.apply_n_times ~n succ m, but more efficient.

val npred : t -> int -> t

npred m n is Fn.apply_n_times ~n pred addr, but more efficient.

val (++) : t -> int -> t

a ++ n is nsucc a n

val (--) : t -> int -> t

a -- n is npred a n

val gcd : t -> t -> t Core_kernel.Or_error.t

gcd x y is the greatest common divisor of x and y in the integers. Note that this is not always the greatest common divisor in the bitvectors of fixed length. For example, in the 32-bit unsigned integers, 2 = 2 + 2^32 = 2(1 + 2^31). Thus, 1 + 2^31 is a divisor of 2, even though gcd 2 2 = 2. Two properties that still hold are: 1. Both x and y are multiples of gcd x y, and 2. gcd x y <= min (abs x) (abs y)

val lcm : t -> t -> t Core_kernel.Or_error.t

lcm x y is the least common multiple of x and y in the integers. Note that, like gcd x y, this is not always the least common multiple of x and y in the fixed- length bitvectors. See the gcd documentation for an example. The result of this function will always be some common multiple of the inputs, even in the fixed-width bitvectors.

val gcdext : t -> t -> (t * t * t) Core_kernel.Or_error.t

gcdext x y returns (g, s, t) where g = gcd x y and g = s*x + t*y. See the documentation for gcd x y for why this function is tricky to use.

val gcd_exn : t -> t -> t

gcd_exn x y is the same as gcd, but will raise an exception on error.

val lcm_exn : t -> t -> t

lcm_exn x y is the same as lcm, but will raise an exception on error.

val gcdext_exn : t -> t -> t * t * t

gcdext_exn x y is the same as gcdext, but will raise an exception on error.

Iteration over bitvector components

val enum_bytes : t -> endian -> t seq

enum_bytes x order returns a sequence of bytes of x in a specified order. Each byte is represented as a bitvector itself.

val enum_chars : t -> endian -> char seq

enum_bytes x order returns bytes of x in a specified order, with bytes represented by char type

val enum_bits : t -> endian -> bool seq

enum_bits x order returns bits of x in a specified order. order defines only the ordering of words in a bitvector, bits will always be in MSB first order. The length of the sequence is always a power of 8.

Comparison with zero

Note, we're not including With_zero interface, since it refers to the `Sign` module, that is available only in core_kernel >= 113.33.00.

val validate_positive : t Core_kernel.Validate.check

validate_positive validates that a value is positive.

val validate_non_negative : t Core_kernel.Validate.check

validate_non_negative validates that a value is non negative.

val validate_negative : t Core_kernel.Validate.check

validate_negative validates that a value is negative.

val validate_non_positive : t Core_kernel.Validate.check

validate_non_positive validates that a value is not positive.

val is_positive : t -> bool

is_positive x is true if x is greater than zero. Always true if x is unsigned.

val is_non_negative : t -> bool

is_non_negative x is true if x is greater than or equal to zero. Tautology if x is unsigned.

val is_negative : t -> bool

is_negative x is true if x is strictly less than zero. It is a contradiction if x is not signed.

val is_non_positive : t -> bool

is_non_positive x is true if x is less than zero. It is a contradiction if x is not signed.

module Int_err : sig ... end
module Int_exn : Integer.S with type t = t

Arithmetic that raises exceptions.

module Unsafe : Integer.S with type t = t

Arithmetic operations that doesn't check the widths.

module Stable : sig ... end

Stable marshaling interface.

module Trie : sig ... end

Prefix trees for bitvectors.

OCaml

Innovation. Community. Security.