package irmin

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

Irmin public API.

Irmin is a library to design and use persistent stores with built-in snapshot, branching and reverting mechanisms. Irmin uses concepts similar to Git but it exposes them as a high level library instead of a complex command-line frontend. It features a bidirectional Git backend, fully-compatible with the usual Git tools and workflows.

Irmin is designed to use a large variety of backends. It is written in pure OCaml and does not depend on external C stubs; it is thus very portable and aims to run everywhere, from Linux to Xen unikernels.

Consult the basics and Examples of use for a quick start. See also the documentation for the unix backends.

Release %%VERSION%% - %%MAINTAINER%%

val version : string

The version of the library.

Preliminaries

module Hum : sig ... end

Serializable data with reversible human-readable representations.

module Task : sig ... end

Tasks are used to keep track of the origin of reads and writes in the store. Every high-level operation is expected to have its own task which is passed to every low-level call.

module Merge : sig ... end

Merge provides functions to build custom 3-way merge operators for various user-defined contents.

Stores

type task = Task.t

The type for user-defined tasks. See Task.

type config

The type for backend-specific configuration values.

Every backend has different configuration options, which are kept abstract to the user.

type 'a diff = [
  1. | `Updated of 'a * 'a
  2. | `Removed of 'a
  3. | `Added of 'a
]

The type for representing differences betwen values.

An Irmin store is automatically built from a number of lower-level stores, implementing fewer operations, such as append-only and read-write stores. These low-level stores are provided by various backends.

module type RO = sig ... end

Read-only stores.

module type AO = sig ... end

Append-only store.

Immutable Link store.

module type RW = sig ... end

Read-write stores.

module type HRW = sig ... end

Hierarchical read-write stores.

module type RRW = sig ... end

Reactive read-write store

module type BC = sig ... end

Branch-consistent stores.

User-Defined Contents

module Path : sig ... end

Store paths.

module Hash : sig ... end

Hashing functions.

module Contents : sig ... end

Contents specifies how user-defined contents need to be serializable and mergeable.

module Ref : sig ... end

User-defined references. A reference store associates a name (branch ID) with its head commit in an Irmin store.

High-level Stores

An Irmin store is a branch-consistent store where keys are lists of steps.

An example is a Git repository where keys are filenames, i.e. list of '/'-separated strings. More complex examples are structured values, where steps might contain first-class field accessors and array offsets.

Irmin provides the following features:

  • Support for fast clones, branches and merges, in a fashion very similar to Git.
  • Efficient staging areas for fast, transient, in-memory operations.
  • Fast synchronization primitives between remote stores, using native backend protocols (as the Git protocol) when available.
module Private : sig ... end

Private defines functions only useful for creating new backends. If you are just using the library (and not developing a new backend), you should not use this module.

module type S = sig ... end

Signature for Irmin stores.

module type S_MAKER = functor (C : Contents.S) -> functor (R : Ref.S) -> functor (H : Hash.S) -> S with type key = C.Path.t and module Key = C.Path and type value = C.t and type branch_id = R.t and type commit_id = H.t

S_MAKER is the signature exposed by any backend providing S implementations. C is the implementation of user-defined contents, R is the implementation of store references and H is the implementation of store heads. It does not use any native synchronization primitives.

Synchronization

type remote

The type for remote stores.

val remote_uri : string -> remote

remote_uri s is the remote store located at uri. Use the optimized native synchronization protocol when available for the given backend.

Examples

These examples are in the examples directory of the distribution.

Synchronization

A simple synchronization example, using the Git backend and the Sync helpers. The code clones a fresh repository if the repository does not exist locally, otherwise it performs a fetch: in this case, only the missing contents is downloaded.

open Lwt
open Irmin_unix

module S = Irmin_git.FS(Irmin.Contents.String)(Irmin.Ref.String)(Irmin.Hash.SHA1)
module Sync = Irmin.Sync(S)
let config = Irmin_git.config ~root:"/tmp/test" ()

let upstream =
  if Array.length Sys.argv = 2 then (Irmin.remote_uri Sys.argv.(1))
  else (Printf.eprintf "Usage: sync [uri]\n%!"; exit 1)

let test () =
  S.Repo.create config >>= S.master task
  >>= fun t  -> Sync.pull_exn (t "Syncing with upstream store") upstream `Update
  >>= fun () -> S.read_exn (t "get the README") ["README.md"]
  >>= fun r  -> Printf.printf "%s\n%!" r; return_unit

let () =
  Lwt_main.run (test ())

Mergeable logs

We will demonstrate the use of custom merge operators by defining mergeable debug log files. We first define a log entry as a pair of a timestamp and a message, using the combinator exposed by mirage-tc:

module Entry = struct
  include Tc.Pair (Tc.Int)(Tc.String)
  let compare (x, _) (y, _) = Pervasives.compare x y
  let time = ref 0
  let create message = incr time; !time, message
end

A log file is a list of entries (one per line), ordered by decreasing order of timestamps. The 3-way merge operator for log files concatenates and sorts the new entries and prepend them to the common ancestor's ones.

module Log: Irmin.Contents.S with type t = Entry.t list = struct
  module Path = Irmin.Path.String_list
  module S = Tc.List(Entry)
  include S

  (* Get the timestamp of the latest entry. *)
  let timestamp = function
    | [] -> 0
    | (timestamp, _ ) :: _ -> timestamp

  (* Compute the entries newer than the given timestamp. *)
  let newer_than timestamp entries =
    let rec aux acc = function
      | [] -> List.rev acc
      | (h, _) :: _ when h <= timestamp -> List.rev acc
      | h::t -> aux (h::acc) t
    in
    aux [] entries

  let merge_log _path ~old t1 t2 =
    let open Irmin.Merge.OP in
    old () >>| fun old ->
    let old = match old with None -> [] | Some o -> o in
    let ts = timestamp old in
    let t1 = newer_than ts t1 in
    let t2 = newer_than ts t2 in
    let t3 = List.sort Entry.compare (List.rev_append t1 t2) in
    ok (List.rev_append t3 old)

  let merge path = Irmin.Merge.option (module S) (merge_log path)

end

Note: The serialisation primitives provided by mirage-tc: are not very efficient in this case as they parse the file every-time. For real usage, you would write buffered versions of Log.read and Log.write.

To persist the log file on disk, we need to choose a backend. We show here how to use the on-disk Git backend on Unix.

(* Bring [Irmin_unix.task] and [Irmin_unix.Irmin_git] in scope. *)
open Irmin_unix

(* Build an Irmin store containing log files. *)
module S = Irmin_git.FS(Log)(Irmin.Ref.String)(Irmin.Hash.SHA1)

(* Set-up the local configuration of the Git repository. *)
let config = Irmin_git.config ~root:"/tmp/irmin/test" ~bare:true ()

We can now define a toy example to use our mergeable log files.

open Lwt

(* Name of the log file. *)
let file = [ "local"; "debug" ]

(* Read the entire log file. *)
let read_file t =
  S.read (t "Reading the log file") file >>= function
  | None   -> return_nil
  | Some l -> return l

(* Persist a new entry in the log. *)
let log t fmt =
  Printf.ksprintf (fun message ->
      read_file t >>= fun logs ->
      let logs = Entry.create message :: logs in
      S.update (t "Adding a new entry") file logs
    ) fmt

let () =
  Lwt_unix.run begin
    S.Repo.create config >>= S.master task
    >>= fun t  -> log t "Adding a new log entry"
    >>= fun () -> Irmin.clone_force task (t "Cloning the store") "x"
    >>= fun x  -> log x "Adding new stuff to x"
    >>= fun () -> log x "Adding more stuff to x"
    >>= fun () -> log x "More. Stuff. To x."
    >>= fun () -> log t "I can add stuff on t also"
    >>= fun () -> log t "Yes. On t!"
    >>= fun () -> Irmin.merge_exn "Merging x into t" x ~into:t
    >>= fun () -> return_unit
  end

Helpers

val remote_store : (module S with type t = 'a) -> 'a -> remote

remote_store t is the remote corresponding to the local store t. Synchronization is done by importing and exporting store slices, so this is usually much slower than native synchronization using remote_uri but it works for all backends.

module type SYNC = sig ... end

SYNC provides functions to synchronization an Irmin store with local and remote Irmin stores.

module Sync (S : S) : SYNC with type db = S.t and type commit_id = S.commit_id

The default Sync implementation.

module type VIEW = sig ... end

View provides an in-memory partial mirror of the store, with lazy reads and delayed writes.

module View (S : S) : VIEW with type db = S.t and type key = S.Key.t and type value = S.Val.t and type commit_id = S.commit_id

Create views.

val with_hrw_view : (module VIEW with type db = 'store and type key = 'path and type t = 'view) -> 'store -> path:'path -> [< `Merge | `Rebase | `Update ] -> ('view -> unit Lwt.t) -> unit Merge.result Lwt.t

with_rw_view (module View) t ~path strat ops applies ops to an in-memory, temporary and mutable view of the store t. All operations in the transaction are relative to path. The strat strategy decides which merging strategy to use: see VIEW.update_path, VIEW.rebase_path and VIEW.merge_path.

module Dot (S : S) : sig ... end

Dot provides functions to export a store to the Graphviz `dot` format.

Backends

API to create new Irmin backends. A backend is an implementation exposing either a concrete implementation of S or a functor providing S once applied.

There are two ways to create a concrete Irmin.S implementation:

  • Make creates a store where all the objects are stored in the same store, using the same internal keys format and a custom binary format based on bin_prot, with no native synchronization primitives: it is usually what is needed to quickly create a new backend.
  • Make_ext creates a store with a deep embedding of each of the internal stores into separate store, with a total control over the binary format and using the native synchronization protocols when available. This is mainly used by the Git backend, but could be used for other similar backends as well in the future.
module type AO_MAKER = functor (K : Hash.S) -> functor (V : Tc.S0) -> sig ... end

AO_MAKER is the signature exposed by append-only store backends. K is the implementation of keys and V is the implementation of values.

module type RAW = Tc.S0 with type t = Cstruct.t

RAW is the signature for raw values.

module type AO_MAKER_RAW = functor (K : Hash.S) -> functor (V : RAW) -> AO with type key = K.t and type value = V.t

AO_MAKER_RAW if the signature exposed by any backend providing append-only stores with access to raw values. K is the implementation of keys and V is the implementation of raw values.

LINK_MAKER is the signature exposed by store which enable adding relation between keys. This is used to decouple the way keys are manipulated by the Irmin runtime and the keys used for storage. This is useful when trying to optimize storage for random-access file operations or for encryption.

module type RW_MAKER = functor (K : Hum.S) -> functor (V : Tc.S0) -> sig ... end

RW_MAKER is the signature exposed by read-write store backends. K is the implementation of keys and V is the implementation of values.

module Make (AO : AO_MAKER) (RW : RW_MAKER) : S_MAKER

Simple store creator. Use the same type of all of the internal keys and store all the values in the same store.

module Make_ext (P : Private.S) : S with type key = P.Contents.Path.t and type value = P.Contents.value and type branch_id = P.Ref.key and type commit_id = P.Ref.value and type Key.step = P.Contents.Path.step and type Repo.t = P.Repo.t

Advanced store creator.

OCaml

Innovation. Community. Security.