package pacomb

  1. Overview
  2. Docs

A module providing efficient input buffers with preprocessing.

Type

type stream_infos =
  1. | File of {
    1. name : string;
    2. length : int;
    3. date : float;
    }
  2. | String of string
  3. | Stream

Information for a file for reopening

type buffer

The abstract type for an input buffer.

type infos

Type of fixed data attached to the buffer (like file name)

val infos : buffer -> infos

return the infos associated to a buffer

val phantom_infos : infos

dummy infos

val stream_infos : infos -> stream_infos

returns the stream_infos stored in infos

val filename : infos -> string

returns the filename if it exists, the empty string otherwise

val utf8 : infos -> Utf8.context

utf8 infos return the unicode context in use for this file

type idx

The abstract type position relative to the current buffer

val init_idx : idx

position at the beginning of a buffer

type byte_pos

The abstract position relative to the beginning of buffer

val int_of_byte_pos : byte_pos -> int

convert byte_pos to natural number

val init_byte_pos : byte_pos

zero

val phantom_byte_pos : byte_pos

dummy value, to initiaize references for instance

type spos = infos * byte_pos

Short (and quick) type for positions

val phantom_spos : spos

dummy value, to initiaize references for instance

Reading from a buffer

val read : buffer -> idx -> char * buffer * idx

read buf idx returns the character at position idx in the buffer buf, together with the new buffer and position. Read infinitely many '\255' at end of buffer

val sub : buffer -> idx -> int -> string

sub b i len returns len characters from position idx. If the end of buffer is reached, the string is filed with eof '\255'

val get : buffer -> idx -> char

get buf idx returns the character at position idx in the buffer buf.

Creating a buffer

type context = Utf8.context
val from_file : ?utf8:context -> string -> buffer

from_file fn returns a buffer constructed using the file fn.

If utf8 is Utf8.UTF8 or Utf8.CJK_UTF8 (Utf8.ASCII is the default), positions are reported according to utf8. read is still reading bytes.

Getting line number and column number requires rescanning the file and if the file is not a regular file, it is kept in memory. Setting rescan to false avoid this, but only byte position and file name will be available.

val from_channel : ?utf8:context -> ?filename:string -> Stdlib.in_channel -> buffer

from_channel ~filename ch returns a buffer constructed using the channel ch. The optional filename is only used as a reference to the channel in error messages.

uft8 and rescan as in from_file.

val from_fd : ?utf8:context -> ?filename:string -> Unix.file_descr -> buffer

Same as above for file descriptor

val from_string : ?utf8:context -> string -> buffer

from_string ~filename str returns a buffer constructed using the string str. The optional filename is only used as a reference to the channel in error messages.

Buffer manipulation functions

val is_empty : buffer -> int -> bool

is_empty buf test whether the buffer buf is empty.

exception NoLineNorColumnNumber
val byte_pos : buffer -> idx -> byte_pos

position in bytes, regardless to utf8

val spos : buffer -> idx -> spos

get spos from buffer and idx, to get line_num and col_num if needed later.

val buffer_uid : buffer -> int

buffer_uid buf returns a unique identifier. Input.read does not change the uid. The uid is created when creating the initial buffer.

val buffer_equal : buffer -> buffer -> bool

buffer_eq b1 b2 tests the equality of b1 and b2.

val buffer_compare : buffer -> buffer -> int

buffer_compare b1 b2 compares b1 and b2.

val buffer_before : buffer -> int -> buffer -> int -> bool

buffer_before b1 i1 b2 i2 returns true if the position b1, i1 is before b2, i2. Gives meaningless result if b1 and b2 do not refer to the same file, i.e. do not have the same uid.

module Tbl : sig ... end

Table to associate value to positions in input buffers. The complexity of access in the table is O(ln(N)) where N is the number of tables.