Library
Module
Module type
Parameter
Class
Class type
A record type for FASTA files.
If you have a fasta file something like this:
>s1 apple pie
ACTG
actg
Then you would get a record
something like this:
(* "s1" *)
Fasta.Record.id record;;
(* Some "apple pie" *)
Fasta.Record.desc record;;
(* "ACTGactg" *)
Fasta.Record.seq record
If you have a fasta file something like this:
>s1
ACTG
actg
Then you would get a record
something like this:
(* "s1" *)
Fasta.Record.id record;;
(* None *)
Fasta.Record.desc record;;
(* "ACTGactg" *)
Fasta.Record.seq record
To change a part of the Fasta.Record
use the with_*
functions. E.g.,
Fasta.Record.with_id "apple" record
would change give you a t
with the id
set to "apple"
.
val create :
id:Base.string ->
desc:Base.string Base.option ->
seq:Base.string ->
t
create ~id ~desc ~seq
creates a new t
. Shouldn't raise as literally any values of the correct type are accepted.
val to_string : t -> Base.string
to_string t
returns a string representation of t
ready to print to a FASTA output file.
val to_string_nl : ?nl:Base.string -> t -> Base.string
to_string_nl t ~nl
returns a string representation of t
ready to print to a FASTA output file, including a trailing newline (nl) string. nl
defaults to "\n"
.
val id : t -> Base.string
id t
returns the id
of the t
.
val desc : t -> Base.string Base.option
desc t
returns the desc
(description) of the t
.
val seq : t -> Base.string
seq t
returns the seq
of the t
.
val seq_length : t -> Base.int
seq_length t
returns the length of the seq
of t
.
If you construct a record by hand (e.g., with create
), and there are spaces or other weird characters in the sequences, they will be counted in the length. E.g.,
let r = Fasta.Record.create ~id:"apple" ~desc:None ~seq:"a a" in
assert (Int.(3 = Fasta.Record.seq_length r))
with_id new_id t
returns a t
with new_id
instead of the original id
.
with_seq new_seq t
returns a t
with new_seq
instead of the original seq
.
with_desc new_desc t
returns a t
with new_desc
instead of the original desc
.
comp t
returns the complement of t
. I.e., the seq
is complemented. Uses IUPAC conventions. Any "base" (char) that isn't part of the IUPAC passes through unchanged. Note that comp
does not round-trip.