package bio_io

  1. Overview
  2. Docs

Bio_io

Build and test Coverage Status

The full API is browsable here.

Bio_io is an OCaml library that provides programmatic access to common file formats used in bioinformatics like FASTA files.

If you have any problems or find any bugs, open an issue on the GitHub page.

License

license MIT or Apache 2.0

Licensed under the Apache License, Version 2.0 or the MIT license, at your option. This program may not be copied, modified, or distributed except according to those terms.

Quick Start

Install

opam install bio_io

Example

Read a FASTA file and print the ID and sequence length for each record.

open! Base

let fasta_file = "sequences.fasta"

let () =
  (* This open gives you [In_channel] and [Record]. *)
  let open Bio_io.Fasta in
  In_channel.with_file_iter_records fasta_file ~f:(fun record ->
      (* Print the ID and the length of the sequence. *)
      Stdio.printf "%s => %d\n" (Record.id record) (Record.seq_length record))

Overview

The Bio_io library provides input channels that return records.

For an overview see the Record_in_channel module signature. In fact, all the In_channels in this library satisfy this signature.

The Fasta and Fastq modules provide Record and In_channels for reading FASTA and FASTQ files.

There are modules for reading "delimited" files like Btab and Btab_queries.

Extending Bio_io

The Record_in_channel.Make functor can be used to make new specialized records and input channels. To do so, you need a module that satisfies the In_channel_input_record signature. Bio_io has a couple of In_channel types in the Private module you can use for this, and then add in the input_record function.

For examples, see the definitions of Fasta.In_channel, Fastq.In_channel, and Btab.In_channel.