package wu-manber-fuzzy-search

  1. Overview
  2. Docs

An module for fuzzy searching in strings

This module is intended to be used for searching as well as an example of properly using the mid-level functors in the FirstMatch module.

The search function in this module stop after the first match, and returns the error count together with the number of characters of the text read by the algorithm.

The following is a description of how to use the functions in Wu_Manber.FirstMatch.

We first collect all the relevant operations of strings and characters in a module called Pattern.

module Pattern : sig ... end

Module collecting operations for strings.

Next we create a Matcher using one of the functors in the Matcher module.

module ArrayMatcher = Matcher.MakeArrayMatcher (Pattern)
module ArrayMatcher : sig ... end

Next, we use the FirstMatch.Make functor to create functions that can search through sequences of characters.

module FirstMatch = FirstMatch.Make (Pattern) (ArrayMatcher)
module FirstMatch : sig ... end

Lastly, to search through strings instead of sequences, we wrap the functions from the FirstMatch module.

let search ~k ~pattern ~text =
  let seq = String.to_seq text in
  FirstMatch.first_match ~pattern ~k seq

let search_right_leaning ~k ~pattern ~text =
  let seq = String.to_seq text in
  FirstMatch.first_right_leaning_match ~pattern ~k seq

let report = FirstMatch.report

search ~k ~pattern ~text searches for the first match in a string using the basic Wu and Manber algorithm, allowing for ~k errors for a match. ~pattern must have length less than or equal 63.

val search_right_leaning : k:int -> pattern:Pattern.t -> text:Stdlib.String.t -> (int * int * Optint.Int63.t array) option

search_right_leaning ~k ~pattern ~text searches for the first match in a string using the right leaning variant of the Wu and Manber algorithm, allowing for ~k errors for a match. ~pattern must have length less than or equal 63.

val report : (int * int * Optint.Int63.t array) option -> string

report produces a texual description of the results of the above functions.