package SZXX
Library
Module
Module type
Parameter
Class
Class type
module DOM : sig ... end
Basic XML types and accessor functions
module SAX : sig ... end
Advanced parsing utilities: custom parser options and tools to stream huge documents
type document = {
decl_attrs : DOM.attr_list;
(*The declaration attributes, e.g. version and encoding
*)top : DOM.element;
(*The top element of the document
*)
}
val sexp_of_document : document -> Sexplib0.Sexp.t
val parse_document :
?parser:SAX.node Angstrom.t ->
?strict:Base.bool ->
Feed.t ->
(document, Base.string) Base.Result.t
Progressively parse a fully formed, fully escaped XML document. It begins parsing without having to read the whole input in its entirety.
parser
: Override the default parser. Make your own parser with SZXX.Xml.SAX.make_parser
or pass SZXX.Xml.html_parser
.
strict
: Default: true
. When false, non-closed elements are treated as self-closing elements, HTML-style. For example a <br>
without a matching </br>
will be treated as a self-closing <br />
.
feed
: A producer of raw input data. Create a feed
by using the SZXX.Feed
module.
val parse_document_from_string :
?parser:SAX.node Angstrom.t ->
?strict:Base.bool ->
Base.string ->
(document, Base.string) Base.Result.t
Same as parse_document
, but from a string
val html_parser : SAX.node Angstrom.t
val stream_matching_elements :
?parser:SAX.node Angstrom.t ->
?strict:Base.bool ->
filter_path:Base.string Base.list ->
on_match:(DOM.element -> Base.unit) ->
Feed.t ->
(document, Base.string) Base.Result.t
Progressively assemble an XML DOM, but every element that matches filter_path
is passed to on_match
instead of being added to the DOM. This "shallow DOM" is then returned. All text nodes are properly unescaped. It begins parsing without having to read the whole input in its entirety.
parser
: Override the default parser. Make your own parser with SZXX.Xml.SAX.make_parser
or pass SZXX.Xml.html_parser
.
strict
: Default: true
. When false, non-closed elements are treated as self-closing elements, HTML-style. For example a <br>
without a matching </br>
will be treated as a self-closing <br />
.
feed
: A producer of raw input data. Create a feed
by using the SZXX.Feed
module.
filter_path
: indicates which part of the DOM should be streamed out instead of being stored in the DOM. For example ["html"; "body"; "div"; "div"; "p"]
will emit all the <p>
tags nested inside exactly 2 levels of <div>
tags in an HTML document.
on_match
: Called on every element that matched filter_path