Utf8_textText is text encoded in UTF-8.
Under the hood, this is just a String.t, but the type is abstract so that the compiler will remind us not to use String.length when we mean Text.width.
include Ppx_compare_lib.Comparable.S with type t := tval compare : t Base__Ppx_compare_lib.compareinclude Ppx_quickcheck_runtime.Quickcheckable.S with type t := tval quickcheck_generator : t Base_quickcheck.Generator.tval quickcheck_observer : t Base_quickcheck.Observer.tval quickcheck_shrinker : t Base_quickcheck.Shrinker.tval sexp_of_t : t -> Sexplib0.Sexp.tThe invariant is that t is a sequence of well-formed UTF-8 code points.
include Core.Invariant.S with type t := tval invariant : t Base__.Invariant_intf.invinclude Core.Container.S0 with type t := t with type elt := Core.Uchar.tval mem : t -> Core.Uchar.t -> boolval length : t -> intval is_empty : t -> boolval iter : t -> f:(Core.Uchar.t -> unit) -> unitval fold : t -> init:'accum -> f:('accum -> Core.Uchar.t -> 'accum) -> 'accumval fold_result :
t ->
init:'accum ->
f:('accum -> Core.Uchar.t -> ('accum, 'e) Base__.Result.t) ->
('accum, 'e) Base__.Result.tval fold_until :
t ->
init:'accum ->
f:
('accum ->
Core.Uchar.t ->
('accum, 'final) Base__.Container_intf.Continue_or_stop.t) ->
finish:('accum -> 'final) ->
'finalval exists : t -> f:(Core.Uchar.t -> bool) -> boolval for_all : t -> f:(Core.Uchar.t -> bool) -> boolval count : t -> f:(Core.Uchar.t -> bool) -> intval sum :
(module Base__.Container_intf.Summable with type t = 'sum) ->
t ->
f:(Core.Uchar.t -> 'sum) ->
'sumval find : t -> f:(Core.Uchar.t -> bool) -> Core.Uchar.t optionval find_map : t -> f:(Core.Uchar.t -> 'a option) -> 'a optionval to_list : t -> Core.Uchar.t listval to_array : t -> Core.Uchar.t arrayval min_elt :
t ->
compare:(Core.Uchar.t -> Core.Uchar.t -> int) ->
Core.Uchar.t optionval max_elt :
t ->
compare:(Core.Uchar.t -> Core.Uchar.t -> int) ->
Core.Uchar.t optionval width : t -> intwidth t approximates the displayed width of t.
We incorrectly assume that every code point has the same width. This is better than String.length for many code points, but doesn't work for double-width characters or combining diacritics.
val bytes : t -> intbytes t is the number of bytes in the UTF-8 encoding of t.
chunks_of t ~width splits t into chunks no wider than width characters s.t.
t = t |> chunks_of ~width |> concat
. chunks_of always returns at least one chunk, which may be empty.
If prefer_split_on_spaces = true and such a space exists, t will be split on the last U+020 SPACE before the chunk becomes too wide. Otherwise, the split happens exactly at width characters.
val of_uchar_list : Core.Uchar.t list -> tval iteri : t -> f:(int -> Core.Uchar.t -> unit) -> unititeri t ~f calls f index uchar for every uchar in t. index counts characters, not bytes.