text-1.2.4.1: An efficient packed Unicode text type.
An efficient packed, immutable Unicode text type (both strict and lazy), with a powerful loop fusion optimization framework.
The
Text
type represents Unicode character strings, in a time and
space-efficient manner. This package provides text processing
capabilities that are optimized for performance critical use, both
in terms of large data quantities and high speed.
The
Text
type provides character-encoding, type-safe case
conversion via whole-string case conversion functions (see
Data.Text
).
It also provides a range of functions for converting
Text
values to
and from
ByteStrings
, using several standard encodings
(see
Data.Text.Encoding
).
Efficient locale-sensitive support for text IO is also supported (see Data.Text.IO ).
These modules are intended to be imported qualified, to avoid name clashes with Prelude functions, e.g.
import qualified Data.Text as T
ICU Support
To use an extended and very rich family of functions for working with Unicode text (including normalization, regular expressions, non-standard encodings, text breaking, and locales), see the text-icu package based on the well-respected and liberally licensed ICU library .
Internal Representation: UTF-16 vs. UTF-8
Currently the
text
library uses UTF-16 as its internal representation
which is
neither a fixed-width nor always the most dense representation
for Unicode text. We're currently investigating the feasibility
of
changing Text's internal representation to UTF-8
and if you need such a
Text
type right now you might be interested in using the spin-off
packages
text-utf8
and
text-short
.