nulib.text.unicode

Unicode Parsing and Utilities.

Modules

utf16
module nulib.text.unicode.utf16

UTF-16 Utilities

utf32
module nulib.text.unicode.utf32

UTF-32 Utilities

utf8
module nulib.text.unicode.utf8

UTF-8 Utilities

Public Imports

nulib.text.unicode.utf8
import utf8 = nulib.text.unicode.utf8; via public import nulib.text.unicode.utf8;
Undocumented in source.
nulib.text.unicode.utf16
import utf16 = nulib.text.unicode.utf16; via public import nulib.text.unicode.utf16;
Undocumented in source.
nulib.text.unicode.utf32
import utf32 = nulib.text.unicode.utf32; via public import nulib.text.unicode.utf32;
Undocumented in source.

Members

Aliases

GraphemeSequence
alias GraphemeSequence = weak_vector!Grapheme

A sequence of graphemes

UnicodeSequence
alias UnicodeSequence = vector!codepoint

A unicode codepoint sequence

UnicodeSlice
alias UnicodeSlice = codepoint[]

A unicode codepoint sequence

codepoint
alias codepoint = uint

A unicode codepoint

Functions

decode
UnicodeSequence decode(T str, bool stripBOM)

Decodes a string

encode
T encode(UnicodeSequence seq, bool addBOM)

Encodes a string

getEndianFromBOM
Endianess getEndianFromBOM(codepoint c)

Gets the endianess from a BOM

hasSurrogatePairs
bool hasSurrogatePairs(codepoint code)

Gets whether the codepoint mistakenly has surrogate pairs encoded within it.

isBOM
bool isBOM(codepoint c)

Gets whether the character is a BOM

isBigEndianBOM
bool isBigEndianBOM(codepoint c)

Gets whether the byte order mark is big endian

isLittleEndianBOM
bool isLittleEndianBOM(codepoint c)

Gets whether the byte order mark is little endian

toUTF16
auto ref toUTF16(FromT from, bool addBOM)

Converts the given string to a UTF-16 string.

toUTF32
auto ref toUTF32(FromT from, bool addBOM)

Converts the given string to a UTF-32 string.

toUTF8
auto ref toUTF8(FromT from)

Converts the given string to a UTF-8 string.

validate
bool validate(codepoint code)

Validates whether the codepoint is within spec

Static variables

unicodeReplacementCharacter
codepoint unicodeReplacementCharacter;

Validates whether the codepoint is within spec

Structs

Grapheme
struct Grapheme

A unicode grapheme

Variables

UNICODE_BOM
enum codepoint UNICODE_BOM;

Codepoint for the unicode byte-order-mark

Meta

Authors

Luna Nielsen