Go to the documentation of this file.
19 #ifndef __NORMALIZER2_H__
20 #define __NORMALIZER2_H__
29 #if !UCONFIG_NO_NORMALIZATION
159 getNFKCCasefoldInstance(
UErrorCode &errorCode);
183 getInstance(
const char *packageName,
201 normalize(src, result, errorCode);
357 getCombiningClass(
UChar32 c)
const;
514 norm2(n2), set(filterSet) {}
756 normalizeUTF8(uint32_t options,
const char *src, int32_t length,
773 #endif // !UCONFIG_NO_NORMALIZATION
774 #endif // __NORMALIZER2_H__
virtual UBool getRawDecomposition(UChar32 c, UnicodeString &decomposition) const
Gets the raw decomposition mapping of c.
virtual UBool getDecomposition(UChar32 c, UnicodeString &decomposition) const =0
Gets the decomposition mapping of c.
virtual int32_t spanQuickCheckYes(const UnicodeString &s, UErrorCode &errorCode) const =0
Returns the end of the normalized substring of the input string.
UNormalization2Mode
Constants for normalization modes.
Basic definitions for ICU, for both C and C++ APIs.
A mutable set of Unicode characters and multicharacter strings.
int8_t UBool
The ICU boolean type.
virtual UBool hasBoundaryAfter(UChar32 c) const =0
Tests if the character always has a normalization boundary after it, regardless of context.
USetSpanCondition
Argument values for whether span() and similar functions continue while the current character is cont...
FilteredNormalizer2(const Normalizer2 &n2, const UnicodeSet &filterSet)
Constructs a filtered normalizer wrapping any Normalizer2 instance and a filter set.
C++ API: StringPiece: Read-only byte string wrapper class.
UnicodeString is a string class that stores Unicode characters directly and provides similar function...
int32_t UChar32
Define UChar32 as a type for single Unicode code points.
UObject is the common ICU "boilerplate" class.
UErrorCode
Error code to replace exception handling, so that the code is compatible with all C++ compilers,...
Records lengths of string edits but not replacement text.
virtual UBool isNormalizedUTF8(StringPiece s, UErrorCode &errorCode) const
Tests if the UTF-8 string is normalized.
A ByteSink can be filled with bytes.
Unicode normalization functionality for standard Unicode normalization or for using custom mapping ta...
virtual UNormalizationCheckResult quickCheck(const UnicodeString &s, UErrorCode &errorCode) const =0
Tests if the string is normalized.
virtual UnicodeString & append(UnicodeString &first, const UnicodeString &second, UErrorCode &errorCode) const =0
Appends the second string to the first string (merging them at the boundary) and returns the first st...
UNormalizationCheckResult
Result values for normalization quick check functions.
C API: New API for Unicode Normalization.
virtual UBool isInert(UChar32 c) const =0
Tests if the character is normalization-inert.
virtual UnicodeString & normalizeSecondAndAppend(UnicodeString &first, const UnicodeString &second, UErrorCode &errorCode) const =0
Appends the normalized form of the second string to the first string (merging them at the boundary) a...
virtual UBool hasBoundaryBefore(UChar32 c) const =0
Tests if the character always has a normalization boundary before it, regardless of context.
virtual uint8_t getCombiningClass(UChar32 c) const
Gets the combining class of c.
virtual UBool isNormalized(const UnicodeString &s, UErrorCode &errorCode) const =0
Tests if the string is normalized.
virtual UChar32 composePair(UChar32 a, UChar32 b) const
Performs pairwise composition of a & b and returns the composite if there is one.
virtual void normalizeUTF8(uint32_t options, StringPiece src, ByteSink &sink, Edits *edits, UErrorCode &errorCode) const
Normalizes a UTF-8 string and optionally records how source substrings relate to changed and unchange...
#define U_NAMESPACE_BEGIN
A string-like object that points to a sized piece of memory.
UnicodeString normalize(const UnicodeString &src, UErrorCode &errorCode) const
Returns the normalized form of the source string.
Normalization filtered by a UnicodeSet.