UTF8 - Simple Library for Internationalization
|
Conversion between upper case and lower case letters. More...
Functions | |
void | utf8::make_lower (std::string &str) |
In place version converts a UTF-8 encoded string to lowercase. | |
void | utf8::make_upper (std::string &str) |
In place version converts a UTF-8 encoded string to lowercase. | |
std::string | utf8::tolower (const std::string &str) |
Convert UTF-8 string to lower case. | |
std::string | utf8::toupper (const std::string &str) |
Convert a UTF-8 string to upper case. | |
int | utf8::icompare (const std::string &s1, const std::string &s2) |
Compare two strings in a case-insensitive way. | |
Conversion between upper case and lower case letters.
toupper() and tolower() functions and their in-place counterparts make_upper() and make_lower(), use standard tables published by Unicode Consortium to perform case folding. There is also a function, icompare(), that performs string comparison ignoring the case.
If input strings are not valid UTF-8 encoded strings, these function will throw a utf8::exception.
A small ancillary program (gen_casetab) converts the original table in two tables of equal size, one with the upper case letters and the other with the lower case ones. The upper case table is sorted to allow for binary searching. If a code is found in the upper case table, it is replaced with the matching code from the lower case.
Case folding tables take about 22k. Finding a code takes at most 11 comparisons.
int utf8::icompare | ( | const std::string & | s1, |
const std::string & | s2 | ||
) |
Compare two strings in a case-insensitive way.
s1 | first string |
s2 | second string |
Strings must be valid UTF-8 strings.
void utf8::make_lower | ( | std::string & | str | ) |
In place version converts a UTF-8 encoded string to lowercase.
str | UTF-8 encoded string to be converted |
Note that, in general, the size of the resulting string will be different from that of the original string.
void utf8::make_upper | ( | std::string & | str | ) |
In place version converts a UTF-8 encoded string to lowercase.
str | string to be converted |
Note that, in general, the size of the resulting string will be different from that of the original string.
std::string utf8::tolower | ( | const std::string & | str | ) |
Convert UTF-8 string to lower case.
str | UTF-8 string to convert to lowercase. |
Uses case mapping table published by Unicode Consortium (https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt)
std::string utf8::toupper | ( | const std::string & | str | ) |
Convert a UTF-8 string to upper case.
str | UTF-8 string to convert to uppercase. |
Uses case mapping table published by Unicode Consortium (https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt)