UTF8 - Simple Library for Internationalization
Loading...
Searching...
No Matches
Character Folding Functions

Conversion between upper case and lower case letters. More...

Functions

void utf8::make_lower (std::string &str)
 In place version converts a UTF-8 encoded string to lowercase.
 
void utf8::make_upper (std::string &str)
 In place version converts a UTF-8 encoded string to lowercase.
 
std::string utf8::tolower (const std::string &str)
 Convert UTF-8 string to lower case.
 
std::string utf8::toupper (const std::string &str)
 Convert a UTF-8 string to upper case.
 
int utf8::icompare (const std::string &s1, const std::string &s2)
 Compare two strings in a case-insensitive way.
 

Detailed Description

Conversion between upper case and lower case letters.

toupper() and tolower() functions and their in-place counterparts make_upper() and make_lower(), use standard tables published by Unicode Consortium to perform case folding. There is also a function, icompare(), that performs string comparison ignoring the case.

If input strings are not valid UTF-8 encoded strings, these function will throw a utf8::exception.

A small ancillary program (gen_casetab) converts the original table in two tables of equal size, one with the upper case letters and the other with the lower case ones. The upper case table is sorted to allow for binary searching. If a code is found in the upper case table, it is replaced with the matching code from the lower case.

Case folding tables take about 22k. Finding a code takes at most 11 comparisons.

Function Documentation

◆ icompare()

int utf8::icompare ( const std::string &  s1,
const std::string &  s2 
)

Compare two strings in a case-insensitive way.

Parameters
s1first string
s2second string
Returns
<0 if first string is lexicographically before the second one
>0 if first string is lexicographically after the second string
=0 if the two strings are equal

Strings must be valid UTF-8 strings.

◆ make_lower()

void utf8::make_lower ( std::string &  str)

In place version converts a UTF-8 encoded string to lowercase.

Parameters
strUTF-8 encoded string to be converted

Note that, in general, the size of the resulting string will be different from that of the original string.

◆ make_upper()

void utf8::make_upper ( std::string &  str)

In place version converts a UTF-8 encoded string to lowercase.

Parameters
strstring to be converted

Note that, in general, the size of the resulting string will be different from that of the original string.

◆ tolower()

std::string utf8::tolower ( const std::string &  str)

Convert UTF-8 string to lower case.

Parameters
strUTF-8 string to convert to lowercase.
Returns
lower case UTF-8 string

Uses case mapping table published by Unicode Consortium (https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt)

◆ toupper()

std::string utf8::toupper ( const std::string &  str)

Convert a UTF-8 string to upper case.

Parameters
strUTF-8 string to convert to uppercase.
Returns
upper case UTF-8 string

Uses case mapping table published by Unicode Consortium (https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt)