c++ - std::u32string conversion to/from std::string and std::u16string -


i need convert between utf-8, utf-16 , utf-32 different api's/modules , since know have option use c++11 looking @ new string types.

it looks can use string, u16string , u32string utf-8, utf-16 , utf-32. found codecvt_utf8 , codecvt_utf16 able conversion between char or char16_t , char32_t , looks higher level wstring_convert appears work bytes/std::string , not great deal of documentation.

am meant use wstring_convert somehow utf-16 ↔ utf-32 , utf-8 ↔ utf-32 case? found examples utf-8 utf-16, not sure correct on linux wchar_t considered utf-32... or more complex codecvt things directly?

or still not in usable state , should stick own existing small routines using 8, 16 , 32bit unsigned integers?

if read documentation @ cppreference.com wstring_convert, codecvt_utf8, codecvt_utf16, , codecvt_utf8_utf16, pages include table tells can use various utf conversions.

table

and yes, use std::wstring_convert facilitate conversion between various utfs. despite name, not limited std::wstring, operates std::basic_string type (which std::string, std::wstring, , std::uxxstring based on).

class template std::wstring_convert performs conversions between byte string std::string , wide string std::basic_string<elem>, using individual code conversion facet codecvt. std::wstring_convert assumes ownership of conversion facet, , cannot use facet managed locale. the standard facets suitable use std::wstring_convert std::codecvt_utf8 utf-8/ucs2 , utf-8/ucs4 conversions , std::codecvt_utf8_utf16 utf-8/utf-16 conversions.

for example:

typedef std::string u8string;  u8string to_utf8(const std::u16string &s) {     std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t> conv;     return conv.to_bytes(s); }  u8string to_utf8(const std::u32string &s) {     std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> conv;     return conv.to_bytes(s); }  std::u16string to_utf16(const u8string &s) {     std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t> conv;     return conv.from_bytes(s); }  std::u16string to_utf16(const std::u32string &s) {     std::wstring_convert<std::codecvt_utf16<char32_t>, char32_t> conv;     std::string bytes = conv.to_bytes(s);     return std::u16string(reinterpret_cast<const char16_t*>(bytes.c_str()), bytes.length()/sizeof(char16_t)); }  std::u32string to_utf32(const u8string &s) {     std::wstring_convert<codecvt_utf8<char32_t>, char32_t> conv;     return conv.from_bytes(s); }  std::u32string to_utf32(const std::u16string &s) {     const char16_t *pdata = s.c_str();     std::wstring_convert<std::codecvt_utf16<char32_t>, char32_t> conv;     return conv.from_bytes(reinterpret_cast<const char*>(pdata), reinterpret_cast<const char*>(pdata+s.length())); } 

Comments

Popular posts from this blog

python - No exponential form of the z-axis in matplotlib-3D-plots -

php - Best Light server (Linux + Web server + Database) for Raspberry Pi -

c# - "Newtonsoft.Json.JsonSerializationException unable to find constructor to use for types" error when deserializing class -