From: Eric Biggers Date: Sat, 23 Mar 2013 21:29:59 +0000 (-0500) Subject: Update char encoding docs X-Git-Tag: v1.3.2~20 X-Git-Url: https://wimlib.net/git/?p=wimlib;a=commitdiff_plain;h=34c830edf813567274416e84816947d3383a6ac5;hp=0ea2bc5fcf915a313935c00a76d64aa60b5ae3c9 Update char encoding docs --- diff --git a/doc/imagex.1.in b/doc/imagex.1.in index 6838f2a1..9662b0a0 100644 --- a/doc/imagex.1.in +++ b/doc/imagex.1.in @@ -151,18 +151,11 @@ UNIX builds.) .SH LOCALES AND CHARACTER ENCODINGS -wimlib 1.3.0 has improved support for alternate character encodings. -However, not everything has been well tested, and on UNIX you are strongly -encouraged to use a UTF-8 locale so that you do not run into any problems. -In particular, if your locale uses a character encoding that is -not UTF-8, then you will not be able to open or capture WIM files containing -files with paths not representable in the current locale's character encoding. - -Similar restrictions apply to the Windows-native build of wimlib, but -unfortunately Windows does not support UTF-8 locales. So you will not be able -to apply a WIM image containing files with names not representable in the -current Windows code page, nor will you be able to capture a directory tree -containing files with names not representable in the current Windows code page. +On Windows, wimlib 1.3.2 and later works in UTF-16LE, and there should be no +problems with character encodings. + +On UNIX, wimlib works primarily in the locale-dependent multibyte encoding, +which you are strongly recommended to set to UTF-8 to avoid any problems. .SH WARNING diff --git a/src/wimlib.h b/src/wimlib.h index 403d3877..4b46f21c 100644 --- a/src/wimlib.h +++ b/src/wimlib.h @@ -190,16 +190,17 @@ * * \section encodings Locales and character encodings * - * wimlib 1.3.0 has improved handling of different character encodings compared - * to previous versions. Functions are explictly noted as taking - * ::wimlib_mbchar strings, which are encoded in the locale-dependent multibyte - * encoding (e.g. ASCII, ISO-8859-1, or UTF-8), or ::wimlib_utf8char strings, - * which are encoded in UTF-8. Generally, filenames and paths are in the - * locale-dependent multibyte encoding, while other types of data must be - * provided in UTF-8. Please see the man page for @b wimlib-imagex for more - * information. However, I strongly recommend that you use UTF-8 for your - * locale's encoding so that ::wimlib_mbchar strings will be encoded the same - * way as ::wimlib_utf8char strings. + * To support Windows as well as UNIX, wimlib's API typically takes and returns + * strings of "tchars", which are in a platform-dependent encoding. + * + * On Windows, each "tchar" is 2 bytes and is the same as a "wchar_t", and the + * encoding is UTF-16LE. + * + * On UNIX, each "tchar" is 1 byte and is simply a "char", and the encoding is + * the locale-dependent multibyte encoding. I recommend you set your locale to a + * UTF-8 capable locale to avoid any issues. Also, by default, wimlib on UNIX + * will assume the locale is UTF-8 capable unless you call wimlib_global_init() + * after having set your desired locale. * * \section Limitations *