I'm afraid not, as the WIM file format stores filenames as Windows-style wide character strings (UTF-16LE, with unpaired surrogates allowed). As a result there is no way to represent a UNIX-style arbitrary byte sequence filename unless it is valid UTF-8 (with unpaired surrogates allowed).
Edit: in principle filenames with a well-defined encoding other than UTF-8, say ISO-8859-1, could be mapped to UTF-16 as well. Almost everyone uses UTF-8 now though, so there hasn't been a need to support this.
synchronicity wrote: ↑Thu Jul 08, 2021 9:37 pm
Edit: in principle filenames with a well-defined encoding other than UTF-8, say ISO-8859-1, could be mapped to UTF-16 as well. Almost everyone uses UTF-8 now though, so there hasn't been a need to support this.
It should be possible, even, to have a flag that interprets all file names as ISO-8859-1 and capture every possible file name that might be seen, and a corresponding flag on apply/extract. It has a particular advantage in that the first 256 code points in Unicode are also the entire character set of 8859-1, the conversion is pretty simple.
I'd generally agree that assuming UTF-8 is a safe default (especially given how long it took for this issue to arrise), and maintains the least surprises. Old archives, to and from Windows, etc.