Unicode filenames and paths cannot be used as parameters

Comments, questions, bug reports, etc.
Post Reply
Vulpix
Posts: 14
Joined: Fri Jan 25, 2019 7:01 am

Unicode filenames and paths cannot be used as parameters

Post by Vulpix »

Hello,

env: latest wimlib 1.14.1 / w11 x64 w/ latest updates.

Recently I found out that a bitflip has happened during compression of one of my wim archives, and I wanted to replace the offending file in the wim archive using wimupdate.

However, be it the --command or the .txt file (in whichever formatting), I have not been successful in making wimlib replace only the file I wanted, due to the path containing unicode (japanese) characters.

Is this something that could be supported?
synchronicity
Site Admin
Posts: 472
Joined: Sun Aug 02, 2015 10:31 pm

Re: Unicode filenames and paths cannot be used as parameters

Post by synchronicity »

It is already supported. Can you give a specific example?
Vulpix
Posts: 14
Joined: Fri Jan 25, 2019 7:01 am

Re: Unicode filenames and paths cannot be used as parameters

Post by Vulpix »

Well, I made a test for this.

I made this directory structure:

Code: Select all

F:\Temp\Test\123これはテストです 頑張って\テスト1\
Inside, there was one file called

Code: Select all

イル視点.txt
I created a wim from this the usual way.

Code: Select all

c:\Programs\Programs\wimlib>wimcapture F:\Temp\Test F:\example.wim
Scanning "F:\Temp\Test"
3 bytes scanned (1 files, 3 directories)
Using LZX compression with 1 thread
Archiving file data: 3 bytes of 3 bytes (100%) done
Then I made a command file called

Code: Select all

 example.txt 
and into this file I added a file that I wanted to add into this folder. EDIT: I also created the actual source file in that folder so it does exist.

Code: Select all

add "f:\Temp\Test\123これはテストです 頑張って\テスト1\新しい視点.txt" \
But running this gives me an error.

Code: Select all

c:\Programs\Programs\wimlib>wimupdate F:\example.wim < F:\Temp\example.txt
Scanning "f:\Temp\Test\123ÒüÒÒü»ÒÒ╣ÒÒüºÒüÒÚáÕ╝ÁÒüúÒüª\ÒÒ╣Ò1\µ░ÒüÒüÞªþ╣.txt"
[ERROR] Can't open "f:\Temp\Test\123ÒüÒÒü»ÒÒ╣ÒÒüºÒüÒÚáÕ╝ÁÒüúÒüª\ÒÒ╣Ò1\µ░ÒüÒüÞªþ╣.txt" (status=c000003a): {Path Not Found}
The path %hs does not exist
ERROR: Exiting with error code 47:
       Failed to open a file.
The file is saved as UTF-8. I tried saving it as UTF-8 with BOM and all the other UTF16 variants but result was always an error (although always slightly different).

If I open the file in my text editor, it looks normal.
Last edited by Vulpix on Sun May 14, 2023 10:36 am, edited 1 time in total.
synchronicity
Site Admin
Posts: 472
Joined: Sun Aug 02, 2015 10:31 pm

Re: Unicode filenames and paths cannot be used as parameters

Post by synchronicity »

Did you actually create the file "f:\Temp\Test\123これはテストです 頑張って\テスト1\新しい視点.txt"? I don't see that listed in your repro steps. If you didn't do that, it's expected that that file doesn't exist. Perhaps you intended to give a different path for the source?

Also keep in mind, the second argument to an "add" command must be the destination path in the WIM image. I would have expected the second argument to your "add" command to be "\123これはテストです 頑張って\テスト1\新しい視点.txt".
Vulpix
Posts: 14
Joined: Fri Jan 25, 2019 7:01 am

Re: Unicode filenames and paths cannot be used as parameters

Post by Vulpix »

Hi!
Yes, the files do exist in the source folder (I added them in the folder after creating the wim image. Let me correct the post.).

I get your point about the destination path, though here the complaint is about the source path and it is (at least to me) clearly caused by the unicode characters of the path and the file.

The workaround to make this file replacement a success that I did was, I created a folder structure that started with ascii-only characters, and I asked wimlib to update that folder (as the source), and I put only one file inside it so only that one file (that matched in filename terms) the same file on the destination (i.e. inside the wim archive) was overwritten / replaced.

Still, would've been nicer if I could just point to the actual file but the unicode path (and I think even filename) does not allow this, at least for now.
synchronicity
Site Admin
Posts: 472
Joined: Sun Aug 02, 2015 10:31 pm

Re: Unicode filenames and paths cannot be used as parameters

Post by synchronicity »

Hi,

It turns out that on Windows, wimlib-imagex interprets the wimupdate command file as either the current Windows code page or UTF-16LE. UTF-16LE is assumed if the file begins with a UTF-16LE BOM or the UTF-16LE encoding of an ASCII character, otherwise the current Windows code page is assumed.

I am thinking of changing it to be autodetected UTF-8 or UTF-16LE, to be consistent with wimextract pathlist files and wimcapture config files which already are interpreted that way. The current Windows code page would no longer be used.

This might solve the problem you're having. However, you should already be able to use arbitrary Unicode characters in wimupdate commands if you use UTF-16LE encoding. Are you sure that doesn't work either?
Vulpix
Posts: 14
Joined: Fri Jan 25, 2019 7:01 am

Re: Unicode filenames and paths cannot be used as parameters

Post by Vulpix »

Hi!

That sounds promising. It did not actually work correctly using just the commands but maybe I'm missing something. Is there anything specific that I need to do (like set something up in my cmd session) to switch to the proper encoding? I'll try it out if you can help me verify that I'm doing it right.

Thank you for looking into this.
synchronicity
Site Admin
Posts: 472
Joined: Sun Aug 02, 2015 10:31 pm

Re: Unicode filenames and paths cannot be used as parameters

Post by synchronicity »

It should work if the command file that's passed into wimupdate is UTF-16LE encoded. F:\Temp\example.txt in your example.
synchronicity
Site Admin
Posts: 472
Joined: Sun Aug 02, 2015 10:31 pm

Re: Unicode filenames and paths cannot be used as parameters

Post by synchronicity »

I've posted wimlib v1.14.2-BETA1 which makes UTF-8 be accepted for the wimcapture command file.
synchronicity
Site Admin
Posts: 472
Joined: Sun Aug 02, 2015 10:31 pm

Re: Unicode filenames and paths cannot be used as parameters

Post by synchronicity »

This change was released in wimlib v1.14.2.
Post Reply