Page 1 of 1

parser error : PCDATA invalid Char value 31 with cyrillic xml

Posted: Mon Feb 22, 2016 10:24 am
by steemandlinux

Code: Select all

wimlib-imagex info boot.wim 
Entity: line 1: parser error : PCDATA invalid Char value 31
D58</HIGHPART><LOWPART>0x37EC5140</LOWPART></LASTMODIFICATIONTIME><DISPLAYNAME>@
                                                                               ^
Entity: line 1: parser error : PCDATA invalid Char value 1
</HIGHPART><LOWPART>0x37EC5140</LOWPART></LASTMODIFICATIONTIME><DISPLAYNAME>@И
                                                                               ^
Entity: line 1: parser error : PCDATA invalid Char value 1
HPART><LOWPART>0x37EC5140</LOWPART></LASTMODIFICATIONTIME><DISPLAYNAME>@ИP И
                                                                               ^
Entity: line 1: parser error : PCDATA invalid Char value 1
STMODIFICATIONTIME><DISPLAYNAME>@ИP И</DISPLAYNAME><DISPLAYDESCRIPTION>P И
                                                                               ^
Entity: line 1: parser error : PCDATA invalid Char value 24
ИP И</DISPLAYNAME><DISPLAYDESCRIPTION>P И</DISPLAYDESCRIPTION><FLAGS>Ђя
                                                                               ^
[ERROR] Unable to parse the WIM file's XML document!
ERROR: Exiting with error code 73:
       The XML data of the WIM is invalid.

Code: Select all

WIM Information:
---------------------
GUID:		{52524113-3099-4213-925F-6265AEE4C28B}
Image Count:	1
Compression:	LZX
Part Number:	1/1
Boot Index:	1
Attributes:	0x8
		RP_FIX

Image Index: 1
-------------------
Name:		Microsoft Windows PE (x86)
Description:	Microsoft Windows PE (x86)
Flags:		Ђя
Files:		6629
Folders:		704
Expanded Size:	704 MB


WIM XML Information:
---------------------------
<WIM>
  <TOTALBYTES>282778488</TOTALBYTES>
  <IMAGE INDEX="1">
    <DESCRIPTION>Microsoft Windows PE (x86)</DESCRIPTION>
    <NAME>Microsoft Windows PE (x86)</NAME>
    <DIRCOUNT>704</DIRCOUNT>
    <FILECOUNT>6629</FILECOUNT>
    <TOTALBYTES>738837540</TOTALBYTES>
    <CREATIONTIME>
      <HIGHPART>0x01D06091</HIGHPART>
      <LOWPART>0x16F397AB</LOWPART>
    </CREATIONTIME>
    <LASTMODIFICATIONTIME>
      <HIGHPART>0x01D16D58</HIGHPART>
      <LOWPART>0x37EC5140</LOWPART>
    </LASTMODIFICATIONTIME>
    <DISPLAYNAME>@ИP И</DISPLAYNAME>
    <DISPLAYDESCRIPTION>P И</DISPLAYDESCRIPTION>
    <FLAGS>Ђя</FLAGS>
    <HARDLINKBYTES>0</HARDLINKBYTES>
    <WINDOWS>
      <ARCH>0</ARCH>
      <PRODUCTNAME>Операционная система Microsoft® Windows®</PRODUCTNAME>
      <EDITIONID>Professional</EDITIONID>
      <INSTALLATIONTYPE>Client</INSTALLATIONTYPE>
      <SERVICINGDATA>
        <GDRDUREVISION>0</GDRDUREVISION>
        <PKEYCONFIGVERSION>6.3.9600.16384;2013-08-21T23:45:30Z</PKEYCONFIGVERSION>
      </SERVICINGDATA>
      <HAL>acpiapic</HAL>
      <PRODUCTTYPE>WinNT</PRODUCTTYPE>
      <PRODUCTSUITE>Terminal Server</PRODUCTSUITE>
      <LANGUAGES>
        <LANGUAGE>ru-RU</LANGUAGE>
        <FALLBACK LANGUAGE="ru-RU">en-US</FALLBACK>
        <DEFAULT>ru-RU</DEFAULT>
      </LANGUAGES>
      <VERSION>
        <MAJOR>6</MAJOR>
        <MINOR>3</MINOR>
        <BUILD>9600</BUILD>
        <SPBUILD>16384</SPBUILD>
        <SPLEVEL>0</SPLEVEL>
      </VERSION>
      <SYSTEMROOT>WINDOWS</SYSTEMROOT>
    </WINDOWS>
    <WIMBOOT>0</WIMBOOT>
  </IMAGE>
</WIM>

Re: parser error : PCDATA invalid Char value 31 with cyrillic xml

Posted: Mon Feb 22, 2016 3:17 pm
by synchronicity
Sounds like the WIM file's XML document contains characters that are forbidden by the XML standard. Where did this file come from?

Re: parser error : PCDATA invalid Char value 31 with cyrillic xml

Posted: Mon Feb 22, 2016 3:40 pm
by steemandlinux
magnet:?xt=urn:btih:3992baf08d904937733210a5c05a7da4b7454f2d&tr=http%3A%2F%2Fusbtor.ru%2Fbt%2Fannounce.php

Is it possible to remove invalid characters by the wimlib program?

Re: parser error : PCDATA invalid Char value 31 with cyrillic xml

Posted: Mon Feb 22, 2016 4:11 pm
by synchronicity
I don't think it's realistic to expect wimlib to process incorrect XML documents. wimlib uses a standards compliant XML parser. You should find out why these characters were inserted in the first place, so you can solve the problem at its source. For its part, wimlib should always create well-formed XML documents.