X-Git-Url: https://wimlib.net/git/?p=wimlib;a=blobdiff_plain;f=README;h=c911c429916edbdec8b4b4b44e9754013a061ea1;hp=a5cacf67d3daadbcb85cb77f007371338f0dd166;hb=e3dc3c76cf0896eb98f455f2538999d23f95b61a;hpb=8e4a9a0129dabaddef33ee0a99f7f8b221bdf483 diff --git a/README b/README index a5cacf67..c911c429 100644 --- a/README +++ b/README @@ -1,202 +1,393 @@ - WIMLIB - -This is wimlib version 0.6.3 (May 2012). wimlib can be used to read, write, -and mount files in the Windows Imaging Format (WIM files). These files are -normally created by using the `imagex.exe' utility on Windows, but this library -provides a free implementetion of imagex for UNIX-based systems. - -The main use of this library is to create customized images of Windows PE, the -Windows Preinstallation Environment, without having to rely on Windows. Windows -PE is a lightweight version of Windows that can run entirely from memory and can -be used to install Windows from local media or a network drive or perform -maintenance. Windows PE is the operating systems runs when you boot from the -Windows DVD. - -You can find Windows PE on the ISO filesystem on the installation DVD for both -Windows 7 and Windows 8. I don't have a DVD for Vista but it should be on there -too. The Windows PE image a WIM file, `sources/boot.wim', on the ISO -filesystem. Windows PE can also be found in the Windows Automated Installation -Kit (WAIK), which is free to download from Microsoft, inside the `WinPE.cab' -file, which you can extract if you install the `cabextract' program. - -wimlib provides a public API for other programs to use, but also comes with two -programs: `imagex' and `mkwinpeimg'. - -`imagex' is intended to be like the imagex.exe program from Windows. `imagex' -can be used to create, extract, and mount WIM files. Both read-only and -read-write mounts are supported. See the man page `doc/imagex.1' for more -details. - -`mkwinpeimg' is shell script that makes it easy to create a customized bootable -image of Windows PE that can be put on a CD or USB drive, or published on a -server for PXE booting. See the main page `doc/mkwinpeiso.1' for more details. - -Wimlib can also be used to handle larger WIM files such as the `install.wim' -file that comes on the Windows DVD; however, this has not been well tested. - -An earlier version of Wimlib is being used to deploy Windows 7 from the Ultimate -Deployment Appliance. For more information see -http://www.ultimatedeployment.org/. - -------------------------------------------------------------------------------- - - CONFIGURATION - -Besides the various well-known options, the following options can be passed to -wimlib's `configure' script: + INTRODUCTION + +This is wimlib version 1.6.3 (May 2014). wimlib is a C library for +creating, modifying, extracting, and mounting files in the Windows Imaging +Format (WIM files). These files are normally created using the ImageX +(imagex.exe) or Dism (Dism.exe) utilities on Windows, but wimlib is distributed +with a free implementation of ImageX called "wimlib-imagex" for both UNIX-like +systems and Windows. + + INSTALLATION + +To install wimlib and wimlib-imagex on Windows you simply need to download and +extract the ZIP file containing the latest binaries from the SourceForge page +(http://sourceforge.net/projects/wimlib/), which you may have already done. + +To install wimlib and wimlib-imagex on UNIX-like systems (with Linux being the +primary supported and tested platform), you must compile the source code, which +is also available at http://sourceforge.net/projects/wimlib/. Alternatively, +check if a package has been prepared for your Linux distribution. Example files +for Debian and RPM packaging are in the debian/ and rpm/ directories. + + WIM FILES + +A Windows Imaging (WIM) file is an archive designed primarily for archiving +Windows filesystems. However, it can be used on other platforms as well, with +some limitations. Like some other archive formats such as ZIP, files in WIM +archives may be compressed. WIM files support multiple compression formats, +including LZX, XPRESS, and LZMS. All these formats are supported by wimlib. + +A WIM file consists of one or more "images". Each image is an independent +top-level directory structure and is logically separate from all other images in +the WIM. Each image has a name as well as a 1-based index in the WIM file. To +save space, WIM archives automatically combine all duplicate files across all +images. + +A WIM file may be either stand-alone or split into multiple parts. Split WIMs +are read-only and cannot be modified. + +Since version 1.6.0, wimlib also supports ESD (.esd) files, except when +encrypted. These are still WIM files but they use a newer version of the file +format. + + IMAGEX IMPLEMENTATION + +wimlib itself is a C library, and it provides a documented public API (See: +http://wimlib.sourceforge.net) for other programs to use. However, it is also +distributed with a command-line program called "wimlib-imagex" that uses this +library to implement an imaging tool similar to Microsoft's ImageX. +wimlib-imagex supports almost all the capabilities of Microsoft's ImageX as well +as additional capabilities. wimlib-imagex works on both UNIX-like systems and +Windows, although some features differ between the platforms. + +Run `wimlib-imagex' with no arguments to see an overview of the available +commands and their syntax. For additional documentation: + + * If you have installed wimlib-imagex on a UNIX-like system, you will find + further documentation in the man pages; run `man wimlib-imagex' to get + started. + + * If you have downloaded the Windows binary distribution, you will find the + documentation for wimlib-imagex in PDF format in the "doc" directory, + ready for viewing with any PDF viewer. Please note that although the PDF + files are converted from UNIX-style "man pages", they do document + Windows-specific behavior when appropriate. + + COMPRESSION RATIO + +wimlib (and wimlib-imagex) can create XPRESS, LZX, or LZMS compressed WIM +archives. wimlib includes its own compression codecs and does not use the +compression API available on some versions of Windows. The below table provides +the results (file size, in bytes, and time to create, in seconds) of capturing a +WIM containing an x86 Windows PE image, using various compression types and +options. When applicable, the results with the equivalent Microsoft +implementation in WIMGAPI, which is the library used by ImageX and Dism, are +included. + + =========================================================================== + | Compression type || wimlib (v1.6.1) | WIMGAPI (Windows 8) | + =========================================================================== + | None [1] || 531,979,435 in 18s | 531,980,333 in 24s | + | XPRESS [2] || 207,369,912 in 22s | 209,886,010 in 39s | + | LZX (quick) [3] || 194,876,901 in 29s | N/A | + | LZX (normal) [4] || 187,962,713 in 158s | 188,163,523 in 125s | + | LZX (slow) [5] || 186,913,423 in 358s | N/A | + | LZMS (non-solid) [6] || 176,880,594 in 182s | N/A | + | LZMS (solid) [7] || 136,507,304 in 494s | 126,735,608 in 623s | + =========================================================================== + +Notes: + [1] '--compress=none' for wimlib-imagex; + '/compress none' or no option for ImageX. + + [2] '--compress=fast' or '--compress=XPRESS' for wimlib-imagex; + '/compress fast' or no option for ImageX. + Compression chunk size is 32768 (the default for XPRESS). + + [3] No compression option specified to wimlib-imagex; no known equivalent for + WIMGAPI (ImageX uses XPRESS compression if no option specified). + Compression chunk size is 32768 (the default for LZX). + + [4] '--compress=maximum' or '--compress=LZX' for wimlib-imagex; + '/compress maximum' for ImageX. + Compression chunk size is 32768 (the default for LZX). + + [5] '--compress=maximum --compress-slow' for wimlib-imagex; + no known equivalent for WIMGAPI. + Compression chunk size is 32768 (the default for LZX). + + [6] '--compress=recovery' or '--compress=LZMS' for wimlib-imagex; + no known way to create the equivalent with WIMGAPI. + Compression chunk size is 131072 (the default for LZMS). Note: this + compression type is not generally recommended due to its limited + compatibility with the MS implementations. + + [7] '--compress=recovery --solid' or '--compress=LZMS --solid' for + wimlib-imagex; WIMCreateFile with WIM_COMPRESSION_LZMS and flag + 0x20000000 for WIMGAPI. Compression chunk size in packed resources is + 33554432 for wimlib, 67108864 for WIMGAPI. Note: this compression type + is not generally recommended due to its limited compatibility with the MS + implementations. Also, due to the large chunk size, wimlib uses about + 500MB of memory per thread when compressing in this format. + +The above timings were done on Windows 8 (x86) so that side-by-side comparisons +with the Microsoft implementation would be possible; however, wimlib may have +even better performance on other operating systems such as Linux. The system +had 2 CPUs and 2 GiB of memory available. All times were done with the page +cache warmed, so the times primarily measure the performance of the compression +algorithms and not the time to read data from disk, which presumably is similar +in each implementation. + +Below are results for compressing the Canterbury corpus using wimlib (v1.6.1), +WIMGAPI (Windows 8), and some other formats/programs, including the archive size +only. Note that the Canterbury corpus includes no duplicate files or hard +links, which WIM handles better than most other formats by storing only distinct +data streams. + + ================================================= + | Format | Size (bytes) | + ================================================= + | tar | 2,826,240 | + | WIM (WIMGAPI, None) | 2,814,278 | + | WIM (wimlib, None) | 2,813,856 | + | WIM (WIMGAPI, XPRESS) | 825,410 | + | WIM (wimlib, XPRESS) | 792,024 | + | tar.gz (gzip, default) | 738,796 | + | ZIP (Info-ZIP, default) | 735,334 | + | tar.gz (gzip, -9) | 733,971 | + | ZIP (Info-ZIP, -9) | 732,297 | + | WIM (wimlib, LZX quick) | 722,196 | + | WIM (WIMGAPI, LZX) | 651,766 | + | WIM (wimlib, LZX normal) | 639,464 | + | WIM (wimlib, LZX slow) | 633,144 | + | WIM (wimlib, LZMS non-solid) | 590,252 | + | tar.bz2 (bzip, default) | 565,008 | + | tar.bz2 (bzip, -9) | 565,008 | + | WIM (wimlib, LZMS solid) | 534,218 | + | WIM (wimlib, LZMS solid, slow) | 529,904 | + | WIM (WIMGAPI, LZMS solid) | 521,232 | + | tar.xz (xz, default) | 486,916 | + | tar.xz (xz, -9) | 486,904 | + | 7z (7-zip, default) | 484,700 | + | 7z (7-zip, -9) | 483,239 | + ================================================= + + NTFS SUPPORT + +WIM images may contain data, such as alternate data streams and +compression/encryption flags, that are best represented on the NTFS filesystem +used on Windows. Also, WIM images may contain security descriptors which are +specific to Windows and cannot be represented on other operating systems. +wimlib handles this NTFS-specific or Windows-specific data in a +platform-dependent way: + + * In the Windows version of wimlib and wimlib-imagex, NTFS-specific and + Windows-specific data are supported natively. + + * In the UNIX version of wimlib and wimlib-imagex, NTFS-specific and + Windows-specific data are ordinarily ignored; however, there is also special + support for capturing and extracting images directly to/from unmounted NTFS + volumes. This was made possible with the help of libntfs-3g from the + NTFS-3g project. + +For both platforms the code for NTFS capture and extraction is complete enough +that it is possible to apply an image from the "install.wim" contained in recent +Windows installation media (Vista, Windows 7, or Windows 8) directly to a NTFS +filesystem, and then boot Windows from it after preparing the Boot Configuration +Data. In addition, a Windows installation can be captured (or backed up) into a +WIM file, and then re-applied later. + + WINDOWS PE + +A major use for wimlib and wimlib-imagex is to create customized images of +Windows PE, the Windows Preinstallation Environment, on either UNIX-like systems +or Windows without having to rely on Microsoft's software and its restrictions +and limitations. + +Windows PE is a lightweight version of Windows that can run entirely from memory +and can be used to install Windows from local media or a network drive or +perform maintenance. It is the operating system that runs when you boot from +the Windows installation media. + +You can find Windows PE on the installation DVD for Windows Vista, Windows 7, or +Windows 8, in the file `sources/boot.wim'. Windows PE can also be found in the +Windows Automated Installation Kit (WAIK), which is free to download from +Microsoft, inside the `WinPE.cab' file, which you can extract natively on +Windows, or on UNIX-like systems if you install either the `cabextract' or +`p7zip' programs. + +In addition, Windows installations and recovery partitions frequently contain a +WIM containing an image of the Windows Recovery Environment, which is similar to +Windows PE. + +A shell script `mkwinpeimg' is distributed with wimlib on UNIX-like systems to +ease the process of creating and customizing a bootable Windows PE image. + + DEPENDENCIES + +This section documents the dependencies of wimlib and the programs distributed +with it, when building for a UNIX-like system from source. If you have +downloaded the Windows binary distribution of wimlib and wimlib-imagex then all +dependencies were already included and this section is irrelevant. + +* libxml2 (required) + This is a commonly used free library to read and write XML files. You + likely already have it installed as a dependency for some other program. + For more information see http://xmlsoft.org/. + +* libfuse (optional but highly recommended) + Unless configured with --without-fuse, wimlib requires a non-ancient + version of libfuse to be installed. Most Linux distributions already + include this, but make sure you have the libfuse package installed, and + also libfuse-dev if your distribution distributes header files + separately. FUSE also requires a kernel module. If the kernel module + is available it will automatically be loaded if you try to mount a WIM + file. For more information see http://fuse.sourceforge.net/. FUSE is + also available for FreeBSD. + +* libntfs-3g (optional but highly recommended) + Unless configured with --without-ntfs-3g, wimlib requires the library + and headers for libntfs-3g version 2011-4-12 or later to be installed. + Versions dated 2010-3-6 and earlier do not work because they are missing + the header xattrs.h (and the file xattrs.c, which contains functions we + need). libntfs-3g version 2013-1-13 is compatible only with wimlib + 1.2.4 and later. + +* OpenSSL / libcrypto (optional) + wimlib can use the SHA1 message digest code from OpenSSL instead of + compiling in yet another SHA1 implementation. (See LICENSE section.) + +* cdrkit (optional) +* mtools (optional) +* syslinux (optional) +* cabextract (optional) + The `mkwinpeimg' shell script will look for several other programs + depending on what options are given to it. Depending on your Linux + distribution, you may already have these programs installed, or they may + be in the software repository. Making an ISO filesystem requires + `mkisofs' from `cdrkit' (http://www.cdrkit.org). Making a disk image + requires `mtools' (http://www.gnu.org/software/mtools) and `syslinux' + (http://www.syslinux.org). Retrieving files from the Windows Automated + Installation Kit requires `cabextract' (http://www.cabextract.org.uk). + + CONFIGURATION + +This section documents the most important options that may be passed to the +"configure" script when building from source: + +--without-ntfs-3g + If libntfs-3g is not available or is not version 2011-4-12 or later, + wimlib can be built without it, in which case it will not be possible to + apply or capture images directly to/from NTFS volumes. --without-fuse If libfuse or the FUSE kernel module is not available, wimlib can be compiled with --without-fuse. This will remove the ability to mount and - unmount WIM files. wimlib_mount() and wimlib_unmount() will fail with - WIMLIB_ERR_UNSUPPORTED. + unmount WIM files. ---disable-libcrypto +--without-libcrypto Build in functions for SHA1 rather than using external SHA1 functions from libcrypto (part of OpenSSL). The default is to use libcrypto if it is found on the system. +--enable-xattr, --disable-xattr + Enable or disable support for the extended-attributes interface to NTFS + alternate data streams in mounted WIMs. To support these, wimlib + requires that the setxattr() function and the attr/xattr.h header are + available. The default is to autodetect whether support is possible. + +--disable-multithreaded-compression + By default, data will be compressed using multiple threads when writing + a WIM, unless only 1 processor is detected. Specify this option to + disable support for this. + --enable-ssse3-sha1 Use a very fast assembly language implementation of SHA1 from Intel. Only use this if the build target supports the SSSE3 instructions. ---disable-custom-memory-allocator - If this option is given, MALLOC(), FREE(), CALLOC(), and STRDUP() will - directly call the appropriate functions in the C library. - wimlib_set_memory_allocator() will fail with WIMLIB_ERR_UNSUPPORTED. - ---disable-verify-compression - Unless this option is given, every time wimlib compresses a data block - it will decompress it into a temporary buffer and abort() the program - with an error message if the decompressed data does not exactly match - the original data. This is to find bugs. - --disable-error-messages - Removes all error messages from the library. If left in, they still - have to explicitly turned on with wimlib_set_print_errors() in order to - see them. Also, error codes will still be returned regardless of - whether error messages are printed or not. - - If --disable-error-messages is given, wimlib_set_print_errors() will - fail with WIMLIB_ERR_UNSUPPORTED if the action is to turn error messages - on. + Save some space by removing all error messages from the library. --disable-assertions - Remove all assertions. Without this option, wimlib will abort() the - program if an assertion fails. An assertion failure should only occur - if there is a bug in wimlib. - ---disable-security-data - Wimlib cannot create or modify WIM security data, but by default it will - copy existing security data when modifying a WIM or exporting an image. - Passing this flag will disable this support; then wimlib will always - write WIMs without security data. - ---enable-debug - Include debugging messages. Only use this option if you have found a - bug in the library. - ---enable-more-debug - Include more debugging messages. Only use this option if you have found - a bug in the library. - - -------------------------------------------------------------------------------- - - DEPENDENCIES - -Wimlib requires libxml2 to build. This is a commonly used free library to read -and write XML files. You likely already have it installed as a dependency for -some other program. For more information see http://xmlsoft.org/. - -Wimlib also requires libfuse to build (unless configured with --without-fuse; -see above). Most GNU/Linux distributions already include this, but make sure -you have the libfuse package installed (libfuse-dev if your distribution -distributes header files separately). FUSE also requires a kernel module. If -the kernel module is available it will automatically be loaded if you try to -mount a WIM file. Wimlib has only been tested with the Linux version of FUSE. -For more information see http://fuse.sourceforge.net/. - -The `mkwinpeimg' shell script will look for several other programs depending on -what options are given to it. Depending on your GNU/Linux distribution, you may -already have these programs installed, or they may be in the software -repository. Making an ISO filesystem requires `mkisofs' from `cdrkit' -(http://www.cdrkit.org). Making a disk image requires `mtools' -(http://www.gnu.org/software/mtools) and `syslinux' (http://www.syslinux.org). -Retrieving files from the Windows Automated Installation Kit requires -`cabextract' (http://www.cabextract.org.uk). - ------------------------------------------------------------------------------- - - PORTABILITY - -wimlib has mostly been developed and tested on x86_64 (64-bit) GNU/Linux. + Remove assertions included by default. -It has been tested on x86 (32-bit) GNU/Linux occasionally. + PORTABILITY -It can also be compiled and run on FreeBSD. +wimlib has primarily been tested on Linux and Windows (primarily Windows 7, but +also Windows XP and Windows 8). -wimlib should work on big endian machines but it has not been tested. +wimlib may work on FreeBSD and Mac OS X. However, this is not well tested. If +you do not have libntfs-3g 2011-4-12 or later available, you must configure +wimlib with --without-ntfs-3g. On FreeBSD, before mounting a WIM you need to +load the POSIX message queue module (run `kldload mqueuefs'). -There are no plans to port wimlib to Windows since the programming interface on -Windows is very different and Microsoft's imagex.exe is already available. +The code has primarily been tested on x86 and x86_64 CPUs, but it's written to +be portable to other architectures and I've also tested it on ARM. However, +although the code is written to correctly deal with endianness, it has not yet +actually been tested on a big-endian architecture. ------------------------------------------------------------------------------- + REFERENCES - REFERENCES +The WIM file format is partially specified in a document that can be found in +the Microsoft Download Center. However, this document really only provides an +overview of the format and is not a formal specification. -The WIM file format is specified in a document that can be found in the -Microsoft Download Center. There is a similar document that specifies the LZX -compression format, and a document that specifies the XPRESS compression format. -However, some aspects of these formats are poorly documented. Some particularly -poorly documented parts of the formats have had comments added in various places -in the library. +With regards to the supported compression formats: -lzx-decomp.c, the code to decompress WIM file resources that are compressed -using LZX compression, is originally based on code from the cabextract project -(http://www.cabextract.org.uk). +- Microsoft has official documentation for XPRESS that is of reasonable quality. +- Microsoft has official documentation for LZX but it contains errors. +- There does not seem to be any official documentation for LZMS, so my comments + and code in src/lzms-decompress.c may in fact be the best documentation + available for this particular compression format. -lzx-comp.c, the code to compress WIM file resources using LZX compression, is -originally based on code written by Matthew Russotto (www.russotto.net/chm/). +The code in ntfs-3g_apply.c and ntfs-3g_capture.c uses the NTFS-3g library, +which is a library for reading and writing to NTFS filesystems (the filesystem +used by recent versions of Windows). See +http://www.tuxera.com/community/ntfs-3g-download/ for more information. -lz.c, the code to find LZ77 matches, is based on code from zlib. +The LZX decompressor (lzx-decompress.c) was originally based on code from the +cabextract project (http://www.cabextract.org.uk) but has been rewritten. -sha1.c and sha1.h, the code to compute SHA1 message digests of WIM resources or -of the WIM file itself in the case of integrity checks, are based on code from -GNU coreutils. +The LZX compressor (lzx-compress.c) was originally based on code written by +Matthew Russotto (www.russotto.net/chm/) but has been rewritten. It now uses +suffix array construction code from divsufsort +(https://code.google.com/p/libdivsufsort/) and algorithms from 7-Zip as well as +several published papers. -A very limited number of other free programs can handle some parts of the WIM -file format. 7-zip is able to extract and create WIMs and files in many other -archive formats. However, WIMLIB is designed specifically to handle WIM files -and provides features previously only available in Microsoft's imagex.exe, such -as the ability to mount WIMs read-write. +lz_hash.c contains a hash-table-based LZ77 matchfinder that is based on code +from zlib but has been rewritten. This code is applicable to XPRESS, LZX, and +LZMS, all of which are partly based on LZ77 compression. ------------------------------------------------------------------------------- +A limited number of other free programs can handle some parts of the WIM +file format: - MORE INFORMATION + * 7-zip is able to extract and create WIMs (as well as files in many + other archive formats). However, wimlib is designed specifically to handle + WIM files and provides features previously only available in Microsoft's + implementation, such as the ability to mount WIMs read-write as well as + read-only, the ability to create LZX or XPRESS compressed WIMs, and the + correct handling of security descriptors and hard links. + * ImagePyX (https://github.com/maxpat78/ImagePyX) is a Python program that + provides similar capabilities to wimlib-imagex. One thing to note, though, + is that it does not support compression and decompression by itself, but + instead relies on external native code, such as the codecs from wimlib. -See the manual pages for `imagex', the manual pages for the subcommands of -`imagex', and the manual page for `mkwinpeimg'. +A very early version of wimlib is being used to deploy Windows 7 from the +Ultimate Deployment Appliance. For more information see +http://www.ultimatedeployment.org/. -As of version 0.5.0, Wimlib's public API is documented. Doxygen is required to -build the documentation. To build the documentation, run `configure', then -enter the directory `doc' and run `doxygen'. The HTML documentation will be -created in a directory named `html'. +If you are looking for a UNIX archive format that provides features similar to +WIM, I recommend you take a look at SquashFS (http://squashfs.sourceforge.net/). ------------------------------------------------------------------------------- + LICENSE - LICENSE +As of version 1.0.0, wimlib and all programs and scripts distributed with it are +released under the GNU GPL version 3.0 or later. See COPYING for details. +Some individual source files are also released under more permissive licenses. -Wimlib is released under the GNU LGPL version 2.1 or later. The files in the -`programs' directory are released under the GPL version 3. +wimlib is independently developed and does not contain any code, data, or files +copyrighted by Microsoft. It is not known to be affected by any patents. ------------------------------------------------------------------------------- +On UNIX-like systems, if you do not want wimlib to be dynamically linked with +libcrypto (OpenSSL), configure with --without-libcrypto. This replaces the SHA1 +implementation with built-in code and there will be no difference in +functionality. - DISCLAIMER + DISCLAIMER -Wimlib is experimental. Use Microsoft's `imagex.exe' if you want to make sure -your WIM files are made correctly. Please submit a bug report (to -ebiggers3@gmail.com) if you find a bug. +wimlib comes with no warranty whatsoever. Please submit a bug report (to +ebiggers3@gmail.com) if you find a bug in wimlib and/or wimlib-imagex. -Some parts of the WIM file format are poorly documented or even completely -undocumented, so these parts had to be reverse engineered. +Be aware that some parts of the WIM file format are poorly documented or even +completely undocumented, so I've just had to do the best I can to read and write +WIMs that appear to be compatible with Microsoft's software.