X-Git-Url: https://wimlib.net/git/?p=wimlib;a=blobdiff_plain;f=README;h=9a3f26da22c75577aae60e23c9734957ff784441;hp=a5cacf67d3daadbcb85cb77f007371338f0dd166;hb=41fb4d5debd14f3a0e1717bf25e7e145cd42ae1f;hpb=8e4a9a0129dabaddef33ee0a99f7f8b221bdf483 diff --git a/README b/README index a5cacf67..9a3f26da 100644 --- a/README +++ b/README @@ -1,202 +1,383 @@ - WIMLIB - -This is wimlib version 0.6.3 (May 2012). wimlib can be used to read, write, -and mount files in the Windows Imaging Format (WIM files). These files are -normally created by using the `imagex.exe' utility on Windows, but this library -provides a free implementetion of imagex for UNIX-based systems. - -The main use of this library is to create customized images of Windows PE, the -Windows Preinstallation Environment, without having to rely on Windows. Windows -PE is a lightweight version of Windows that can run entirely from memory and can -be used to install Windows from local media or a network drive or perform -maintenance. Windows PE is the operating systems runs when you boot from the -Windows DVD. - -You can find Windows PE on the ISO filesystem on the installation DVD for both -Windows 7 and Windows 8. I don't have a DVD for Vista but it should be on there -too. The Windows PE image a WIM file, `sources/boot.wim', on the ISO -filesystem. Windows PE can also be found in the Windows Automated Installation -Kit (WAIK), which is free to download from Microsoft, inside the `WinPE.cab' -file, which you can extract if you install the `cabextract' program. - -wimlib provides a public API for other programs to use, but also comes with two -programs: `imagex' and `mkwinpeimg'. - -`imagex' is intended to be like the imagex.exe program from Windows. `imagex' -can be used to create, extract, and mount WIM files. Both read-only and -read-write mounts are supported. See the man page `doc/imagex.1' for more -details. - -`mkwinpeimg' is shell script that makes it easy to create a customized bootable -image of Windows PE that can be put on a CD or USB drive, or published on a -server for PXE booting. See the main page `doc/mkwinpeiso.1' for more details. - -Wimlib can also be used to handle larger WIM files such as the `install.wim' -file that comes on the Windows DVD; however, this has not been well tested. - -An earlier version of Wimlib is being used to deploy Windows 7 from the Ultimate -Deployment Appliance. For more information see -http://www.ultimatedeployment.org/. - -------------------------------------------------------------------------------- - - CONFIGURATION - -Besides the various well-known options, the following options can be passed to -wimlib's `configure' script: + INTRODUCTION + +This is wimlib version 1.7.1-BETA (June 2014). wimlib is a C library for +creating, modifying, extracting, and mounting files in the Windows Imaging +Format (WIM files). wimlib and its command-line frontend 'wimlib-imagex' +provide a free and cross-platform alternative to Microsoft's WIMGAPI, ImageX, +and DISM. + + INSTALLATION + +To install wimlib and wimlib-imagex on Windows, simply download and extract the +ZIP file containing the latest binaries from the SourceForge page +(http://sourceforge.net/projects/wimlib/). You probably have already done this! + +To install wimlib and wimlib-imagex on UNIX-like systems (with Linux being the +primary supported and tested platform), you must compile the source code, which +is also available at http://sourceforge.net/projects/wimlib/. Alternatively, +check if a package has been prepared for your Linux distribution. Example files +for Debian and RPM packaging are in the debian/ and rpm/ directories. + + WIM FILES + +A Windows Imaging (WIM) file is an archive designed primarily for archiving +Windows filesystems. However, it can be used on other platforms as well, with +some limitations. Like some other archive formats such as ZIP, files in WIM +archives may be compressed. WIM files support multiple compression formats, +including LZX, XPRESS, and LZMS. All these formats are supported by wimlib. + +A WIM file consists of one or more "images". Each image is an independent +top-level directory structure and is logically separate from all other images in +the WIM. Each image has a name as well as a 1-based index in the WIM file. To +save space, WIM archives automatically combine all duplicate files across all +images. + +A WIM file may be either stand-alone or split into multiple parts. Split WIMs +are read-only and cannot be modified. + +Since version 1.6.0, wimlib also supports ESD (.esd) files, except when +encrypted. These are still WIM files but they use a newer version of the file +format. + + IMAGEX IMPLEMENTATION + +wimlib itself is a C library, and it provides a documented public API (See: +http://wimlib.sourceforge.net) for other programs to use. However, it is also +distributed with a command-line program called "wimlib-imagex" that uses this +library to implement an imaging tool similar to Microsoft's ImageX. +wimlib-imagex supports almost all the capabilities of Microsoft's ImageX as well +as additional capabilities. wimlib-imagex works on both UNIX-like systems and +Windows, although some features differ between the platforms. + +Run `wimlib-imagex' with no arguments to see an overview of the available +commands and their syntax. For additional documentation: + + * If you have installed wimlib-imagex on a UNIX-like system, you will find + further documentation in the man pages; run `man wimlib-imagex' to get + started. + + * If you have downloaded the Windows binary distribution, you will find the + documentation for wimlib-imagex in PDF format in the "doc" directory, + ready for viewing with any PDF viewer. Please note that although the PDF + files are converted from UNIX-style "man pages", they do document + Windows-specific behavior when appropriate. + + COMPRESSION RATIO + +wimlib (and wimlib-imagex) can create XPRESS, LZX, or LZMS compressed WIM files. +wimlib includes its own compression codecs and does not use the compression API +available on some versions of Windows. + +I have gradually been improving the compression codecs in wimlib. For XPRESS +and LZX, they now usually outperform and outcompress the equivalent Microsoft +implementations. Although results will vary depending on the data being +compressed, in the table below I present the results for a common use case: +compressing an x86 Windows PE image. Each row displays the compression type, +the size of the resulting WIM file in bytes, and how many seconds it took to +create the file. When applicable, the results with the equivalent Microsoft +implementation in WIMGAPI is included. + + ============================================================================= + | Compression || wimlib (v1.7.1) | WIMGAPI (Windows 8.1) | + ============================================================================= + | None [1] || 361,182,560 in 3.7s | 361,183,674 in 4.4s | + | XPRESS [2] || 138,349,798 in 5.8s | 140,416,657 in 6.8s | + | XPRESS (slow) [3] || 135,234,072 in 19.5s | N/A | + | LZX (quick) [4] || 131,816,279 in 6.7s | N/A | + | LZX (normal) [5] || 126,808,347 in 28.3s | 127,259,566 in 31.4s | + | LZX (slow) [6] || 126,199,523 in 61.4s | N/A | + | LZMS (non-solid) [7] || 122,083,126 in 30.4s | N/A | + | LZMS (solid) [8] || 93,752,206 in 84.3s | 88,742,238 in 156.1s | + | "WIMBoot" [9] || 167,039,787 in 7.6s | 169,051,718 in 14.9s | + | "WIMBoot" (slow) [10] || 165,141,503 in 15.8s | N/A | + ============================================================================= + +Notes: + [1] '--compress=none' for wimlib-imagex; '/compress:none' for DISM. + + [2] '--compress=XPRESS' for wimlib-imagex; '/compress:fast' for DISM. + Compression chunk size defaults to 32768 bytes in both cases. + + [3] '--compress=XPRESS:80' for wimlib-imagex; no known equivalent for DISM. + Compression chunk size defaults to 32768 bytes. + + [4] '--compress=LZX:20' for wimlib-imagex; no known equivalent for DISM. + Compression chunk size defaults to 32768 bytes. + + [5] '--compress=LZX' or '--compress=LZX:50' or no option for wimlib-imagex; + '/compress:maximum' for DISM. + Compression chunk size defaults to 32768 bytes in both cases. + + [6] '--compress=LZX:100' for wimlib-imagex; no known equivalent for DISM. + Compression chunk size defaults to 32768 bytes. + + [7] '--compress=LZMS' for wimlib-imagex; no known equivalent for DISM. + Compression chunk size defaults to 131072 bytes. + + [8] '--solid' for wimlib-imagex. Should be '/compress:recovery' for DISM, + but only works for /Export-Image, not /Capture-Image. Compression chunk + size in solid blocks defaults to 33554432 for wimlib, 67108864 for DISM. + + [9] '--wimboot' for wimlib-imagex; '/wimboot' for DISM. + This is really XPRESS compression with 4096 byte chunks, so the same as + '--compress=XPRESS --chunk-size=4096'. + + [10] '--wimboot --compress=XPRESS:80' for wimlib-imagex; + no known equivalent for DISM. + Same format as [9], but trying harder to get a good compression ratio. + +Note: wimlib-imagex's --compress option also accepts the "fast", "maximum", and +"recovery" aliases for XPRESS, LZX, and LZMS, respectively. + +Testing environment: + + - 64 bit binaries + - Windows 8.1 virtual machine running on Linux with VT-x + - 2 CPUs and 2 GiB memory given to virtual machine + - SSD-backed virtual disk + - All tests done with page cache warmed + +The compression ratio provided by wimlib is also competitive with commonly used +archive formats. Below are file sizes that result when the Canterbury corpus is +compressed with wimlib (v1.7.0), WIMGAPI (Windows 8), and some other +formats/programs: + + ================================================= + | Format | Size (bytes) | + ================================================= + | tar | 2,826,240 | + | WIM (WIMGAPI, None) | 2,814,278 | + | WIM (wimlib, None) | 2,813,856 | + | WIM (WIMGAPI, XPRESS) | 825,410 | + | WIM (wimlib, XPRESS) | 792,024 | + | tar.gz (gzip, default) | 738,796 | + | ZIP (Info-ZIP, default) | 735,334 | + | tar.gz (gzip, -9) | 733,971 | + | ZIP (Info-ZIP, -9) | 732,297 | + | WIM (wimlib, LZX quick) | 722,196 | + | WIM (WIMGAPI, LZX) | 651,766 | + | WIM (wimlib, LZX normal) | 649,204 | + | WIM (wimlib, LZX slow) | 639,618 | + | WIM (wimlib, LZMS non-solid) | 592,136 | + | tar.bz2 (bzip, default) | 565,008 | + | tar.bz2 (bzip, -9) | 565,008 | + | WIM (wimlib, LZMS solid) | 525,270 | + | WIM (wimlib, LZMS solid, slow) | 521,700 | + | WIM (WIMGAPI, LZMS solid) | 521,232 | + | tar.xz (xz, default) | 486,916 | + | tar.xz (xz, -9) | 486,904 | + | 7z (7-zip, default) | 484,700 | + | 7z (7-zip, -9) | 483,239 | + ================================================= + +Note: WIM does even better on directory trees containing duplicate files, which +the Canterbury corpus doesn't have. + + NTFS SUPPORT + +WIM images may contain data, such as alternate data streams and +compression/encryption flags, that are best represented on the NTFS filesystem +used on Windows. Also, WIM images may contain security descriptors which are +specific to Windows and cannot be represented on other operating systems. +wimlib handles this NTFS-specific or Windows-specific data in a +platform-dependent way: + + * In the Windows version of wimlib and wimlib-imagex, NTFS-specific and + Windows-specific data are supported natively. + + * In the UNIX version of wimlib and wimlib-imagex, NTFS-specific and + Windows-specific data are ordinarily ignored; however, there is also special + support for capturing and extracting images directly to/from unmounted NTFS + volumes. This was made possible with the help of libntfs-3g from the + NTFS-3g project. + +For both platforms the code for NTFS capture and extraction is complete enough +that it is possible to apply an image from the "install.wim" contained in recent +Windows installation media (Vista, Windows 7, or Windows 8) directly to an NTFS +filesystem, and then boot Windows from it after preparing the Boot Configuration +Data. In addition, a Windows installation can be captured (or backed up) into a +WIM file, and then re-applied later. + + WINDOWS PE + +A major use for wimlib and wimlib-imagex is to create customized images of +Windows PE, the Windows Preinstallation Environment, on either UNIX-like systems +or Windows without having to rely on Microsoft's software and its restrictions +and limitations. + +Windows PE is a lightweight version of Windows that can run entirely from memory +and can be used to install Windows from local media or a network drive or +perform maintenance. It is the operating system that runs when you boot from +the Windows installation media. + +You can find Windows PE on the installation DVD for Windows Vista, Windows 7, or +Windows 8, in the file `sources/boot.wim'. Windows PE can also be found in the +Windows Automated Installation Kit (WAIK), which is free to download from +Microsoft, inside the `WinPE.cab' file, which you can extract natively on +Windows, or on UNIX-like systems if you install either the `cabextract' or +`p7zip' programs. + +In addition, Windows installations and recovery partitions frequently contain a +WIM containing an image of the Windows Recovery Environment, which is similar to +Windows PE. + +A shell script `mkwinpeimg' is distributed with wimlib on UNIX-like systems to +ease the process of creating and customizing a bootable Windows PE image. + + DEPENDENCIES + +This section documents the dependencies of wimlib and the programs distributed +with it, when building for a UNIX-like system from source. If you have +downloaded the Windows binary distribution of wimlib and wimlib-imagex then all +dependencies were already included and this section is irrelevant. + +* libxml2 (required) + This is a commonly used free library to read and write XML files. You + likely already have it installed as a dependency for some other program. + For more information see http://xmlsoft.org/. + +* libfuse (optional but highly recommended) + Unless configured with --without-fuse, wimlib requires a non-ancient + version of libfuse to be installed. Most Linux distributions already + include this, but make sure you have the libfuse package installed, and + also libfuse-dev if your distribution distributes header files + separately. FUSE also requires a kernel module. If the kernel module + is available it will automatically be loaded if you try to mount a WIM + file. For more information see http://fuse.sourceforge.net/. FUSE is + also available for FreeBSD. + +* libntfs-3g (optional but highly recommended) + Unless configured with --without-ntfs-3g, wimlib requires the library + and headers for libntfs-3g version 2011-4-12 or later to be installed. + Versions dated 2010-3-6 and earlier do not work because they are missing + the header xattrs.h (and the file xattrs.c, which contains functions we + need). libntfs-3g version 2013-1-13 is compatible only with wimlib + 1.2.4 and later. + +* OpenSSL / libcrypto (optional) + wimlib can use the SHA1 message digest code from OpenSSL instead of + compiling in yet another SHA1 implementation. (See LICENSE section.) + +* cdrkit (optional) +* mtools (optional) +* syslinux (optional) +* cabextract (optional) + The `mkwinpeimg' shell script will look for several other programs + depending on what options are given to it. Depending on your Linux + distribution, you may already have these programs installed, or they may + be in the software repository. Making an ISO filesystem requires + `mkisofs' from `cdrkit' (http://www.cdrkit.org). Making a disk image + requires `mtools' (http://www.gnu.org/software/mtools) and `syslinux' + (http://www.syslinux.org). Retrieving files from the Windows Automated + Installation Kit requires `cabextract' (http://www.cabextract.org.uk). + + CONFIGURATION + +This section documents the most important options that may be passed to the +"configure" script when building from source: + +--without-ntfs-3g + If libntfs-3g is not available or is not version 2011-4-12 or later, + wimlib can be built without it, in which case it will not be possible to + apply or capture images directly to/from NTFS volumes. --without-fuse If libfuse or the FUSE kernel module is not available, wimlib can be compiled with --without-fuse. This will remove the ability to mount and - unmount WIM files. wimlib_mount() and wimlib_unmount() will fail with - WIMLIB_ERR_UNSUPPORTED. + unmount WIM files. ---disable-libcrypto +--without-libcrypto Build in functions for SHA1 rather than using external SHA1 functions from libcrypto (part of OpenSSL). The default is to use libcrypto if it is found on the system. +--disable-multithreaded-compression + By default, data will be compressed using multiple threads when writing + a WIM, unless only 1 processor is detected. Specify this option to + disable support for this. + --enable-ssse3-sha1 Use a very fast assembly language implementation of SHA1 from Intel. Only use this if the build target supports the SSSE3 instructions. ---disable-custom-memory-allocator - If this option is given, MALLOC(), FREE(), CALLOC(), and STRDUP() will - directly call the appropriate functions in the C library. - wimlib_set_memory_allocator() will fail with WIMLIB_ERR_UNSUPPORTED. - ---disable-verify-compression - Unless this option is given, every time wimlib compresses a data block - it will decompress it into a temporary buffer and abort() the program - with an error message if the decompressed data does not exactly match - the original data. This is to find bugs. - --disable-error-messages - Removes all error messages from the library. If left in, they still - have to explicitly turned on with wimlib_set_print_errors() in order to - see them. Also, error codes will still be returned regardless of - whether error messages are printed or not. - - If --disable-error-messages is given, wimlib_set_print_errors() will - fail with WIMLIB_ERR_UNSUPPORTED if the action is to turn error messages - on. + Save some space by removing all error messages from the library. --disable-assertions - Remove all assertions. Without this option, wimlib will abort() the - program if an assertion fails. An assertion failure should only occur - if there is a bug in wimlib. - ---disable-security-data - Wimlib cannot create or modify WIM security data, but by default it will - copy existing security data when modifying a WIM or exporting an image. - Passing this flag will disable this support; then wimlib will always - write WIMs without security data. - ---enable-debug - Include debugging messages. Only use this option if you have found a - bug in the library. - ---enable-more-debug - Include more debugging messages. Only use this option if you have found - a bug in the library. - - -------------------------------------------------------------------------------- - - DEPENDENCIES - -Wimlib requires libxml2 to build. This is a commonly used free library to read -and write XML files. You likely already have it installed as a dependency for -some other program. For more information see http://xmlsoft.org/. - -Wimlib also requires libfuse to build (unless configured with --without-fuse; -see above). Most GNU/Linux distributions already include this, but make sure -you have the libfuse package installed (libfuse-dev if your distribution -distributes header files separately). FUSE also requires a kernel module. If -the kernel module is available it will automatically be loaded if you try to -mount a WIM file. Wimlib has only been tested with the Linux version of FUSE. -For more information see http://fuse.sourceforge.net/. - -The `mkwinpeimg' shell script will look for several other programs depending on -what options are given to it. Depending on your GNU/Linux distribution, you may -already have these programs installed, or they may be in the software -repository. Making an ISO filesystem requires `mkisofs' from `cdrkit' -(http://www.cdrkit.org). Making a disk image requires `mtools' -(http://www.gnu.org/software/mtools) and `syslinux' (http://www.syslinux.org). -Retrieving files from the Windows Automated Installation Kit requires -`cabextract' (http://www.cabextract.org.uk). - ------------------------------------------------------------------------------- - - PORTABILITY - -wimlib has mostly been developed and tested on x86_64 (64-bit) GNU/Linux. - -It has been tested on x86 (32-bit) GNU/Linux occasionally. - -It can also be compiled and run on FreeBSD. - -wimlib should work on big endian machines but it has not been tested. - -There are no plans to port wimlib to Windows since the programming interface on -Windows is very different and Microsoft's imagex.exe is already available. - ------------------------------------------------------------------------------- - - REFERENCES - -The WIM file format is specified in a document that can be found in the -Microsoft Download Center. There is a similar document that specifies the LZX -compression format, and a document that specifies the XPRESS compression format. -However, some aspects of these formats are poorly documented. Some particularly -poorly documented parts of the formats have had comments added in various places -in the library. - -lzx-decomp.c, the code to decompress WIM file resources that are compressed -using LZX compression, is originally based on code from the cabextract project -(http://www.cabextract.org.uk). - -lzx-comp.c, the code to compress WIM file resources using LZX compression, is -originally based on code written by Matthew Russotto (www.russotto.net/chm/). - -lz.c, the code to find LZ77 matches, is based on code from zlib. - -sha1.c and sha1.h, the code to compute SHA1 message digests of WIM resources or -of the WIM file itself in the case of integrity checks, are based on code from -GNU coreutils. - -A very limited number of other free programs can handle some parts of the WIM -file format. 7-zip is able to extract and create WIMs and files in many other -archive formats. However, WIMLIB is designed specifically to handle WIM files -and provides features previously only available in Microsoft's imagex.exe, such -as the ability to mount WIMs read-write. - ------------------------------------------------------------------------------- - - MORE INFORMATION - -See the manual pages for `imagex', the manual pages for the subcommands of -`imagex', and the manual page for `mkwinpeimg'. - -As of version 0.5.0, Wimlib's public API is documented. Doxygen is required to -build the documentation. To build the documentation, run `configure', then -enter the directory `doc' and run `doxygen'. The HTML documentation will be -created in a directory named `html'. - ------------------------------------------------------------------------------- - - LICENSE - -Wimlib is released under the GNU LGPL version 2.1 or later. The files in the -`programs' directory are released under the GPL version 3. - ------------------------------------------------------------------------------- - - DISCLAIMER - -Wimlib is experimental. Use Microsoft's `imagex.exe' if you want to make sure -your WIM files are made correctly. Please submit a bug report (to -ebiggers3@gmail.com) if you find a bug. - -Some parts of the WIM file format are poorly documented or even completely -undocumented, so these parts had to be reverse engineered. + Remove assertions included by default. + + PORTABILITY + +wimlib works on both UNIX-like systems (Linux, Mac OS X, FreeBSD, etc.) and +Windows (XP and later). + +On UNIX-like systems other than Linux, you must compile --without-fuse. +In addition, --without-ntfs-3g and --without-libcrypto are needed if the +corresponding libraries are not available on your system. + +wimlib works on x86 and x86_64, but it should work on any other GCC-supported +32-bit or 64-bit architecture. + + REFERENCES + +The WIM file format is partially specified in a document that can be found in +the Microsoft Download Center. However, this document really only provides an +overview of the format and is not a formal specification. It also does not +cover later extensions of the format, such as solid blocks. + +With regards to the supported compression formats: + +- Microsoft has official documentation for XPRESS that is of reasonable quality. +- Microsoft has official documentation for LZX, but in two different documents, + neither of which is completely applicable to its use in the WIM format, and + the first of which contains multiple errors. +- There does not seem to be any official documentation for LZMS, so my comments + and code in src/lzms-decompress.c may in fact be the best documentation + available for this particular compression format. + +The algorithms used by wimlib's compression and decompression codecs are +inspired by a variety of sources, including open source projects and computer +science papers. + +The code in ntfs-3g_apply.c and ntfs-3g_capture.c uses the NTFS-3g library, +which is a library for reading and writing to NTFS filesystems (the filesystem +used by recent versions of Windows). See +http://www.tuxera.com/community/ntfs-3g-download/ for more information. + +A limited number of other free programs can handle some parts of the WIM +file format: + + * 7-zip is able to extract and create WIMs (as well as files in many + other archive formats). However, wimlib is designed specifically to handle + WIM files and provides features previously only available in Microsoft's + implementation, such as the ability to mount WIMs read-write as well as + read-only, the ability to create compressed WIMs, the correct handling of + security descriptors and hard links, support for LZMS compression, and + support for solid archives. + * ImagePyX (https://github.com/maxpat78/ImagePyX) is a Python program that + provides similar capabilities to wimlib-imagex. One thing to note, though, + is that it does not support compression and decompression by itself, but + instead relies on external native code, such as the codecs from wimlib. + +If you are looking for an archive format that provides features similar to WIM +but was designed primarily for UNIX, you may want to consider SquashFS +(http://squashfs.sourceforge.net/). However, you may find that wimlib works +surprisingly well on UNIX. It will store hard links and symbolic links, and it +has optional support for storing UNIX owners, groups, modes, and special files +such as device nodes and FIFOs. Actually, I use it to back up my own files on +Linux! + + LICENSE AND DISCLAIMER + +See COPYING for information about the license. + +wimlib is independently developed and does not contain any code, data, or files +copyrighted by Microsoft. It is not known to be affected by any patents. + +On UNIX-like systems, if you do not want wimlib to be dynamically linked with +libcrypto (OpenSSL), configure with --without-libcrypto. This replaces the SHA1 +implementation with built-in code and there will be no difference in +functionality. + +wimlib comes with no warranty whatsoever. Please submit a bug report (to +ebiggers3@gmail.com) if you find a bug in wimlib and/or wimlib-imagex.