Eric Biggers [Wed, 19 Apr 2017 06:58:03 +0000 (23:58 -0700)]
Improved year 2038 safety
Make wimlib on 32-bit Windows year 2038 safe by doing the following:
- Build both the library and program with 64-bit time_t, being careful
to avoid changing the timespec struct exposed in the API.
- Update wimlib's API to include an extended seconds field in
wimlib_dir_entry for each timestamp, and set it when tv_sec is 32-bit.
- When needing the current time, call GetSystemTimeAsFileTime() instead
of MinGW's gettimeofday().
This also has the advantage that due to switching to the 64-bit time_t
functions, 32-bit wimlib-imagex.exe now prints timestamps prior to year
1970 correctly.
Unfortunately, despite the API improvement, we cannot at this time make
wimlib fully Y2038-safe on 32-bit UNIX, due to lack of OS support.
Eric Biggers [Sun, 29 Jan 2017 05:18:21 +0000 (21:18 -0800)]
avl_tree.h: avoid bad function pointer cast
Casting the type of the 'cmp' function, while under normal circumstances
compiled correctly, was not technically correct and was not compatible
with some control flow integrity (CFI) implementations.
Eric Biggers [Sun, 15 Jan 2017 21:34:36 +0000 (13:34 -0800)]
lzx_compress: optimize storing information in lzx_sequence
Pack the literal run length and match length ourselves instead of using
bitfields, and store the actual match length instead of the adjusted
match length. Also make matchlen=0 represent end-of-block, and store
the full main symbol, not just the match header.
Eric Biggers [Sun, 15 Jan 2017 01:00:13 +0000 (17:00 -0800)]
Don't generate GUID in wimlib_create_new_wim()
It's not necessary to generate a GUID in wimlib_create_new_wim() because
one is generated later by wimlib_write(), and nothing seems to assume
that a WIMStruct not yet backed by a file has a valid GUID. This saves
a call to get_random_bytes(). Also remove some unnecessary
initializations to 0.
Eric Biggers [Sat, 14 Jan 2017 08:56:39 +0000 (00:56 -0800)]
lzx_compress: fix corruption with long literal run
The last round of updates to the LZX compressor made it start being able
to use larger blocks, up to ~100KB. Unfortunately it was overlooked
that this allows literal runs > 65535 bytes while in one place the
length of a literal run was still being stored in a u16. Therefore, on
incompressible input data this could be wrapped around, causing
incorrect compression. Fix this by enlarging the variable.
Eric Biggers [Sun, 8 Jan 2017 06:34:32 +0000 (22:34 -0800)]
hc_matchfinder: use well-defined initialization of best_matchptr
The initial value of best_matchptr is not truly used, but since we do
always compute 'in_next - best_matchptr', assign an initial value which
avoids undefined behavior.
Eric Biggers [Tue, 27 Dec 2016 23:24:55 +0000 (17:24 -0600)]
Add basic infrastructure for storing xattr items
Define a new tagged metadata item to hold a list of names and values of
Linux-style extended attributes, and prepare for supporting
capture/apply of extended attributes.
I considered making the xattrs a stream instead, referenced from the
tagged item which would just hold a hash. This would have allowed
xattrs to be deduplicated between files. However, I ultimately decided
against this because WIMGAPI and older versions of wimlib would discard
the streams on optimize/export, and extraction would be much more
complicated because xattr streams could come up for extraction before
other streams --- which would be especially problematic for symlinks.
Eric Biggers [Tue, 27 Dec 2016 23:24:55 +0000 (17:24 -0600)]
tagged_items updates
- Expose tagged_item functions in new header tagged_items.h
- Make object_id functions inline functions in object_id.h
- Make inode_get_tagged_item() return stored length, not aligned length
- Add a new function inode_set_tagged_data() which removes existing
items before setting the new one, and use it for inode_set_object_id()
- Make inode_add_tagged_item() append item rather than prepend
- Keep items 8-byte aligned in memory
Eric Biggers [Tue, 27 Dec 2016 02:27:29 +0000 (20:27 -0600)]
Improve random number generation
wimlib used rand() to generate random numbers, e.g. for GUIDs. This was
neither cryptographically secure nor thread-safe. Use getrandom(),
/dev/urandom, or RtlGenRandom() instead.
Eric Biggers [Sat, 17 Dec 2016 03:47:44 +0000 (19:47 -0800)]
join.c: clean up verify_swm_set()
UBSAN complained when the parts_to_swms array had 0 length. Clean this
up by sorting the parts first, making the verification simpler. Also
don't bother checking compression_type and chunk_size anymore; checking
guid should be sufficient, and it doesn't really matter if the
compression formats are different since now everything will be written
out correctly anyway.
Eric Biggers [Thu, 15 Dec 2016 04:49:55 +0000 (20:49 -0800)]
Extract sparse files as sparse
When extracting a stream belonging to an inode with
FILE_ATTRIBUTE_SPARSE_FILE set, before writing any data, mark the
extracted stream as sparse if needed and skip preallocating space.
Then, skip writing zero regions. This makes it so that sparse files are
still sparse after extraction.
Eric Biggers [Sat, 8 Oct 2016 02:59:14 +0000 (19:59 -0700)]
mkwinpeimg: use case insensitive mode when updating boot.wim
It was reported that some Windows PE images have a system directory
called "windows" rather than "Windows". Use case insensitive mode to
ensure added files go to the right place.
Eric Biggers [Wed, 27 Jul 2016 00:10:05 +0000 (17:10 -0700)]
libattr is no longer needed
wimlib only uses the extended attributes interface on Linux, where it
appears it is now safe to assume the functions are present in libc (see:
http://lists.nongnu.org/archive/html/acl-devel/2012-04/msg00001.html).
Note: the setfattr program from the "attr" package is still required to
run the NTFS-3G test script.
Eric Biggers [Sat, 9 Jul 2016 17:13:23 +0000 (12:13 -0500)]
configure.ac: Do not check for <sys/param.h>
This header is conditionally included by <ntfs-3g/endians.h>. It defines
too much stuff on certain platforms, e.g. an ALIGN() macro on FreeBSD,
and it appears redundant with other methods of determining the
endianness.
Eric Biggers [Sat, 9 Jul 2016 17:12:14 +0000 (12:12 -0500)]
ntfs-3g_capture.c: include <ntfs-3g/compat.h> to get ENODATA definition
Some platforms, e.g. FreeBSD, do not define ENODATA. On such platforms,
libntfs-3g uses ENOENT instead, and <ntfs-3g/compat.h> defines ENODATA as
ENOENT.
Eric Biggers [Sat, 9 Jul 2016 15:01:25 +0000 (10:01 -0500)]
bitops: rename bit scan functions
Our bit scan functions use 0-based indices and do not allow zero inputs.
Rename them to 'bsr' and 'bsf' to match the x86 instructions and avoid
confusion with another common convention for 'fls' and 'ffs'.