We found an issue in the implementation of avl-tree in the wimlib library. Please kindly check the following details.
#### Reproducing Instructions
* The test driver (native_run.c) is as follows:
Code: Select all
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include "wimlib/avl_tree.h"
#pragma pack(push, 1)
struct WrapAligned {
char padding[3];
struct Wrap {
int key;
struct avl_tree_node node;
} wrap;
};
#pragma pack(pop)
static int cmp_nodes(const struct avl_tree_node *a, const struct avl_tree_node *b) {
const struct Wrap *wa = avl_tree_entry(a, struct Wrap, node);
const struct Wrap *wb = avl_tree_entry(b, struct Wrap, node);
if (wa->key < wb->key) return -1;
if (wa->key > wb->key) return 1;
return 0;
}
static struct Wrap *make(void) { return (struct Wrap*)malloc(sizeof(struct Wrap)); }
int main(void) {
// Heap manipulation code to ensure last two bits are 11
struct WrapAligned *A_aligned = (struct WrapAligned*)aligned_alloc(256, 256);
struct WrapAligned *B_aligned = (struct WrapAligned*)aligned_alloc(256, 256);
struct Wrap *A = &A_aligned->wrap;
struct Wrap *B = &B_aligned->wrap;
printf("Address A: %p (last byte: 0x%02x, last two bits: %d%d)\n",
(void*)A, (unsigned int)(((uintptr_t)A) & 0xFF),
(int)((((uintptr_t)A) & 2) >> 1), (int)(((uintptr_t)A) & 1));
printf("Address B: %p (last byte: 0x%02x, last two bits: %d%d)\n",
(void*)B, (unsigned int)(((uintptr_t)B) & 0xFF),
(int)((((uintptr_t)B) & 2) >> 1), (int)(((uintptr_t)B) & 1));
struct avl_tree_node *root = NULL;
A->key=10; B->key=20;
avl_tree_insert(&root,&A->node,cmp_nodes);
avl_tree_insert(&root,&B->node,cmp_nodes);
free(A_aligned);
free(B_aligned);
return 0;
}
Code: Select all
gcc -g native_run.c avl_tree.c -I./ -o native_run
gdb ./native_run
Code: Select all
…
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555abb in avl_tree_rebalance_after_insert (root_ptr=0x7fffffffd6a0, inserted=0x55555555c307) at avl_tree.c:513
513 if (node == parent->left)
…
Code: Select all
$ uname -a
Linux symbolism 6.8.0-52-generic #53~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jan 15 19:18:46 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.4.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
#### Justification of the real-world impact
This issue highlights a subtle and critical flaw in the implementation of the avl-tree that could result in undefined behavior, data corruption, or other security vulnerabilities if similar allocation patterns were to arise. A user (or developer) of wimlib might interact with the implementation of avl-tree in a way that creates such allocation patterns, leading to hard-to-diagnose crashes or data integrity issues. Therefore, to enhance the robustness and reliability of the implementation of the avl-tree, we believe that it is important to address this issue in all possible usage scenarios.
#### Root cause analysis
We believe that the root cause of the crash is in the function `avl_get_parent`. The parent of a node is stored in the `parent_balance` field of the `avl_tree_node` struct, with the last two bits used to store balance information. The issue occurs when the value stored in the `parent_balance` field is an address where the last two bits are both 1. In this case, when the `avl_get_parent` function is called, it clears the last two bits of the address to retrieve the parent pointer. However, since the last two bits are 1, clearing them results in an address that points to an incorrect location. In our case, this eventually leads to an out-of-bounds read when accessing the parent node, resulting in a segmentation fault.
#### Affected versions
We found that this issue can be reproduced from a very early released version (i.e., v1.6.2 released on 2014-03-31) to the current v4.14.4.
#### Attached files
We have attached the reproducing script and the test driver code to help you reproduce and investigate the issue. The code of avl-tree implementation is extracted from wimlib-v1.14.4.
Thank you for your attention to this matter. We look forward to your response and will be glad to provide any additional information or assistance needed to resolve this issue.
Best regards,
Haoxin and David