CPU Parallelism when Applying Image

Comments, questions, bug reports, etc.
Post Reply
zipmagic
Posts: 61
Joined: Thu Aug 06, 2015 7:09 am

CPU Parallelism when Applying Image

Post by zipmagic »

Is it possible to leverage CPU parallelism when extracting an WIM image onto an SSD target, especially in WIMBOOT mode, but with other modes as well?

This would be nice to peak out the underlying HDD while accelerating processing speed dramatically. One extraction per core would be amazing on SSDs and provide near-linear performance increase per CPU core. Currently only one CPU core is used.

Some suggestions based on my experience:

- 2 threads max on (S)HDD or RAID 0 array hosted on (S)HDDs.
- X div 2 threads max on SSD, where X is CPU core count - leaving one core per thread for decompression - in regular extraction onto SSD.
- X threads max on SSD, where X is CPU core count - in WIMBOOT extraction onto SSD.

Would it be possible to implement this feature in WIMLIB, is the current architecture permissive?

This idea would work only on those WIM files without solid compression, of course. Otherwise, it would be an exponential slowdown factor!

I would welcome any feedback...
synchronicity
Site Admin
Posts: 472
Joined: Sun Aug 02, 2015 10:31 pm

Re: CPU Parallelism when Applying Image

Post by synchronicity »

Parallel extraction is possible, but difficult. It would require a lot of refactoring of the extraction code. I've considered it before, but there hasn't been a great need for it yet, especially because wimlib's decompressors are so fast. Though, as you probably understand the usefulness of parallel extraction has been growing over time due to changing hardware, so it's definitely worth continuing to consider.

Note that an alternative to doing the extraction itself in multiple threads is to split reading and decompressing the data into its own thread(s), that run in parallel to the extraction. I believe that would be an easier improvement to make. But unfortunately it wouldn't help much in the "WIMBoot extraction" scenario you're particularly interested in, since that doesn't need the data; the bottleneck in that scenario is really the extraction itself.
zipmagic
Posts: 61
Joined: Thu Aug 06, 2015 7:09 am

Re: CPU Parallelism when Applying Image

Post by zipmagic »

Understood. Thank you!
Post Reply