Page 1 of 1

Pipable - Alternatives to stdin/out

Posted: Tue Dec 13, 2016 6:05 pm
by don
I love the pipable wim concept, and the ability to capture/apply via STDOUT/STDIN works for a number of use cases. Would you consider adding the ability to work with the pipable data via an API callback? This is a pattern I've seen in compression libraries that could be a be a way to allow preprocessing activities relevant to pipable wims:

Chunk
Encrypt
Split
etc.

Tools that work with STDIO (wget/wput) are great for generic http storage mediums. However some cloud storage providers may have unique constraints that require preprocessing data prior to passing it to their API/CLI. For example, they may have blob size limitations (or upload timeouts for a given chunk size). A callback could allow splitting/joining and other needs. In other words, sometimes a translator is needed between the two ends of the pipe, and as long as the consumer provides valid PWM data back to wimlib it shouldn't matter that it may have been sitting on cloud storage encrypted in 100 MiB chunks.

Some implementation ideas:

Opt-in flag for performance reasons
Stream chunk size param (so the callback can receive consistent chunks until empty)
Callback position relative to existing STDIN/OUT? Before/After?

The U-Stream library seems relevant to cloning/redirection of STDIO to C-style callbacks.

Don

Re: Pipable - Alternatives to stdin/out

Posted: Wed Dec 14, 2016 1:27 am
by synchronicity
That would make sense for a general-purpose compression library which needs to expose more lower-level functionality, but I don't think it makes as much sense for wimlib. Have you considered just putting programs on the other side of the pipe from wimlib? For example, if you need to upload an archive to cloud storage in chunks, you could pipe the 'wimcapture' output to a process or thread which uploads one chunk at a time.

Re: Pipable - Alternatives to stdin/out

Posted: Wed Dec 14, 2016 4:19 pm
by don
Agreed it's probably beyond the scope. I thought there might be some additional flexibility around pipable given it's a wimlib-specific format. :) As you know there are quite a few constraints around the WIM format, and you've given us some great potential freedoms by combining the pipable format with the ability to pipe bits to/fro the wimlib API. Lots can be done to a PWM prior to writing/reading to/fro it's final storage location.

So in the case of wimcapture using pipable to STDOUT as it sits today, we lose progress information as I believe it's silenced necessarily given the channel is busy with stream data. What would be a good way to preserve the display of progress information while piping to STDIO (on Windows)? Use a different STDIO channel (STDERR)? Build a stub that relays progress via messages or named pipes?

Don

Re: Pipable - Alternatives to stdin/out

Posted: Thu Dec 15, 2016 2:59 am
by synchronicity
wimlib-imagex already writes its progress information to stderr instead of stdout when capturing or exporting a pipable WIM to stdout. So the two types of data are already separated.