Skip to content
This repository has been archived by the owner on Apr 9, 2024. It is now read-only.

Work in progress: 1810x increase in extraction speed for solid archives. Fixes issue #21 #22

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

fartwhif
Copy link

@fartwhif fartwhif commented Sep 9, 2018

Fixes issue #21
brought a direct stream getter delegate out
made the stream classes public
changed the signature for the "extract all files with callback" function
removed garbage collection bossiness

made the stream classes public
changed the signature for the "extract all files with callback" function
removed garbage collection bossyness
@fartwhif fartwhif changed the title Work in progress: 1810x increase in extraction speed for solid archives. Fixes #21 Work in progress: 1810x increase in extraction speed for solid archives. Fixes issue #21 Sep 9, 2018
@fartwhif
Copy link
Author

fartwhif commented Sep 9, 2018

It's very hacky. Need to re-hide the stream classes and revert the extract API so it has the previous signature and method. May need to infer the index of the stream 7z is asking for, for extraction of a single entry as opposed to everything - don't add an outer loop to do this, as doing so will reintroduce the problem.

@fartwhif
Copy link
Author

fartwhif commented Sep 18, 2018

large solid archives are incredibly slow for both extracting the entire archive, and for extracting a single file towards the end of the data (a relatively high index). This PR provides a massively improved way to do so. It's still not as good as using 7z normally, but it's degrees of magnitude better than before. One thing to note is that the project lacks archive files to test with that are large (>100MB). Needless to say all the c# wrappers and pure .net libraries seem to suffer from this problem in different ways.

@squid-box
Copy link
Owner

So I'm still not happy about exposing so many internals to support this solution.

Given the fact that the documentation for ExtractFiles (with callback) states 7-Zip (and any other solid) archives are NOT supported. I'm more inclined to simply add something like

if (IsSolid)
{
    throw new NotSupportedException("Solid archives are not supported.");
}

Do you actually need the per-file-callback, or could you use the other extraction methods that don't have this slowdown?

@fartwhif
Copy link
Author

fartwhif commented Nov 20, 2018

Yes, my solution essentially needs file-entry specific callback with both solid and non solid archives. If testing is to reveal jaw dropping problems like this then the testing scenarios need to include large solid archive containing many file-entries with the objective of extracting a file-entry near the end of it and another objective of extracting all file-entries from it. I may have more to add to this PR I'll have to check tonight. Inferring things through nested loops and wrapping things too much causes this. Take it or leave it, I'm just saying THANKS VERY MUCH for getting the ball rolling for me!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants