-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
massive improvements in spc clone time, setup.py ver and dep bumps
A fresh pull of all 200 remote datasets now takes about 3 minutes. NOTE: `spc pull` should NOT BE USED unless you know exactly what you are doing. In the future this functionality will be restored with better performance, but for now it is almost always faster delete the contents of the dataset folder and express ds.rchildren. It only took me about 9 months to finally figure out that I had actually fixed many of the pulling performance bottlenecks and that we can almost entirely get rid of the current implementation of pull. As it turns out it I got almost everything sorted out so that it is possible to just call `list(dataset_cache.rchildren)` and the entire entire tree will populate itself. When we fix the cache constructor this becomes `[rc.materialize() for rc in d.rchildren]` or similar, depending on exactly what we name that method. Better yet, if we do it using a bare for loop then the memory overhead will be zero. The other piece that makes this faster is the completed sparse pull implementation. We now use the remote package count with a default cutoff of 10k packages to cause a dataset to be sparse, namely that only its metadata files and their parend directories are pulled. The implementation of that is a bit slow, but still about 2 orders of magnitude faster than the alternative. The approach for implementing is_sparse also points the way toward being able to mark folders with additional operational information, e.g. that they should not be exported or that they should not be pulled at all. Some tweaks to how spc rmeta works were also made so that existing metadata will not be repulled in a bulk clone. This work also makes the BlackfynnCache aware of the dataset metadata pulled from rmeta, so we should be able to start comparing ttl file and bf:internal metadata in the near future.
- Loading branch information
Showing
8 changed files
with
143 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,6 +14,7 @@ branches: | |
python: | ||
- 3.6 | ||
- 3.7 | ||
- 3.8 | ||
|
||
install: | ||
- pip install --upgrade pytest pytest-cov | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
__version__ = '0.0.1.dev0' | ||
__version__ = '0.0.1.dev1' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters