You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current and previous (MS thesis work) implementation of the phenologs pipeline is computationally intensive, requires too much time when run locally, even with the parallel processing implementation. Can the pipeline be sped up with DuckDB, accomplishing much of the logic via table joins instead of nested Python for-loops?
The text was updated successfully, but these errors were encountered:
Noting my first memory/processing issue when running the 'star join' to generate the cross-product table of every cross-species phenotype combination. Might be a laptop limitation, so need to test on a more powerful desktop machine for comparison.
Update: Running on a local machine with 128GB of RAM, the table creation script ran just fine within a minute or two. Will have to see how well downstream operations run as well though.
The current and previous (MS thesis work) implementation of the phenologs pipeline is computationally intensive, requires too much time when run locally, even with the parallel processing implementation. Can the pipeline be sped up with DuckDB, accomplishing much of the logic via table joins instead of nested Python for-loops?
The text was updated successfully, but these errors were encountered: