Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple threads question #150

Open
alondhe opened this issue May 30, 2024 · 3 comments
Open

Multiple threads question #150

alondhe opened this issue May 30, 2024 · 3 comments

Comments

@alondhe
Copy link

alondhe commented May 30, 2024

Hello, I'm curious about the comment here on running multi-threaded cohort generation: https://github.com/OHDSI/CohortGenerator/blob/main/R/CohortConstruction.R#L126-L131

Understandably, a dependency tree would need to be utilized to ensure we handle subsets. But I'm wondering if there are other challenges to implement it. I think it'd be a huge efficiency gain if we could parallelize.

@anthonysena
Copy link
Collaborator

Hey @alondhe - my intent with the generateCohortSet was to support some mode of parallelization for cohort generations hence why you see the reference to ParallelLogger in that function. I'm now unsure if putting this into the package is the right approach and instead allowing some calling process to work out the parallelization. Adding to that, the sub-setting functionality does make it a more complex operation to parallelize since it would require that you know that all dependencies are generated ahead of generating the subsets.

If there is interest in exploring ways to parallelize this, we can discuss it here.

@alondhe
Copy link
Author

alondhe commented Jun 3, 2024

I am interested for sure. So at this point, all cohort generation via CohortGeneration is in serial (1 thread)? Are there any projects in OHDSI you know of that have handled parallel calls to CohortGenerator?

@anthonysena
Copy link
Collaborator

I am interested for sure. So at this point, all cohort generation via CohortGeneration is in serial (1 thread)?

Correct.

Are there any projects in OHDSI you know of that have handled parallel calls to CohortGenerator?

Not that I am aware of. This is a bit tricky too since the parallelization depends on your CDM RDBMS utilization - you don't want to overburden your DB and slow down all of the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants