Add gpu implementation #23

benjeffery · 2022-05-28T08:50:21Z

WIP - Code now produces the same mutations, next step is to get that working on actual hardware. Hitting version issues with CUDA/cudatoolkit.

benjeffery · 2022-06-01T01:21:54Z

Got this working on actual hardware (RTX3060)! Currently seems to be running at the same speed as the vectorised CPU, pretty sure I'm not using the GPU correctly yet though, and still have a few ifs that can be removed.

benjeffery · 2022-06-02T01:10:05Z

Quick update here, turns out that the number of sites in a chunk was limited to 10! It seems this was put in place to make the CPU versions finish in good time?
With the current VRAM usage we can fit 500 sites in parallel. This brings us to 10x the speed of the numba vectorised implementation per-site.

bhaller · 2022-06-02T01:16:04Z

@benjeffery after you're done with this, could you make SLiM run on GPUs too? Kthxbye.

benjeffery · 2022-06-02T01:35:44Z

@benjeffery after you're done with this, could you make SLiM run on GPUs too? Kthxbye.

I'll send over my (exorbitant) consultancy price list.

benjeffery added 2 commits May 28, 2022 02:01

Add gpu implementation

2ad9b1c

Can't use np.repeat?

37166e7

benjeffery and others added 3 commits June 1, 2022 16:59

Simplify to a single function

9144e3e

Can't use array creation in kernal

c56d1e5

Can't use argmax

0ccc3a5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add gpu implementation #23

Add gpu implementation #23

benjeffery commented May 28, 2022

benjeffery commented Jun 1, 2022 •

edited

Loading

benjeffery commented Jun 2, 2022

bhaller commented Jun 2, 2022

benjeffery commented Jun 2, 2022

Add gpu implementation #23

Are you sure you want to change the base?

Add gpu implementation #23

Conversation

benjeffery commented May 28, 2022

benjeffery commented Jun 1, 2022 • edited Loading

benjeffery commented Jun 2, 2022

bhaller commented Jun 2, 2022

benjeffery commented Jun 2, 2022

benjeffery commented Jun 1, 2022 •

edited

Loading