v0.2.4
AMDGPU v0.2.4
Closed issues:
- Implement execution contexts (#16)
- Add/test broadcasting support to ROCArray (#12)
- Add queue/device/system sync functionality (#24)
- Support OpenCL.jl as device runtime (#23)
- Distribute ROCR/ROCT via artifacts (#6)
- Allow setting Private and Group segment sizes manually (#56)
- FATAL ERROR: Symbol "ccalllib_libhsa-runtime64445"not found on AMDGPU (#73)
- test failures and crashes on 580 (#92)
- Tests allocate memory indefinitely (#106)
- Check for invalid workgroup sizes (#110)
- Add example for gridsize usage and workgroup sizing (#113)
Merged pull requests:
- Silence queue destroy finalizer errors (#98) (@jpsamaroo)
- KernelAbstractions support (#100) (@jpsamaroo)
- Fix kernel argument alignment (#101) (@jpsamaroo)
- Fix alloc_local (#102) (@jpsamaroo)
- Add mapreducedim! (#104) (@jpsamaroo)
- Added support for HSA artifacts (#105) (@0x0f0f0f)
- Update to GPUCompiler 0.10 (#107) (@jpsamaroo)
- Fix CompatHelper config (#108) (@jpsamaroo)
- CompatHelper: bump compat for "AbstractFFTs" to "1.0" (#109) (@github-actions[bot])
- Fix loading of AMDGPU in other packages (#114) (@jpsamaroo)
- Check groupsize, add group/grid docs (#115) (@jpsamaroo)
- Fix memory leak (#118) (@jpsamaroo)