Releases: JuliaGPU/AMDGPU.jl
Releases · JuliaGPU/AMDGPU.jl
v0.2.8
v0.2.7
v0.2.6
AMDGPU v0.2.6
Closed issues:
Merged pull requests:
- Add mark/wait synchronization system (#116) (@jpsamaroo)
- CompatHelper: bump compat for "GPUCompiler" to "0.11" (#122) (@github-actions[bot])
- Replace arrays with Refs in ccall. (#123) (@chriselrod)
- Fix packet launch (#125) (@jpsamaroo)
v0.2.6 for Zenodo
Merge pull request #116 from JuliaGPU/jps/mark-wait Add mark/wait synchronization system
v0.2.5
v0.2.4
AMDGPU v0.2.4
Closed issues:
- Implement execution contexts (#16)
- Add/test broadcasting support to ROCArray (#12)
- Add queue/device/system sync functionality (#24)
- Support OpenCL.jl as device runtime (#23)
- Distribute ROCR/ROCT via artifacts (#6)
- Allow setting Private and Group segment sizes manually (#56)
- FATAL ERROR: Symbol "ccalllib_libhsa-runtime64445"not found on AMDGPU (#73)
- test failures and crashes on 580 (#92)
- Tests allocate memory indefinitely (#106)
- Check for invalid workgroup sizes (#110)
- Add example for gridsize usage and workgroup sizing (#113)
Merged pull requests:
- Silence queue destroy finalizer errors (#98) (@jpsamaroo)
- KernelAbstractions support (#100) (@jpsamaroo)
- Fix kernel argument alignment (#101) (@jpsamaroo)
- Fix alloc_local (#102) (@jpsamaroo)
- Add mapreducedim! (#104) (@jpsamaroo)
- Added support for HSA artifacts (#105) (@0x0f0f0f)
- Update to GPUCompiler 0.10 (#107) (@jpsamaroo)
- Fix CompatHelper config (#108) (@jpsamaroo)
- CompatHelper: bump compat for "AbstractFFTs" to "1.0" (#109) (@github-actions[bot])
- Fix loading of AMDGPU in other packages (#114) (@jpsamaroo)
- Check groupsize, add group/grid docs (#115) (@jpsamaroo)
- Fix memory leak (#118) (@jpsamaroo)
v0.2.3
AMDGPU v0.2.3
Closed issues:
- Add support for trap handlers (#8)
- Unreachable reached in SIISelLowering.cpp due to unhandled AS (#76)
- Ensure that CI tests all available external libraries (#85)
- Only load libhsa-runtime64 major version 1 (#93)
Merged pull requests:
- Disable unreliable pointerinfo tests (#86) (@jpsamaroo)
- Test extlibs are available under CI (#87) (@jpsamaroo)
- Enable device stacktraces (#88) (@jpsamaroo)
- Add ForwardDiff integrations (#90) (@jpsamaroo)
- Look for libhsa-runtime64.so.1 (#94) (@jpsamaroo)
- Remove annoying target-features for now (#95) (@jpsamaroo)
- Bump to 0.2.3, add MacroTools LB (#96) (@jpsamaroo)
v0.2.2
AMDGPU v0.2.2
Closed issues:
- Implement RNGs (#14)
- Allow 0-argument kernels (#10)
- Add tests to match CUDAnative (#22)
- Build fails on OSX (#19)
- Add options to at-roc to allow initializing globals (#36)
- Some errors during test. Are they cause for concern? (#70)
Merged pull requests:
- Implement OCKL wavefront ops (#49) (@jpsamaroo)
- Accept hooks for user-defined globals (#55) (@jpsamaroo)
- Add at-rocprintf support (#61) (@jpsamaroo)
- UB GPUArrays to 5.0/5.1 (#71) (@jpsamaroo)
- rocRAND (#72) (@antholzer)
- Update to Julia 1.6 (#75) (@jpsamaroo)
- Some fixes for Julia 1.6 (#78) (@jpsamaroo)
- Bump GPUArrays to 6 (#79) (@jpsamaroo)
- Add Buildkite CI (#80) (@jpsamaroo)
- Allow 0-argument kernels (#81) (@jpsamaroo)
- Skip build if in AutoMerge (#82) (@jpsamaroo)
v0.2.1
AMDGPU v0.2.1
Closed issues:
- Throw error in wait() call on queue error (#7)
- Test code doesn't work (#43)
- Broken build script? (#46)
- HSASignal and HSAKernelInstance references can be accidentally GC'd (#63)
Merged pull requests:
- Made build and initialization quieter (#47) (@jpsamaroo)
- Refcount HSA objects for safe shutdown (#52) (@jpsamaroo)
- Adapt to GPUCompiler disk cache changes (#53) (@jpsamaroo)
- Disable broken GPUArrays tests (#54) (@jpsamaroo)
- Fixed malloc return type (#59) (@jpsamaroo)
- Remove HSA C files from repo (#60) (@jpsamaroo)
- Don't do set_used on GVs (#62) (@jpsamaroo)
- Add barrier packet support (#65) (@jpsamaroo)
- Bump to 0.2.1, use GPUCompiler 0.8 (#66) (@jpsamaroo)