Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary: switch from using xxd to bin2c when generating the .ptx.c files so that the PTX data can be null-terminated.
In newer drivers or cuda versions, vmaf now segfaults when trying to do anything from the GPU. The coredumps indicate that the crash happens somewhere inside the cuModuleLoadData calls in init_fex_cuda.
Documentation for cuModuleLoadData states that its
image
argument can be "obtained by mapping a cubin or PTX or fatbin file, [or] passing a cubin or PTX or fatbin file as a NULL-terminated text string...". It looks like VMAF is trying to do the latter, encoding PTX text files as an ASCII string using xxd, but there's no null-terminator in the data because nothing asked for one.I'm a CUDA noob and don't know how this ever worked on older driver versions, but I tried editing the .ptx.c files by hand to add 0x00 bytes at the end and it worked!
Switch from xxd to bin2c (which is distributed with the cuda-nvcc package) that supports a
--padd
option to add a null byte to the PTX data, eliminating the segfaults. The arrays got renamed slightly to remove the src_ prefix, since bin2c doesn't do any automatic naming of the output array.This should resolve #1357