Skip to content

Pytorch Bindings for warp-ctc Built with VS2019 PyTorch 1.5 CUDA 10.2 Python 3.8

License

Notifications You must be signed in to change notification settings

Alexgolshtein/warp-ctc

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyTorch bindings for Warp-ctc

Modified by isses in hzli-ucas/warp-ctc description

Sucessfully built with: Win10 cuda10.2 pip3 python3.8 pytorch 1.5 VS2019(with c++)

Follow all steps in Specific Modifications section if any issus occured

This is an extension onto the original repo found here. This is a branch for Visual Studio 2017/2019 building, with the errors fixed while installing from pytorch_bindings branch. The installation has been successfully performed in the following environments:

Win10 cuda10.2 pip3 python3.8 pytorch1.5 vs2019

Win10 cuda10.1 anaconda3 python3.7 pytorch1.0 vs2017

Win7 cuda9.0 anaconda3 pytorch3.7 pytorch1.0 vs2015

Most of our modifications follow amberblade's (It is a Windows compatible version of SeanNaren's work, but cannot work with pytorch1.0). In case you should modify it yourself, all the necessary modifications can be viewed in the commit fix errors in vs2017 building. You can also refer to "Specific Modifications" for all the modifications and the error messages that would have be encountered during the installation without the modifications.

Installation

Install PyTorch v1.0 or higher.

From within a new warp-ctc clone you could build WarpCTC like this:

You may clone the original repo (git clone https://github.com/SeanNaren/warp-ctc.git) and follow the next steps or clone thes repo.

git clone -b vs2017 https://github.com/Alexgolshtein/warp-ctc.git
cd warp-ctc
mkdir build
cd build
cmake ..

cmake -G "Visual Studio 16 2019" -A x64 ..

If using vs2015 instead of vs2017 for building, replace the cmake -G "Visual Studio 15 2017 Win64" .. with cmake -G "Visual Studio 14 2015 Win64" .. or cmake -G "Visual Studio 16 2019" -A x64 ...

Open the warp-ctc/build/ctc_release.sln with vs2019, and build the project ALL_BUILD (Release x64). Then install the bindings:

cd ../pytorch_binding
python setup.py install

You could possibly get errors in "Test" project - try to unload them.

Go to Specific Modifications if you got issues, then return here.

Set CUDA_HOME and CUDA_PATH in system env vars to your cuda toolkit installation dir in "Proigram files", for example: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2 You can set the WARP_CTC_PATH in system env vars to any directory and place in it the warpctc.dll (You can find it in \warp-ctc\build\Release). If you don't - you could get an exception "Could not find warpctc.dll in {}.\n". You can find another fix of this issue in Specific Modifications .

At last, copy warp-ctc/build/Release/warpctc.dll to the path where you install the warpctc. For example, for the installation performed with Anaconda3 within an environment named "pytorch", the path oughts to be: Anaconda3/envs/pytorch/Lib/site-packages/warpctc_pytorch-0.1-py3.8-win-amd64.egg/warpctc_pytorch/ .

Go to Specific Modifications if you got issues, then return here.

You could use the following commands to check that warpctc_pytorch has been successfully installed:

python tests/test_cpu.py
python tests/test_gpu.py

Example to use the bindings below.

import torch
from warpctc_pytorch import CTCLoss
ctc_loss = CTCLoss()
# expected shape of seqLength x batchSize x alphabet_size
probs = torch.FloatTensor([[[0.1, 0.6, 0.1, 0.1, 0.1], [0.1, 0.1, 0.6, 0.1, 0.1]]]).transpose(0, 1).contiguous()
labels = torch.IntTensor([1, 2])
label_sizes = torch.IntTensor([2])
probs_sizes = torch.IntTensor([2])
probs.requires_grad_(True)  # tells autograd to compute gradients for probs
cost = ctc_loss(probs, labels, probs_sizes, label_sizes)
cost.backward()

Documentation

CTCLoss(size_average=False, length_average=False)
    # size_average (bool): normalize the loss by the batch size (default: False)
    # length_average (bool): normalize the loss by the total number of frames in the batch. If True, supersedes size_average (default: False)

forward(acts, labels, act_lens, label_lens)
    # acts: Tensor of (seqLength x batch x outputDim) containing output activations from network (before softmax)
    # labels: 1 dimensional Tensor containing all the targets of the batch in one large sequence
    # act_lens: Tensor of size (batch) containing size of each output sequence from the network
    # label_lens: Tensor of (batch) containing label length of each example

Specific Modifications

Made by amberblade's and hzli-ucas

When build the project with vs2017 (and 2019), there might be


2>F:/warp-ctc/src/ctc_entrypoint.cu(1): error : this declaration has no storage class or type specifier
2>F:/warp-ctc/src/ctc_entrypoint.cu(1): error : expected a ";"

warp-ctc/src/ctc_entrypoint.cu: change ctc_entrypoint.cpp to #include "ctc_entrypoint.cpp"


2>f:\warp-ctc\include\contrib\moderngpu\include\device\intrinsics.cuh(115): error : identifier "__shfl_up" is undefined
2>f:\warp-ctc\include\contrib\moderngpu\include\device\intrinsics.cuh(125): error : identifier "__shfl_up" is undefined

warp-ctc/CMakeLists.txt: comment out set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_70,code=sm_70")

Also check this list

// GTX 1080, GTX 1070, GTX 1060, GTX 1050, GTX 1030, Titan Xp, Tesla P40, Tesla P4
// -arch=compute_61 -code=sm_61

// Tesla V100
// -arch=compute_70 -code=[sm_70,compute_70]

// GP100/Tesla P100 – DGX-1
// -arch=compute_60 -code=sm_60

// Jetson Tx1
// -arch=compute_51 -code=[sm_51,compute_51]

// Jetson Tx2 or Drive-PX2
// -arch=compute_62 -code=[sm_62,compute_62]

3>f:\warp-ctc\tests\test.h(57): error : namespace "std" has no member "max"

warp-ctc\tests\test.h: add #include <algorithm>


3>test_gpu_generated_test_gpu.cu.obj : error LNK2019: 无法解析的外部符号 get_warpctc_version,该符号在函数 main 中被引用
3>test_gpu_generated_test_gpu.cu.obj : error LNK2019: 无法解析的外部符号 ctcGetStatusString,该符号在函数 "float __cdecl grad_check(int,int,class std::vector<float,class std::allocator<float> > &,class std::vector<class std::vector<int,class std::allocator<int> >,class std::allocator<class std::vector<int,class std::allocator<int> > > > const &,class std::vector<int,class std::allocator<int> > const &)" (?grad_check@@YAMHHAEAV?$vector@MV?$allocator@M@std@@@std@@AEBV?$vector@V?$vector@HV?$allocator@H@std@@@std@@V?$allocator@V?$vector@HV?$allocator@H@std@@@std@@@2@@2@AEBV?$vector@HV?$allocator@H@std@@@2@@Z) 中被引用
3>test_gpu_generated_test_gpu.cu.obj : error LNK2019: 无法解析的外部符号 compute_ctc_loss,该符号在函数 "float __cdecl grad_check(int,int,class std::vector<float,class std::allocator<float> > &,class std::vector<class std::vector<int,class std::allocator<int> >,class std::allocator<class std::vector<int,class std::allocator<int> > > > const &,class std::vector<int,class std::allocator<int> > const &)" (?grad_check@@YAMHHAEAV?$vector@MV?$allocator@M@std@@@std@@AEBV?$vector@V?$vector@HV?$allocator@H@std@@@std@@V?$allocator@V?$vector@HV?$allocator@H@std@@@std@@@2@@2@AEBV?$vector@HV?$allocator@H@std@@@2@@Z) 中被引用
3>test_gpu_generated_test_gpu.cu.obj : error LNK2019: 无法解析的外部符号 get_workspace_size,该符号在函数 "float __cdecl grad_check(int,int,class std::vector<float,class std::allocator<float> > &,class std::vector<class std::vector<int,class std::allocator<int> >,class std::allocator<class std::vector<int,class std::allocator<int> > > > const &,class std::vector<int,class std::allocator<int> > const &)" (?grad_check@@YAMHHAEAV?$vector@MV?$allocator@M@std@@@std@@AEBV?$vector@V?$vector@HV?$allocator@H@std@@@std@@V?$allocator@V?$vector@HV?$allocator@H@std@@@std@@@2@@2@AEBV?$vector@HV?$allocator@H@std@@@2@@Z) 中被引用
3>F:\warp-ctc\build\Release\test_gpu.exe : fatal error LNK1120: 4 个无法解析的外部命令

warp-ctc\include\ctc.h: put __declspec(dllexport) in front of the four variables/functions


When run the following command,

cd ../pytorch_binding
python setup.py install

there might be


(pytorch) F:\warp-ctc\pytorch_binding>python setup.py install
Could not find libwarpctc.so in ../build.
Build warp-ctc and set WARP_CTC_PATH to the location of libwarpctc.so (default is '../build')

warp-ctc/pytorch_binding/steup.py: change .so to .dll change ../build to ../build/Release change libwarpctc to warpctc


binding.obj : error LNK2001: 无法解析的外部符号 "struct THCState * state" (?state@@3PEAUTHCState@@EA)
build\lib.win-amd64-3.7\warpctc_pytorch\_warp_ctc.cp37-win_amd64.pyd : fatal error LNK1120: 1 个无法解析的外部命令
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX86\\x64\\link.exe' failed with exit status 1120

warp-ctc/pyrotch_binding/src/binding.cpp: change extern THCState* state; to THCState* state = at::globalContext().lazyInitCUDA();


The error might occur using the bindings


>>> import warpctc_pytorch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\dell\Anaconda3\envs\pytorch\lib\site-packages\warpctc_pytorch-0.1-py3.7-win-amd64.egg\warpctc_pytorch\__init__.py", line 6, in <module>
    from ._warp_ctc import *
ImportError: DLL load failed: 找不到指定的模块。

copy warp-ctc/build/Release/warpctc.dll to where/you/install/the/warpctc_pytorch/

About

Pytorch Bindings for warp-ctc Built with VS2019 PyTorch 1.5 CUDA 10.2 Python 3.8

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Cuda 60.6%
  • C++ 31.9%
  • Python 4.0%
  • C 2.2%
  • CMake 1.3%