Bug with GroupConvolutions with Padding using R7 branch #386

clement-masson · 2017-10-30T11:56:44Z

Hi all.

I observed a strange behavior for SpatialConvolution modules with group > 1, padding >0 and cudnn7 (with the branch R7).
When using R7 branch, the output feature map of a single convolution with groups is like translated (but only vertically !!!?) compared to the output obtained with the master branch. It also looks like the translation amplitude is somehow related (equal ?) to the padding used.

Here is a code snippet to reproduce the bug. You must change torch folder location (first line) before running). Try changing the padding or remove the groups.

I know torch lua maintenance may not be a priority but I though it might be an issue coming from somewhere deeper than lua and may help for pytorch too.

torch_folder='<path to your torch folder>'
here=$(pwd)

# Initialisation. Define the comon input tensor and module for both version test
rm ./output-*.t7
th -e "
require 'cudnn'
torch.save('input.t7', torch.randn(1,32,200,200))
torch.save('layer.t7', cudnn.SpatialConvolution(32,32,3,3,1,1,2,2, 32))
"

# operations performed once for both version
run_test() # performing a simple forward pass and storing the result
{
th -e "
require 'cudnn'
print(cudnn.version)
input = torch.load('input.t7'):cuda()
layer = torch.load('layer.t7'):cuda()
torch.save(('output-%s.t7'):format(cudnn.version), layer:forward(input))
"
}

# run with cudnn-torch for cudnn 7
cd $torch_folder'/extra/cudnn'
git checkout R7
luarocks make cudnn-scm-1.rockspec
cd $here
run_test

# run with cudnn-torch for cudnn 5
cd $torch_folder'/extra/cudnn'
git checkout master
luarocks make cudnn-scm-1.rockspec
cd $here
run_test

# Observation of differences between the two outputs
qlua -e "
require 'image'
require 'cutorch'
o1 = torch.load('output-5110.t7')[1]
o2 = torch.load('output-7001.t7')[1]
diff = o1:ne(o2)
print('# non equal elements : '.. 100*diff:sum()/diff:nElement() ..'%' )
print(o1[{1,{101,110},{101,110}}])
print(o2[{1,{101,110},{101,110}}])
image.display(diff)
"

Ouput sample :

# non equal elements : 87.623762376238%
Columns 1 to 6
-2.4412e-03  2.3100e-02  5.0485e-02  1.0684e-02 -4.1076e-02 -1.8460e-01
 4.7165e-02 -1.5568e-01 -9.3024e-02  1.4469e-01 -7.4957e-02  6.9803e-02
 3.5772e-02  5.5186e-02 -3.9895e-02  5.8437e-02  1.2992e-01 -4.9678e-02
 5.6799e-02  3.3620e-02 -3.8907e-03 -2.2967e-01  4.9513e-03 -1.2442e-01
-1.3218e-01 -3.9624e-03 -9.8057e-02 -8.4634e-02 -1.5401e-01 -3.3796e-02
-4.2397e-02 -3.2773e-02  6.3796e-03  7.0577e-02 -4.6901e-02  3.2816e-02
 3.2370e-02 -4.2442e-02 -7.9321e-02  2.6837e-02  7.6467e-02 -8.2963e-02
-6.9260e-02 -1.7496e-02 -2.7724e-02 -1.2335e-01 -8.4141e-02 -1.8470e-01
 3.4064e-02 -4.9115e-02 -4.8077e-02  1.0062e-02  4.5234e-02  5.9185e-02
 1.0122e-01  3.9731e-02  1.1834e-01 -1.2328e-02  7.9523e-02  5.5892e-02

Columns 7 to 10
-1.3808e-02 -1.7866e-02  7.3171e-02  1.1211e-01
 7.1314e-02 -4.7732e-02 -6.0124e-02 -1.0452e-02
 7.2734e-02 -1.3572e-01 -8.4925e-03 -1.3552e-02
 1.2510e-02  1.0214e-02 -7.5096e-02 -2.0804e-02
-1.6500e-01  1.9954e-01 -1.8387e-01 -6.2257e-02
 1.3598e-01 -3.6790e-02  3.0715e-02 -1.3187e-01
-6.1633e-03 -5.1489e-02 -6.5662e-02 -2.0836e-02
-9.9093e-02 -1.0995e-01  5.5576e-02 -9.3252e-02
-5.0548e-02  1.1062e-01  1.6523e-02  9.5098e-03
-2.7006e-02 -6.7763e-06  6.4172e-02  7.3943e-03
[torch.CudaTensor of size 10x10]

Columns 1 to 6
 3.5772e-02  5.5186e-02 -3.9895e-02  5.8437e-02  1.2992e-01 -4.9678e-02
 5.6799e-02  3.3620e-02 -3.8907e-03 -2.2967e-01  4.9513e-03 -1.2442e-01
-1.3218e-01 -3.9624e-03 -9.8057e-02 -8.4634e-02 -1.5401e-01 -3.3796e-02
-4.2397e-02 -3.2773e-02  6.3796e-03  7.0577e-02 -4.6901e-02  3.2816e-02
 3.2370e-02 -4.2442e-02 -7.9321e-02  2.6837e-02  7.6467e-02 -8.2963e-02
-6.9260e-02 -1.7496e-02 -2.7724e-02 -1.2335e-01 -8.4141e-02 -1.8470e-01
 3.4064e-02 -4.9115e-02 -4.8077e-02  1.0062e-02  4.5234e-02  5.9185e-02
 1.0122e-01  3.9731e-02  1.1834e-01 -1.2328e-02  7.9523e-02  5.5892e-02
 1.0939e-01  4.0622e-02  9.8273e-02 -1.2600e-01 -4.8626e-02 -9.3333e-02
-6.8294e-02  3.5513e-02 -1.6634e-01 -1.3823e-01 -2.3921e-02 -3.3631e-02

Columns 7 to 10
 7.2734e-02 -1.3572e-01 -8.4925e-03 -1.3552e-02
 1.2510e-02  1.0214e-02 -7.5096e-02 -2.0804e-02
-1.6500e-01  1.9954e-01 -1.8387e-01 -6.2257e-02
 1.3598e-01 -3.6790e-02  3.0715e-02 -1.3187e-01
-6.1633e-03 -5.1489e-02 -6.5662e-02 -2.0836e-02
-9.9093e-02 -1.0995e-01  5.5576e-02 -9.3252e-02
-5.0548e-02  1.1062e-01  1.6523e-02  9.5098e-03
-2.7006e-02 -6.7763e-06  6.4172e-02  7.3943e-03
-9.5984e-02 -8.3623e-02 -1.1424e-01  2.7539e-03
-1.6859e-01 -1.5069e-01  8.9932e-02  8.5738e-02
[torch.CudaTensor of size 10x10]

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug with GroupConvolutions with Padding using R7 branch #386

Bug with GroupConvolutions with Padding using R7 branch #386

clement-masson commented Oct 30, 2017 •

edited

Loading

Bug with GroupConvolutions with Padding using R7 branch #386

Bug with GroupConvolutions with Padding using R7 branch #386

Comments

clement-masson commented Oct 30, 2017 • edited Loading

clement-masson commented Oct 30, 2017 •

edited

Loading