CPU Inference After GPU Training #159

aldrinc · 2019-04-30T00:06:18Z

Hi - I trained an Implicit Sequence Model and loaded it in my Flask API for serving locally on my machine and I cannot seem to get CPU inference working.

The model works correctly when a GPU is available.

Steps to recreate:

Run flask server locally
e.g. model = torch.load('./my_model_v0.13.pt', map_location='cpu')`
Post a JSON payload with sequence values. I've already tested that the server can correctly parse the response.
Server error when model attempts to predict preds = model.predict(arr)

RuntimeError: torch.cuda.LongTensor is not enabled.
More trace below.

Traceback (most recent call last):
  File "/Users/aldrinclement/anaconda/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/aldrinclement/anaconda/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/aldrinclement/anaconda/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/aldrinclement/anaconda/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/aldrinclement/anaconda/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "main.py", line 77, in predict
    preds = model.predict(arr)
  File "/Users/aldrinclement/anaconda/lib/python2.7/site-packages/spotlight/sequence/implicit.py", line 323, in predict
    sequence_var = gpu(sequences, self._use_cuda)
  File "/Users/aldrinclement/anaconda/lib/python2.7/site-packages/spotlight/torch_utils.py", line 9, in gpu
    return tensor.cuda()
RuntimeError: torch.cuda.LongTensor is not enabled.

def load_model():
    """Load the pre-trained model, you can use your model just as easily."""
    global model
    model = torch.load('./justlook_v0.13.pt', map_location='cpu')

The text was updated successfully, but these errors were encountered:

angelleng · 2019-05-10T05:16:46Z

You need to also turn the flag model._use_cuda off. Otherwise the input will be converted to cuda tensors: sequence_var = gpu(sequences, self._use_cuda)

maciejkula · 2019-05-11T17:19:46Z

That's correct. There really should be a better way of doing this but I'm short on time and GPU testing runs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU Inference After GPU Training #159

CPU Inference After GPU Training #159

aldrinc commented Apr 30, 2019 •

edited

Loading

angelleng commented May 10, 2019

maciejkula commented May 11, 2019

CPU Inference After GPU Training #159

CPU Inference After GPU Training #159

Comments

aldrinc commented Apr 30, 2019 • edited Loading

angelleng commented May 10, 2019

maciejkula commented May 11, 2019

aldrinc commented Apr 30, 2019 •

edited

Loading