In dev mode, server is stuck at Server started at unix:///tmp/text-generation-server-0 #2735

mokeddembillel · 2024-11-10T11:22:04Z

System Info

Using prefix caching = True
Using Attention = flashinfer
WARNING 11-10 11:16:48 ray_utils.py:46] Failed to import Ray with ModuleNotFoundError("No module named 'ray'"). For distributed inference, please install Ray with pip install ray.
/usr/src/server/text_generation_server/layers/gptq/triton.py:242: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead.
@custom_fwd(cast_inputs=torch.float16)
/opt/conda/lib/python3.11/site-packages/mamba_ssm/ops/selective_scan_interface.py:158: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead.
@custom_fwd
/opt/conda/lib/python3.11/site-packages/mamba_ssm/ops/selective_scan_interface.py:231: FutureWarning: torch.cuda.amp.custom_bwd(args...) is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda') instead.
@custom_bwd
/opt/conda/lib/python3.11/site-packages/mamba_ssm/ops/triton/layernorm.py:507: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead.
@custom_fwd
/opt/conda/lib/python3.11/site-packages/mamba_ssm/ops/triton/layernorm.py:566: FutureWarning: torch.cuda.amp.custom_bwd(args...) is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda') instead.
@custom_bwd
/opt/conda/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:79: FutureWarning: You are using a Backend <class 'text_generation_server.utils.dist.FakeGroup'> as a ProcessGroup. This usage is deprecated since PyTorch 2.0. Please use a public API of PyTorch Distributed instead.
return func(*args, **kwargs)
Using experimental prefill chunking = False

/usr/src/server/text_generation_server/server.py(278)serve_inner()
-> server = aio.server(
(Pdb) c
Server started at unix:///tmp/text-generation-server-0

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

SAFETENSORS_FAST_GPU=1 python text_generation_server/cli.py serve state-spaces/mamba-130

Expected behavior

launching webserver

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In dev mode, server is stuck at Server started at unix:///tmp/text-generation-server-0 #2735

In dev mode, server is stuck at Server started at unix:///tmp/text-generation-server-0 #2735

mokeddembillel commented Nov 10, 2024

In dev mode, server is stuck at Server started at unix:///tmp/text-generation-server-0 #2735

In dev mode, server is stuck at Server started at unix:///tmp/text-generation-server-0 #2735

Comments

mokeddembillel commented Nov 10, 2024

System Info

Information

Tasks

Reproduction

Expected behavior