-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add auto-gptq integration #175
base: main
Are you sure you want to change the base?
Conversation
国内镜像源可能暂时还没有同步到 auto-gptq,安装依赖时需要指定官方源 |
感谢您的PR. 看了一下autogptq的安装,默认会重装torch和cuda ext。这对于多数用户来说感觉不够友好,能否为MOSS设计一个pip install 的最小依赖集合,可以在现有的环境上便捷地安装? |
@PanQiWei 装了auto-gptq,是不是量化就不用自己配置cuda环境,然后从gptq源码编译whl和pytorch extension?auto-gptq有要求对应的pytorch cuda版本?或transformer版本 |
@Hzfinfdu 我对 |
新增使用 |
代码还没有合并到主repo上是因为有问题吗? |
我还没进行完整的应用测试,包括 auto-gptq 发布了新的版本,兼容问题也需要测测,我争取周末做一下 |
using auto-gptq to simplify code and quantization, by this, user can use quantized model to inference with or without triton installed, and can even run on CPU.