Skip to content

Commit

Permalink
Merge branch 'powerinfer' into ibaldoall
Browse files Browse the repository at this point in the history
  • Loading branch information
ibaldonl committed Jan 31, 2024
2 parents 045ce96 + f81e384 commit db86809
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 3 deletions.
4 changes: 2 additions & 2 deletions powerinfer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,10 @@ FROM nvidia/cuda:12.3.1-devel-rockylinux9
ARG USERID=1000
RUN yum install -y python3-pip cmake libcudnn8 git && yum clean all && rm -rf /var/cache/yum/*
RUN git clone https://github.com/SJTU-IPADS/PowerInfer
WORKDIR PowerInfer
WORKDIR /PowerInfer
RUN pip install --no-cache-dir -r requirements.txt
RUN cmake -S . -B build -DLLAMA_CUBLAS=ON
RUN cmake --build build --config Release -j $(nproc)
RUN cmake --build build --config Release -j "$(nproc)"
RUN pip install --no-cache-dir pandas #for the benchmark.
RUN adduser -u $USERID user
USER user
Expand Down
4 changes: 3 additions & 1 deletion powerinfer/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
[PowerInfer](https://github.com/SJTU-IPADS/PowerInfer)
#PowerInfer benchmark

Benchmark for [PowerInfer](https://github.com/SJTU-IPADS/PowerInfer).

Note that the model loses some inference quality in exchange for speed as shown in https://huggingface.co/SparseLLM/ReluLLaMA-7B.

Expand Down

0 comments on commit db86809

Please sign in to comment.