Skip to content

Latest commit

 

History

History
167 lines (143 loc) · 12.4 KB

README.md

File metadata and controls

167 lines (143 loc) · 12.4 KB

Efficient-LLMAgent-Survey

We are currently writing a survey on Efficient LLM Agent Serving and welcome everyone to provide comments on the list!

This repository maintains a curated list of papers related to Large Language Model Based Agents (LLM Agents), especially focusing on efficient serving methods for LLM Agents.

This paper list covers several main aspects of efficient serving methods for LLM Agents. Table of content:

What is LLM Agent

Efficient Serving LLM Agent

LLM Serving

Planning

Tool and Action

Serverless

Memory

Component Collaboration and Agent Framework

focus on improving the efficiency of data exchange and data transmission within AI agents: ---hongqiu

Device-Edge-Cloud Collaboration

LLM and Agent Framework

LLM Framework

from here

Efficient Training Efficient Inference Efficient Fine-Tuning
DeepSpeed [Code]
Megatron [Code]
Alpa [Code]
ColossalAI [Code]
FairScale [Code]
Pax [Code]
Composer [Code]
vLLM [Code]
TensorRT-LLM [Code]
LightLLM [Code]
OpenLLM [Code]
Ray-LLM [Code]
MLC-LLM [Code]
Sax [Code]
Mosec [Code]
LLM-Foundry [Code]

GenAI Develop Engine

Agent Framework

Benchmark, Trace, and Dataset

  • BurstGPTTowards Efficient and Reliable LLM Serving: A Real-World Workload Study

LLM and Agent on Mobile Platform

Survey Papers

Others