Efficient-LLMAgent-Survey

We are currently writing a survey on Efficient LLM Agent Serving and welcome everyone to provide comments on the list!

This repository maintains a curated list of papers related to Large Language Model Based Agents (LLM Agents), especially focusing on efficient serving methods for LLM Agents.

This paper list covers several main aspects of efficient serving methods for LLM Agents. Table of content:

Efficient-LLMAgent-Survey

What is LLM Agent

LLM Powered Autonomous Agents

Efficient Serving LLM Agent

Memory

Component Collaboration and Agent Framework

focus on improving the efficiency of data exchange and data transmission within AI agents: ---hongqiu

Device-Edge-Cloud Collaboration

Hybrid LLM Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing | ICLR'24
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Optimal Caching and Model Multiplexing for Large Model Inference | NeurIPS'23
Optimising Calls to Large Language Models with Uncertainty Based Two-Tier Selection
Octopus: On-device language model for function calling of software APIs
Octopus v2: On-device language model for super agent | Stanford
Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent| Stanford
Octopus v4: Graph of language models

LLM and Agent Framework

LLM Framework

from here

	Efficient Training	Efficient Inference	Efficient Fine-Tuning
DeepSpeed [Code]	✅	✅	✅
Megatron [Code]	✅	✅	✅
Alpa [Code]	✅	✅	✅
ColossalAI [Code]	✅	✅	✅
FairScale [Code]	✅	✅	✅
Pax [Code]	✅	✅	✅
Composer [Code]	✅	✅	✅
vLLM [Code]	❌	✅	❌
TensorRT-LLM [Code]	❌	✅	❌
LightLLM [Code]	❌	✅	❌
OpenLLM [Code]	❌	✅	✅
Ray-LLM [Code]	❌	✅	❌
MLC-LLM [Code]	❌	✅	❌
Sax [Code]	❌	✅	❌
Mosec [Code]	❌	✅	❌
LLM-Foundry [Code]	✅	✅	❌

GenAI Develop Engine

dify
fedml

Agent Framework

Benchmark, Trace, and Dataset

BurstGPTTowards Efficient and Reliable LLM Serving: A Real-World Workload Study

LLM and Agent on Mobile Platform

Mobile LLM MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases | Meta
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
LLM as a System Service on Mobile Devices https://arxiv.org/abs/2403.11805

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Efficient-LLMAgent-Survey

What is LLM Agent

Efficient Serving LLM Agent

LLM Serving

Planning

Tool and Action

Serverless

Memory

Component Collaboration and Agent Framework

Device-Edge-Cloud Collaboration

LLM and Agent Framework

LLM Framework

GenAI Develop Engine

Agent Framework

Benchmark, Trace, and Dataset

LLM and Agent on Mobile Platform

Survey Papers

Others

About

Releases

Packages

guopeng-gpli/Efficient-LLMAgent-Survey

Folders and files

Latest commit

History

Repository files navigation

Efficient-LLMAgent-Survey

What is LLM Agent

Efficient Serving LLM Agent

LLM Serving

Planning

Tool and Action

Serverless

Memory

Component Collaboration and Agent Framework

Device-Edge-Cloud Collaboration

LLM and Agent Framework

LLM Framework

GenAI Develop Engine

Agent Framework

Benchmark, Trace, and Dataset

LLM and Agent on Mobile Platform

Survey Papers

Others

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages