Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
YuanDaoze authored Dec 20, 2024
1 parent 44af414 commit 37b8ea1
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ This repo covers a variety of papers related to GUI Agents, such as:
- 📑 Publisher: https://os-agent-survey.github.io/
- 💻 Env: [GUI]
- 🔑 Key: [survey]
- 📖 TLDR: This paper conducts a comprehensive survey on OS Agents, which are (M)LLM-based agents that operate on computing devices (e.g., computers and mobile phones) by interacting with the environments and interfaces (e.g., Graphical User Interfaces (GUI)) provided by operating systems (OS) to automate tasks and processes. The survey begins by elucidating the fundamentals of OS Agents, exploring their key components including the environment, observation space, and action space, and outlining essential capabilities such as understanding, planning, and grounding. Methodologies for constructing OS Agents are examined, with a focus on domain-specific foundation models and agent frameworks. A detailed review of evaluation protocols and benchmarks highlights how OS Agents are assessed across diverse tasks. Finally, current challenges are discussed, and promising directions for future research are identified, including safety and privacy, personalization, and self-evolution.
- 📖 TLDR: This paper conducts a comprehensive survey on OS Agents, which are (M)LLM-based agents that operate on computing devices (e.g., computers and mobile phones) by interacting with the environments and interfaces (e.g., Graphical User Interfaces (GUI)) provided by operating systems (OS) to automate tasks and processes. The survey begins by elucidating the fundamentals of OS Agents, exploring their key components including the environment, observation space, and action space, and outlining essential capabilities such as understanding, planning, and grounding. Methodologies for constructing OS Agents are examined, with a focus on domain-specific foundation models and agent frameworks. A detailed review of evaluation protocols and benchmarks highlights how OS Agents are assessed across diverse tasks. Finally, current challenges are discussed, and promising directions for future research are identified, including safety and privacy, personalization and self-evolution.


- [GUI Agents: A Survey](https://arxiv.org/abs/2412.13501)
Expand Down

0 comments on commit 37b8ea1

Please sign in to comment.