Abstract
This project aims to advance the development of embodied AI agents by leveraging video games as a testing ground, specifically building upon the Voyager framework in Minecraft. We propose enhancing the existing architecture through a combination of state-of-the-art LLM integration, reinforcement learning techniques, and optimized prompt engineering. Our approach focuses on improving long-horizon task completion and skill acquisition, using Minecraft's open-ended environment as a low-risk platform for developing foundational AI capabilities. By optimizing agent behavior through direct gameplay feedback and sophisticated skill library management, we aim to demonstrate significantly improved performance over baseline metrics, particularly in exploration and task completion efficiency.
We are expanding upon the Voyager Paper which created Large Language Agents capable of playing the video game Minecraft. We wanted to take the idea of the project further, and we integrated multi-agent capabilities and added vision agents. A central focus of the original research was the enhancement of agent behavior through iterative gameplay feedback and the strategic management of a comprehensive skill library. These improvements led to significant advancements in both exploration efficiency and task completion effectiveness compared to baseline models. Our project introduced multi-agent capabilities and incorporated vision-based agents, enriching the interactions and collaborative potential within the Minecraft ecosystem.