Posters
- WebArena: A Realistic Web Environment for Building Autonomous Agents (opens in a new tab)
Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig - VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks (opens in a new tab)
Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried - TROVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks (opens in a new tab)
Zhiruo Wang, Daniel Fried, Graham Neubig - What Are Tools Anyway? A Survey from the Language Model Perspective (opens in a new tab)
Zhiruo Wang, Zhoujun Cheng, Hao Zhu, Daniel Fried, Graham Neubig - SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents (opens in a new tab)
Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, Maarten Sap - Autonomous Evaluation and Refinement of Digital Agents (opens in a new tab)
Jiayi Pan, Yichi Zhang, Nicholas Tomlin, Yifei Zhou, Sergey Levine, Alane Suhr - ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL (opens in a new tab)
Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar - CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks (opens in a new tab)
Yiqing Xie, Alex Xie, Divyanshu Sheth, Pengfei Liu, Daniel Fried, Carolyn Rose - VHABench: Benchmarking Underspecified User Intents for Tool Augmented LLMs (opens in a new tab)
Eduardo Trevino, Rohit Malhotra, Hugo Contant - InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews (opens in a new tab)
Xintao Wang, Yunze Xiao, Jen-tse Huang, Siyu Yuan, Rui Xu, Haoran Guo, Quan Tu, Yaying Fei, Ziang Leng, Wei Wang, Jiangjie Chen, Cheng Li, Yanghua Xiao - Can large language models explore in-context? (opens in a new tab)
Akshay Krishnamurthy, Keegan Harris, Dylan J Foster, Cyril Zhang, Aleksandrs Slivkins - Building a Financial Chatbot Using Langroid's Multi-Agent Framework (opens in a new tab)
Karthik Talluri, Vu Nguyen, Saanika Shahi, Antony Liao - What Is Missing in Multilingual Visual Reasoning and How to Fix It (opens in a new tab)
Yueqi Song, Simran Khanuja, Graham Neubig - LLM-Accelerated Tool Generation for Code Analysis (opens in a new tab)
Shen Zhang, Ryan Karl, Yash Hindka - Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models (opens in a new tab)
Gabriel Sarch, Yue Wu, Sahil Somani, Raghav Kapoor, Michael Tarr, Katerina Fragkiadaki - LLM-SR: Scientific Equation Discovery via Programming with Large Language Models (opens in a new tab)
Parshin Shojaee, Kazem Meidani, Shashank Gupta, Amir Barati Farimani, Chandan K. Reddy - How well do LLM Web Agents work for non-English Language? (opens in a new tab)
Abir Muhtasim, Arnab Bhattacharjee, Abhik Bhattacharjee, Faria Huq and Rifat Shahriyar - Large Language Models for Collective Problem-Solving: Insights into Group Consensus Decision-Making (opens in a new tab)
Yinuo Du, Prashanth Rajivan, Cleotilde Gonzalez - Influence Maximization in Dynamical Social Networks via Reinforcement Learning with LLM-simulated User Behaviors and Graph Embeddings (opens in a new tab)
Yurun Tian - MemAgent: A cache-inspired framework for aligning LLM Web Agent with User Goal (opens in a new tab)
Faria Huq, Nazmus Sakib, Protoy Barai, Sifat Ishmam Parisa, Abhik Bhattacharjee, Anindya Iqbal - Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions (opens in a new tab)
Leena Mathur, Paul Pu Liang, Louis-Philippe Morency - The Collaborative Caring Simulation Framework: Using LLM agents to simulate care-teams for prototyping and evaluating collective intelligence interventions (opens in a new tab)
Andrew Kuznetsov, Ping-Ya Chao, Christopher Dishop, Allen Brown, and Anita Woolley - WebBug: Evaluating MultiModal Agents in Real-World Web Quality Analysis Testing (opens in a new tab)
Sneha Sivakumar, Anushka Nijhawan Arman Cohan - AutoTester: An LMM Web-Agent for Automated Web Quality Assurance Testing (opens in a new tab)
Anushka Nijhawan, Sneha Sivakumar, Arman Cohan - ... and more to come!
Submit your poster at https://forms.gle/sWgMhXrjX4cnoiGA8 (opens in a new tab) and present in person!
Call for Posters
We are excited to announce our LLM Agent Workshop on May 2-3.
Our workshop is a CMU-wide event for students to show their recent and/or ongoing research and projects to the community. Any work related to LLM situating in, perceiving, and acting upon an external environment is welcomed. Submissions from students will be presented at the poster session in our workshop. This would be a great opportunity to publicize your projects and gain valuable feedback from the community!
Important Dates
- Submission Deadline: Apr 30, 2024
- Presentation Time/Date: May 2nd, 2024, 3:00-5:30 pm. In-person at Cohon University Center, Rangos Ballroom 1 (opens in a new tab)
Submission Format
Please prepare materials in the following two formats:
- A poster in the .pdf format, with any reasonable size, landscape or portrait (you can print out your slides if you are feeling lazy :)).
- A short abstract of your work
You’re welcome to submit projects at any stage, including side projects, submissions to other conferences, course projects, and any ongoing projects.
Please submit your project via this google form: https://forms.gle/sWgMhXrjX4cnoiGA8 (opens in a new tab) , and bring your poster. It can be printed via Tartan Ink (opens in a new tab) or SCS Poster Printing (opens in a new tab) if you are part of SCS community. If for some reason you are unable to print the poster, please email us notifying this!
Topics of Interest
Topics include, but are not limited to:
- LLM agents, environments, and tool use
- LLM reasoning
- LLM agents learning: expert supervision, reinforcement learning, synthetic data augmentation, etc.
- Open source software of LLM tool use and agents: browser plugin, AI assistant, toolbox, environment, framework, etc.
- Multi-agent interaction: human agent and model-based agent
- Specific interesting tasks and domains: e.g., knowledge exploration, software engineering, scientific discovery, social intelligence
- Safety of LLM agents
- Ethics and societal impact of LLM agents: job market, economic impact, accessibility, etc.
Organizing Committee
- Frank Xu: fangzhex@cs.cmu.edu
- Zora Wang: zhiruow@cs.cmu.edu
- Shuyan Zhou: shuyanzh@cs.cmu.edu
Feel free to contact the LLM Agent Workshop organizers if you have any questions!