Posters

Deadline: April 7, 2025

List of Accepted Papers

🥇 People’s Choice Best Paper (1st Place) An Analyst-Inspector Framework for Evaluating Reproducibility of LLMs in Data Science (opens in a new tab)
Qiuhai Zeng*, Claire Jin*, Xinyue Wang, Yuhan Zheng, Qunhua Li
🥈 People’s Choice Best Paper (2nd Place) Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents (opens in a new tab)
Yu Gu, Kai Zhang, Yuting Ning, Boyuan Zheng, Boyu Gou, Tianci Xue, Cheng Chang, Sanjari Srivastava, Yanan Xie, Peng Qi, Huan Sun, Yu Su
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Web Agents (opens in a new tab)
Vardaan Pahuja, Yadong Lu, Corby Rosset, Boyu Gou, Arindam Mitra, Spencer Whitehead, Yu Su, Ahmed Awadallah
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation (opens in a new tab)
Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou, Jeffrey P. Bigham*, Graham Neubig*
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models (opens in a new tab)
Bernal Jimenez Gutierrez, Yiheng Shu, Weijian Qi, Sizhe Zhou, Yu Su
Refusal-trained LLMs Are Easy Jailbroken as Browser Agents (opens in a new tab)
Priyanshu Kumar, Elaine Lau, Saranya Vijayakumar, Tu (Alina) Trinh, Scale Red Team, Elaine Chang, Vaughn Robinson, Sean Hendryx, Shuyan Zhou, Matt Fredrikson, Summer Yue, Zifan (Sail) Wang
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents (opens in a new tab)
Harsh Trivedi, Tushar Khot, Mareike Hartmann, Ruskin Manku, Vinty Dong, Edward Li, Shashank Gupta, Ashish Sabharwal, Niranjan Balasubramanian
SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model (opens in a new tab)
Mingkai Deng*, Jinyu Hou*, Zhiting Hu, Graham Neubig, Hongxia Jin, Yilin Shen, Eric P. Xing
Should You Use Your Large Language Model to Explore or Exploit? (opens in a new tab)
Keegan Harris, Alex Slivkins
Direct Multi-Turn Preference Optimization for Language Agents (opens in a new tab)
Wentao Shi, Mengqi Yuan, Junkang Wu, Qifan Wang, Fuli Feng
An Illusion of Progress? Assessing the Current State of Web Agents (opens in a new tab)
Tianci Xue, Weijian Qi, Tianneng Shi, Chan Hee Song, Boyu Gou, Dawn Song, Huan Sun, Yu Su
Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks (opens in a new tab)
Hanjiang Hu, Alexander Robey, Changliu Liu
AI Agents for Personal Finance: Enhancing Credit Behavior, Budgeting Precision, and Spending Decisions (opens in a new tab)
Siddarth Pai
ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data (opens in a new tab)
Junhong Shen, Atishay Jain, Zedian Xiao, Ishan Amlekar, Mouad Hadji, Aaron Podolny, Ameet Talwalkar
Investigating Explainability with a Privacy-Preserving Multi-agent Systems
Roshni Kaushik, Hyeonsu Kang, Koichi Onoue
Interactive Debugging and Steering of Multi-Agent AI Systems (opens in a new tab)
Will Epperson, Gagan Bansal, Victor Dibia, Adam Fourney, Jack Gerrits, Erkang Zhu, Saleema Amershi
Langroid Research Agent (opens in a new tab)
Kepler Lang, Kojo Dokyi, Gillis Wang, Yanqi Chen
VLM-MPC: Model Predictive Controller Augmented Vision Language Model for Autonomous Driving (opens in a new tab)
Keke Long, Haotian Shi, Jiaxi Liu, Chaowei Xiao, Xiaopeng Li
HypoVeil: A Privacy-Utility Trade-off Aware Framework for Collaborative Multi-Agent Reasoning
Hyeonsu B. Kang, Roshni Kaushik, Koichi Onoue
Agentic Harmonization across Different Frameworks (opens in a new tab)
Kota Miyake, Miho Tanaka, Koichi Onoue, Graham Neubig
ToM-based Message Anonymization for Stepwise Confidential-Information Disclosure in Conversation between Agents
Miho Tanaka, Koichi Onoue, Graham Neubig
On the Fine-Grained Planning Abilities of VLM Web Agents (opens in a new tab)
Surgan Jandial, Yinong (Oliver) Wang, Andrea Bajcsy, Fernando De La Torre

Feel free to contact the CMU Agent Workshop organizers (cmu-agent-workshop@andrew.cmu.edu) if you have any questions!

Speakers Workshop Organizer Committee