Posters
Deadline: April 7, 2025
List of Accepted Papers
-
🥇 People’s Choice Best Paper (1st Place) An Analyst-Inspector Framework for Evaluating Reproducibility of LLMs in Data Science (opens in a new tab)
Qiuhai Zeng*, Claire Jin*, Xinyue Wang, Yuhan Zheng, Qunhua Li -
🥈 People’s Choice Best Paper (2nd Place) Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents (opens in a new tab)
Yu Gu, Kai Zhang, Yuting Ning, Boyuan Zheng, Boyu Gou, Tianci Xue, Cheng Chang, Sanjari Srivastava, Yanan Xie, Peng Qi, Huan Sun, Yu Su -
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Web Agents (opens in a new tab)
Vardaan Pahuja, Yadong Lu, Corby Rosset, Boyu Gou, Arindam Mitra, Spencer Whitehead, Yu Su, Ahmed Awadallah -
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation (opens in a new tab)
Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou, Jeffrey P. Bigham*, Graham Neubig* -
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models (opens in a new tab)
Bernal Jimenez Gutierrez, Yiheng Shu, Weijian Qi, Sizhe Zhou, Yu Su -
Refusal-trained LLMs Are Easy Jailbroken as Browser Agents (opens in a new tab)
Priyanshu Kumar, Elaine Lau, Saranya Vijayakumar, Tu (Alina) Trinh, Scale Red Team, Elaine Chang, Vaughn Robinson, Sean Hendryx, Shuyan Zhou, Matt Fredrikson, Summer Yue, Zifan (Sail) Wang -
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents (opens in a new tab)
Harsh Trivedi, Tushar Khot, Mareike Hartmann, Ruskin Manku, Vinty Dong, Edward Li, Shashank Gupta, Ashish Sabharwal, Niranjan Balasubramanian -
SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model (opens in a new tab)
Mingkai Deng*, Jinyu Hou*, Zhiting Hu, Graham Neubig, Hongxia Jin, Yilin Shen, Eric P. Xing -
Should You Use Your Large Language Model to Explore or Exploit? (opens in a new tab)
Keegan Harris, Alex Slivkins -
Direct Multi-Turn Preference Optimization for Language Agents (opens in a new tab)
Wentao Shi, Mengqi Yuan, Junkang Wu, Qifan Wang, Fuli Feng -
An Illusion of Progress? Assessing the Current State of Web Agents (opens in a new tab)
Tianci Xue, Weijian Qi, Tianneng Shi, Chan Hee Song, Boyu Gou, Dawn Song, Huan Sun, Yu Su -
Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks (opens in a new tab)
Hanjiang Hu, Alexander Robey, Changliu Liu -
ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data (opens in a new tab)
Junhong Shen, Atishay Jain, Zedian Xiao, Ishan Amlekar, Mouad Hadji, Aaron Podolny, Ameet Talwalkar -
Investigating Explainability with a Privacy-Preserving Multi-agent Systems
Roshni Kaushik, Hyeonsu Kang, Koichi Onoue -
Interactive Debugging and Steering of Multi-Agent AI Systems (opens in a new tab)
Will Epperson, Gagan Bansal, Victor Dibia, Adam Fourney, Jack Gerrits, Erkang Zhu, Saleema Amershi -
Langroid Research Agent (opens in a new tab)
Kepler Lang, Kojo Dokyi, Gillis Wang, Yanqi Chen -
VLM-MPC: Model Predictive Controller Augmented Vision Language Model for Autonomous Driving (opens in a new tab)
Keke Long, Haotian Shi, Jiaxi Liu, Chaowei Xiao, Xiaopeng Li -
HypoVeil: A Privacy-Utility Trade-off Aware Framework for Collaborative Multi-Agent Reasoning
Hyeonsu B. Kang, Roshni Kaushik, Koichi Onoue -
Agentic Harmonization across Different Frameworks (opens in a new tab)
Kota Miyake, Miho Tanaka, Koichi Onoue, Graham Neubig -
ToM-based Message Anonymization for Stepwise Confidential-Information Disclosure in Conversation between Agents
Miho Tanaka, Koichi Onoue, Graham Neubig -
On the Fine-Grained Planning Abilities of VLM Web Agents (opens in a new tab)
Surgan Jandial, Yinong (Oliver) Wang, Andrea Bajcsy, Fernando De La Torre
Feel free to contact the CMU Agent Workshop organizers (cmu-agent-workshop@andrew.cmu.edu) if you have any questions!