TrustworthyAI in the Era of Frontier Models

Chirag Agarwal
UVA Data Science & Computer Science

Time: 2025-11-12, 12:00 - 13:00 ET
Location: Rice 540 and Zoom

Abstract Machine learning models have become ubiquitous in the last decade, and with their increasing use in critical applications (e.g., healthcare, financial systems, and crime forecasting), it is vital to ensure that ML developers and practitioners understand and trust their decisions. This problem has become paramount in the era of frontier models, which are developed by training billion-parameter models using broad, uncurated datasets and extensive computing. In this talk, we will first explore the (un)reliability of reasoning in Large Language Models (LLMs) and cover the uncertainty, unfaithfulness, and hallucination properties of Chain-of-Thought reasoning. Next, we will delve into the intriguing world of multilingual LLMs and discuss why state-of-the-art multilingual LLMs lack reasoning capabilities and are more vulnerable to safety and alignment attacks. Finally, we will discuss how multimodal explainability has not kept pace with the surge in multimodal AI and debate the path forward.

Bio: Chirag Agarwal is is an assistant professor of data science and leads the Aikyam lab, which focuses on developing trustworthy machine learning frameworks that go beyond training models for specific downstream tasks and satisfy trustworthy properties, such as explainability, fairness, and robustness.

Before joining UVA, he was a postdoctoral research fellow at Harvard University and completed his Ph.D. at the University of Illinois at Chicago in electrical and computer engineering and bachelor’s degree in electronics and communication. His Ph.D. thesis was on the “Robustness and Explainability of Deep Neural Networks,” and his research encompasses different trustworthy topics, such as explainability, fairness, robustness, privacy, transferability estimation, and their intersection in the age of large-scale models. He has developed the first-of-its-kind, large-scale, in-depth study to support systematic, reproducible, and efficient evaluations of post hoc explanation methods for (un)structured data to understand algorithmic decision-making on diverse tasks ranging from bail decisions to loan credit recommendations.

Agarwal has published in top-tier machine learning and computer vision conferences (NeurIPS, ICML, ICLR, UAI, AISTATS, CVPR, SIGIR, ACCV) as well as in top journals in datasets (Nature Scientific Data) and health care (Journal of Clinical Sleep Medicine and Cardiovascular Digital Health Journal). His research has received Spotlight and Oral presentations at NeurIPS, ICML, CVPR, and ICIP conferences, and industrial grants from Adobe, Microsoft, and Google to support his work on trustworthy machine learning.