Pengrui Han

Pengrui Han (Barry)

I am currently in the MSCS program at UIUC, advised by Prof. Jiaxuan You. I am also a researcher in the MIT Brain and Cognitive Sciences department, working with Prof. Evelina Fedorenko and Dr. Andrea de Varda in the EvLab. During my undergrad, I was fortunate to work with Prof. Anima Anandkumar (Caltech) and Dr. Rafał Kocielnik (Caltech). I also previously interned at NVIDIA.

韩芃睿 / Email / Google Scholar / GitHub / LinkedIn / Twitter

News

[Sep 2025] 🔥 We released our work discovering The Personality Illusion: LLMs do not have personalities in the way humans do.
[Aug. 2025] One paper accepted to COLM Interplay Workshop.
[Jul 2025] Our paper on LLM Reasoning Failures is accepted to ICML AI for Math Workshop. Stay tuned for our full release!
[Apr. 2025] I am joining Prof. Evelina Fedorenko's group at MIT starting April 2025, working on the language and thought in LLMs.
[Oct. 2024] I am attending COLM 2024 to present our paper on synthetic data for debiasing.
[Oct. 2024] Two papers accepted to EMNLP 2024: one in Demo Track (Paper Copilot), and one in Findings (LLM cognitive errors).

Research

My research aims to advance scientific understanding of AI (especially neural models like LLMs), and more broadly, the general principles of intelligence and intelligent behavior. I approach this goal across three interconnected levels:

Behavioral level — analyzing how models and humans reason, generalize, and solve problems, including studies of alignment, limitations, and trustworthy reasoning.
Mechanistic level — interpreting model internals to understand the circuits, representations, and algorithms that give rise to intelligent behavior and drive observable performance and failures.
Social level — investigating how intelligence emerges and interacts in multi-agent systems and human–AI collaborations.

One can think of these as three levels, paralleling psychology, neuroscience, and social science in their study of human intelligence and behavior. If any of this resonates with your interests, feel free to reach out and let's connect/collaborate!

Selected Publications

	The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs Pengrui Han, Rafal D. Kocielnik, Peiyang Song, Ramit Debnath, Dean Mobbs, Anima Anandkumar, and R. Michael Alvarez (* Equal Contribution) ICML Workshop on Models of Human Feedback for AI Alignment (MoFA), 2025 arXiv / project / code LLMs say they have personalities, but they don’t act like it. Alignment today shapes language, not behavior. This linguistic–behavioral dissociation cautions against equating coherent self-reports with cognitive depth.
	A Survey on Large Language Model Reasoning Failures Peiyang Song, Pengrui Han*, and Noah Goodman ( Equal Contribution) ICML AI for Math Workshop, 2025 preprint / full release coming soon We present the first comprehensive survey dedicated to reasoning failures in LLMs. By unifying fragmented research efforts, our survey provides a structured perspective on systemic weaknesses in LLM reasoning, offering valuable insights and guiding future research towards building stronger, more reliable, and robust reasoning capabilities.
	In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models Pengrui Han, Peiyang Song, Haofei Yu, and Jiaxuan You (* Equal Contribution) Findings of Empirical Methods in Natural Language Processing (EMNLP), 2024 code Motivated by the crucial cognitive phenomenon of A-not-B errors, we present the first systematic evaluation on the surprisingly vulnerable inhibitory control abilities of LLMs. We reveal that this weakness undermines LLMs' trustworthy reasoning capabilities across diverse domains, and introduce various mitigations.
	ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs Pengrui Han, Rafal Kocielnik, Adhithya Saravanan,Roy Jiang, Or Sharir,and Anima Anandkumar (* Equal Contribution) Conference On Language Modeling (COLM), 2024 code We propose a light and efficient pipeline that enables both domain and non-domain experts to quickly generate synthetic debiasing data to mitigate specific or general bias in their models with parameter-efficient fine-tuning.
	Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance Guanyu Lin, Tao Feng, Pengrui Han, Ge Liu, Jiaxuan You ( Equal Contribution) System Demonstration Track of Empirical Methods in Natural Language Processing (EMNLP), 2024 Huggingface Live Demo: Link We propose a light and efficient pipeline that enables both domain and non-domain experts to quickly generate synthetic debiasing data to mitigate specific or general bias in their models with parameter-efficient fine-tuning.
	Thought-Retriever: Don’t Just Retrieve Raw Data, Retrieve Thoughts Tao Feng, Pengrui Han*, Guanyu Lin, Ge Liu, Jiaxuan You (* Equal Contribution) ICLR Workshop on How Far Are We From AGI, 2024 We introduce Thought-Retriever a novel model-agnostic algorithm that enables LLMs to effectively utilize external data without being limited by context length.
	Exploring Social Bias in Downstream Applications of Text-to-Image Foundation Models Adhithya Saravanan, Rafal Kocielnik, Roy Jiang, Pengrui Han, and Anima Anandkumar NeurIPS Workshop on Failure Modes in the Age of Foundation Models, 2023 Proceedings of Machine Learning Research (PMLR) We explore the social biases in text-to-image diffusion models used in commercial applications like image editing. By analyzing models like Stable Diffusion, we uncover significant biases, emphasizing the need for careful consideration when adopting these technologies for broader use.

Awards

Phi Beta Kappa Honor Society (2025)
Carleton College Chang-Lan Award (2024)
Caltech SURF Award (2023)
Carleton College Dean's List (2023)

Academic Services

Reviewer for conferences: ICLR, ICML, COLM, COLING.
Reviewer for workshops: Re-Align, LLM-Cognition, BehaviorML, LTEDI, INTERPLAY, AI4Math, LatinX, Assessing World Models

Site source