Pengrui Han (Barry)

I am currently in the MSCS program at UIUC, advised by Prof. Jiaxuan You. I am also a researcher in the MIT Brain and Cognitive Sciences department, working with Prof. Evelina Fedorenko and Dr. Andrea de Varda in the EvLab. During my undergrad, I was fortunate to work with Prof. Anima Anandkumar (Caltech) and Dr. Rafał Kocielnik (Caltech). I also previously interned at NVIDIA.

韩芃睿  /  Email  /  Google Scholar  /  GitHub  /  LinkedIn  /  Twitter

profile photo
News

[Apr. 2025] I am joining Prof. Evelina Fedorenko's group at MIT starting April 2025, working on the language and thought in LLMs.
[Apr. 2025] I will begin the MSCS program at UIUC in Fall 2025, where I will continue working with Prof. Jiaxuan You.
[Oct. 2024] I am attending COLM 2024 to present our paper on synthetic data for debiasing.
[Oct. 2024] Two papers accepted to EMNLP 2024: one in Demo Track (Paper Copilot), and one in Findings (LLM cognitive errors).

Research

I’m interested in language and intelligence in both humans and machines, particularly the role language plays in shaping intelligent behavior, which I study through comparisons between LLMs and humans. I analyze their behavioral capabilities and failures, examine internal mechanisms, and develop strategies to enhance intelligent machine systems.

Selected Publications
In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models
Pengrui Han*, Peiyang Song*, Haofei Yu, and Jiaxuan You (* Equal Contribution)
Findings of Empirical Methods in Natural Language Processing (EMNLP), 2024
code

Motivated by the crucial cognitive phenomenon of A-not-B errors, we present the first systematic evaluation on the surprisingly vulnerable inhibitory control abilities of LLMs. We reveal that this weakness undermines LLMs' trustworthy reasoning capabilities across diverse domains, and introduce various mitigations.

ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs
Pengrui Han*, Rafal Kocielnik*, Adhithya Saravanan,Roy Jiang, Or Sharir,and Anima Anandkumar (* Equal Contribution)
Conference On Language Modeling (COLM), 2024
code

We propose a light and efficient pipeline that enables both domain and non-domain experts to quickly generate synthetic debiasing data to mitigate specific or general bias in their models with parameter-efficient fine-tuning.

Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance
Guanyu Lin*, Tao Feng*, Pengrui Han*, Ge Liu, Jiaxuan You (* Equal Contribution)
System Demonstration Track of Empirical Methods in Natural Language Processing (EMNLP), 2024
Huggingface Live Demo: Link

We propose a light and efficient pipeline that enables both domain and non-domain experts to quickly generate synthetic debiasing data to mitigate specific or general bias in their models with parameter-efficient fine-tuning.

Thought-Retriever: Don’t Just Retrieve Raw Data, Retrieve Thoughts
Tao Feng*, Pengrui Han*, Guanyu Lin*, Ge Liu, Jiaxuan You (* Equal Contribution)
ICLR Workshop on How Far Are We From AGI, 2024

We introduce Thought-Retriever a novel model-agnostic algorithm that enables LLMs to effectively utilize external data without being limited by context length.

Exploring Social Bias in Downstream Applications of Text-to-Image Foundation Models
Adhithya Saravanan, Rafal Kocielnik, Roy Jiang, Pengrui Han, and Anima Anandkumar
NeurIPS Workshop on Failure Modes in the Age of Foundation Models, 2023
Proceedings of Machine Learning Research (PMLR)

We explore the social biases in text-to-image diffusion models used in commercial applications like image editing. By analyzing models like Stable Diffusion, we uncover significant biases, emphasizing the need for careful consideration when adopting these technologies for broader use.

Awards
  • Carleton College Chang-Lan Award (2024)
  • Caltech SURF Award (2023)
  • Carleton College Dean's List (2023)
Academic Services
  • Reviewer for conferences: ICLR, ICML, COLM, COLING.
  • Reviewer for workshops: Re-Align, LLM-Cognition, BehaviorML, LTEDI.

Site source