Yongjin Yang | Research Scientist in AI Alignment, Safety and Multi-agent Systems

I am a PhD student at the University of Toronto with the Connaught International Scholarship, advised by Professor Zhijing Jin. I am also associated with the Vector Institute. Previously, I finished my MS degree at KAIST AI advised by Se-Young Yun and also collaborated closely with Professor Kimin Lee. I finished my undergraduate degree at Seoul National University, with my last semester collaborating with Taesup Kim.

My research focuses on enabling AI models to perform real-world tasks while being safe and reliable. Currently, I am primarily interested in LLMs reasoning, open-endedness, and multi-agent systems. However, I'm open to exploring any interesting problems in the field.

Publications

^* denotes equal contribution, ^{^} denotes corresponding author

2025

Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models

Younwoo Choi, Changling Li, Yongjin Yang, Zhijing Jin

EMNLP 2025

[paper] [code]

Automated Skill Discovery for Language Agents through Exploration and Iterative Feedback

Yongjin Yang^*, Sinjae Kang^*, Juyong Lee, Dongjun Lee, Se-Young Yun^, Kimin Lee^

Preprint

[paper]

Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness

Yongjin Yang^*, Euiin Yi^*, Jongwoo Ko, Kimin Lee^, Zhijing Jin^, Se-Young Yun^

Preprint, ICML 2025 MAS Workshop

[paper] [code]

Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games

David Guzman Piedrahita, Yongjin Yang, Mrinmaya Sachan, Giorgia Ramponi, Bernhard Schölkopf, Zhijing Jin

COLM 2025, ACL 2025 REALM Workshop (Oral & Spotlight)

[paper] [code]

Self-Training Elicits Concise Reasoning in Large Language Models

Tergel Munkhbat^*, Namgyu Ho^*, Seo Hyun Kim^*,Yongjin Yang, Yujin Kim, Se-Young Yun

ACL 2025 Findings

[paper] [code]

CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

Haneul Yoo, Yongjin Yang, Hwaran Lee

ACL 2025 Main

[paper] [code]

Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models

Yongjin Yang^*, Sihyeon Kim^*, Hojung Jung, Sangmin Bae, Sangmook Kim, Se-Young Yun^, Kimin Lee^

ICLR 2025

[paper] [code] [project]

MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty

Yongjin Yang, Haneul Yoo, Hwaran Lee

NAACL 2025 Findings

[paper] [code]

2024

Towards Difficulty-Agnostic Efficient Transfer Learning for Vision-Language Models

Yongjin Yang^*, Jongwoo Ko^*, Se-Young Yun

EMNLP 2024

[paper] [code]

Towards Unbiased Evaluation of Detecting Unanswerable Questions in EHRSQL

Yongjin Yang^*, Sihyeon Kim^*, Sangmook Kim^*, Gyubok Lee, Se-Young Yun, Edward Choi

DPFM Workshop @ ICLR 2024

[paper]

Leveraging Normalization Layer in Adapters With Progressive Learning and Adaptive Distillation for Cross-Domain Few-Shot Learning.

Yongjin Yang, Taehyeon Kim, Se-Young Yun

AAAI 2024

[paper]

2023

HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning.

Yongjin Yang^*, Joonkee Kim^*, Yujin Kim^*, Namgyu Ho, James Thorne^, Se-Young Yun^

EMNLP 2023 Findings

[paper] [code]

Meta-Learning with Adaptive Weighted Loss for Imbalanced Cold-Start Recommendation.

Minchang Kim^*, Yongjin Yang^*, Jung Hyun Ryu, Taesup Kim

CIKM 2023 (Oral)

[paper] [code]

Education

Sep 2025 - Present	University of Toronto, Ph.D. in Department of Computer Science
Mar 2023 - Feb 2025	KAIST, M.S. in Graduate School of AI
Mar 2017 - Feb 2023	Seoul National Unversity, B.S. in Electrical and Computer Engineering

Academic Services

Reviewer

ACL ARR 2024 (Feb, June, Oct, Dec), ACL ARR 2025 (Feb, May, July), COLM 2025, ICLR 2025, NeurIPS 2025, AAAI 2026, ICLR 2026

Work Experience

Mar 2024 - June 2024	Naver AI Lab (Research Intern) Host : Hwaran Lee
Aug 2022 - Feb 2023	OSI Lab (Research Intern) Host : Se-Young Yun
Jul 2021 - Aug 2021	Samsung Electronics, MX (Summer Engineering Intern)

Mentoring Experience

Spring 2018	Principles of Physics
Fall 2021	Introduction to Algorithm
Fall 2025	Introduction to Programming Languages