Hanyang Zhao

I am a Ph.D. candidate at the Department of IEOR of Columbia University advised by Professor Wenpin Tang and Professor David D. Yao. Prior to my Ph.D. study, I obtained my B.S. degree in Mathematics at Fudan University and my M.S. degree in Financial Engineering also at Columbia.
I have been working on reinforcement learning (RL) and generative models (LLMs and Diffusion Model). I try to design algorithms from first mathematical principles and also leverage the structural properties of the underlying models.
Feel free to DM me (e.g. on Twitter) if you would like to chat about any research ideas!
News
May 01, 2025 | Our Scores as Actions paper is accepted by ICML 2025! See everybody in Vancouver this summer! |
---|---|
Feb 05, 2025 | Our preference learning survey paper is accepted by JAIR! |
Feb 04, 2025 | We propose a continuous-time RL method for Diffusion Models RLHF, which outperforms discrete-time RL baseline in robustness and stability, and also adapts to diffusion models with high-order or black-box samplers, thanks to the continuous-time nature. See our paper Scores as Actions! |