colored-dye's blog
baoyuntai [at] outlook [dot] com
This is Yuntai Bao, a third-year PhD candidate at School of Software Technology, Zhejiang University, advised by Xuhong Zhang. I’m expected to graduate in 2028. My research interest includes mechanistic interpretability (mech interp), AI safety, neural network learning dynamics as well as general principles of ML systems. I have experiences in steering vectors, model probes and training data attribution.
Currently, I am committed to pragmatic interpretability in order to enable effective and efficient (compute & data) model control via theoretical/empirical insights from mech interp. Beyond interpretability, I am also working on LLM post-training including RL, knowledge distillation and LLM-based agents. I also have experiences in cryptography and software/OS security.
Please feel free to reach out~