I often write some unofficial manuscripts and slides to help collect my thoughts.
I maintain a resource list for GFlowNets here.
I maintain a (possibly outdated) paper list for out-of-distribution generalization here (thanks for Irina’s help!).
Blogs
Knowledge Flow: Scaling Reasoning Beyond the Context Limit
Yufan Zhuang, Liyuan Liu, Dinghuai Zhang, Chandan Singh, Yelong Shen, Jingbo Shang, Jianfeng Gao
FlashRL: 8Bit Rollouts, Full Power RL
Your Efficient RL Framework Secretly Brings You Off-Policy RL Training
Feng Yao*, Liyuan Liu*, Dinghuai Zhang, Chengyu Dong, Jingbo Shang, Jianfeng Gao
Blogs on addressing off-policy mismatch (from vLLM serving & rollout quantization) in modern LLM+RL systems
Towards 131k-Context dLLMs
Albert Ge, Chandan Singh, Dinghuai Zhang, Letian Peng, Yufan Zhuang, Ning Shang, Li Lyna Zhang, Liyuan Liu, Jianfeng Gao
Talk Slides
- At the intersection of probabilistic inference and exploration methods: link
- Review of counterfactual representation learning: link
- Review of out-of-distribution generalization: link
- Introduction to causal inference: link
- Review of learning in traditional cv methods: link
- Review of reweight methods: link
- Review of unlabeled data used in adversarial training: link
- Some random papers sharing: link
- One paper about off-policy evaluation: link
- Review of normalizing flows: link
- Review of capsule networks: link
Notes
- Bayesian optimal experimental design: link
- Kernelized Wasserstein gradient flow: link
- Conformal inference basics: link
- A way to unify different probabilistic inference approaches: link
- Connection between adversarial robustness and other learning fields: link
- Stochastic analysis: link
- A review on interpretating deep neural network: link
- A short survey on Image Inpainting: link
