Zhiyang TengContact: zhiyang.teng (at) bytedance.comGoogle Scholar |
TikTok Global Live — Research Intern Hiring! 🚀
Location: Singapore (preferred), Beijing, Shanghai, Shenzhen, Guangzhou, or Australia.
We build multimodal foundation models and large-scale RL for MoE to power real-world Live experiences:
understanding fast-changing live scenes, following multi-speaker conversations, recognizing events and intents,
and enabling unified understanding + generation for image/video (e.g., live highlights, interactive assistants,
safer and smarter live content, and better creator/viewer engagement).
Who we’re looking for: hands-on builders who are rigorous, reliable, and love turning ideas into working systems.
Bonus skills: Triton/CUDA, Linear Attention (FLA), Omni Models, Large-Scale RL for MoE, multimodal LLMs,
diffusion-based multimodal LLMs, and unified image/video generation + understanding models.
Interested? DM me with your CV + a short note on what you’ve built (GitHub / papers / projects welcome).
Last update: