The XR Tech Research team is seeking highly motivated Research Interns to join us in advancing the next generation of video and image generation technologies. Our team is dedicated to conducting foundational, state-of-the-art research in video generation, with a strong focus on pushing the scientific and technical boundaries of generative AI to enable future immersive and interactive experiences. As a Research Intern, you will work alongside leading experts in computer vision, generative modeling, and multimodal learning, contributing to projects at the frontier of large-scale video and image generation models. Our work not only advances the academic field but also lays the groundwork for future XR systems and applications, shaping how people will interact, create, and connect in virtual and real-world environments. About the Team The XR Tech Research team has a long-standing history of groundbreaking contributions to generative AI research across modalitiesfrom images and video to 3D and beyond. We are at the forefront of advancing foundational video generation models, collaborating with world-class research groups, and driving the next wave of innovation in immersive technologies. Our work is both academically impactful and tightly connected to real-world applications, enabling breakthroughs that redefine the future of XR.
Research Intern, Video & Image Generation (PhD) Responsibilities
Conduct research on advanced topics in video and image generation, including but not limited to generative diffusion and transformer architectures, spatio-temporal modeling, and multimodal integration.
Design, implement, and evaluate novel algorithms and model architectures.
Collaborate closely with researchers and engineers across the XR-Tech group and broader FAIR/GenAI teams.
Contribute to publications in top-tier conferences and journals in AI, computer vision, and machine learning.
Present findings and share insights that help shape both ongoing research and future directions.
Requirements: Minimum Qualifications
Currently pursuing a PhD in Computer Science, Electrical Engineering, or related field with a focus on machine learning, computer vision, or generative modeling
Strong research background with publications (or submissions) in top conferences such as CVPR, NeurIPS, ICLR, ICCV, ICML, or SIGGRAPH
Proficiency in deep learning frameworks such as PyTorch or TensorFlow
Solid understanding of generative models (diffusion models, GANs, VAEs, autoregressive transformers, etc.)
Strong programming skills and ability to work with large-scale datasets and compute infrastructure
Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Preferred Qualifications
Research experience in video generation, temporal modeling, or multimodal learning
Strong track record of contributions to open-source projects, benchmarks, or shared research artifacts
Demonstrated ability to work collaboratively in interdisciplinary research environments.
This position is open to all candidates.