our company has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. Its a unique legacy of innovation thats fueled by great technologyand amazing people. Today, were tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing whats never been done before takes vision, innovation, and the worlds best talent. As a worker, youll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.
we are seeking a highly skilled and modern software engineer to develop and prototype brand new advancements in distributed training and inference using our companys Spectrum-X AI fabric. This role offers a rare chance to pioneer AI and networking technology, contributing to ground-breaking projects that will define the landscape of large-scale AI systems. Improve AI app-networking connection by refining communication, crafting congestion control, coding NIC firmware, and expanding switch SDK features for enhanced AI factory efficiency. Your work impacts large AI system development, scaling, and speed.
What youll be doing:
Prototype end-to-end solutions to improve distributed training and disaggregated inference performance.
Analyze and optimize communication flows across application, transport, and network layers.
Develop system software spanning communication libraries, drivers, and firmware integrations.
Collaborate with hardware, firmware, and SDK teams to co-design network features.
Validate and integrate prototypes into our companys AI infrastructure and products.
Requirements: BSc/MSc/PhD in Computer Science or Electrical Engineering
5+ years of relevant experience and/or knowledge
Deep understanding of networking and communication internals NCCL, RDMA/RoCE, congestion control.
Hands-on experience with HW/SW/FW integration and low-level programming (C/C++, kernel, drivers).
Some background in distributed training systems (such as PyTorch DDP, Megatron-LM, DeepSpeed).
Ways to stand out from the crowd:
Demonstrated innovation and leadership turning prototypes into impactful product features.
Experience with programmable data planes (P4, eBPF, DOCA SDK, or switch SDKs).
Familiarity with NIC firmware scheduling, in-network compute, or congestion management.
Contributions to open-source projects, academic papers, or performance benchmarking tools.
Strong background in AI factory architectures, distributed inference, or network telemetry.
This position is open to all candidates.