The Ecosystems Engineering group is seeking a Principal Software Engineer to join our rapidly growing team to bridge the gap between cutting-edge hardware and world-class open-source software. This is a high-impact role at the intersection of Hybrid Cloud and AI. You will lead the design and productization of next-generation AI solutions, focusing on the deep integration of our software stack with Nvidias hardware acceleration technologies.
This is a game-changing opportunity to join an open-source AI platform that harnesses the power of hybrid cloud to drive innovation. In this role, you will work with a diverse team of highly talented engineers, product teams, Nvidia engineers, and lighthouse customers.
You'll play a critical role in shaping the next generation of hybrid cloud platforms by directly contributing to our innovative AI and cloud offering . This is your chance to be at the forefront of AI's exciting evolution, joining an ecosystem that champions continuous learning, career growth, and professional development.
What You Will Do
Architect and lead the implementation of new features and joined solutions for our AI & Cloud with Nvidia cutting edge technologies: GPU, DPU and more.
Explore deep code integration into various products, ensuring optimal integration between the portfolio, hardware accelerators and partners.
Provide technical vision and leadership on critical and high-impact projects, ensuring non-functional requirements including security, resiliency, and maintainability are met.
Collaborate closely with UX, UI, QE, and cross-functional teams to deliver a great experience to our partners and customers.
Coordinate with team leads, architects, and other engineers on the design and architecture of our offerings.
Become responsible for the quality of our offerings, participate in peer code reviews and continuous integration (CI), and respond to security threats.
Mentor, influence, and coach a distributed team of engineers, contributing to a culture of continuous improvement by sharing recommendations and technical knowledge.
Explore and experiment with emerging AI technologies relevant to software development, proactively identifying opportunities to incorporate new AI capabilities into existing workflows and tooling.
Requirements: 7+ years of relevant technical experience in software development.
Advanced experience working in a Linux environment with at least one language like Golang, Python, Java, C, or C++.
Advanced experience with a container orchestration ecosystem like Kubernetes, or OpenShift.
Strong experience with microservices architectures and concepts including APIs, versioning, monitoring, etc.
Virtual Networking / Software Defined Networking (SDN) experience
Experience with AI/ML technologies, including foundational frameworks, large language models (LLMs) and orchestration tools.
Ability to quickly learn and guide others on using new tools and technologies.
Proven ability to innovate and a passion for staying at the forefront of technology.
Excellent system understanding and troubleshooting capabilities.
Autonomous work ethic, thriving in a dynamic, fast-paced environment.
Proficient written and verbal communication skills in English.
The Following is Considered a Plus
Experience with cloud development for public cloud services (AWS, GCE, Azure).
Background in DevOps or site reliability engineering (SRE).
Experience with hardware accelerators (e.g., GPUs, CUDA, DOCA) for AI workloads.
Recent hands-on experience with distributed computation, either at the end-user or infrastructure provider level.
Experience with performance analysis tools.
Experience with Linux kernel development.
This position is open to all candidates.