We are looking for an outstanding Software Engineer to join NSV tools (Network Solutions Validation) group. As a senior team member, you will be part of a development effort of high-performing software automation systems for NVIDIA's Data Center environments. You will interact with NIC, OS, Switch, HCA, CPU and GPU compute as well as architects, network engineers, and developers. We drive the data growth of the worlds biggest companies. With talented engineers around the globe, the work environment is dynamic, meaningful, and fast-paced. Are you ready for the challenge?
What youll be doing:
Design and develop an automation platform used to provision, configure, and monitor HPC data centers.
Implement scalable, reliable, and maintainable services that enhance cluster visibility and improve operational efficiency.
Collaborate closely with internal and external stakeholders to understand requirements and deliver robust full-cycle solutions.
Improve stability and performance across the provisioning pipeline through architectural enhancements and code optimizations.
Troubleshoot issues in distributed environments and contribute to system observability and reliability improvements.
Work cross-functionally with architects, DevOps engineers, product managers and stakeholders to ensure high-quality releases.
Participate in code reviews, technical design discussions, and continuous improvement activities within the team.
Requirements: What we need to see:
B.Sc. in Computer Science, Engineering, or a related field (or equivalent practical experience).
5+ years of strong hands-on experience on Linux-based platforms.
Proficient scripting and automation skills (Bash, Python, Ansible).
Background in DevOps and Network Engineering practices.
Hands-on experience with large-scale network architectures, switches/routers, OVS, SR-IOV, and network operating/management systems.
Networking expertise: Ethernet, VLANs, TCP/UDP/IP, QoS, L2/L3 protocols, BGP, EVPN/VXLAN, and common network topologies.
Practical experience with containers and cloud-native technologies (Docker, Kubernetes) and networking performance.
Experience with version control systems (Git) and CI/CD pipelines.
Independent, fast learner with strong ownership mindset, excellent debugging and problem-solving skills, and effective communication abilities.
Ways to stand out from the crowd:
Experience as Team Lead/ Scrum master or similar leadership role.
Experience in planning, tracking, and delivering projects.
Familiarity with DevOps methodologies and tools (e.g., Jenkins, Ansible).
Hands-on experience with Docker and containerized environments.
Experience with agentic AI development.
This position is open to all candidates.