We are seeking a hands-on Senior Network Performance Engineer to join our Networking Insights team. This role is for an investigative engineer who will thrive in our diagnostics lab while also solving the most complex performance challenges in AI data centers.
We're looking for an engineer to investigate hardware behaviours, particularly the subtle phenomena that emerge when running demanding AI workloads at scale. If your passion is for deep, investigation work to understand complex behaviours, this role will place you at the forefront of the AI revolution."
What You'll Be Doing
Experimental Root Cause Analysis: Design and build targeted experiments from the ground up to replicate complex hardware behaviors observed in AI data centers. You will then analyze the results to hunt down the root cause of these behaviors, whether in silicon, firmware, or software.
Hands-On Lab Investigation: Spend your time in the lab, working directly with the most advanced networking ASICs and systems to profile performance and characterize behavior under stress.
Test Automation and Development: Write and debug advanced automation scripts (Python) to programmatically control traffic generators (e.g., IXIA) and manipulate the test environment to expose corner-case issues.
Lab Environment Management: Maintain and support our lab environment, including equipment setup and racking, procurement, inventory, and coordinating maintenance and upgrades to support ongoing investigations.
Requirements: What We Need to See:
B.S.c in Engineering/Computer Science or equivalent experience with a strong foundation in hardware-software interaction.
5+ years of deep hands-on lab experience focused on hardware validation, testing, and performance-tuning.
Strong proficiency in Python for test automation, hardware diagnostics, and data analysis.
Familiarity with basic networking concepts (Ethernet, Routing) and large-scale network design.
Proven ability to collaborate effectively with multi-functional teams, including hardware, software, and architecture groups.
Curiosity and a problem-solving approach, driven to understand how things work at the fundamental level.
Ways to Stand Out From the Crowd:
Expertise in Ethernet protocols, L2/L3 routing, and large-scale data center network topologies.
Proficient in scripting tools for traffic generation (e.g., IXIA, Spirent) to compose intricate traffic scenarios, rather than simply running pre-existing scripts.
Expertise in validating and stress-testing network systems at the component level (e.g., NICs, Switches), with a focus on hardware diagnostics beyond standard protocol testing.
Familiarity with the unique network architectures and operational challenges of large-scale AI, HPC, or hyperscale data center environments (e.g., RDMA/RoCE, congestion control, high-radix fabrics).
Hands-on experience in configuring and managing datacenter network equipment.
This position is open to all candidates.