We are seeking a Technical Marketing Engineer to join our Ethernet Networking team to keep improving our performance leadership in AI. In this pivotal role, you will be the hands-on expert for our Spectrum-X Ethernet platform, showcasing its superiority for emerging AI use cases. You will develop and implement rigorous benchmarks on various GPU clusters, analyzing everything from LLM training to groundbreaking inference workloads. Your primary mission is to translate these performance results into compelling technical content, including white papers, blogs, and presentations, that clearly articulates why our Spectrum-X Ethernet solutions are the definitive choice for modern AI infrastructure.
Design and execute performance benchmarks using industry-standard tools (e.g., MLPerf, UCX, our Collective Communications Library - NCCL and CloudAI) and customer-representative AI workloads on our state-of-the-art GPU clusters.
Translate your benchmark data and technical insights into compelling, high-impact marketing assets and performance-driven sales enablement materials.
Collaborate closely with Product Management, ASIC and Software architecture and Sales teams, provide feedback on product features, and ensure our performance results are technically accurate and impactful.
Requirements: What we need to see:
B.Sc in Computer Science or Software Engineering or equivalent experience
5+ years of experience benchmarking and analyzing high‑performance networking solutions, including RDMA, MPI, and large‑scale collective communication frameworks.
Hands‑on expertise in testing and benchmarking deep learning workloads on our GPUs with CUDA, TensorFlow, and PyTorch, focused on validating and demonstrating distributed training and inference performance over NCCL, RoCE, and RDMA.
Shown proficiency in Performance Analysis methodologies and techniques.
Understanding of Ethernet and high-performance networking.
Programming experience with Python, Bash and C languages.
Experience with distributed job orchestration (Slurm, Kubernetes).
Experience with Linux OS distros.
Fast and self-learning capabilities with strong analytical and problem-solving skills.
In-depth knowledge and experience with AI workloads and benchmarking for large-scale distributed training/inference systems.
Ways to stand out from the crowd:
Strong Performance Analysis skills and methodologies using modern tools.
Deep knowledge in AI/Data Center Ethernet networks protocols and best-practices (Clos fabrics, BGP, VXLAN, etc.).
Hands-on experience with automation, CI/CD pipelines and DevOps practices.
Expertise in AI fabrics telemetry including metrics capturing and analysis as well as telemetry tools such as Prometheus and Grafana.
In-depth System knowledge and understanding (Intel / AMD / ARM CPUs, NVIDIA GPUs, NIC, Memory, PCI).
This position is open to all candidates.