This position calls for a creative, adaptable engineer who is eager to learn new technologies and thrive in a dynamic, open source-oriented culture. You will be part of a centralized Red Hat performance engineering team that fosters innovation and collaboration to achieve world-class AI performance.
What you will do:
Design performance test plans for new MLPerf Training and Inference suites (datacenter & edge) covering LLMs primarily.
Execute accuracy, performance and power submissions; triage regressions and drive improvements using LoadGen logs, traces, and other profiling tools.
Build and maintain highly optimized container-based harnesses (Podman/Kubernetes) for benchmark execution, dataset pre-processing steps and compliance checks for reproducible CI execution.
Profile kernels, GPU runtimes and distributed collectives; propose patches to Red Hat AI software stack to remove bottlenecks revealed by results.
Represent Red Hat in working groups; upstream fixes and new benchmark proposals.
Present results and best practices at premier open source and industry conferences; author technical blogs and white papers that translate benchmark data into customer value.
Requirements: 5+ years of relevant industry experience in performance engineering or ML infrastructure
Hands-on MLPerf (Training or Inference) harness work
Strong Python & Bash proficiency plus one systems language (Go/Rust/C++)
Expert Linux skills (cgroups, scheduler, perf, NUMA, GPU drivers, etc)
Experience with container orchestration (Kubernetes, OpenShift)
Performance-profiling literacy (nvprof, Nsight Systems, perf, eBPF)
Clear written & spoken English
This position is open to all candidates.