we are building a real-time AI runtime platform for security algorithms running inline across our global cloud and physical PoPs.
We are looking for a hands-on AI Platform Team Lead to build and lead the team behind this platform: a high-throughput, low-latency engine that runs GPU-based models, from MMBERT-style models to LLMs, together with CPU-based heuristics and security logic.
This is a core infrastructure role for someone who wants to own the runtime layer of AI security at scale: performance, reliability, orchestration, GPU efficiency, and production-grade execution in the traffic path.
The team will also own the model lifecycle required to take AI security algorithms from research to large-scale production, working closely with research and algorithm teams.
Responsibilities
Build and lead our companys AI Platform team: hiring, mentoring, architecture, technical direction, and execution.
Own the AI security runtime platform for high-throughput, low-latency inline security decisions across our companys global cloud and PoPs.
Design the orchestration layer for running GPU models, CPU heuristics, and security logic as one production engine.
Own production readiness: observability, SLOs, autoscaling, reliability, rollout, rollback, and operational health.
Own the model lifecycle platform: registry, versioning, deployment, monitoring, and safe production rollout.
Work closely with research and algorithm teams to productionize AI security models and algorithms at scale.
Define the long-term platform strategy for AI runtime and model serving at our company.
Requirements: 3+ years of leadership experience as a team lead, tech lead, or engineering manager.
3+ years of hands-on experience in AI inference, production ML infrastructure, model serving, or AI runtime platforms.
Strong experience with production inference technologies such as Triton, vLLM, CUDA, Kubernetes, Docker, PyTorch, ONNX, TensorRT, or similar.
3+ years of experience with Go, or strong experience with a similar high-performance backend language such as C++, Rust, or Java.
Experience with performance optimization, scalability, observability, and SLO-driven production ownership.
Strong system design skills, especially around distributed systems, performance, reliability, and production infrastructure.
Advantages
Experience with GPU optimization, GPU scheduling, GPU resource efficiency, quantization, runtime acceleration, or large-scale model serving.
This position is open to all candidates.