Required Senior Software Engineer, AI Platform
About the role
As a Senior Software Engineer - AI, youll design, build, and own production grade AI agents that operate at the core of our cloud security platform. Youll work on distributed, cloud native services that embed agentic AI workflows into our existing microservices architecture.
This role goes beyond building AI logic: youll be responsible for operating AI systems in production, ensuring they are observable, reliable, and continuously improving through systematic evaluation and data driven iteration.
On a typical day youll:
Design and implement cloud-native, distributed services that power our AI-driven security features
Build and maintain agentic AI systems that reason over large-scale cloud security data and interact with multiple internal services
Own AI agents in production, including deployment, monitoring, troubleshooting, and performance optimization
Implement observability for AI systems, including metrics, logging, tracing, and alerting for agent behavior, quality, latency, and cost
Develop continuous evaluation pipelines for agentic solutions, including offline testing, regression detection, and production feedback loops
Design and optimize RAG pipelines that operate reliably over high-volume, high-variance security data
Apply strong software engineering practices: clear APIs, clean abstractions, robust error handling, and scalable data flows
Lead services end to end - from design and implementation to deployment and long-term operation
Collaborate closely with Data Platform, Product, and Security Research teams to ensure AI behavior is correct, explainable, and trustworthy.
Requirements: 5+ years of professional software engineering experience building and operating production systems
Strong proficiency in Python & Typescript and experience designing backend services
Solid experience building cloud-native, distributed systems in a microservices architecture
Hands-on experience building, deploying, and maintaining AI systems in production
Proven hands-on experience building AI systems using LLM and agentic frameworks in production
Practical experience with agentic AI workflows, including tool use, multi-step reasoning, and orchestration
Experience implementing observability and monitoring for complex systems (metrics, logs, traces)
Experience designing or working with evaluation frameworks for AI systems (quality, drift, latency, cost)
Ability to reason about tradeoffs and continuously improve systems based on real-world data
Big advantage
Experience evaluating AI systems in high-stakes domains (security, reliability, correctness)
Background in cloud security, cybersecurity, or large-scale SaaS platforms
Familiarity with RAG evaluation techniques, prompt versioning, and regression testing
Experience operating AI-enabled services at scale in AWS or similar cloud environments.
This position is open to all candidates.