Were looking for a Principal Software Engineer to own the RL Gym platform end-to-end: from architecting multi-site web environments that simulate real-world attack surfaces, to optimizing our in-house orchestration harness (AgenticVerse) for high-performance delivery into customer training pipelines.
This is a builder role. Youll lead a small team (including a dedicated web environments engineer), operating with high autonomy, moving fast from concept to working prototype to production system. Youll interact directly with customer engineering teams to understand their infrastructure constraints and deliver environments that meet their scale and reliability requirements.
Why this role:
This is one of the few roles in the industry where your code directly influences how the next generation of AI models are trained. Youll be at the center of advancing AI safety, building systems that the worlds top labs depend on to make their models more robust. The work is technically deep, the problem space is genuinely novel, and the field is moving faster than any team can keep up with alone. Theres no playbook. Youll write it.
What youll do:
Platform & performance:
Own and evolve AgenticVerse, our in-house orchestration harness that provisions and manages RL environments at scale. Focus on performance: low-latency provisioning, high concurrency, minimal overhead per environment instance
Design and build isolated, reproducible web environments using Firecracker microVMs or Docker containers
Architect multi-site scenarios (3-4 interconnected web applications per task) with rich interactions: drag-and-drop, file uploads, authentication flows, LLM-in-the-loop components
Implement deterministic verifiers that evaluate agent behavior with zero ambiguity
Customer delivery:
Work directly with engineering teams at leading AI labs to integrate RL Gym environments into their training and evaluation pipelines
Translate customer specs into working environments, iterating rapidly on feedback
Own the technical relationship: SLAs, API contracts, integration architecture
Adapt environment delivery formats to cus tomer infrastructure (real-time API calls vs. offline batch, managed vs. raw artifacts)
Build customer-facing UIs when needed (dashboards, environment configuration portals, monitoring interfaces)
Rapid prototyping:
Take ambiguous problem descriptions and produce working prototypes within days, not weeks
Validate new environment types, interaction patterns, and verifier approaches quickly
Build internal tooling that accelerates scenario authoring and testing.
Requirements: Must have:
8+ years of software engineering experience, with a track record of building production systems from zero
Deep expertise in infrastructure: Linux, containers (Docker), VMs (Firecracker or similar), networking, cloud platforms (AWS strongly preferred)
Strong Python skills and comfort with async/concurrent systems
Experience building platforms or developer tools (not just consuming them)
Full-stack capability: backend services, infrastructure-as-code, APIs, and frontend development (React or similar) for customer-facing interfaces
Demonstrated ability to work autonomously with minimal specification, making sound architectural decisions under ambiguity
Comfort working directly with external customers and translating technical constraints into engineering solutions
English fluency (written and verbal) for customer-facing communication
Nice to have:
Experience with reinforcement learning infrastructure, training pipelines, or evaluation frameworks
Background in security, adversarial testing, or trust & safety systems
Familiarity with browser automation, headless browsers, or web scraping at scale
Experience with Kubernetes operators or custom schedulers
Prior work in a 0-to-1 environment (startup, innovation lab, or R&D team building new products).
This position is open to all candidates.