At our company, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; its a Dream job. we are where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Lets build something extraordinary together.
our company's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Built as part of a broader sovereign AI platform, our technology is designed to operate in on-premise, private cloud, and air-gapped environments, enabling nations to maintain full control over their data, infrastructure, and AI capabilities. Central to our company's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.
At our company, our talented team, driven by passion, expertise, and innovative minds, inspires us daily. We are not just dreamers, we are dream-makers.
Responsibilities
Build and operate ML training infrastructure - distributed training pipelines, compute scheduling, and reproducible experiment workflows that data scientists rely on daily.
Own model serving and inference systems - packaging, deployment, autoscaling, A/B testing, canary rollouts, and latency/cost optimization for production models.
Run feature stores, model registries, and dataset versioning - enabling self-serve feature engineering, model lineage, and reproducible experiments across teams.
Build experiment tracking and evaluation infrastructure - automated evals, comparison dashboards, drift detection, and monitoring that give teams visibility into model behavior and performance.
Build and maintain production pipelines for training, fine-tuning workflows, and serving domain models - owning reliability, reproducibility, and scale.
Build and maintain the monitoring and observability layer - model performance tracking, data and prediction drift detection, data quality validation, and alerting.
Improve performance and cost across the ML stack - training throughput, inference latency, batch vs. real-time tradeoffs, and compute cost management.
Ship shared tooling - libraries, templates, CI/CD for models, IaC, and runbooks - while collaborating across Data Platform, AI, Data Science, Engineering, and DevOps. Own architecture, documentation, and operations end-to-end.
Requirements: 5+ years in software engineering, with 2+ years focused on ML infrastructure, MLOps, or data-intensive systems
Engineering craft - Strong Python, distributed systems design, testing, secure coding, API design, CI/CD discipline, and production ownership.
ML platform & serving - Model serving frameworks (e.g., Triton, TorchServe, vLLM, Ray Serve); model packaging, deployment pipelines, and inference optimization
Training infrastructure - Distributed training pipelines (e.g., frameworks like PyTorch, JAX) experiment orchestration and reproducibility
ML lifecycle tooling - Feature stores, model registries, experiment tracking (e.g., MLflow, Weights & Biases); dataset versioning and lineage
Data pipelines - Building training and inference data pipelines; familiarity with tools like Spark, Airflow/Dagster, and streaming ingestion
Comfortable with AI coding tools like Cursor, Claude Code, or Copilot
Nice to Have:
Experience operating in constrained environments - on-premise, private cloud, or air-gapped deployments
Hands-on experience with simulation environments, synthetic data generation, or reinforcement learning workflows
Platform & infra - Kubernetes, AWS, Terraform or similar IaC, CI/CD, observability, incident response
Hands-on data science or applied ML experience.
This position is open to all candidates.