We are looking for an outstanding Senior DevOps Engineer to join our revolutionary, large-scale mobile content discovery platform used by millions of users worldwide. In this role, you won't just keep the lights on-you will take a leading role in shaping our data infrastructure, setting architectural standards for AI/ML workloads, and bridging the gap between DevOps and Data Engineering.
Why youll love this team: We move fast, use cutting-edge technologies, and value absolute technical excellence over rigid bureaucracy. If you are passionate about solving complex, high-traffic infrastructure puzzles and want to see your work directly impact millions of daily users, this is the sandbox youve been looking for.
What you'll be doing
Design Data-Native Cloud Solutions: Design and implement scalable data and AI/ML infrastructure across multiple environments using Kubernetes, orchestration platforms, and IaC to power our AI, ML, and analytics ecosystem
Accelerate Data/ML Engineer Experience: Spearhead improvements to data pipeline deployment, monitoring tools, and self-service capabilities that empower data teams to deliver insights faster with higher reliability
Engineer Robust Data/ML Platforms: Build and optimize infrastructure that supports diverse data workloads from real-time streaming to batch processing, ensuring performance and cost-effectiveness for critical analytics systems
Drive DevOps Excellence: Collaborate with engineering leaders across backend and ML teams, champion modern infrastructure practices, and mentor team members to elevate how we build, deploy, and operate data systems at scale
Collaborate on high-level technical designs with ML and Backend engineers to build resilient systems.
Requirements: 5+ years of hands-on DevOps experience building, shipping, and operating production systems
Infrastructure as Code: design and implement infrastructure automation using tools such as Terraform, Pulumi, or CloudFormation (modular code, reusable patterns, pipeline integration)
Cloud platforms: deep experience with AWS, GCP, or Azure (core services, networking, IAM)
Kubernetes: strong end-to-end understanding of Kubernetes as a system (routing/networking, scaling, security, observability, upgrades), with proven experience integrating data-centric components (e.g., Kafka, RDS, BigQuery, Aerospike).
GitOps & CI/CD: practical experience implementing pipelines and advanced delivery using tools such as Argo CD / Argo Rollouts, GitHub Actions, or similar
Observability: metrics, logs, and traces; actionable alerting and SLOs using tools such as Prometheus, Grafana, ELK/EFK, OpenTelemetry, or similar
Scalability & Performance: Proven experience managing production environments characterized by high traffic volumes and large amounts of data, with a focus on maintaining system reliability and cost-efficiency at scale.
You might also have
Coding proficiency in at least one language (e.g., Python or TypeScript); able to build production-grade automation and tools.
Data Pipeline Orchestration: Demonstrated success building and optimizing data pipeline deployment using modern tools (Airflow, Temporal, Kubernetes operators) and implementing GitOps practices for data workloads
Data Engineer Experience Focus: Track record of creating and improving self-service platforms, deployment tools, and monitoring solutions that measurably enhance data engineering team productivity
Data Infrastructure Deep Knowledge: Extensive experience designing infrastructure for data-intensive workloads including streaming platforms (Kafka, Kinesis), data processing frameworks (Spark, Flink), storage solutions, and comprehensive observability systems.
This position is open to all candidates.