we are seeking a Senior Infrastructure Engineer to join our Core SRE team and help build the next generation of our stream-based observability platform. We deliver real-time data analytics at scale to some of the worlds leading tech companies
The Core SRE team is responsible for the foundational infrastructure that powers :
Kubernetes Infrastructure: Managing over 10,000 nodes across multiple cloud providers and regions. production is 100% Kubrenetes based
Kafka Clusters: Maintaining critical, large-scale Kafka clusters processing billions of events per second.
Automation & Operators: Building and maintaining both open-source and custom Kubernetes operators to manage complex stateful workloads like Kafka. Our tech stack is constantly evolving. It includes: Kubernetes, Go (Golang), AWS, GCP, Kafka, Istio, and more.
Responsibilities:
Act as a hands-on technical leader with deep expertise in modern cloud infrastructure.
Serve as a go-to person in the team - leading through influence, not hierarchy.
Collaborate cross-functionally to refine requirements and propose innovative, scalable solutions.
Drive long-term, high-impact infrastructure projects across multiple teams, from design to implementation, within defined timelines.
Contribute to improving system reliability, performance, and cost-efficiency at scale.
Requirements: 5+ years of experience in DevOps, SRE, platform engineering, or infrastructure roles.
Deep understanding of Kubernetes: API, CNI, scheduling, container runtimes and such.
Strong hands-on experience with Kafka and Istio (or similar technologies ), and core networking protocols (HTTP, gRPC, TLS).
Proven experience managing large-scale cloud infrastructure (AWS, GCP, etc.).
Experience in incident response and troubleshooting complex distributed systems.
Some software engineering experience, preferably in Golang.
Passion for automation, performance tuning, and operational excellence.
This position is open to all candidates.