Tech Lead - Cloud & DevOps Engineer
We are looking for a Tech Lead - Cloud & DevOps Engineer to lead the Platform team in designing, building, and operating scalable cloud infrastructure and Kubernetes-based systems for production workloads, IoT services, and AI-driven applications. This includes supporting environments that may also interact with multi-agent AI systems. Experience with AI and LLM technologies is not required but would be a significant advantage. This is a hands-on technical leadership role that combines architecture, implementation, team leadership, and mentorship. You will guide engineers, define best practices, and drive the overall DevOps and cloud strategy. Responsibilities
* Lead the design and operation of scalable cloud infrastructure on AWS
* Architect, deploy, and manage production Amazon EKS (Kubernetes) clusters
* Own and continuously improve CI/CD pipelines (Jenkins, GitHub, Bitbucket, Argo CD)
* Define and enforce Infrastructure as Code (IaC) standards using Terraform
* Drive GitOps-based deployment strategies
* Build automation tools and backend services using Python
* Lead the observability strategy using Grafana, Amazon CloudWatch, OpenSearch, Pixie, and other monitoring tools
* Ensure production reliability, scalability, security, and cost optimization
* Manage cloud networking, IAM, security, and connectivity
* Administer Linux and Windows servers
* Lead production incident response and root cause analysis
* Mentor and support DevOps engineers
* Conduct code reviews, technical design reviews, and knowledge-sharing sessions
* Collaborate closely with software, Embedded, data, and AI teams
* Define engineering standards, best practices, and platform architecture direction
Requirements: Requirements
* Bachelor's degree in Computer Science, Software Engineering, Computer Engineering, or a related technical field, or equivalent practical experience
* 5+ years of experience in Cloud & DevOps, including technical leadership experience
* Strong hands-on experience with AWS
* Kubernetes (Amazon EKS preferred)
* Docker and Terraform
* CI/CD pipelines (Jenkins, GitHub, Bitbucket, Argo CD)
* Strong Linux administration skills
* Hands-on Windows Server administration
* Python scripting and automation
* Experience with production troubleshooting
* Strong knowledge of cloud networking, IAM, VPC, and cloud security
* Experience with Grafana, Amazon CloudWatch, and observability tools such as the ELK stack
* Proven experience in technical leadership and mentoring engineers Nice to Have
* AWS Certified Solutions Architect (Associate or Professional)
* Experience with IoT or Embedded systems
* Experience designing scalable distributed systems
* Experience operating high-scale production systems (millions of devices)
* Experience with AI/LLM tooling (LangChain, LiteLLM, Langfuse)
* Exposure to multi-agent AI systems
This position is open to all candidates.