As a DevOps Engineer, you will be responsible for the reliability, scalability, and efficiency of our SaaS products. Your success will be measured by your ability to achieve the following:
First 3 Months: Master our GitOps-based deployment pipelines. You will be expected to independently manage and troubleshoot deployments using ArgoCD and Kargo, and contribute to the team's on-call rotation.
First 6 Months: Enhance our CI/CD processes and workflow efficiency. You will lead the project to reduce average build and deployment times by 20% by optimizing GitHub Actions, Helm charts, and introducing initial AI-assisted automation.
First 12 Months: Improve system scalability and reliability. You will design and implement infrastructure enhancements using Terraform to support a 25% increase in customer workload while maintaining a 99.9% uptime.
Core Responsibilities
Deployment Pipeline Management: Build and maintain our GitOps-based deployment pipelines to ensure a 99% success rate for all deployments and reduce manual intervention by 30% within the first year.
Infrastructure Management: Manage and scale our Kubernetes infrastructure on GCP, with a goal of optimizing resource utilization to achieve a 15% cost reduction in our GCP spending over the next 18 months.
Automation and CI/CD: Enhance and maintain our GitHub Actions CI/CD pipelines to decrease the lead time for changes to production by 25% within the first year.
AI-Assisted Workflow Integration: Integrate AI-assisted tooling into day-to-day DevOps and engineering workflows to improve productivity, scalability, and operational efficiency. You will leverage AI tools to generate initial configuration drafts, validate infrastructure code, and utilize AI-driven automation to reduce repetitive manual tasks by 20% within the first 6 months, accelerating engineering execution while maintaining high-quality standards.
System Reliability: Proactively improve system reliability and availability, with the objective of reducing the number of critical production incidents by 50% through improved monitoring, logging, and alerting within 12 months.
Requirements: What We're Looking For
3+ years in DevOps/SRE: You have proven experience in a high-growth SaaS environment and can hit the ground running to help us scale our platform.
Google Cloud Platform (GCP): You possess a deep understanding of GCP services, particularly GKE, which is essential as our entire infrastructure is on GCP.
ArgoCD and Kargo: You have hands-on experience with GitOps and progressive delivery, which is key to our goal of achieving faster, more reliable deployments.
Kubernetes and Helm: You bring strong experience in managing and deploying applications on Kubernetes, as you will be responsible for the container orchestration of our microservices.
Terraform: You have expertise in infrastructure as code, which will be crucial for our project to scale our infrastructure and reduce costs.
Forward-Thinking Automation: You have a strong interest in or experience with leveraging emerging technologies, including AI tools, to modernize workflows, validate code, and eliminate repetitive manual tasks.
This position is open to all candidates.