we are looking for an experienced DevOps Engineer to join our high-performance team. Youll work closely with development teams to design and implement smarter processes and tools, while embracing a GenAI-driven mindset. In this role, you will help build and scale infrastructure that not only keeps our company running smoothly, but also powers next-generation AI-driven applications with speed, resilience, and efficiency.
our companys Technology Stack sample:
AWS, Kubernetes, Terragrunt, Ansible, Jenkins, ArgoCD, Argo-Workflows, Service Mesh, Nginx, CloudFlare, Hashicorp Vault/Consul, Kafka, RabbitMQ, Prometheus, Grafana, VictoriaMetrics, CircleCI
Programming languages: Python, NodeJS, Go, Kotlin
What am I going to do?
Maintain and build a large-scale, highly available cloud infrastructure focusing on K8S.
Improve resiliency and cost efficiency of our cloud infrastructure.
Use GenAI tools to automate troubleshooting, speed up incident resolution, and improve production reliability.
Develop AI-driven self-service solutions to accelerate developer issue resolution and resource provisioning.
Develop and adopt new tools to make Development and Operations processes at our company more efficient.
Collaborate with developers to optimize system performance, reliability, and scale.
Evolve and maintain our companys AWS infrastructure by improving and adopting new services.
Support AI/ML/GenAI services with scalable infrastructure and monitoring.
Maintain our company availability by participating in DevOps on-call shifts.
Mentor DevOps engineers.
Requirements: 4+ years of hands-on DevOps / Platform Engineering experience in production environments within a public cloud environment (AWS preferred)
Strong, production-grade Kubernetes experience (design, deployment, scaling, and troubleshooting) with solid AWS experience (VPC, IAM, EC2, EKS, Load Balancers, DNS)
Experience designing and operating highly available, scalable infrastructure systems
Experience with managed and distributed databases (AWS Aurora, RDS, MongoDB, Redis)
Hands-on experience with Infrastructure as Code and configuration management (Terraform required, Terragrunt & Ansible - advantage)
Experience with Docker and containerized workloads
2+ years of experience building and maintaining CI/CD pipelines (Jenkins, GitHub Actions)
Proficiency in Python for automation and strong Linux administration skills
Experience with monitoring and observability tools (Prometheus, Grafana)
Development experience and familiarity with GenAI platforms (AWS Bedrock, Vertex AI, OpenAI) - advantage.
This position is open to all candidates.