Were growing and we need people who are experienced in devops positions to help us grow faster and bigger.
We expect this position to be taken by someone who is ready to tackle big production systems and wants to learn what it takes to scale a system greatly, manage it, maintain it, and keep it operational at all times.
This includes, but is not limited to, suggesting, planning and executing tasks that will help achieve this goal.
Key Responsibilities:
Cloud Infrastructure Management: Provision, manage, and optimize resources across Azure (preferred), AWS, and GCP, ensuring high availability, cost-efficiency, and performance.
Kubernetes Orchestration: Architect and operate advanced Kubernetes environments both managed (AKS, EKS, GKE) and on-prem (RKE, Rancher, OpenShift).
CI/CD Engineering: Develop and manage robust CI/CD pipelines using Bitbucket Pipelines, Argo Workflows, and Jenkins, automating build-test-deploy workflows.
Monitoring & Logging: Implement and manage monitoring systems with Prometheus, Grafana, and centralized logging with ELK/EFK stacks.
Infrastructure Automation: Leverage Terraform, Ansible, and scripting (Python/Bash) to build and manage infrastructure as code (IaC).
On-Premise and Air-Gapped Deployments: Architect and support isolated environments using local DNS (PowerDNS), registries (Harbor), GitOps, and secure deployment practices.
Security & Compliance: Implement IAM policies, secrets management (Vault), encryption, and secure software delivery pipelines.
Documentation & System Design: Author detailed technical documentation including architectural blueprints, SOPs, and disaster recovery plans.
Collaboration & Mentorship: Work cross-functionally with developers, product, and QA teams; mentor junior DevOps engineers; and drive a culture of excellence.
Customer Engagement: Participate in technical discussions and workshops with enterprise clients to support onboarding and production success.
Requirements: Minimum Qualifications:
5+ years of experience in a DevOps, Site Reliability, or Platform Engineering role.
Strong command over Linux system administration, cloud networking, and container orchestration.
Proven experience with Azure, AWS, and GCP cloud services, with Azure being a strong preference.
Advanced skills in Kubernetes, with expertise in OpenShift, RKE, and Rancher.
Familiarity with GitOps, Bitbucket Pipelines, Helm, and ArgoCD.
Experience with observability using Prometheus, Grafana, Elasticsearch, Fluentd/Filebeat, and Kibana.
Hands-on expertise with Terraform, Ansible, and scripting languages like Bash or Python.
Knowledge of secure deployment practices, disaster recovery, and high availability designs.
Nice to Have:
Red Hat, Kubernetes, or cloud certifications (e.g., RHCA, CKA, Azure DevOps Expert).
Experience in GPU-enabled Kubernetes clusters.
Familiarity with DNS, Image Registry, or service mesh technologies.
Exposure to hybrid infrastructure environments.
Understanding of DevSecOps and compliance standards.
This position is open to all candidates.