We are adding a third DevOps engineer to increase delivery capacity, reduce single-person risk on critical systems, and grow our database infrastructure capability from within. If building resilient cloud and on-prem systems at scale sounds like the right challenge, we'd like to hear from you.
What you'll be doing:
Design, build, and operate Kubernetes infrastructure across Azure AKS and on-prem clusters, including ingress, autoscaling with Keda, TLS management, and GPU-enabled workloads.
Extend and harden CI/CD pipelines in GitLab, manage runners across multiple environments, and evolve GitOps-based deployments through ArgoCD.
Maintain and improve the critical on-prem infrastructure - Linux servers, NGINX, container platforms, and networking - that several production workflows depend on.
Partner with development, data, and architecture teams to streamline delivery, improve observability across Datadog, and shorten time-to-recovery during incidents.
Contribute to flagship initiatives on the roadmap: per-site Kubernetes cluster rollouts, AKS upgrades and node pool reorganization, GPU cluster enablement, and secret management with Azure Key Vault, and Sealed Secrets.
Automate provisioning and configuration across Azure resources and on-prem systems using infrastructure-as-code and scripting.
Troubleshoot across the full stack - from networking and certificates to container runtime and pipeline internals - turning incidents into durable improvements.
Requirements: What we need to see:
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
3+ years in a DevOps, SRE, or infrastructure engineering role.
Hands-on proficiency with Kubernetes and container tooling (Docker for example) in production environments.
Track record of building and maintaining CI/CD pipelines, ideally in GitLab, including runner management and pipeline-as-code.
Fluency using AI-assisted development tools (such as Cursor, Codex or Claude) as a regular part of daily engineering work.
Solid Linux administration skills and fluency in Bash.
Practical background with a major cloud platform, Azure preferred (or AWS o/GCP).
Working knowledge of GitOps workflows and tooling such as ArgoCD or Flux.
Collaboration and ownership mentality, with the accountability needed to operate business-critical systems.
Ways to stand out from the crowd:
Hands-on experience with on-prem Kubernetes at scale, including cluster bootstrap, MetalLB, and ingress configuration.
Familiarity with secret management via HashiCorp Vault, Azure Key Vault, or Sealed Secrets.
Operational background with SQL (PostgreSQL, MySQL) and/or MongoDB, including backups, replication, or performance tuning.
Contributions to observability improvements with Datadog.
This position is open to all candidates.