Were looking for a DevOps Team Leader to lead our DevOps efforts as we scale. This is a hands-on leadership role requiring deep technical expertise, strong project and people management skills, and the ability to navigate complex stakeholder environments.
The ideal candidate will lead the day-to-day operations of the DevOps team, ensure operational excellence, drive platform improvements, and manage cross-functional alignment with Dev/Finance/Security/ higher manager.
Responsibilities
Team Leadership & Execution
Lead, mentor, and grow a high-performing DevOps team.
Manage day-to-day operations including incident management and task prioritization.
Ensure SLAs and compliance requirements are met.
Balance proactive platform improvements with reactive issue resolution.
Platform Ownership
Oversee the design, implementation, and maintenance of a secure, scalable, multi-region AWS infrastructure.
Own CI/CD pipelines, infrastructure-as-code (IaC), observability (logs, metrics, tracing), and automation tooling.
Ensure robust disaster recovery (DR) and business continuity practices are in place and regularly tested.
Stakeholder Collaboration
Act as the main point of contact between DevOps and external/internal stakeholders: Banks, Regulators, Security teams, NOC/SOC, and Development teams.
Communicate clearly on priorities, incidents, risks, timelines, and platform status.
Represent DevOps in cross-functional planning and reviews.
Process & Standards
Define and evolve the DevOps teams SDLC, deployment standards, and incident response processes.
Drive best practices in monitoring, alerting, and reliability engineering.
Champion a culture of ownership, transparency, and continuous improvement.
Requirements: 7+ years of DevOps / SRE experience, including at least 2 years in a leadership or tech lead role.
Proven experience managing multi-region AWS production environments.
Strong skills in Terraform, Kubernetes, CI/CD (e.g., GitHub Actions), observability tools (e.g., Datadog, Prometheus, OTLP).
Hands-on experience with high-availability systems, disaster recovery, and compliance-driven environments.
Ability to balance short-term firefighting with long-term vision.
Excellent communication and stakeholder management skills.
This position is open to all candidates.