דרושים » מחשבים ורשתות » Site Reliability Team Leader

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 6 שעות
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an experienced SRE Team Lead to drive the reliability, observability, and automation practices across our private cloud infrastructure and operations. In this role, you will lead a team of site reliability engineers, own the engineering roadmap for monitoring and automation, and act as a key liaison between development, operations, and platform teams. You bring at least 3-4 years of hands-on people management experience and a deep technical background in SRE or DevOps disciplines.



What will you do?

Leadership & Team Management

Lead, mentor, and grow a team of SREs, providing technical direction, career development guidance, and day-to-day management.

Own the team roadmap for reliability, observability, and automation initiatives - prioritizing work, removing blockers, and driving delivery.

Conduct regular 1:1s, performance reviews, and hiring processes to build and sustain a high-performing team.

Foster a culture of operational excellence, blameless post-mortems, and continuous improvement.

Act as an escalation point for complex incidents and reliability issues, leading post-incident reviews and ensuring follow-through on action items.


Automation & Infrastructure

Design, develop, and maintain automation tools to support infrastructure and operations teams at scale.

Manage pipelines and infrastructure workflows using Jenkins, Ansible, Python, and Bash.

Drive the adoption of infrastructure-as-code practices across the organization.

Collaborate with system engineers to improve scalability, performance, and fault tolerance of critical systems.


Monitoring & Observability

Build and extend monitoring and alerting systems using Grafana, the ELK (Elastic) stack, Zabbix, and custom scripts.

Implement and enforce observability best practices to ensure full visibility into systems, applications, and infrastructure.

Define and track SLIs, SLOs, and error budgets across key services.

Partner with development teams to embed observability earlier in the software development lifecycle.


Database & Platform Support

Support monitoring and infrastructure integration for databases including MongoDB and PostgreSQL.

Maintain documentation and champion knowledge sharing around automation, monitoring, and reliability practices.
Requirements:
What you need:

Experience & Leadership

3-4+ years of experience in a people management or team lead capacity within SRE, DevOps, or infrastructure engineering.

5-8+ years of overall experience in SRE, DevOps, or infrastructure automation roles.

Proven track record of building, coaching, and retaining high-performing engineering teams.

Experience owning an engineering roadmap and driving cross-functional reliability initiatives.



Technical Skills

Strong scripting skills in Python and Bash; comfortable building and maintaining production-grade automation.

Hands-on experience with infrastructure automation tools, particularly Ansible.

Solid experience with monitoring and observability platforms - ELK stack, Grafana, and Zabbix.

Good understanding of CI/CD pipelines and related tooling, including Jenkins.

Familiarity with managing and monitoring MongoDB and PostgreSQL in a production environment.

Comfortable working in Linux-based environments.

Excellent problem-solving skills and strong written and verbal communication.



Ability to support the following:

Experience with cloud providers - AWS, GCP, or Azure.

Exposure to containerization technologies such as Docker and Kubernetes.

Familiarity with infrastructure provisioning using Terraform.

Experience introducing SRE practices (SLOs, error budgets, chaos engineering) at an organizational level.

Exposure and experience with migrating/ building AI tools to improve process.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8662300
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 4 שעות
דרושים בCrowdStrike
Location: Tel Aviv-Yafo
Job Type: Full Time
CrowdStrike's Data Science Studio is seeking a pioneering Senior MLOps Engineer to establish and lead our MLOps function from the ground up. As the first MLOps engineer in the studio, you will play a foundational role in shaping how we build, deploy, and scale machine learning systems that protect thousands of organizations worldwide.

This is a unique opportunity to define the technical strategy, influence the technology stack, and architect the infrastructure that will power our AI/ML-driven security solutions for years to come.

This role combines strategic vision with hands-on execution. You'll work at the intersection of data science, engineering, and production operations - building production-grade systems that operate at immense scale while collaborating closely with highly technical data scientists and ML engineering teams across CrowdStrike.

What You'll Do:
- Architect MLOps infrastructure from the ground up: Design and implement the foundational MLOps platform, establishing best practices, tooling, and workflows that will scale with our growing data science initiatives
- Define technology strategy: Evaluate, select, and integrate MLOps technologies and platforms that best serve our needs - from experiment tracking and model versioning to deployment pipelines and monitoring systems
- Build production-grade ML pipelines: Develop robust, scalable pipelines for model training, validation, deployment, and monitoring that handle massive data volumes and ensure reliability in production
- Enable data scientist productivity: Create tools, frameworks, and automation that empower data scientists to move quickly from research to production while maintaining high quality and reliability standards
- Establish monitoring and observability: Implement comprehensive monitoring, logging, and alerting systems to ensure ML models perform optimally in production and issues are detected proactively
- Drive MLOps culture and practices: Champion best practices in ML engineering, CI/CD for ML, model governance, and reproducibility across the data science organization
- Collaborate cross-functionally: Partner closely with data scientists to understand their workflows and pain points, and work with ML engineering teams to ensure seamless integration with broader platform capabilities
 -Scale for the future: Design systems with scalability, security, and maintainability in mind, anticipating the needs of a rapidly growing ML portfolio
Requirements:
- 6+ years of experience in MLOps, ML engineering, DevOps, or related infrastructure roles with focus on machine learning systems
- Production ML systems expertise: Proven track record of building and operating ML systems at scale in production environments
- Strong infrastructure and automation skills: Deep knowledge of cloud platforms (AWS, Azure, or GCP), containerization (Docker, Kubernetes), and infrastructure-as-code (Terraform, CloudFormation)
- ML pipeline proficiency: Hands-on experience with ML workflow orchestration tools (e.g., Airflow, Kubeflow, MLflow, Metaflow) and building end-to-end ML pipelines
- Programming excellence: Strong coding skills in Python; experience with additional languages is a plus
- CI/CD and DevOps practices: Expertise in building automated deployment pipelines, version control, and modern DevOps methodologies
- Strategic and hands-on balance: Ability to think architecturally about long-term solutions while rolling up your sleeves to implement them
- Collaborative mindset: Excellent communication skills and ability to work effectively with data scientists, engineers, and stakeholders with varying technical backgrounds
- Startup mentality: Comfort with ambiguity and ability to build from scratch in a fast-paced environment
This position is open to all candidates.
 
Show more...
הגשת מועמדות
עדכון קורות החיים לפני שליחה
8611396
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 5 שעות
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an experienced SRE Engineer to drive the reliability, observability, and automation practices across our private cloud infrastructure and operations. In this role, you will be in a team of site reliability engineers, own the engineering roadmap for monitoring and automation, and act as a key liaison between development, operations, and platform teams. You bring at least 4+ years of hands-on people management experience and a deep technical background in SRE or DevOps disciplines.

What will you do?

Automation & Infrastructure
- Design, develop, and maintain automation tools to support infrastructure and operations teams at scale.
- Manage pipelines and infrastructure workflows using Jenkins, Ansible, Python, and Bash.
- Drive the adoption of infrastructure-as-code practices across the organization.
- Collaborate with system engineers to improve scalability, performance, and fault tolerance of critical systems.

Monitoring & Observability
- Build and extend monitoring and alerting systems using Grafana, the ELK (Elastic) stack, Zabbix, and custom scripts.
- Implement and enforce observability best practices to ensure full visibility into systems, applications, and infrastructure.
- Define and track SLIs, SLOs, and error budgets across key services.
- Partner with development teams to embed observability earlier in the software development lifecycle.

Database & Platform Support
- Support monitoring and infrastructure integration for databases including MongoDB and PostgreSQL.
- Maintain documentation and champion knowledge sharing around automation, monitoring, and reliability practices.
Requirements:
What you need:

4+ years of overall experience in SRE, DevOps, or infrastructure automation roles.

Strong scripting skills in Python and Bash; comfortable building and maintaining production-grade automation.

Hands-on experience with infrastructure automation tools, particularly Ansible.

Solid experience with monitoring and observability platforms - ELK stack, Grafana, and Zabbix.

Good understanding of CI/CD pipelines and related tooling, including Jenkins.

Familiarity with managing and monitoring MongoDB and PostgreSQL in a production environment.

Comfortable working in Linux-based environments.

Excellent problem-solving skills and strong written and verbal communication.


Ability to support the following:
Experience with cloud providers - AWS, GCP, or Azure.
Exposure to containerization technologies such as Docker and Kubernetes.
Familiarity with infrastructure provisioning using Terraform.
Experience introducing SRE practices (SLOs, error budgets, chaos engineering) at an organizational level.
Exposure and experience with migrating/ building AI tools to improve process.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8662378
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Senior IT SRE Engineer, you will be a key player in ensuring the reliability, scalability, and performance of our critical IT infrastructure. You will leverage SRE principles and an automation-first mindset to build and maintain resilient hybrid cloud environments. This role is ideal for a candidate who thrives in a fast-paced, innovative setting and is passionate about solving complex challenges with cutting-edge technology.
Key Responsibilities
Provision, configure, and support resilient hybrid cloud deployment architectures using an Infrastructure-as-Code framework.
Proactively collaborate with development teams to ensure new applications are production-ready, scalable, and reliable from inception.
Develop and maintain tools and frameworks to automate operational tasks, including deployment, monitoring, and recovery.
Conduct thorough root cause analysis of production issues and implement preventative measures to improve system resilience, demonstrating strong problem-solving skills.
Manage CI/CD platforms, Linux infrastructure, and contribute to capacity planning and operational runbooks.
Design and implement proactive service monitoring, alerting, and trend analysis to maintain service availability and performance SLAs.
Participate in an on-call rotation to support critical applications and services, responding to and resolving incidents efficiently.
Contribute to comprehensive documentation related to infrastructure design, deployment, and operational procedures.
Requirements:
Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent practical experience.
6+ years of Devops engineering experience on mission-critical, enterprise-level systems in a hybrid (both cloud and on-prem) environment.
3+ years of hands-on experience with cloud environments, preferably Google Cloud Platform (GCP).
Expertise in configuration management and Infrastructure-as-Code using frameworks such as Terraform and Ansible.
Strong programming/scripting knowledge in languages like Python, Bash, or Go for infrastructure automation.
Demonstrated experience with CI/CD pipelines (e.g., GitHub, Jenkins, Artifactory) and a strong foundation in Linux/Unix administration.
Preferred Qualifications
Experience with containerization and orchestration technologies, particularly Kubernetes.
Hands-on experience with monitoring and observability tools such as Datadog, Grafana, or Prometheus.
Understanding of networking principles including firewalls, load balancers, and complex network designs.
A curious and positive mindset with a passion for applied learning and challenging existing processes for continuous improvement.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8637997
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
Required Al Infrastructure & Reliability Engineer
What this role is really about
Youll join a 3-person platform team within our Business Technology group -owning the internal infrastructure that our AI platform and its users depend on. This isnt a product engineering role, and it isnt ticket work or babysitting pipelines someone else built. Youre building and operating the internal foundation that the company runs on. The work covers the full stack of platform engineering: core cloud infrastructure (AWS, Kubernetes, IaC), CI/CD pipelines, AI-driven infrastructure components, and the SRE and observability practice that keeps it all honest -metrics, alerting, incident response, and reliability standards. As our AI capabilities grow, so does the complexity underneath them, and staying ahead of that is central to the role. If you treat infrastructure as a product -reusable, automated, observable, and built to last -this is your kind of role.
Job responsibilities
DevOps & AI-Driven Infrastructure - own CI/CD, deployment processes, and release reliability. Build and operate cloud infrastructure that is automated, intelligent, and continuously self-improving - not just managed.
Design and build our Terraform repository and IaC pipeline from scratch -AI-assisted generation, drift detection, and policy enforcement built in.
Build AI-driven GitHub Actions pipelines -automated code review, risk assessment, and intelligent deployment decisions.
Manage Kubernetes workloads across AWS accounts -zero downtime, fully automated, nothing left behind.
Embed AI into the operational layer -proactive drift detection, automated remediation, and intelligent scaling toward a self-healing runtime.
Reliability & SRE -improve uptime, resilience, and incident response.
Define and enforce SLOs/SLIs, error budgets, and on-call practices.
Lead incident response, postmortems, and systemic reliability improvements.
Own AI-specific reliability: model latency SLOs, token quota monitoring, rate limit handling, fallback and retry strategies, and cost-per-request alerting.
Observability & Telemetry - increase visibility, reduce noise, improve troubleshooting.
Establish and continuously evolve the observability stack: metrics, logs, distributed tracing, and alerting tuned for both application and AI workloads.
AI / LLM Operations- bringing AI systems to production and operating them at scale, with a focus on reliability, performance, and trust.
Own the AI infrastructure layer: rate limits, quota management, latency SLOs, and fallback strategies (retries, circuit breakers).
Operate LLM APIs in production with resilience and cost attribution per team/model.
Requirements:
2-4 years Hands-on DevOps, SRE, or infrastructure engineering in production SaaS environments.
Strong AWS experience: multi-account architecture, cross-account IAM, serverless and event-driven services (Lambda, SQS, SNS, EventBridge), and EKS cluster management.
Proven Kubernetes experience in production, including cross-account migrations and stateful workload management.
Proficiency with Terraform - repository structure design, module architecture, and CI/CD pipeline implementation.
Hands-on experience building and maintaining GitHub Actions pipelines for end-to-end CI/CD workflows.
Working Python proficiency for scripting, internal tooling, and workflow automation.
Practical experience implementing observability stacks from scratch: metrics, logging, distributed tracing, and alerting.
Experience owning reliability practices: SLOs, incident response, and postmortem culture.
Nice to have
Hands-on experience operating LLM APIs in production: rate-limit and quota management, cost attribution per team/model, latency monitoring, and resilience patterns (retries, fallbacks, circuit breakers).
FinOps experience across cloud, AI, and observability spend.
Experience introducing self-healing or auto-remediation patterns in production.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8659781
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Software Developer Team Lead to join our community!
Key Responsibilities:
Lead and mentor a team of full-stack engineers, guiding them in design, development, and delivery of complex projects.
Drive the architecture and implementation of data-centric applications using modern technologies and cloud infrastructure.
Lead a fast-moving development team, ensuring strong execution and teamwork in a dynamic environment
Collaborate with QA, Product, Design, and DevOps teams to ensure system scalability, reliability, and performance.
Conduct code reviews, enforce best practices, and foster a high-quality engineering culture.
Research and integrate new technologies to overcome technical challenges and improve productivity.
Own full project lifecycle, from requirement gathering and design to deployment and monitoring in production. We are looking for an experienced full-stack team lead to join our R&D organization, lead the design and development of large-scale, data-oriented applications.
In this role, you will manage a talented team of full-stack engineers while remaining hands-on and technically involved in building our core SaaS security platform.
You will play a key role in shaping system architecture, driving technological innovation, and ensuring the delivery of robust and scalable solutions in a fast-paced, dynamic environment.
Requirements:
7+ years of professional experience in Full Stack development, including at least 3 years as a Team Lead.
Proven hands-on experience building and maintaining large-scale, data-oriented platforms.
Strong understanding of cloud architecture and distributed systems (AWS preferred).
Experience developing and maintaining CI/CD pipelines and production-grade microservices.
Excellent communication and mentoring skills; ability to inspire and lead a team.
Problem-solving mindset and a proactive, can-do attitude.
Strong collaboration skills and the ability to work in a fast-paced environment.
Experience with the following tech stack: Python, TypeScript, React, Kubernetes, Kafka, Postgres.
Experience with NoSQL databases and data lake storage solutions - Advantage.
Experience working on SaaS-based security products - Advantage.
Experience with browser extension or other end point clients - Advantage.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8643515
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/05/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Senior DevOps Engineer supporting our Cortex Research Group, you will lead all DevOps and infrastructure initiatives that empower our researchers to move quickly, securely, and reliably. You will be responsible for designing, building, and maintaining the groups cloud environments, ensuring scalability, stability, and performance across a wide range of experimental and production workloads. Youll serve as the primary point of contact between the Research Group and other critical stakeholders-including Security, Networking, and Compliance teams-ensuring that research projects align with organizational standards while still enabling rapid innovation.
Key Responsibilities
Own and evolve the Research Groups cloud infrastructure and CI/CD pipelines to enable reproducible, automated, and scalable experimentation.
Define and implement standards for infrastructure-as-code, observability, monitoring, and resource optimization tailored to research use cases.
Proactively collaborate with security and compliance teams to enforce best practices for data governance, access controls, and regulatory requirements.
Partner with networking and platform engineers to integrate research workloads into the broader company ecosystem, ensuring seamless operation.
Serve as the primary technical liaison between the Research Group and stakeholders like Security, Networking, and Platform teams.
Mentor engineers and researchers on DevOps best practices, helping to instill a culture of operational excellence and applied learning.
Requirements:
Your Experience:
5+ years of demonstrated experience in a DevOps, Site Reliability Engineering (SRE), or cloud infrastructure role.
Strong proficiency with infrastructure-as-code (IaC) tools such as Terraform or Ansible.
Hands-on experience building and maintaining CI/CD pipelines using tools like Jenkins, GitLab CI, or GitHub Actions.
In-depth knowledge of at least one major cloud provider (GCP, AWS, Azure).
Preferred Qualifications
Experience with containerization and orchestration technologies, particularly Docker and Kubernetes.
Proficiency in a scripting or programming language such as Python or Go.
Familiarity with monitoring and observability tools like Prometheus, Grafana, or the ELK stack.
Experience supporting machine learning or research-focused environments.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8638096
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Senior DevOps Engineer to join our R&D team in developing the next rising product in the health tech landscape. If you are looking for a challenging, influential position and are passionate about making an impact, this might be the role for you.

As a Senior DevOps Engineer, youll play a key role in the design, development, testing, deployment, and monitoring of our infrastructure and products. In this position, you'll make significant contributions to our observability stack, helping build and maintain robust systems for logs, metrics, traces, and alerting.

Our ideal candidate is passionate about DevOps and observability, has strong communication skills, and thrives on constant improvement for both technology and processes. If you enjoy working on multiple projects in parallel and are a proactive team player, youll fit right in.

This is a unique opportunity to join the core team of a fast-growing startup, where your contributions will have a direct impact on our product and success.

Responsibilities
Support and collaborate with cross-functional engineering teams using cutting-edge technologies.
Contribute to the design, implementation, and maintenance of monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, Loki).
Secure, scale, and manage our cloud environments (AWS and GCP).
Design and implement automation solutions for both development and production.
Manage and improve our CI/CD pipelines for fast and safe delivery
Lead best practices in infrastructure, observability, configuration management, and system hardening.
Continuously assess and improve existing infrastructure in line with industry standards.
Requirements:
5+ years of experience as a DevOps Engineer or similar software engineering role.
Proven experience with Docker and Kubernetes (EKS preferred).
Hands-on experience with monitoring and observability tools, including Prometheus, Grafana, Datadog, or similar.
Expertise in Terraform for AWS infrastructure-as-code deployments.
Strong collaboration and interpersonal communication skills.
Excellent analytical thinking and problem-solving mindset.
Proficiency with relational databases.
Solid knowledge of Python and Bash scripting.
Experience with test automation - an advantage.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8610670
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a hands-on DevOps Team Lead to take ownership of our infrastructure, DevOps practices, and automation pipelines.
You will be the technical and operational lead for a small but growing DevOps team, driving reliability, scalability, and security across our cloud environments.
In this role, you will split your time between leading and mentoring the team, designing and evolving infrastructure, and implementing solutions.
What Youll Do
Lead, mentor, and grow the DevOps team.
Define and enforce DevOps best practices across infrastructure, CI/CD, and security.
Manage the SeaPod Lab environment for developer and test usage.
Operate and evolve the SeaPod Server Linux infrastructure, deployed at scale worldwide, handling complex connectivity and security.
Maintain consistent baselines, update tools, and ensure fleet-wide monitoring and support.
Design, manage, and evolve AWS infrastructure (VPC, IAM, networking, RDS, EKS, etc.).
Operate and upgrade Kubernetes/EKS clusters, manage Helm charts, operators, and custom resources.
Define namespace policies, quotas, and resource allocations.
Drive security, compliance, and cost optimization.
Maintain and enhance GitLab CI pipelines for multiple workloads (Lambda, EKS, EC2, etc.).
Integrate testing, linting, and vulnerability scans into CI/CD workflows.
Build reusable pipeline components for microservices.
Own monitoring and alerting strategies (Grafana, CloudWatch, Coralogix, Prometheus).
Operate and tune PostgreSQL (RDS, Aurora) and manage backups/restores.
Manage distributed tracing. Lead upgrade from Fluentd → OpenTelemetry.
Architect and deploy serverless solutions (Lambda, DynamoDB, API Gateway).
Integrate with event-driven services (SNS/SQS, Kinesis, RDS Proxy).
Manage IAM roles/policies, secrets, and security posture.
Requirements:
5+ years of hands-on DevOps, including 2+ years in a leadership or mentoring role.
Strong production experience with AWS services (VPC, RDS, EKS, IAM, Lambda).
Proven track record operating Kubernetes/EKS clusters at scale.
Expertise with Terraform (or similar IaC tools) and GitLab CI/CD (or equivalent).
Solid background in Linux systems administration, ideally managing large distributed fleets.
Practical experience with PostgreSQL in production (replication, tuning, backup/restore).
Hands-on with observability stacks (Prometheus, Grafana, CloudWatch, OpenTelemetry).
Experience designing and operating secure, compliant environments (SOC2/ISO27001 familiarity a plus).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8610254
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an Engineering Manager to lead the evolution of our engineering foundations - across both frontend and backend domains.

This role blends technical leadership, team management, and strategic impact. Youll guide a multidisciplinary team responsible for developer experience, infrastructure, and governance across the R&D organization.

The teams mission is to increase developer velocity, ensure product quality, and strengthen engineering consistency across all our products - from web applications to backend microservices.


What Youll Do


Technical & Strategic Leadership

Define and own the vision and roadmap for our development platform, including web, backend, and AI-enablement initiatives.

Partner with engineering leadership, architects, and product teams to remove friction and increase velocity across R&D.

Establish and govern engineering standards, testing practices, and secure development workflows.

Lead the evolution of our shared frameworks, build systems, and developer tools for scalability and security.


Frontend & Web Infrastructure

Own and evolve the UI development ecosystem, including build systems, testing frameworks, and local development experiences.

Govern our shared UI kit and design system, ensuring consistency, accessibility, and performance across all products.

Partner with UX and product engineers to create a cohesive, maintainable frontend foundation.


Backend & Microservices Infrastructure

Oversee and guide the microservices development framework, templates, and communication patterns.

Optimize monorepo builds, CI/CD pipelines, and runtime standards for efficiency, reliability, and observability.

Implement infrastructure-level protections that improve product quality and prevent regressions.


AI Enablement Across R&D

Champion and integrate AI-assisted development tools (code generation, code review, testing, documentation).

Define patterns and best practices for AI-driven workflows, enabling teams to move faster with higher quality.

Partner with data, DevOps, and product teams to explore how AI can optimize engineering processes and infrastructure.


People & Collaboration

Lead, mentor, and grow a cross-functional team of backend, frontend, and platform engineers.

Foster a culture of technical excellence, innovation, and continuous improvement.

Collaborate closely with Product, UX, Security, and DevOps leaders to align strategy and execution.
Requirements:
8+ years of engineering experience, with at least 3 years in a technical leadership or team lead role.
Strong technical foundation in frontend architecture and tooling (React, TypeScript, build pipelines, testing).
Solid understanding of backend infrastructure, CI/CD, microservices, and distributed systems.
Experience leading platform or developer productivity teams in large-scale environments.
Proven ability to establish engineering standards, governance, and automation across R&D.
Curiosity and practical understanding of AI-assisted development workflows and related tooling (e.g., code completion, documentation, or CI automation).
Excellent communication skills and a collaborative mindset that bridges frontend, backend, and AI initiatives.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8633862
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
13/05/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
Are you ready to kickstart your DevOps career and be part of the infrastructure powering the future of cybersecurity? Join Infinity Next-our companys next-generation, cloud-native platform delivering a suite of cutting-edge products such as WAF, SD-WAN, and more.
We are looking for a DevOps Engineer to join our dynamic and growing team. This is an opportunity to work alongside top engineers, gain hands-on experience in production environments, and help scale secure, high-performance services used by customers around the globe. You will be part of a fast-paced, startup-like environment, with the backing and stability of a global cybersecurity leader.
Key Responsibilities
Support the deployment and maintenance of scalable, multi-tenant environments across cloud platforms
Assist in automating infrastructure using Infrastructure-as-Code tools and CI/CD pipelines
Monitor and improve the reliability, performance, and security of platform services
Collaborate with development, product, and operations teams to deliver new features and improvements
Troubleshoot infrastructure and application issues in development and production environments
Implement custom user interfaces using the latest programming techniques and technologies
Design, develop, and maintain DevOps-related microservices that support platform automation and reliability
Design and integrate agentic AI capabilities into DevOps workflows to automate decision-making, incident response, and platform operations.
Requirements:
Bachelors degree in Computer Science or a related technical field
At least 3 years of experience as a DevOps Engineer
Strong interest in cloud technologies, DevOps methodologies, and automation
Knowledge of containerization and container orchestration technologies, such as Amazon EKS
Experience in the design, operation, and troubleshooting of Kubernetes core components and API extensions for cloud-native, distributed systems
Familiarity with Linux, basic networking, containers, and scripting
Understanding of CI/CD, cloud infrastructure, and monitoring concepts
Experience building maintainable and testable codebases, including API design and unit testing techniques
Hands-on experience applying GitOps principles to manage Kubernetes infrastructure and application deployments
Nice to Have
Exposure to Kubernetes in cloud environments such as Amazon EKS
Familiarity with Ingress Controllers, Kubernetes Gateway API, CloudFront, and Global Accelerator
Experience designing and developing Kubernetes operators and controllers
Experience or coursework with tools such as Terraform, Pulumi, Crossplane, and Helm
Hands-on experience with observability technologies such as Prometheus, Grafana, OpenTelemetry, and centralized logging systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8650194
סגור
שירות זה פתוח ללקוחות VIP בלבד