דרושים » תוכנה » Sr. DevOps Engineer - Cloud, XSPM (Hybrid, ISR)

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
01/06/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Senior DevOps Engineer within the XSPM team, you will be a critical, go-to technical expert responsible for the health, performance, and evolution of our database and infrastructure systems. When production databases degrade or behave unexpectedly, you are the person who dives deep, investigating root causes hands-on, understanding the underlying mechanics of the problem, and designing lasting solutions. Your mastery of database systems makes you the authority the team relies on to diagnose complex performance issues, architect better data solutions, and ensure our infrastructure scales with confidence.

Beyond databases, you will drive our DevOps practices end-to-end - CI/CD pipelines, infrastructure automation, and operational reliability across the XSPM platform. This is a high-impact, highly visible role at the intersection of database engineering and DevOps, where your expertise directly shapes how the team delivers and operates at scale.

We're a highly collaborative, friendly, inclusive and diverse group that prizes collaboration over competition. We provide opportunities to learn new skills, mentor fellow engineers, and contribute to the direction of both the team and the products for which we're responsible. We work in a distributed, high-trust environment where you manage your own time and have the flexibility to balance your work and personal life.

What You Will Do:

Serve as the team's database expert, the first person to investigate, diagnose, and resolve complex performance problems across our production database systems (MongoDB, OpenSearch, PostgreSQL, Cassandra).

Perform deep-dive root cause analysis on database performance issues, understanding query execution internals, resource consumption patterns, cluster behavior, and system-level interactions to identify the real source of problems, not just symptoms.

Design and propose better database architectures and solutions, recommending when to re-architect data models, migrate workloads, introduce new technologies, or redesign how services interact with their data layer.

* You will put in every effort within the team to ensure the data architecture is well designed.

Own capacity planning, scaling strategies, and high-availability designs for database clusters, ensuring systems are built to handle the team's growth trajectory.

Act as the bridge between development and infrastructure, advising engineers on how their application patterns impact database performance and guiding them toward sustainable solutions.

Build and maintain CI/CD pipelines, infrastructure-as-code (Terraform, Helm, Kubernetes manifests), and automated deployment workflows for the xspm team's services.

Design and manage observability stacks, dashboards, alerting rules, and SLOs, to maintain best-in-class availability for critical data pipelines and services.

Drive infrastructure automation to reduce operational toil, including automated scaling, self-healing systems, and configuration management.

Participate in on-call rotations, incident response, and post-incident reviews, driving root-cause analysis and long-term reliability improvements.

Evaluate and adopt new database technologies and infrastructure tooling that align with the team's evolving data architecture needs.
Requirements:
7+ years experience in DevOps, SRE, DBA, or infrastructure engineering, with significant hands-on responsibility for production database systems at scale.

Expert-level knowledge of a common DB such as MongoDB, Opensearch, Postgress, deep understanding of its internals, performance characteristics, replication, sharding, and the ability to diagnose and solve complex issues from first principles.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8675475
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
24/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an experienced SRE Team Lead to drive the reliability, observability, and automation practices across our private cloud infrastructure and operations. In this role, you will lead a team of site reliability engineers, own the engineering roadmap for monitoring and automation, and act as a key liaison between development, operations, and platform teams. You bring at least 3-4 years of hands-on people management experience and a deep technical background in SRE or DevOps disciplines.



What will you do?

Leadership & Team Management

Lead, mentor, and grow a team of SREs, providing technical direction, career development guidance, and day-to-day management.

Own the team roadmap for reliability, observability, and automation initiatives - prioritizing work, removing blockers, and driving delivery.

Conduct regular 1:1s, performance reviews, and hiring processes to build and sustain a high-performing team.

Foster a culture of operational excellence, blameless post-mortems, and continuous improvement.

Act as an escalation point for complex incidents and reliability issues, leading post-incident reviews and ensuring follow-through on action items.


Automation & Infrastructure

Design, develop, and maintain automation tools to support infrastructure and operations teams at scale.

Manage pipelines and infrastructure workflows using Jenkins, Ansible, Python, and Bash.

Drive the adoption of infrastructure-as-code practices across the organization.

Collaborate with system engineers to improve scalability, performance, and fault tolerance of critical systems.


Monitoring & Observability

Build and extend monitoring and alerting systems using Grafana, the ELK (Elastic) stack, Zabbix, and custom scripts.

Implement and enforce observability best practices to ensure full visibility into systems, applications, and infrastructure.

Define and track SLIs, SLOs, and error budgets across key services.

Partner with development teams to embed observability earlier in the software development lifecycle.


Database & Platform Support

Support monitoring and infrastructure integration for databases including MongoDB and PostgreSQL.

Maintain documentation and champion knowledge sharing around automation, monitoring, and reliability practices.
Requirements:
What you need:

Experience & Leadership

3-4+ years of experience in a people management or team lead capacity within SRE, DevOps, or infrastructure engineering.

5-8+ years of overall experience in SRE, DevOps, or infrastructure automation roles.

Proven track record of building, coaching, and retaining high-performing engineering teams.

Experience owning an engineering roadmap and driving cross-functional reliability initiatives.



Technical Skills

Strong scripting skills in Python and Bash; comfortable building and maintaining production-grade automation.

Hands-on experience with infrastructure automation tools, particularly Ansible.

Solid experience with monitoring and observability platforms - ELK stack, Grafana, and Zabbix.

Good understanding of CI/CD pipelines and related tooling, including Jenkins.

Familiarity with managing and monitoring MongoDB and PostgreSQL in a production environment.

Comfortable working in Linux-based environments.

Excellent problem-solving skills and strong written and verbal communication.



Ability to support the following:

Experience with cloud providers - AWS, GCP, or Azure.

Exposure to containerization technologies such as Docker and Kubernetes.

Familiarity with infrastructure provisioning using Terraform.

Experience introducing SRE practices (SLOs, error budgets, chaos engineering) at an organizational level.

Exposure and experience with migrating/ building AI tools to improve process.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8662300
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
24/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an experienced SRE Engineer to drive the reliability, observability, and automation practices across our private cloud infrastructure and operations. In this role, you will be in a team of site reliability engineers, own the engineering roadmap for monitoring and automation, and act as a key liaison between development, operations, and platform teams. You bring at least 4+ years of hands-on people management experience and a deep technical background in SRE or DevOps disciplines.

What will you do?

Automation & Infrastructure
- Design, develop, and maintain automation tools to support infrastructure and operations teams at scale.
- Manage pipelines and infrastructure workflows using Jenkins, Ansible, Python, and Bash.
- Drive the adoption of infrastructure-as-code practices across the organization.
- Collaborate with system engineers to improve scalability, performance, and fault tolerance of critical systems.

Monitoring & Observability
- Build and extend monitoring and alerting systems using Grafana, the ELK (Elastic) stack, Zabbix, and custom scripts.
- Implement and enforce observability best practices to ensure full visibility into systems, applications, and infrastructure.
- Define and track SLIs, SLOs, and error budgets across key services.
- Partner with development teams to embed observability earlier in the software development lifecycle.

Database & Platform Support
- Support monitoring and infrastructure integration for databases including MongoDB and PostgreSQL.
- Maintain documentation and champion knowledge sharing around automation, monitoring, and reliability practices.
Requirements:
What you need:

4+ years of overall experience in SRE, DevOps, or infrastructure automation roles.

Strong scripting skills in Python and Bash; comfortable building and maintaining production-grade automation.

Hands-on experience with infrastructure automation tools, particularly Ansible.

Solid experience with monitoring and observability platforms - ELK stack, Grafana, and Zabbix.

Good understanding of CI/CD pipelines and related tooling, including Jenkins.

Familiarity with managing and monitoring MongoDB and PostgreSQL in a production environment.

Comfortable working in Linux-based environments.

Excellent problem-solving skills and strong written and verbal communication.


Ability to support the following:
Experience with cloud providers - AWS, GCP, or Azure.
Exposure to containerization technologies such as Docker and Kubernetes.
Familiarity with infrastructure provisioning using Terraform.
Experience introducing SRE practices (SLOs, error budgets, chaos engineering) at an organizational level.
Exposure and experience with migrating/ building AI tools to improve process.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8662378
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Were looking for a DevOps Architect to help shape the infrastructure strategy behind our Revenue AI platform. This role sits at the center of our engineering ecosystem, driving architectural direction, improving operational excellence, and enabling teams to scale with confidence. Youll work across engineering groups to identify systemic gaps, define scalable standards, and accelerate execution without becoming a delivery bottleneck.
Youll Own:
Infrastructure Strategy & Standards: Define and evolve our cloud and infrastructure architecture across Kubernetes, networking, observability, security, and data platforms. Establish clear standards and scalable best practices that enable teams to move faster with consistency and reliability.
Technical Debt & System Health Visibility: Continuously identify, prioritize, and drive resolution of cross-team technical debt, architectural gaps, and operational inefficiencies. Create organizational visibility around the most critical infrastructure challenges and opportunities.
Cross-Org Technical Leadership: Partner closely with engineering leaders and teams to influence architectural decisions, challenge assumptions, and ensure solutions are scalable, maintainable, and secure. Lead through expertise and influence, not direct ownership.
Developer Enablement & Engineering Velocity: Provide frameworks, tooling direction, and lightweight prototypes or POCs that empower teams to execute independently with higher quality and efficiency.
Critical Infrastructure Initiatives: Drive major cross-functional initiatives around reliability, scalability, security, observability, and cost optimization from identification through execution and measurable impact.
Youll Solve:
Scaling Complexity: How do we maintain simplicity, reliability, and operational clarity while supporting rapid growth and increasingly complex distributed systems.
Cross-Team Alignment: How do we create architectural consistency across independent engineering groups without slowing down innovation and execution?
Operational Excellence at Scale: How do we proactively surface and resolve systemic weaknesses before they become production issues?
Balancing Speed & Sustainability: How do we enable fast delivery today while protecting the long-term health and scalability of the platform?
AI Infrastructure Evolution: How do we build infrastructure that supports modern AI/ML workloads, GPUs, large-scale data pipelines, and future platform requirements
Youll Impact:
Platform Reliability & Scalability: Your work will directly improve the resilience, scalability, and operational maturity of our infrastructure platform.
Engineering Efficiency: By creating better standards, tooling, and architectural guidance, youll act as a force multiplier for engineering teams across the company.
Long-Term System Health: Youll help reduce operational friction, minimize technical debt, and ensure our infrastructure can support long-term business growth.
Execution Quality Across Teams: Your influence will elevate engineering quality, decision-making, and operational discipline throughout the organization.
Requirements:
A Deep Technical Expert: Someone with 8+ years of hands-on experience with AWS and cloud-native infrastructure at scale, including strong Kubernetes expertise and distributed systems knowledge.
An Infrastructure Architect: Someone with deep experience in Infrastructure as Code and GitOps methodologies using tools like Terraform, Crossplane, or Pulumi.
A Pragmatic Builder: A strong engineer with programming experience in Python or Go who can build tools, prototypes, and automation when needed.
A Systems Thinker: Someone who can identify patterns, uncover systemic issues, and drive improvements across complex technical environments.
An Influential Technical Leader: Someone with proven experience leading cross-team initiatives and driving alignment without direct authority.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8665155
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Required DevOps Engineer
About the role:
Our DevOps team operates the infrastructure that powers our AI and Computer Vision platform across construction sites in 15+ countries. From data pipelines and ML workloads to backend services - you'll work with a diverse, modern, Kubernetes-based stack and have real influence on how we build, deploy, and operate.
What you'll do:
Own Multi-Cloud Infrastructure: Work alongside the team to design, scale, and operate our high-scale, multi-region production infrastructure across AWS and GCP, powering construction sites globally.
Drive Kubernetes at Scale: Manage and evolve our Kubernetes platform on EKS and leveraging GitOps practices with ArgoCD and Helm to enable safe, fast, and reliable deployments.
Build Robust CI/CD: Design and maintain CI/CD pipelines that empower dozens of engineers to ship confidently - with automation, testing, and progressive delivery built in.
Tackle Diverse Infrastructure Challenges: Work hands-on with a wide variety of workloads - from heavy data processing and Computer Vision pipelines to backend services and ML inference - each with unique scaling, performance, and reliability requirements.
Ensure Reliability & Observability: Build and maintain world-class observability (metrics, logs, tracing, alerting) so that issues are caught early and resolved fast. Performance, reliability, and scalability are at the core of what you do.
Security & Cost: Partner with the team to strengthen our security posture, identity and access management, compliance, and cloud cost optimization across both clouds.
Ownership from 0 to 1: You will have real influence over our architecture and tooling. We want engineers who care about shaping what we build and how we build it, ensuring performance, security, and observability are baked in from day one.
Requirements:
A seasoned DevOps / Infrastructure engineer (5+ years) with strong hands-on experience in production cloud environments.
Proven expertise operating large-scale, distributed systems - with deep understanding of Kubernetes, networking, and cloud-native architecture.
Strong experience with multi-cloud environments (AWS and/or GCP), Infrastructure-as-Code (Terraform), and GitOps workflows (ArgoCD, Flux, or similar).
Hands-on experience with CI/CD systems (Jenkins, GitHub Actions, etc.).
Solid scripting and automation skills (Python, Bash, or Go).
Proven track record of being a collaborative team player who partners closely with developers, ML engineers, and cross-functional stakeholders across the organization.
Experience with observability stacks (Prometheus, Grafana, OpenTelemetry, Logz.io, or similar).
Experience with databases (relational and/or NoSQL) - including operational aspects like backups, migrations, and performance tuning.
AI-Native Engineering: You are an AI-native engineer who leverages LLMs and agentic tools (like Cursor, Copilot, or Claude) not just for command completion, but as a core operational partner - automating diagnostics, runbooks, and infrastructure workflows so you can focus on the critical things.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8670484
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
Required Al Infrastructure & Reliability Engineer
What this role is really about
Youll join a 3-person platform team within our Business Technology group -owning the internal infrastructure that our AI platform and its users depend on. This isnt a product engineering role, and it isnt ticket work or babysitting pipelines someone else built. Youre building and operating the internal foundation that the company runs on. The work covers the full stack of platform engineering: core cloud infrastructure (AWS, Kubernetes, IaC), CI/CD pipelines, AI-driven infrastructure components, and the SRE and observability practice that keeps it all honest -metrics, alerting, incident response, and reliability standards. As our AI capabilities grow, so does the complexity underneath them, and staying ahead of that is central to the role. If you treat infrastructure as a product -reusable, automated, observable, and built to last -this is your kind of role.
Job responsibilities
DevOps & AI-Driven Infrastructure - own CI/CD, deployment processes, and release reliability. Build and operate cloud infrastructure that is automated, intelligent, and continuously self-improving - not just managed.
Design and build our Terraform repository and IaC pipeline from scratch -AI-assisted generation, drift detection, and policy enforcement built in.
Build AI-driven GitHub Actions pipelines -automated code review, risk assessment, and intelligent deployment decisions.
Manage Kubernetes workloads across AWS accounts -zero downtime, fully automated, nothing left behind.
Embed AI into the operational layer -proactive drift detection, automated remediation, and intelligent scaling toward a self-healing runtime.
Reliability & SRE -improve uptime, resilience, and incident response.
Define and enforce SLOs/SLIs, error budgets, and on-call practices.
Lead incident response, postmortems, and systemic reliability improvements.
Own AI-specific reliability: model latency SLOs, token quota monitoring, rate limit handling, fallback and retry strategies, and cost-per-request alerting.
Observability & Telemetry - increase visibility, reduce noise, improve troubleshooting.
Establish and continuously evolve the observability stack: metrics, logs, distributed tracing, and alerting tuned for both application and AI workloads.
AI / LLM Operations- bringing AI systems to production and operating them at scale, with a focus on reliability, performance, and trust.
Own the AI infrastructure layer: rate limits, quota management, latency SLOs, and fallback strategies (retries, circuit breakers).
Operate LLM APIs in production with resilience and cost attribution per team/model.
Requirements:
2-4 years Hands-on DevOps, SRE, or infrastructure engineering in production SaaS environments.
Strong AWS experience: multi-account architecture, cross-account IAM, serverless and event-driven services (Lambda, SQS, SNS, EventBridge), and EKS cluster management.
Proven Kubernetes experience in production, including cross-account migrations and stateful workload management.
Proficiency with Terraform - repository structure design, module architecture, and CI/CD pipeline implementation.
Hands-on experience building and maintaining GitHub Actions pipelines for end-to-end CI/CD workflows.
Working Python proficiency for scripting, internal tooling, and workflow automation.
Practical experience implementing observability stacks from scratch: metrics, logging, distributed tracing, and alerting.
Experience owning reliability practices: SLOs, incident response, and postmortem culture.
Nice to have
Hands-on experience operating LLM APIs in production: rate-limit and quota management, cost attribution per team/model, latency monitoring, and resilience patterns (retries, fallbacks, circuit breakers).
FinOps experience across cloud, AI, and observability spend.
Experience introducing self-healing or auto-remediation patterns in production.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8659781
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
13/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
At UVeye, we're on a mission to redefine vehicle safety and reliability on a global scale. Founded in 2016, we have pioneered the world's first fully automated suite of vehicle inspection systems. At the heart of this innovation lies our advanced AI-centric technology, representing the pinnacle of computer vision, machine learning, and generative AI within the automotive sector. With over $380M in funding and strategic partnerships with industry giants such as Toyota, Amazon, General Motors, Volvo, and Hertz, our technology is utilized in manufacturing plants, dealerships, wholesale auctions, delivery fleets, seaports, and more. Our growing global team of over 200 employees is committed to creating a workplace that celebrates diversity, encourages teamwork, and strives for excellence.
We are looking for a driven, systems-minded Release Engineer to join our AI-Ops team. In this role, you will be the execution layer of our delivery pipeline—the technical gatekeeper who owns the safe, predictable deployment of software and AI models to global edge and cloud systems. You will balance high-speed deployment velocity with rock-solid operational stability. But you won't just be deploying code; you'll be acting as an internal project manager, driving our organizational roadmap by building Agentic AI tools and automating processes to scale our delivery capabilities continuously.
A day in the life and how you’ll make an impact:
* Act as the technical gatekeeper, validating and transitioning versions through strict release gates. Enforce rigorous governance and ensure strict "Definition of Done" criteria are met.
* Lead risk-mitigated rollouts across diverse global hardware environments.
* Monitor real-time deployment performance, Quality of Service (QoS), and algorithmic accuracy, making decisive, crisis-resilient calls to proceed, pause, or rollback to prevent regressions.
* Define and execute comprehensive test plans that verify cross-team dependencies. Validate that new versions meet detection accuracy requirements without degrading infrastructure.
* Triage complex production failures. Look beyond immediate issues to identify root causes using system metrics, logs, and container states, delivering actionable evidence to R&D.
* Build and integrate Agentic AI and LLM-based tools to accelerate log analysis, risk assessment, and deployment troubleshooting.
* Architect automated workflows to eliminate manual overhead and enhance system observability with robust monitors and dashboards.
Requirements:
* 2+ years of experience in Release Engineering, DevOps, QA, or a similar operations-centric role.
* Strong systems-level troubleshooting skills with the ability to analyze data, system metrics, logs, and container states.
* Demonstrated ability to maintain decisive control and make smart risk-management decisions during live, high-stakes deployments.
* Experience enforcing data integrity and process governance using Jira or similar issue-tracking tools.
Bonus if you have:
* Experience building or integrating AI, LLMs, or Agentic workflows into operational tooling.
* Familiarity with deploying software to both cloud environments and distributed edge hardware.
* Experience with performance benchmarking (throughput, bandwidth, algorithmic accuracy).
* Prior experience acting as a project manager for internal engineering initiatives or tools
Why UVeye: Pioneer Advanced Solutions: Harness cutting-edge technologies in AI, machine learning, and computer vision to revolutionize vehicle inspections. Drive Global Impact: Your innovations will play a crucial role in enhancing automotive safety and reliability, impacting lives and businesses on an international scale. Career Growth Opportunities: Participate in a journey of rapid development, surrounded by groundbreaking advancements and strategic industry partnerships
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8649166
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
We are seeking a skilled and motivated DevOps engineer with deep familiarity in the streaming ecosystem to join our elite infrastructure team. If you're excited by the challenge of operating mission-critical systems at scale and optimizing the developer experience through automation and tooling, wed love to hear from you.

What you will do:

Automate Deployment and Operation
Oversee deployment of Kafka and RabbitMQ clusters (including Confluent Cloud & CFK). Build automation pipelines to ensure repeatability and resiliency across environments.

Monitor and Support Production Systems
Own production stability of global Kafka clusters. Handle on-call rotations, incident management, troubleshooting, and scaling challenges.

Improve Infrastructure Observability
Build and maintain observability systems: dashboards, alerting pipelines, metrics collection (Prometheus, Grafana, etc.).

Optimize System Performance
Collaborate with peers on benchmarking and optimization initiatives. Work on tuning Kafka brokers, cluster configurations, and runtime parameters.

Provide Developer Support and Training (Infra-focused)
Help developers configure topics, quotas, and consumers appropriately. Train service owners to interpret monitoring data and avoid pitfalls.

Develop and Maintain Infrastructure
Contribute to building infrastructure tools and scripts (IaC, Helm charts, etc.) that make provisioning and managing clusters reliable and efficient.

Secure Infrastructure Access
Configure and maintain secure access patterns across streaming infrastructure, ensuring proper authentication and role-based access controls are enforced for both developers and services.
Requirements:
What we expect:

8+ years of experience in DevOps, SRE, or Infrastructure Engineering roles.

Deep hands-on Kafka experience, including deploying, maintaining, scaling, and monitoring clusters.

Experience with RabbitMQ.

Extensive experience with Docker, Kubernetes, Helm, and GitOps-style deployments.

Infrastructure as Code experience (Terraform, Pulumi, etc.).

Strong skills in scripting and automation (Python, Bash, etc.).

Familiarity with Confluent Cloud, Confluent for Kubernetes, and similar tools.

Solid understanding of authentication and authorization mechanisms in distributed systems.

Production support mindset - with proven troubleshooting and incident resolution history.

Collaboration and communication skills - especially with dev teams depending on platform support.

Experience with Istio Service Mesh (bonus).

Experience with GovCloud (bonus).


Bonus Qualities:

Mentorship and leadership experience in infrastructure or SRE teams.

Contributions to automation or monitoring open-source tooling.

Active participant in SRE or DevOps communities.

Conference speaker or internal tech trainer.

Technical writing about infrastructure automation or reliability.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8695015
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
13/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
As the worlds leading vendor of Cyber Security, facing the most sophisticated threats and attacks, weve assembled a global team of the most driven, creative, and innovative people. At our company, our employees are redefining the security landscape by meeting our customers real-time needs and providing our cutting-edge technologies and services to an ever-growing customer base.
our company Software Technologies has been honored by Time Magazine as one of the Worlds Best Companies and recently Gartner rated our company email security as a market leader for product, detection and innovation. We've also earned a spot on the Forbes list of the Worlds Best Places to Work for five consecutive years (2020-2024) and recognized as one of the Worlds Top Female-Friendly Companies. If you're passionate about making the world a safer place and want to be part of an award-winning company culture, we invite you to join us.
our company Harmony Email Security and Collaboration (Previously AVANAN) is a unique email solution that fully secures cloud email and cloud platforms using AI.
we are seeking a promising and talented DevOps Cloud Engineer to join our DevOps group. If you thrive in a fast-paced, dynamic environment, can handle multiple requests simultaneously, and enjoy working independently as part of a cutting-edge DevOps team, this is your opportunity to help make the world a safer place!
Key Responsibilities
Act as a DevOps Engineer within a highly skilled team, responsible for large-scale operations from development to production
Design, develop, and maintain Avanans CI/CD solutions, including operating systems, containers, cloud orchestration, and full end-to-end automation
Implement tools and procedures for monitoring, deployment, and alerting across our SaaS multi-tenant product family
Participate in the large-scale migration of a highly complex system into a secured, regulation-compliant environment
Continuously improve our cloud infrastructure to ensure fault tolerance, scalability, and security
Plan capacity, stabilize, and enhance the performance of application infrastructure with cost efficiency and scaling in mind
Design and shape our monitoring and logging solutions
Execute all tasks with top-notch cloud infrastructure security as a guiding principle.
Requirements:
Hands-on mindset - we all write code daily!
3+ years of relevant DevOps experience building CI/CD pipelines for both development and production - must
2+ years of AWS Cloud experience working with high-traffic systems and multiple services - must
Strong scripting skills, with fluency in Python - must
Experience with AI SRE agents
Experience with containers and orchestration tools (Docker, Kubernetes, or ECS) - must.
Experience with CI integration tools such as Jenkins
Familiarity with AWS CloudFormation - an advantage
Exposure to a wide range of open-source technologies (Redis, Nagios, Grafana, Prometheus, etc.)
Knowledge of best practices in security, performance, and monitoring.
Proven ability to research, evaluate, and implement new technologies, including running proof of concepts and cost analysis.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8650178
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
31/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
As a DevOps Engineer, you will be responsible for the reliability, scalability, and efficiency of our SaaS products. Your success will be measured by your ability to achieve the following:

First 3 Months: Master our GitOps-based deployment pipelines. You will be expected to independently manage and troubleshoot deployments using ArgoCD and Kargo, and contribute to the team's on-call rotation.

First 6 Months: Enhance our CI/CD processes and workflow efficiency. You will lead the project to reduce average build and deployment times by 20% by optimizing GitHub Actions, Helm charts, and introducing initial AI-assisted automation.

First 12 Months: Improve system scalability and reliability. You will design and implement infrastructure enhancements using Terraform to support a 25% increase in customer workload while maintaining a 99.9% uptime.

Core Responsibilities

Deployment Pipeline Management: Build and maintain our GitOps-based deployment pipelines to ensure a 99% success rate for all deployments and reduce manual intervention by 30% within the first year.

Infrastructure Management: Manage and scale our Kubernetes infrastructure on GCP, with a goal of optimizing resource utilization to achieve a 15% cost reduction in our GCP spending over the next 18 months.

Automation and CI/CD: Enhance and maintain our GitHub Actions CI/CD pipelines to decrease the lead time for changes to production by 25% within the first year.

AI-Assisted Workflow Integration: Integrate AI-assisted tooling into day-to-day DevOps and engineering workflows to improve productivity, scalability, and operational efficiency. You will leverage AI tools to generate initial configuration drafts, validate infrastructure code, and utilize AI-driven automation to reduce repetitive manual tasks by 20% within the first 6 months, accelerating engineering execution while maintaining high-quality standards.

System Reliability: Proactively improve system reliability and availability, with the objective of reducing the number of critical production incidents by 50% through improved monitoring, logging, and alerting within 12 months.
Requirements:
What We're Looking For

3+ years in DevOps/SRE: You have proven experience in a high-growth SaaS environment and can hit the ground running to help us scale our platform.

Google Cloud Platform (GCP): You possess a deep understanding of GCP services, particularly GKE, which is essential as our entire infrastructure is on GCP.

ArgoCD and Kargo: You have hands-on experience with GitOps and progressive delivery, which is key to our goal of achieving faster, more reliable deployments.

Kubernetes and Helm: You bring strong experience in managing and deploying applications on Kubernetes, as you will be responsible for the container orchestration of our microservices.

Terraform: You have expertise in infrastructure as code, which will be crucial for our project to scale our infrastructure and reduce costs.

Forward-Thinking Automation: You have a strong interest in or experience with leveraging emerging technologies, including AI tools, to modernize workflows, validate code, and eliminate repetitive manual tasks.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8672407
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a passionate and Senior DevOps Engineer to join our DevOps Core Team. In this role, you will be responsible for the design, implementation, and maintenance of cloud-native infrastructure on AWS and Kubernetes. You will work closely with development, operations, and quality assurance teams to streamline processes, own our Infrastructure as Code practices, and help evolve our platform reliability at scale.

How Will You Make an Impact?

Design, implement, and maintain Kubernetes clusters in production environments, ensuring high availability and scalability.

Build and manage Infrastructure as Code using CloudFormation and Crossplane as our primary IaC tools.

Own and operate cloud infrastructure primarily on AWS, with working knowledge of GCP environments.

Identify and implement process improvements to increase the efficiency and reliability of the DevOps Core team.

Provide technical leadership and mentoring to team members, fostering a culture of engineering excellence.

Work closely with engineering teams to define infrastructure needs and provide DevOps support and guidance.

Research, evaluate, and integrate new technologies into our stack.

Manage, monitor, scale, and troubleshoot a distributed, highly available, customer-facing software platform.

Create and maintain technical documentation for infrastructure, processes, and runbooks.
Requirements:
Strong, hands-on Kubernetes experience of 5+ years of proven experience running and operating clusters in production at scale is a must.

Deep expertise with Infrastructure as Code - primary experience with AWS CloudFormation and Crossplane.

Comprehensive knowledge of AWS cloud services (compute, networking, storage, IAM, observability) - with 5+ years of proven, hands-on AWS experience.

Working also with GCP - ability to operate, troubleshoot, and deploy in GCP environments.

Hands-on experience with ArgoCD and GitOps workflows - managing application delivery through Git as the source of truth.

Experience with CI/CD pipelines and automation tooling (Jenkins, CircleCI, or similar).

4+ years of scripting or coding experience (Python, Bash, or GoLang) for automation and tooling.

Advanced knowledge of Linux OS and networking fundamentals.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8695424
סגור
שירות זה פתוח ללקוחות VIP בלבד