DevOps Engineer

עדכון קורות החיים לפני שליחה

8670484

שירות זה פתוח ללקוחות VIP בלבד

משרות דומות שיכולות לעניין אותך

דיווח על תוכן לא הולם או מפלה

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

חברת השמה / כח אדם

20/05/2026

Al Infrastructure & Reliability Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time and Hybrid work

Required Al Infrastructure & Reliability Engineer
What this role is really about
Youll join a 3-person platform team within our Business Technology group -owning the internal infrastructure that our AI platform and its users depend on. This isnt a product engineering role, and it isnt ticket work or babysitting pipelines someone else built. Youre building and operating the internal foundation that the company runs on. The work covers the full stack of platform engineering: core cloud infrastructure (AWS, Kubernetes, IaC), CI/CD pipelines, AI-driven infrastructure components, and the SRE and observability practice that keeps it all honest -metrics, alerting, incident response, and reliability standards. As our AI capabilities grow, so does the complexity underneath them, and staying ahead of that is central to the role. If you treat infrastructure as a product -reusable, automated, observable, and built to last -this is your kind of role.
Job responsibilities
DevOps & AI-Driven Infrastructure - own CI/CD, deployment processes, and release reliability. Build and operate cloud infrastructure that is automated, intelligent, and continuously self-improving - not just managed.
Design and build our Terraform repository and IaC pipeline from scratch -AI-assisted generation, drift detection, and policy enforcement built in.
Build AI-driven GitHub Actions pipelines -automated code review, risk assessment, and intelligent deployment decisions.
Manage Kubernetes workloads across AWS accounts -zero downtime, fully automated, nothing left behind.
Embed AI into the operational layer -proactive drift detection, automated remediation, and intelligent scaling toward a self-healing runtime.
Reliability & SRE -improve uptime, resilience, and incident response.
Define and enforce SLOs/SLIs, error budgets, and on-call practices.
Lead incident response, postmortems, and systemic reliability improvements.
Own AI-specific reliability: model latency SLOs, token quota monitoring, rate limit handling, fallback and retry strategies, and cost-per-request alerting.
Observability & Telemetry - increase visibility, reduce noise, improve troubleshooting.
Establish and continuously evolve the observability stack: metrics, logs, distributed tracing, and alerting tuned for both application and AI workloads.
AI / LLM Operations- bringing AI systems to production and operating them at scale, with a focus on reliability, performance, and trust.
Own the AI infrastructure layer: rate limits, quota management, latency SLOs, and fallback strategies (retries, circuit breakers).
Operate LLM APIs in production with resilience and cost attribution per team/model.

Requirements:
2-4 years Hands-on DevOps, SRE, or infrastructure engineering in production SaaS environments.
Strong AWS experience: multi-account architecture, cross-account IAM, serverless and event-driven services (Lambda, SQS, SNS, EventBridge), and EKS cluster management.
Proven Kubernetes experience in production, including cross-account migrations and stateful workload management.
Proficiency with Terraform - repository structure design, module architecture, and CI/CD pipeline implementation.
Hands-on experience building and maintaining GitHub Actions pipelines for end-to-end CI/CD workflows.
Working Python proficiency for scripting, internal tooling, and workflow automation.
Practical experience implementing observability stacks from scratch: metrics, logging, distributed tracing, and alerting.
Experience owning reliability practices: SLOs, incident response, and postmortem culture.
Nice to have
Hands-on experience operating LLM APIs in production: rate-limit and quota management, cost attribution per team/model, latency monitoring, and resilience patterns (retries, fallbacks, circuit breakers).
FinOps experience across cloud, AI, and observability spend.
Experience introducing self-healing or auto-remediation patterns in production.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8659781

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

לפני 1 שעות

Sr. DevOps Engineer - Cloud, XSPM (Hybrid, ISR)

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

As a Senior DevOps Engineer within the XSPM team, you will be a critical, go-to technical expert responsible for the health, performance, and evolution of our database and infrastructure systems. When production databases degrade or behave unexpectedly, you are the person who dives deep, investigating root causes hands-on, understanding the underlying mechanics of the problem, and designing lasting solutions. Your mastery of database systems makes you the authority the team relies on to diagnose complex performance issues, architect better data solutions, and ensure our infrastructure scales with confidence.

Beyond databases, you will drive our DevOps practices end-to-end - CI/CD pipelines, infrastructure automation, and operational reliability across the XSPM platform. This is a high-impact, highly visible role at the intersection of database engineering and DevOps, where your expertise directly shapes how the team delivers and operates at scale.

We're a highly collaborative, friendly, inclusive and diverse group that prizes collaboration over competition. We provide opportunities to learn new skills, mentor fellow engineers, and contribute to the direction of both the team and the products for which we're responsible. We work in a distributed, high-trust environment where you manage your own time and have the flexibility to balance your work and personal life.

What You Will Do:

Serve as the team's database expert, the first person to investigate, diagnose, and resolve complex performance problems across our production database systems (MongoDB, OpenSearch, PostgreSQL, Cassandra).

Perform deep-dive root cause analysis on database performance issues, understanding query execution internals, resource consumption patterns, cluster behavior, and system-level interactions to identify the real source of problems, not just symptoms.

Design and propose better database architectures and solutions, recommending when to re-architect data models, migrate workloads, introduce new technologies, or redesign how services interact with their data layer.

* You will put in every effort within the team to ensure the data architecture is well designed.

Own capacity planning, scaling strategies, and high-availability designs for database clusters, ensuring systems are built to handle the team's growth trajectory.

Act as the bridge between development and infrastructure, advising engineers on how their application patterns impact database performance and guiding them toward sustainable solutions.

Build and maintain CI/CD pipelines, infrastructure-as-code (Terraform, Helm, Kubernetes manifests), and automated deployment workflows for the xspm team's services.

Design and manage observability stacks, dashboards, alerting rules, and SLOs, to maintain best-in-class availability for critical data pipelines and services.

Drive infrastructure automation to reduce operational toil, including automated scaling, self-healing systems, and configuration management.

Participate in on-call rotations, incident response, and post-incident reviews, driving root-cause analysis and long-term reliability improvements.

Evaluate and adopt new database technologies and infrastructure tooling that align with the team's evolving data architecture needs.

Requirements:
7+ years experience in DevOps, SRE, DBA, or infrastructure engineering, with significant hands-on responsibility for production database systems at scale.

Expert-level knowledge of a common DB such as MongoDB, Opensearch, Postgress, deep understanding of its internals, performance characteristics, replication, sharding, and the ability to diagnose and solve complex issues from first principles.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8675475

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

חברת השמה / כח אדם

7 ימים

DevOps Architect

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

Were looking for a DevOps Architect to help shape the infrastructure strategy behind our Revenue AI platform. This role sits at the center of our engineering ecosystem, driving architectural direction, improving operational excellence, and enabling teams to scale with confidence. Youll work across engineering groups to identify systemic gaps, define scalable standards, and accelerate execution without becoming a delivery bottleneck.
Youll Own:
Infrastructure Strategy & Standards: Define and evolve our cloud and infrastructure architecture across Kubernetes, networking, observability, security, and data platforms. Establish clear standards and scalable best practices that enable teams to move faster with consistency and reliability.
Technical Debt & System Health Visibility: Continuously identify, prioritize, and drive resolution of cross-team technical debt, architectural gaps, and operational inefficiencies. Create organizational visibility around the most critical infrastructure challenges and opportunities.
Cross-Org Technical Leadership: Partner closely with engineering leaders and teams to influence architectural decisions, challenge assumptions, and ensure solutions are scalable, maintainable, and secure. Lead through expertise and influence, not direct ownership.
Developer Enablement & Engineering Velocity: Provide frameworks, tooling direction, and lightweight prototypes or POCs that empower teams to execute independently with higher quality and efficiency.
Critical Infrastructure Initiatives: Drive major cross-functional initiatives around reliability, scalability, security, observability, and cost optimization from identification through execution and measurable impact.
Youll Solve:
Scaling Complexity: How do we maintain simplicity, reliability, and operational clarity while supporting rapid growth and increasingly complex distributed systems.
Cross-Team Alignment: How do we create architectural consistency across independent engineering groups without slowing down innovation and execution?
Operational Excellence at Scale: How do we proactively surface and resolve systemic weaknesses before they become production issues?
Balancing Speed & Sustainability: How do we enable fast delivery today while protecting the long-term health and scalability of the platform?
AI Infrastructure Evolution: How do we build infrastructure that supports modern AI/ML workloads, GPUs, large-scale data pipelines, and future platform requirements
Youll Impact:
Platform Reliability & Scalability: Your work will directly improve the resilience, scalability, and operational maturity of our infrastructure platform.
Engineering Efficiency: By creating better standards, tooling, and architectural guidance, youll act as a force multiplier for engineering teams across the company.
Long-Term System Health: Youll help reduce operational friction, minimize technical debt, and ensure our infrastructure can support long-term business growth.
Execution Quality Across Teams: Your influence will elevate engineering quality, decision-making, and operational discipline throughout the organization.

Requirements:
A Deep Technical Expert: Someone with 8+ years of hands-on experience with AWS and cloud-native infrastructure at scale, including strong Kubernetes expertise and distributed systems knowledge.
An Infrastructure Architect: Someone with deep experience in Infrastructure as Code and GitOps methodologies using tools like Terraform, Crossplane, or Pulumi.
A Pragmatic Builder: A strong engineer with programming experience in Python or Go who can build tools, prototypes, and automation when needed.
A Systems Thinker: Someone who can identify patterns, uncover systemic issues, and drive improvements across complex technical environments.
An Influential Technical Leader: Someone with proven experience leading cross-team initiatives and driving alignment without direct authority.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8665155

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

04/05/2026

DevOps Engineer

חברה חסויה

מיקום המשרה: תל אביב יפו

סוג משרה: משרה מלאה

We are seeking a skilled and motivated DevOps Infrastructure Engineer to join our DevOps Infra team. Our team is responsible for managing and evolving the cloud-native infrastructure that powers our microservices architecture. Core responsibilities span our EKS-based Kubernetes platform, ArgoCD-driven GitOps pipelines, infrastructure observability, Helm-based deployments, and mission-critical web services running on AWS.
We are looking for a DevOps engineer who can hit the ground running, take ownership of critical infrastructure components, and contribute meaningfully from day one. The ideal candidate brings deep Kubernetes expertise, strong hands-on experience with observability tooling, and the maturity to work independently.
In this role, you will be responsible for:
Managing and evolving our EKS-based Kubernetes platform and Helm-based deployment pipelines
Owning and maintaining GitOps workflows using ArgoCD, including troubleshooting sync and rollout issues
Designing, building, and maintaining observability solutions using Prometheus, VictoriaMetrics, and Grafana
Writing and maintaining infrastructure as code using Terraform, including modules, remote state, and CI/CD automation
Taking full ownership of AWS infrastructure components - including networking, compute, IAM, and storage - ensuring reliability, security, and operational excellence across environments
Collaborating with developers and SREs to support reliable, scalable, and secure AWS infrastructure

דרישות:
1-3 years of hands-on experience in DevOps or infrastructure engineering roles.
Deep expertise in Kubernetes and Helm, including production-grade deployments and live incident troubleshooting.
Strong proficiency in Terraform or equivalent IaC tooling
Solid working knowledge of AWS core services (EC2, IAM, S3, VPC, CloudWatch, EKS).
Practical experience with Prometheus, VictoriaMetrics, Grafana, and alerting stack design.
Proven ability to work independently, take ownership end-to-end, and communicate effectively across engineering teams.
Agentic DevOps experience working with common AI assistant tools, MCPs and Agents.
Advantages:
Experience with cloud cost optimization strategies and tooling.
Background in cloud-native security practices (RBAC, policy enforcement,SSL, MTLS etc).
Prior involvement in designing or operating high-availability, fault-tolerant systems.
Experience with nginx and IIS web servers. המשרה מיועדת לנשים ולגברים כאחד.

עוד...

עדכון קורות החיים לפני שליחה

8636122

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

13/05/2026

DevOps Engineer 25651

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

We're looking for a Senior DevOps Engineer who owns infrastructure end-to-end, ships with confidence, and raises the reliability bar without being asked. You will work closely with backend, full-stack, and security teams to build a highly available, reliable, and secure production environment. If you get energized by building systems that scale, pipelines that teams love, and platforms that never sleep - this role is for you.
Key Responsibilities
Own and evolve our cloud infrastructure across multi-region production environments, end-to-end.
Lead our GitOps deployment model - designing and maintaining declarative, automated deployment workflows with zero manual gates.
Build, maintain, and optimize CI/CD pipelines with a strong focus on developer experience, reliability, and speed.
Drive DevSecOps "shift-left" culture: integrate security scanning, SBOM generation, and supply chain hardening directly into every pipeline.
Develop automation frameworks for provisioning, scaling, observability, and incident response - increasingly leveraging AI-assisted tooling to reduce toil.
Operate and improve our observability platform: metrics, logs, alerting, dashboards, SLOs/SLIs, and on-call tooling.
Champion zero-trust secrets management and credential-less authentication patterns across the stack.
Partner with architects and engineering leadership on cloud cost optimization, availability, and performance.
Build internal tooling and automation that multiplies engineering velocity across the organization.

Requirements:
5+ years of hands-on DevOps experience in a SaaS product environment - Must.
Deep, hands-on AWS expertise; multi-cloud experience is a strong plus - Must.
Strong understanding of containers and orchestration - Docker, Kubernetes, including workloads, networking, service mesh (Istio), Helm/Kustomize, and autoscaling (KEDA, HPA, VPA).
Strong experience with:
Infrastructure-as-Code - Terraform, Crossplane, and/or cloud-native declarative tooling.
GitOps principles and tooling (ArgoCD or equivalent).
CI/CD platforms - building reusable, scalable, security-hardened pipeline templates (GitHub Actions or equivalent).
Secrets management - dynamic injection, IRSA/Workload Identity, zero long-lived credentials.
Experience embedding security into CI/CD: vulnerability scanning, SBOM generation, and supply chain security (Trivy, Grype, Syft, JFrog Xray).
Solid observability fluency - OpenTelemetry, Prometheus, Grafana, Datadog, ELK/OpenSearch, distributed tracing.
Exposure to AI/ML workloads or LLMOps infrastructure is a meaningful advantage - not required, but will set you apart.
FinOps mindset - you think about cloud spend as a product metric, not just a finance problem.
A clear communicator who can align engineers, security teams, and leadership around infrastructure decisions.
A builder and owner - you see the system, spot the gaps, and raise the bar without being asked.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8650200

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

24/05/2026

Site Relaibility Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

We are looking for an experienced SRE Engineer to drive the reliability, observability, and automation practices across our private cloud infrastructure and operations. In this role, you will be in a team of site reliability engineers, own the engineering roadmap for monitoring and automation, and act as a key liaison between development, operations, and platform teams. You bring at least 4+ years of hands-on people management experience and a deep technical background in SRE or DevOps disciplines.

What will you do?

Automation & Infrastructure
- Design, develop, and maintain automation tools to support infrastructure and operations teams at scale.
- Manage pipelines and infrastructure workflows using Jenkins, Ansible, Python, and Bash.
- Drive the adoption of infrastructure-as-code practices across the organization.
- Collaborate with system engineers to improve scalability, performance, and fault tolerance of critical systems.

Monitoring & Observability
- Build and extend monitoring and alerting systems using Grafana, the ELK (Elastic) stack, Zabbix, and custom scripts.
- Implement and enforce observability best practices to ensure full visibility into systems, applications, and infrastructure.
- Define and track SLIs, SLOs, and error budgets across key services.
- Partner with development teams to embed observability earlier in the software development lifecycle.

Database & Platform Support
- Support monitoring and infrastructure integration for databases including MongoDB and PostgreSQL.
- Maintain documentation and champion knowledge sharing around automation, monitoring, and reliability practices.

Requirements:
What you need:

4+ years of overall experience in SRE, DevOps, or infrastructure automation roles.

Strong scripting skills in Python and Bash; comfortable building and maintaining production-grade automation.

Hands-on experience with infrastructure automation tools, particularly Ansible.

Solid experience with monitoring and observability platforms - ELK stack, Grafana, and Zabbix.

Good understanding of CI/CD pipelines and related tooling, including Jenkins.

Familiarity with managing and monitoring MongoDB and PostgreSQL in a production environment.

Comfortable working in Linux-based environments.

Excellent problem-solving skills and strong written and verbal communication.

Ability to support the following:
Experience with cloud providers - AWS, GCP, or Azure.
Exposure to containerization technologies such as Docker and Kubernetes.
Familiarity with infrastructure provisioning using Terraform.
Experience introducing SRE practices (SLOs, error budgets, chaos engineering) at an organizational level.
Exposure and experience with migrating/ building AI tools to improve process.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8662378

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

24/05/2026

Site Reliability Team Leader

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

We are looking for an experienced SRE Team Lead to drive the reliability, observability, and automation practices across our private cloud infrastructure and operations. In this role, you will lead a team of site reliability engineers, own the engineering roadmap for monitoring and automation, and act as a key liaison between development, operations, and platform teams. You bring at least 3-4 years of hands-on people management experience and a deep technical background in SRE or DevOps disciplines.

What will you do?

Leadership & Team Management

Lead, mentor, and grow a team of SREs, providing technical direction, career development guidance, and day-to-day management.

Own the team roadmap for reliability, observability, and automation initiatives - prioritizing work, removing blockers, and driving delivery.

Conduct regular 1:1s, performance reviews, and hiring processes to build and sustain a high-performing team.

Foster a culture of operational excellence, blameless post-mortems, and continuous improvement.

Act as an escalation point for complex incidents and reliability issues, leading post-incident reviews and ensuring follow-through on action items.

Automation & Infrastructure

Design, develop, and maintain automation tools to support infrastructure and operations teams at scale.

Manage pipelines and infrastructure workflows using Jenkins, Ansible, Python, and Bash.

Drive the adoption of infrastructure-as-code practices across the organization.

Collaborate with system engineers to improve scalability, performance, and fault tolerance of critical systems.

Monitoring & Observability

Build and extend monitoring and alerting systems using Grafana, the ELK (Elastic) stack, Zabbix, and custom scripts.

Implement and enforce observability best practices to ensure full visibility into systems, applications, and infrastructure.

Define and track SLIs, SLOs, and error budgets across key services.

Partner with development teams to embed observability earlier in the software development lifecycle.

Database & Platform Support

Support monitoring and infrastructure integration for databases including MongoDB and PostgreSQL.

Maintain documentation and champion knowledge sharing around automation, monitoring, and reliability practices.

Requirements:
What you need:

Experience & Leadership

3-4+ years of experience in a people management or team lead capacity within SRE, DevOps, or infrastructure engineering.

5-8+ years of overall experience in SRE, DevOps, or infrastructure automation roles.

Proven track record of building, coaching, and retaining high-performing engineering teams.

Experience owning an engineering roadmap and driving cross-functional reliability initiatives.

Technical Skills

Strong scripting skills in Python and Bash; comfortable building and maintaining production-grade automation.

Hands-on experience with infrastructure automation tools, particularly Ansible.

Solid experience with monitoring and observability platforms - ELK stack, Grafana, and Zabbix.

Good understanding of CI/CD pipelines and related tooling, including Jenkins.

Familiarity with managing and monitoring MongoDB and PostgreSQL in a production environment.

Comfortable working in Linux-based environments.

Excellent problem-solving skills and strong written and verbal communication.

Ability to support the following:

Experience with cloud providers - AWS, GCP, or Azure.

Exposure to containerization technologies such as Docker and Kubernetes.

Familiarity with infrastructure provisioning using Terraform.

Experience introducing SRE practices (SLOs, error budgets, chaos engineering) at an organizational level.

Exposure and experience with migrating/ building AI tools to improve process.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8662300

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

10/05/2026

Machine learning operations engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time and Hybrid work

We are always looking for exceptional talent to join us on the journey!
We are always looking for exceptional talent to join us on the journey!

Your Mission

As an MLOps Engineer at Nuvei, your mission is to design, build, and operate the platforms that power our machine learning and generative AI products spanning real-time use cases such as large-scale fraud scoring, MCP & agentic workflows support. Youll create reliable CI/CD for models and Agents, robust data/feature pipelines, secure model serving, and comprehensive observability. You will also support our agentic AI ecosystem and Model Context Protocol (MCP) services so that models can safely use tools, data, and actions across .
You will partner closely with Data Scientists, Data/Platform Engineers, Product, and SRE to ensure every model from classic ML to LLM/RAG agents moves from prototype to production with strong reliability, governance, cost efficiency, and measurable business impact.
Responsibilities:
Operate & Develop ML/LLM platforms on Kubernetes + cloud (Azure; AWS/GCP ok) with Docker, Terraform, and other relevant tools
Manage object storage, GPUs, and autoscaling for training & low-latency model serving
Manage cloud environment, networking, service mesh, secrets, and policies to meet PCI-DSS and data-residency requirements
Build end-to-end CI/CD for models/agents/MCP tooling (versioning, tests, approvals)
Deliver real-time fraud/risk scoring & agent signals under strict latency SLOs.
Maintain MCP servers/clients: tool/resource definitions, versioning, quotas, isolation, access controls
Integrate agents with microservices, event streams, and rule engines; provide SLAs, tracing, and on-call runbooks
Measure operational metrics of ML/LLM (latency, throughput, cost, tokens, tool success, safety events)
Enforce governance: RBAC/ABAC, row-level security, encryption, PII/secrets management, audit trails.
Partner with DS on packaging (wheels/conda/containers), feature contracts, and reproducible experiments.
lead incident response and post-mortems.
Drive FinOps: right-sizing, GPU utilization, batching/caching, budget alerts.

Requirements:
4+ years in DevOps/MLOps/Platform roles building and operating production ML systems (batch and real-time)
Strong hands-on with Kubernetes, Docker, Terraform/IaC, and CI/CD
Practical experience with Spark/Databricks and scalable data processing
Proficiency in Python & Bash
Ability to operate DS code and optimize runtime performance.
Experience with model registries (MLflow or similar), experiment tracking, and artifact management.
Production model serving using FastAPI/Ray Serve/Triton/TorchServe, including autoscaling and rollout strategies
Monitoring and tracing with Prometheus/Grafana/OpenTelemetry; alerting tied to SLOs/SLAs
Solid understanding of PCI-DSS/GDPR considerations for data and ML systems
Experience with the Azure cloud environment is a big plus
Operating LLM/agent workloads in production (prompt/config versioning, tool execution reliability, fallback/retry policies)
Building/maintaining RAG stacks (indexing pipelines, vector DBs, retrieval evaluation, hybrid search)
Implementing guardrails (policy checks, content filters, allow/deny lists) and human-in-the-loop workflows
Experience with feature stores - Qwak Feature Store, Feast
A/B testing for models and agents, offline/online evaluation frameworks
Payments/fraud/risk domain experience; integrating ML outputs with rule engines and operational systems - Advantage
Familiarity with Databricks Unity Catalog, dbt, or similar tooling

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8644480

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

20/05/2026

DevOps Team Lead

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

We are looking for a seasoned and execution driven DevOps Team Lead. In this role you will drive the evolution of our infrastructure and cloud operations empowering our entire R&D organization. You will lead fully the DevOps aspects of the company including strategic infrastructure management within a multi-cloud environment. In this role you will leverage cutting-edge AI technologies to accelerate important DevOps flows. As a technical lighthouse, you will set the standard for engineering excellence, cloud-native architecture, and operational scale. You will work in a global environment with highly-skilled engineering teams reporting to the VP R&D.

What Youll Do?

Build and lead a strong DevOps team fostering a culture of ownership, collaboration, and technical excellence.
Take end-to-end leadership and project management of large-scale cloud infrastructures, driving technical decisions, execution, and outcomes.
Continuously modernize DevOps practices (CI/CD, GitOps, monitoring etc) to drive efficiency, agility, and performance across the organization.
Use AI-powered tools and agentic workflows to improve R&D efficiency.
Design and maintain cloud environments using Infrastructure as Code.
Our stack - Mainly GCP and AWS, Kubernetes, Terraform, Terragrunt, Node.js, Golang and more

Requirements:
Proven success record managing DevOps team(s) in hyper growth tech companies.
Hands-on and expertise in AWS or GCP production environments at scale.
Hands-on and expertise in Kubernetes and managing large scale Prod+Dev environments.
Experience with FinOps and cloud cost optimization.
Experience in backend micro-services / API development.
A proactive Can-Do attitude and a drive to build something new.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8659707

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

13/05/2026

DevOps Engineer Infinity Next Platform 25457

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

Are you ready to kickstart your DevOps career and be part of the infrastructure powering the future of cybersecurity? Join Infinity Next-our companys next-generation, cloud-native platform delivering a suite of cutting-edge products such as WAF, SD-WAN, and more.
We are looking for a DevOps Engineer to join our dynamic and growing team. This is an opportunity to work alongside top engineers, gain hands-on experience in production environments, and help scale secure, high-performance services used by customers around the globe. You will be part of a fast-paced, startup-like environment, with the backing and stability of a global cybersecurity leader.
Key Responsibilities
Support the deployment and maintenance of scalable, multi-tenant environments across cloud platforms
Assist in automating infrastructure using Infrastructure-as-Code tools and CI/CD pipelines
Monitor and improve the reliability, performance, and security of platform services
Collaborate with development, product, and operations teams to deliver new features and improvements
Troubleshoot infrastructure and application issues in development and production environments
Implement custom user interfaces using the latest programming techniques and technologies
Design, develop, and maintain DevOps-related microservices that support platform automation and reliability
Design and integrate agentic AI capabilities into DevOps workflows to automate decision-making, incident response, and platform operations.

Requirements:
Bachelors degree in Computer Science or a related technical field
At least 3 years of experience as a DevOps Engineer
Strong interest in cloud technologies, DevOps methodologies, and automation
Knowledge of containerization and container orchestration technologies, such as Amazon EKS
Experience in the design, operation, and troubleshooting of Kubernetes core components and API extensions for cloud-native, distributed systems
Familiarity with Linux, basic networking, containers, and scripting
Understanding of CI/CD, cloud infrastructure, and monitoring concepts
Experience building maintainable and testable codebases, including API design and unit testing techniques
Hands-on experience applying GitOps principles to manage Kubernetes infrastructure and application deployments
Nice to Have
Exposure to Kubernetes in cloud environments such as Amazon EKS
Familiarity with Ingress Controllers, Kubernetes Gateway API, CloudFront, and Global Accelerator
Experience designing and developing Kubernetes operators and controllers
Experience or coursework with tools such as Terraform, Pulumi, Crossplane, and Helm
Hands-on experience with observability technologies such as Prometheus, Grafana, OpenTelemetry, and centralized logging systems.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8650194

שירות זה פתוח ללקוחות VIP בלבד