דרושים » ניהול ביניים » Senior DevOps Engineer

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 1 שעות
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
we are looking for a Senior DevOps Engineer.
What Youll Do:
Design, scale, and operate high-throughput, low-latency infrastructure supporting RTB bidders and ML prediction services at extreme scale (2Million+ QPS).
Own production GCP environments end-to-end, including deployment, monitoring, 99.999 uptime, incident response, and post-mortems.
Build and maintain infrastructure for massive data ingestion (up to 1TB per hour), continuous ML training pipelines, and real-time prediction systems.
Develop and evolve CI/CD pipelines, DevOps automation, and infrastructure-as-code practices.
Work closely with engineering and data science teams, taking full ownership from design through production and ongoing operations.
Continuously evaluate and introduce improvements to our infrastructure stack, tooling, and operational practices.
Our core stack includes GCP, Kubernetes, Prometheus, Grafana, Python, Go, BigQuery, Redis, Prefect, Aerospike and MySQL. We are looking for someone experienced, independent, and opinionated about production systems, but still curious and eager to improve how things are built and operated.
Requirements:
At least 4 years of experience in a DevOps, Infrastructure or MLOps role preferably in a startup or high-scale environment.
Strong understanding of systems, infrastructure, and how modern distributed applications are built and scaled.
Experience with cloud infrastructure, preferably GCP.
Familiarity with Kubernetes-based production systems and observability tools.
Experience working with Redis and relational or NoSQL databases or data warehouses.
Ability to think in terms of system architecture and long-term scalability, not just short-term fixes.
A strong sense of ownership, urgency, and responsibility for production systems.
Advantages:
Proven experience running ML systems in production.
Experience with high-throughput data pipelines and real-time systems
Background in infrastructure supporting data science or ML teams.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8692945
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
01/06/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an outstanding Senior DevOps Engineer to join our revolutionary, large-scale mobile content discovery platform used by millions of users worldwide. In this role, you won't just keep the lights on-you will take a leading role in shaping our data infrastructure, setting architectural standards for AI/ML workloads, and bridging the gap between DevOps and Data Engineering.
Why youll love this team: We move fast, use cutting-edge technologies, and value absolute technical excellence over rigid bureaucracy. If you are passionate about solving complex, high-traffic infrastructure puzzles and want to see your work directly impact millions of daily users, this is the sandbox youve been looking for.
What you'll be doing
Design Data-Native Cloud Solutions: Design and implement scalable data and AI/ML infrastructure across multiple environments using Kubernetes, orchestration platforms, and IaC to power our AI, ML, and analytics ecosystem
Accelerate Data/ML Engineer Experience: Spearhead improvements to data pipeline deployment, monitoring tools, and self-service capabilities that empower data teams to deliver insights faster with higher reliability
Engineer Robust Data/ML Platforms: Build and optimize infrastructure that supports diverse data workloads from real-time streaming to batch processing, ensuring performance and cost-effectiveness for critical analytics systems
Drive DevOps Excellence: Collaborate with engineering leaders across backend and ML teams, champion modern infrastructure practices, and mentor team members to elevate how we build, deploy, and operate data systems at scale
Collaborate on high-level technical designs with ML and Backend engineers to build resilient systems.
Requirements:
5+ years of hands-on DevOps experience building, shipping, and operating production systems
Infrastructure as Code: design and implement infrastructure automation using tools such as Terraform, Pulumi, or CloudFormation (modular code, reusable patterns, pipeline integration)
Cloud platforms: deep experience with AWS, GCP, or Azure (core services, networking, IAM)
Kubernetes: strong end-to-end understanding of Kubernetes as a system (routing/networking, scaling, security, observability, upgrades), with proven experience integrating data-centric components (e.g., Kafka, RDS, BigQuery, Aerospike).
GitOps & CI/CD: practical experience implementing pipelines and advanced delivery using tools such as Argo CD / Argo Rollouts, GitHub Actions, or similar

Observability: metrics, logs, and traces; actionable alerting and SLOs using tools such as Prometheus, Grafana, ELK/EFK, OpenTelemetry, or similar
Scalability & Performance: Proven experience managing production environments characterized by high traffic volumes and large amounts of data, with a focus on maintaining system reliability and cost-efficiency at scale.
You might also have
Coding proficiency in at least one language (e.g., Python or TypeScript); able to build production-grade automation and tools.
Data Pipeline Orchestration: Demonstrated success building and optimizing data pipeline deployment using modern tools (Airflow, Temporal, Kubernetes operators) and implementing GitOps practices for data workloads
Data Engineer Experience Focus: Track record of creating and improving self-service platforms, deployment tools, and monitoring solutions that measurably enhance data engineering team productivity
Data Infrastructure Deep Knowledge: Extensive experience designing infrastructure for data-intensive workloads including streaming platforms (Kafka, Kinesis), data processing frameworks (Spark, Flink), storage solutions, and comprehensive observability systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8675462
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
We are seeking a skilled Site Reliability Engineer (SRE) to join our team and help build, maintain, and improve the reliability, scalability, and performance of our systems. As an SRE, you will be responsible for owning and evolving our observability tooling, using real-time insights to make data-driven decisions about system behavior and performance at runtime, and implementing automation to enhance our infrastructure. This role involves collaborating across teams to ensure a robust and efficient technology stack supporting mission-critical systems.

You will:
Proactively enhance system reliability, scalability, and performance through automation, monitoring, and capacity planning.

Develop and maintain observability systems, including distributed tracing, logging, and metrics platforms.

Establish and maintain organizational standards for monitoring, leveraging tools like Prometheus, Grafana, and OpenTelemetry.

Use observability tools to analyze runtime behavior and make data-driven decisions that improve system performance and reliability.

Partner with development teams to integrate reliability best practices into the software development lifecycle.

Manage infrastructure at scale in cloud services (AWS advantage) and platforms like Kubernetes.

Optimize resource utilization to reduce costs while maintaining service quality.

Contribute to the development and adoption of AI-driven tools and practices for engineering and observability.
What success looks like:

You are a trusted technical leader within the organization, mentoring others and helping shape the evolution of our SRE and observability practices.

You reduce the frequency and impact of production incidents by building resilient systems and using observability insights to address issues before they escalate.

You significantly improve observability: key metrics, logs, and traces are consistently available, well instrumented, and actionable across all critical services, enabling fast, informed decisions and rapid resolution of issues.

You are actively engaged in proactive problem solving: you identify and resolve systemic issues before they impact customers, and continuously refine SLOs and SLIs to reflect evolving business needs.
Requirements:
We are looking for:

At least 6 years of experience as a SRE or DevOps.

Strong experience with Observability Tools such as OpenTelemetry, Grafana, Prometheus, and ELK stack (Elasticsearch, Logstash, Kibana).

In-depth experience with Cloud Platforms: AWS services, including EC2, S3, RDS, and CloudFormation/Terraform for infrastructure-as-code.

Strong experience working in Kubernetes environments, with a focus on Helm for deployment and configuration management

Experience working with AI and LLM tools such as Cursor, Claude Code or similar.

Proficiency in scripting and/or development languages such as Bash or Python.

Thorough understanding of CI/CD pipelines and automation tools.

Strong experience with automation tools like Terraform and/or Ansible, and understanding of Infrastructure as Code.

Solid troubleshooting and debugging skills.

A team player with a strong can-do mentality.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8656402
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
25/05/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
At our company, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; its a Dream job. we are where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Lets build something extraordinary together.
our company's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Built as part of a broader sovereign AI platform, our technology is designed to operate in on-premise, private cloud, and air-gapped environments, enabling nations to maintain full control over their data, infrastructure, and AI capabilities. Central to our company's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.
At our company, our talented team, driven by passion, expertise, and innovative minds, inspires us daily. We are not just dreamers, we are dream-makers.
Responsibilities
Architect and evolve the AI platform - agent orchestration, LLM gateways, context engineering pipelines, evaluation infrastructure, tool-calling systems, and retrieval pipelines - through RFCs, prototypes, and design reviews.
Lead and grow a small team of AI Engineers building the agent framework, production backend services, and AI platform infrastructure - hire, mentor, pair on hard problems, and raise the bar through hands-on code and design reviews.
Contribute to critical systems, debug production incidents, and maintain enough codebase context to make sound technical calls.
Own reliability across AI and agent services - set and enforce SLAs, build observability for non-deterministic systems, and harden tool execution environments for cost and security.
Set the standard for AI engineering practices - agent testing strategies, evaluation frameworks with human-in-the-loop oversight, retrieval quality benchmarks, and CI/CD for AI systems.
Work closely with ML Platform, Data Platform, DevOps, Data Science, and Product teams across the Applied AI Engineering group - ensure the AI platform evolves to serve teams building agentic workflows across the organization.
Measure and improve developer experience - deploy friction, onboarding time, CI turnaround - as seriously as system performance.
Requirements:
6+ years in backend software engineering, with 4+ years focused on production systems that integrate AI/ML models or LLMs.
2+ years leading an engineering team - hiring, mentoring, conducting design reviews, and shipping alongside your team.
Engineering craft - Strong Python, Go, or Java, system architecture, API design, testing, and secure coding practices.
Agentic systems & LLM integration - Deep understanding of agent orchestration, tool-use architectures, LLM integration patterns, context engineering, and frameworks like LangGraph or similar, or custom-built equivalents
Backend & platform engineering - Experience building and operating production APIs, services, and platform infrastructure at scale; comfortable working with relational databases, message queues, and event-driven architectures
RAG & retrieval - Experience with production RAG pipelines, vector databases, embedding systems, and retrieval quality
Evaluation & observability - Experience building LLM and agent eval infrastructure, monitoring AI quality, and observability for non-deterministic systems
Nice to Have:
Platform & infra - Kubernetes, AWS, Terraform or similar IaC, CI/CD, service architecture, incident management
Experience with MCP or similar tool-use protocols for agent-to-service communication
Hands-on ML experience - model training, fine-tuning, or working directly with ML pipelines.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8664323
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
Were looking for a Solutions Engineer with deep experience in Big Data technologies, real-time data pipelines, and scalable infrastructure-someone whos been delivering critical systems under pressure, and knows what it takes to bring complex data architectures to life. This isnt just about checking boxes on tech stacks-its about solving real-world data problems, collaborating with smart people, and building robust, future-proof solutions.
In this role, youll partner closely with engineering, product, and customers to design and deliver high-impact systems that move, transform, and serve data at scale. Youll help customers architect pipelines that are not only performant and cost-efficient but also easy to operate and evolve.
We want someone whos comfortable switching hats between low-level debugging, high-level architecture, and communicating clearly with stakeholders of all technical levels.
Key Responsibilities:
Build distributed data pipelines using technologies like Kafka, Spark (batch & streaming), Python, Trino, Airflow, and S3-compatible data lakes-designed for scale, modularity, and seamless integration across real-time and batch workloads.
Design, deploy, and troubleshoot hybrid cloud/on-prem environments using Terraform, Docker, Kubernetes, and CI/CD automation tools.
Implement event-driven and serverless workflows with precise control over latency, throughput, and fault tolerance trade-offs.
Create technical guides, architecture docs, and demo pipelines to support onboarding, evangelize best practices, and accelerate adoption across engineering, product, and customer-facing teams.
Integrate data validation, observability tools, and governance directly into the pipeline lifecycle.
Own end-to-end platform lifecycle: ingestion → transformation → storage (Parquet/ORC on S3) → compute layer (Trino/Spark).
Benchmark and tune storage backends (S3/NFS/SMB) and compute layers for throughput, latency, and scalability using production datasets.
Work cross-functionally with R&D to push performance limits across interactive, streaming, and ML-ready analytics workloads.
Operate and debug object store-backed data lake infrastructure, enabling schema-on-read access, high-throughput ingestion, advanced searching strategies, and performance tuning for large-scale workloads.
Requirements:
2-4 years in software / solution or infrastructure engineering, with 2-4 years focused on building / maintaining large-scale data pipelines / storage & database solutions.
Proficiency in Trino, Spark (Structured Streaming & batch) and solid working knowledge of Apache Kafka.
Coding background in Python (must-have); familiarity with Bash and scripting tools is a plus.
Deep understanding of data storage architectures including SQL, NoSQL, and HDFS.
Solid grasp of DevOps practices, including containerization (Docker), orchestration (Kubernetes), and infrastructure provisioning (Terraform).
Experience with distributed systems, stream processing, and event-driven architecture.
Hands-on familiarity with benchmarking and performance profiling for storage systems, databases, and analytics engines.
Excellent communication skills-youll be expected to explain your thinking clearly, guide customer conversations, and collaborate across engineering and product teams.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8682670
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
we are looking for a Senior AI Engineer to design and build production-grade, LLM-powered systems. You'll work at the intersection of software engineering and applied AI - shipping agents, RAG pipelines, and tool-using systems that solve real problems at scale. This is a hands-on, high-ownership role for someone who thrives at the frontier of what's possible with modern LLMs and isn't afraid to write the glue, the infrastructure, and the prompts that make it all work.
This is a **cross-functional, company-wide role**. You won't be embedded in a single product team - instead, you'll partner with every department to identify high-leverage opportunities and build AI-powered tools and workflows that boost productivity and efficiency across the entire organization.
This is a great opportunity to be part of one of the fastest-growing infrastructure companies in history, an organization that is in the center of the hurricane being created by the revolution in artificial intelligence.
"our company's data management vision is the future of the market."- Forbes
we are the data platform company for the AI era. We are building the enterprise software infrastructure to capture, catalog, refine, enrich, and protect massive datasets and make them available for real-time data analysis and AI training and inference. Designed from the ground up to make AI simple to deploy and manage, our company takes the cost and complexity out of deploying enterprise and AI infrastructure across data center, edge, and cloud.
Our success has been built through intense innovation, a customer-first mentality and a team of fearless workers who leverage their skills & experiences to make real market impact. This is an opportunity to be a key contributor at a pivotal time in our companys growth and at a pivotal point in computing history.
What You'll Do:
- Design, build, and operate LLM-powered applications, agents, and workflows end-to-end - from prototype to production.
- Architect retrieval, context engineering, and tool-use strategies that make models reliable, accurate, and cost-efficient.
- Integrate LLMs with internal services, third-party APIs, and data stores to automate complex business and engineering workflows.
- Build, evaluate, and continuously improve evaluation harnesses for non-deterministic systems.
- Collaborate closely with product, research, and platform teams to translate ambiguous problems into shipped capabilities.
- Stay ahead of the rapidly evolving LLM ecosystem (models, frameworks, agentic patterns) and bring the best ideas into our stack.
Requirements:
Engineering Foundations:
- Strong Python skills- you write clean, idiomatic, well-tested code and understand the language deeply.
- Hands-on experience using coding agents(Cursor, Claude Code, GitHub Copilot, or similar) to build complex software systems. You know how to delegate effectively to AI assistants and review their output critically.
- Experience with multiple database paradigms- both SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Redis, DynamoDB, or similar). You can choose the right tool for the job.
- Experience designing and integrating with third-party APIs- REST and gRPC. Comfortable building robust clients, handling auth, retries, rate limits, and schema evolution.
- Production experience with Docker and Kubernetes- containerizing services, writing manifests, and debugging deployments.
- Strong Linux fundamentals- confident in bash and the terminal; you can navigate, script, and troubleshoot a server without reaching for a GUI.
- Experience building cloud-native tools on AWS, GCP, or Azure (compute, storage, queues, serverless, IAM).
AI / LLM Expertise:
- Solid understanding of what an LLM is and how it works- tokenization, attention, context windows, sampling, and the practical implications of each for system design.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8682766
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/05/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
Your Career:
Join a team of senior engineers operating in a large-scale, multi-cloud production environment supporting tens of thousands of enterprise customers worldwide. This is not a typical SRE role - youll work at the core of a complex, high-impact system alongside experienced DevOps professionals in a fast-paced, cybersecurity-focused organization.
Your Impact:
Own and operate large-scale, global production environments across multiple cloud providers (GCP, AWS, Azure)
Actively monitor, investigate, and resolve incidents triggered by automated alerting systems (PagerDuty / Incident Response)
Drive end-to-end troubleshooting across complex, distributed systems with high context switching
Design, deploy, and improve monitoring and observability systems (e.g., Prometheus, Grafana) - not just react to alerts
Collaborate closely with internal teams (CX, CS, Engineering) to ensure system reliability and performance
Work hands-on with modern DevOps and infrastructure tools including Kubernetes, Terraform, CI/CD pipelines, and GitOps workflows
Develop and maintain automation and tooling (primarily in Python)
Gain deep understanding of system architecture and interconnected services
Contribute to a culture of operational excellence in a high-scale, high-availability environment
On call responsibilities:
Daytime hours (12:00-20:00)
Occasional weekends and holidays (rotation-based).
Requirements:
Your experience:
5+ years of experience in SRE roles in production environments at scale
Strong hands-on experience with Kubernetes and Terraform
Strong hands-on experience with at least one major cloud platform (GCP or AWS required)
Experience building and configuring monitoring systems (e.g., Prometheus, Grafana)
Familiarity with CI/CD and GitOps tools (GitLab CI, GitHub Actions, Jenkins, Flux)
Proficiency in Python for scripting and automation
Strong troubleshooting and problem-solving skills with a passion for incident handling
Ability to work in fast-paced environments with high context switching
Highly responsive, proactive, and ownership-driven
Strong collaboration and communication skills
Curious mindset and eagerness to learn.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8638182
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Senior DevOps Engineer to join our R&D team in developing the next rising product in the health tech landscape. If you are looking for a challenging, influential position and are passionate about making an impact, this might be the role for you.

As a Senior DevOps Engineer , youll play a key role in the design, development, testing, deployment, and monitoring of our infrastructure and products. In this position, you'll make significant contributions to our observability stack, helping build and maintain robust systems for logs, metrics, traces, and alerting.

Our ideal candidate is passionate about DevOps and observability, has strong communication skills, and thrives on constant improvement for both technology and processes. If you enjoy working on multiple projects in parallel and are a proactive team player, youll fit right in.

This is a unique opportunity to join the core team of a fast-growing startup, where your contributions will have a direct impact on our product and success.

Responsibilities

Support and collaborate with cross-functional engineering teams using cutting-edge technologies.
Contribute to the design, implementation, and maintenance of monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, Loki)
Secure, scale, and manage our cloud environments (AWS and GCP)
Design and implement automation solutions for both development and production
Manage and improve our CI/CD pipelines for fast and safe delivery
Lead best practices in infrastructure, observability, configuration management, and system hardening
Continuously assess and improve existing infrastructure in line with industry standards
Requirements:
BSc in Computer Science, Engineering, or equivalent experience
5+ years of experience as a DevOps Engineer or similar software engineering role
Proven experience with Docker and Kubernetes (EKS preferred)
Hands-on experience with monitoring and observability tools, including Prometheus, Grafana, Datadog, or similar.
Expertise in Terraform for AWS infrastructure-as-code deployments
Strong collaboration and interpersonal communication skills
Excellent analytical thinking and problem-solving mindset
Proficiency with relational databases
Solid knowledge of Python and Bash scripting
Experience with test automation - an advantage
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8671069
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
25/05/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
At our company, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; its a Dream job. we are where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Lets build something extraordinary together.
our company's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Built as part of a broader sovereign AI platform, our technology is designed to operate in on-premise, private cloud, and air-gapped environments, enabling nations to maintain full control over their data, infrastructure, and AI capabilities. Central to our company's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.
At our company, our talented team, driven by passion, expertise, and innovative minds, inspires us daily. We are not just dreamers, we are dream-makers.
Responsibilities
Design and build agentic systems - single and multi-agent workflows with planning, memory, context engineering, and tool use - for both internal automation and product-facing autonomous capabilities operating over long time horizons.
Build and operate the AI platform layer - LLM gateways, prompt management, structured output handling, tool-calling infrastructure, and cost/latency optimization - deployed on Kubernetes, consumed by every team for their agentic work.
Own the agent framework layer - orchestration primitives, execution environments, state management, and sandboxed tool execution - giving every team at our company the building blocks to create and operate their own agents.
Build evaluation infrastructure that gives teams confidence in agent behavior - automated LLM and agent evals for quality, correctness, safety, latency, cost, and regressions, including human-in-the-loop oversight for mission-critical workflows.
Productionize and harden backend services (APIs, gRPC, async workers) that integrate LLMs - with proper error handling, retries, circuit breakers, and high-availability patterns.
Own RAG pipelines and retrieval systems - indexing, chunking, embedding, vector database management, filtering, and relevance tuning for production retrieval.
Optimize performance and cost across the AI stack - model routing, caching, batching, and inference cost management.
Ship shared tooling - libraries, SDKs, agent templates, and documentation - while working closely with ML Platform, Data Platform, DevOps, and other teams across the Applied AI Engineering group. Own architecture, documentation, and operations end-to-end.
Requirements:
5+ years in backend or distributed systems engineering, with 2+ years focused on production systems that integrate AI/ML models or LLMs.
Engineering craft - Strong Python, Go, or Java, system architecture, API design, testing, and secure coding practices.
Agentic systems - Experience designing and building agent orchestration, tool-use systems, and autonomous workflows; familiarity with frameworks like LangGraph or similar, or having built equivalent from scratch
Backend engineering - Experience building production APIs and services (FastAPI or similar); async programming, service architecture, high-availability, and reliability patterns (retries, circuit breakers, backpressure)
LLM integration - Hands-on experience integrating LLMs via SDKs and APIs; context engineering, structured outputs, tool calling, and model routing
RAG & retrieval - Experience with embedding pipelines, vector databases (e.g., Milvus, Qdrant, Pinecone), chunking strategies, and relevance tuning
Evaluation & observability - Experience designing LLM and agent evals, monitoring AI system quality, and building observability for non-deterministic systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8664306
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
25/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an experienced Senior DevOps Engineer to join our DevOps team in the Posture R&D Group, who is passionate about software design, development and deployment. The role goes beyond traditional DevOps - it focuses on building the infrastructure and platforms that enable AI models and autonomous agents to run in production at scale, across both cloud and on-prem environments. The job involves writing production-grade modern DevOps solutions that will be shipped to the cloud and on-prem solutions, while working with cutting-edge technologies and architectures that push the boundaries of AI-driven cybersecurity systems.
Responsibilities
Build the best solutions for our production platform, enabling high-scale, AI-driven systems and agents to operate reliably in production-scale environments
Everything as a code approach (IaC): Run our infrastructure with a wide range of technologies including Terraform, and Kubernetes
Build and maintain tools for automation, deployment, monitoring, and operations, with a strong focus on scalability, resilience, and observability of distributed system.
Troubleshoot complex issues in our development, production, and test environments, including large-scale, distributed, and AI-integrated systems
Excellent communication and people skills.
Requirements:
8+ of years experience with DevOps technologies.
Extensive background leading the design, build, and evolution of end-to-end DevOps platforms, including infrastructure, tooling, and operational frameworks across the software lifecycle.
Deep expertise with one of the major cloud providers: AWS (preferred), GCP, Azure.
Extensive experience with modern deployment strategies (GitOps, blue/green, canary, Kubernetes-based deployments)
Strong experience designing and optimizing end-to-end CI/CD pipelines, enabling high velocity, reliable software delivery.
Experienced with bootstrapping projects, introducing new technologies and building systems from scratch.
Background in working with AI components and understanding the challenges of bringing AI workloads into production.
Good coding capabilities (Python, Bash, etc.)
Experience mentoring engineers, leading cross-functional initiatives, and influencing technical direction.
Advantages:
Experience with on-prem environments and solutions.
Prior experience with endpoint security products (agents, sensors, collectors).
Tech Stack: AWS, Kubernetes, EKS, Jenkins, IaC, GitHub, Terraform, Python, Docker, ArgoCD, MongoDB, RabbitMQ, Redis, Go, Neo4J, AI, and more.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8664565
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/05/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Site Reliability Engineer on the SASE Platform team, you will play a critical role in building and operating highly available, secure, and globally distributed services. Your mission is to ensure our cloud-native security and networking platform is reliable, scalable, and performant from day one, protecting the users, applications, and data for the world's largest enterprises as they adopt cloud, remote work, and AI.
Key Responsibilities
Proactively collaborate with development teams to embed reliability, scalability, and operability into services from the earliest design stages.
Design, review, and evolve cloud-native architectures to improve availability, performance, cost efficiency, and fault tolerance.
Build and operate automation for provisioning, deploying, and managing global infrastructure using Infrastructure as Code (IaC).
Improve CI/CD pipelines and release processes to enable safe, fast, and repeatable deployments.
Drive observability best practices, including metrics, logs, traces, and SLIs/SLOs to enable data-driven incident analysis.
Participate in on-call rotations, reducing mean time to resolution (MTTR) through automation and proactive reliability improvements.
Challenge existing processes by championing reliability, security, and operational maturity across the organization.
Requirements:
Your experience:
5+ years of experience working with Unix/Linux systems, including shell, tools, networking, and kernel concepts.
2+ years of hands-on experience with microservices architectures running on Kubernetes and container platforms.
Proven experience operating workloads in public cloud environments (e.g., AWS, GCP, Azure) at scale.
Proficiency in building automation and tools in at least one scripting or programming language (e.g., Python, Go, Java).
Strong experience with Infrastructure as Code (IaC) tools such as Terraform or Ansible.
Bachelors degree in Engineering, Computer Science, or a related technical field, or equivalent practical experience.
Preferred Qualifications
Deep expertise in designing and operating monitoring, alerting, and observability systems (e.g., Prometheus, Grafana, ELK Stack).
Advanced networking expertise, including TCP/IP, DNS, BGP, routing, and cloud networking concepts relevant to SASE architectures.
Prior experience operating or supporting SASE, SD-WAN, Zero Trust, or network security platforms.
Familiarity with using AI/LLM technologies to improve operational workflows (e.g., incident analysis, automation).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8638178
סגור
שירות זה פתוח ללקוחות VIP בלבד