דרושים » תוכנה » Senior Site Reliability Engineer (Cortex)

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 2 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
Your Career:
Join a team of senior engineers operating in a large-scale, multi-cloud production environment supporting tens of thousands of enterprise customers worldwide. This is not a typical SRE role - youll work at the core of a complex, high-impact system alongside experienced DevOps professionals in a fast-paced, cybersecurity-focused organization.
Your Impact:
Own and operate large-scale, global production environments across multiple cloud providers (GCP, AWS, Azure)
Actively monitor, investigate, and resolve incidents triggered by automated alerting systems (PagerDuty / Incident Response)
Drive end-to-end troubleshooting across complex, distributed systems with high context switching
Design, deploy, and improve monitoring and observability systems (e.g., Prometheus, Grafana) - not just react to alerts
Collaborate closely with internal teams (CX, CS, Engineering) to ensure system reliability and performance
Work hands-on with modern DevOps and infrastructure tools including Kubernetes, Terraform, CI/CD pipelines, and GitOps workflows
Develop and maintain automation and tooling (primarily in Python)
Gain deep understanding of system architecture and interconnected services
Contribute to a culture of operational excellence in a high-scale, high-availability environment
On call responsibilities:
Daytime hours (12:00-20:00)
Occasional weekends and holidays (rotation-based).
Requirements:
Your experience:
5+ years of experience in SRE roles in production environments at scale
Strong hands-on experience with Kubernetes and Terraform
Strong hands-on experience with at least one major cloud platform (GCP or AWS required)
Experience building and configuring monitoring systems (e.g., Prometheus, Grafana)
Familiarity with CI/CD and GitOps tools (GitLab CI, GitHub Actions, Jenkins, Flux)
Proficiency in Python for scripting and automation
Strong troubleshooting and problem-solving skills with a passion for incident handling
Ability to work in fast-paced environments with high context switching
Highly responsive, proactive, and ownership-driven
Strong collaboration and communication skills
Curious mindset and eagerness to learn.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8638182
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 13 שעות
דרושים בCrowdStrike
Location: Tel Aviv-Yafo
Job Type: Full Time
CrowdStrike's Data Science Studio is seeking a pioneering Senior MLOps Engineer to establish and lead our MLOps function from the ground up. As the first MLOps engineer in the studio, you will play a foundational role in shaping how we build, deploy, and scale machine learning systems that protect thousands of organizations worldwide.

This is a unique opportunity to define the technical strategy, influence the technology stack, and architect the infrastructure that will power our AI/ML-driven security solutions for years to come.

This role combines strategic vision with hands-on execution. You'll work at the intersection of data science, engineering, and production operations - building production-grade systems that operate at immense scale while collaborating closely with highly technical data scientists and ML engineering teams across CrowdStrike.

What You'll Do:
- Architect MLOps infrastructure from the ground up: Design and implement the foundational MLOps platform, establishing best practices, tooling, and workflows that will scale with our growing data science initiatives
- Define technology strategy: Evaluate, select, and integrate MLOps technologies and platforms that best serve our needs - from experiment tracking and model versioning to deployment pipelines and monitoring systems
- Build production-grade ML pipelines: Develop robust, scalable pipelines for model training, validation, deployment, and monitoring that handle massive data volumes and ensure reliability in production
- Enable data scientist productivity: Create tools, frameworks, and automation that empower data scientists to move quickly from research to production while maintaining high quality and reliability standards
- Establish monitoring and observability: Implement comprehensive monitoring, logging, and alerting systems to ensure ML models perform optimally in production and issues are detected proactively
- Drive MLOps culture and practices: Champion best practices in ML engineering, CI/CD for ML, model governance, and reproducibility across the data science organization
- Collaborate cross-functionally: Partner closely with data scientists to understand their workflows and pain points, and work with ML engineering teams to ensure seamless integration with broader platform capabilities
 -Scale for the future: Design systems with scalability, security, and maintainability in mind, anticipating the needs of a rapidly growing ML portfolio
Requirements:
- 6+ years of experience in MLOps, ML engineering, DevOps, or related infrastructure roles with focus on machine learning systems
- Production ML systems expertise: Proven track record of building and operating ML systems at scale in production environments
- Strong infrastructure and automation skills: Deep knowledge of cloud platforms (AWS, Azure, or GCP), containerization (Docker, Kubernetes), and infrastructure-as-code (Terraform, CloudFormation)
- ML pipeline proficiency: Hands-on experience with ML workflow orchestration tools (e.g., Airflow, Kubeflow, MLflow, Metaflow) and building end-to-end ML pipelines
- Programming excellence: Strong coding skills in Python; experience with additional languages is a plus
- CI/CD and DevOps practices: Expertise in building automated deployment pipelines, version control, and modern DevOps methodologies
- Strategic and hands-on balance: Ability to think architecturally about long-term solutions while rolling up your sleeves to implement them
- Collaborative mindset: Excellent communication skills and ability to work effectively with data scientists, engineers, and stakeholders with varying technical backgrounds
- Startup mentality: Comfort with ambiguity and ability to build from scratch in a fast-paced environment
This position is open to all candidates.
 
Show more...
הגשת מועמדות
עדכון קורות החיים לפני שליחה
8611396
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
At our company, we believe people are capable of more than a single job description. Youre not hired just to fill a position- youre empowered to shape it, grow it, and make it your own.
We call this being Positionless.
And Positionless isnt just our culture. Its our product.
we are the creator of Positionless Marketing, an AI-powered platform that gives every marketer the power to analyze, create, launch, and optimize independently. The result is faster execution, deeper personalization, and 88% greater campaign efficiency.
Recognized as a Visionary in Gartners Magic Quadrant, we partner with leading brands like Sephora, Staples, and Entain. Today, more than 550 our company's across NYC, London, Tel Aviv, Scotland, Brazil, Estonia, and beyond are building the future of marketing together, in an environment that actively encourages ownership and growth, with two out of every three managers promoted from within.
If youre looking for a place where you can do more, be more, come grow with us.
Are you passionate about ensuring system reliability, scalability, and performance? Do you thrive in a dynamic environment where automation and operational excellence are key?
we are looking for a Site Reliability Engineer (SRE) to join our team and play a crucial role in designing, implementing, and maintaining our cloud-based infrastructure. In this role, you will collaborate across teams to drive automation, improve system resilience, and optimize performance while fostering a culture of reliability.
Responsibilities:
System Reliability- Ensure high availability and performance of services through effective monitoring, incident management, and root cause analysis.
Automation & Tooling- Develop and maintain automation for infrastructure provisioning, configuration management, and application deployment.
Performance Optimization- Analyze and enhance system performance, including load balancing, caching, and database tuning. Conduct regular capacity planning.
Incident Response & Troubleshooting- Lead incident response efforts, participate in on-call rotations, and troubleshoot complex infrastructure issues.
Security & Compliance- Collaborate with security teams to implement best practices and ensure compliance with relevant standards (ISO 27001, SOC 2, etc.).
Collaboration & Mentorship- Work closely with developers, DevOps, Support, and product teams to enhance application reliability and implement SRE best practices.
Requirements:
4+ years in Site Reliability Engineering, DevOps, or related roles.
Proven experience managing large-scale, cloud-based infrastructure in GCP, AWS, or Azure.
Expertise in container orchestration (Kubernetes, Docker) and microservices architecture.
Strong proficiency in scripting and programming languages (Python, Go, Bash, etc.).
Experience with CI/CD pipelines, infrastructure as code (Terraform, CloudFormation), and configuration management (Ansible, Puppet, Chef).
Hands-on experience with monitoring and observability tools (Datadog, Prometheus, Grafana, ELK Stack).
Experience using AI tools to enhance SRE processes, such as intelligent monitoring, incident prediction, and automation of incident response.
Deep understanding of networking concepts, DNS, load balancing, and distributed systems.
Strong problem-solving skills, excellent communication, and a proactive mindset.
Advantages:
Certifications- AWS Certified Solutions Architect, GCP Professional Cloud Architect, or Kubernetes certifications (CKA, CKAD).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8594736
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 2 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Site Reliability Engineer on the SASE Platform team, you will play a critical role in building and operating highly available, secure, and globally distributed services. Your mission is to ensure our cloud-native security and networking platform is reliable, scalable, and performant from day one, protecting the users, applications, and data for the world's largest enterprises as they adopt cloud, remote work, and AI.
Key Responsibilities
Proactively collaborate with development teams to embed reliability, scalability, and operability into services from the earliest design stages.
Design, review, and evolve cloud-native architectures to improve availability, performance, cost efficiency, and fault tolerance.
Build and operate automation for provisioning, deploying, and managing global infrastructure using Infrastructure as Code (IaC).
Improve CI/CD pipelines and release processes to enable safe, fast, and repeatable deployments.
Drive observability best practices, including metrics, logs, traces, and SLIs/SLOs to enable data-driven incident analysis.
Participate in on-call rotations, reducing mean time to resolution (MTTR) through automation and proactive reliability improvements.
Challenge existing processes by championing reliability, security, and operational maturity across the organization.
Requirements:
Your experience:
5+ years of experience working with Unix/Linux systems, including shell, tools, networking, and kernel concepts.
2+ years of hands-on experience with microservices architectures running on Kubernetes and container platforms.
Proven experience operating workloads in public cloud environments (e.g., AWS, GCP, Azure) at scale.
Proficiency in building automation and tools in at least one scripting or programming language (e.g., Python, Go, Java).
Strong experience with Infrastructure as Code (IaC) tools such as Terraform or Ansible.
Bachelors degree in Engineering, Computer Science, or a related technical field, or equivalent practical experience.
Preferred Qualifications
Deep expertise in designing and operating monitoring, alerting, and observability systems (e.g., Prometheus, Grafana, ELK Stack).
Advanced networking expertise, including TCP/IP, DNS, BGP, routing, and cloud networking concepts relevant to SASE architectures.
Prior experience operating or supporting SASE, SD-WAN, Zero Trust, or network security platforms.
Familiarity with using AI/LLM technologies to improve operational workflows (e.g., incident analysis, automation).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8638178
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 4 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Site Reliability Engineer on the SASE Platform team, you will play a critical role in building and operating highly available, secure, and globally distributed services. Your mission is to ensure our cloud-native security and networking platform is reliable, scalable, and performant from day one, protecting the users, applications, and data for the world's largest enterprises as they adopt cloud, remote work, and AI
Your Impact:
Proactively collaborate with development teams to embed reliability, scalability, and operability into services from the earliest design stages.
Design, review, and evolve cloud-native architectures to improve availability, performance, cost efficiency, and fault tolerance.
Build and operate automation for provisioning, deploying, and managing global infrastructure using Infrastructure as Code (IaC).
Improve CI/CD pipelines and release processes to enable safe, fast, and repeatable deployments.
Drive observability best practices, including metrics, logs, traces, and SLIs/SLOs to enable data-driven incident analysis.
Participate in on-call rotations, reducing mean time to resolution (MTTR) through automation and proactive reliability improvements.
Challenge existing processes by championing reliability, security, and operational maturity across the organization.
Requirements:
Your Experience
5+ years of experience working with Unix/Linux systems, including shell, tools, networking, and kernel concepts.
2+ years of hands-on experience with microservices architectures running on Kubernetes and container platforms.
Proven experience operating workloads in public cloud environments (e.g., AWS, GCP, Azure) at scale.
Proficiency in building automation and tools in at least one scripting or programming language (e.g., Python, Go, Java).
Strong experience with Infrastructure as Code (IaC) tools such as Terraform or Ansible.
Bachelors degree in Engineering, Computer Science, or a related technical field, or equivalent practical experience.
Nice to have:
Deep expertise in designing and operating monitoring, alerting, and observability systems (e.g., Prometheus, Grafana, ELK Stack).
Advanced networking expertise, including TCP/IP, DNS, BGP, routing, and cloud networking concepts relevant to SASE architectures.
Prior experience operating or supporting SASE, SD-WAN, Zero Trust, or network security platforms.
Familiarity with using AI/LLM technologies to improve operational workflows (e.g., incident analysis, automation).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8638041
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Senior Software Engineer to join our CI Infrastructure team within the Platform Engineering organization.
The team designs, builds, and operates the internal engineering platform that powers our companys build, test, and security validation workflows. This includes large-scale orchestration of complex test environments, hybrid cloud/on-prem execution infrastructure, DevSecOps integrations (SCA, SAST, policy enforcement), and advanced optimization mechanisms.
Our mission is to provide a scalable, reliable, and intelligent CI platform that enables hundreds of engineers to build and validate a complex distributed networking product efficiently and safely.
This is a hands-on engineering role focused on scalability, observability, reliability, and intelligent automation, with direct impact on engineering velocity and release confidence.
What Youll Do
Design and evolve scalable CI infrastructure for build and large-scale test execution
Develop automation and orchestration systems across hybrid environments (AWS and on-prem)
Integrate and optimize security validation flows (SCA, SAST, quality gates) within CI
Improve reliability, performance, and observability across high-volume CI workloads
Develop AI-assisted tooling and agents to optimize test selection, failure analysis, and resource utilization
Analyze CI data to identify bottlenecks, flakiness patterns, and optimization opportunities
Collaborate with R&D, Automation, and Security teams to continuously improve CI architecture and best practices
Take ownership of critical platform components used daily by large engineering teams.
Requirements:
What Were Looking For
B.Sc. in Computer Science or equivalent practical experience
5+ years of hands-on software engineering or infrastructure development experience
Strong programming skills in Python (or similar high-level language)
Experience designing and building scalable systems or automation frameworks
Solid understanding of Linux, containers (Docker), and Git-based workflows
Experience working in cloud and hybrid infrastructure environments
Strong system-level thinking and troubleshooting skills
Ability to take ownership and drive solutions end-to-end
Nice to Have
Experience with CI/CD systems (Jenkins, GitHub Actions, or similar)
Background in large-scale test infrastructure or build systems
Experience integrating security tools (SCA, SAST, SBOM, vulnerability management)
Experience building AI-assisted developer productivity tools
Familiarity with observability stacks (metrics, logs, tracing)
Experience improving reliability and performance of large engineering platforms
Personal Qualities
Strong ownership mindset and sound engineering judgment
Passion for building scalable, reliable infrastructure
Data-driven approach to optimization and continuous improvement
Excellent communication and cross-team collaboration skills
Comfortable operating in complex, distributed environments.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8595577
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/04/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
we are on a mission to bring identity security everywhere - to every human, machine, and AI agent, both on-prem and in the cloud. Our unique technology secures identities & access at runtime, in ways that werent possible before. With the broadest identity security platform in the market, trusted by more than 1,000 customers including many Fortune 100 companies, we are uniquely positioned to lead the fast-growing identity security category.
Joining our company means becoming part of a fast-moving team with a culture of innovation and collaboration, that goes above and beyond to help our customers and each other, on a journey to reshape the future of identity security.
As a DevOps Engineer, youll design and implement a full CI/CD solution on AWS and Azure with Kubernetes, while supporting and scaling our cloud infrastructure using Infrastructure as Code.
Responsibilities
Design and implement robust, scalable, and highly available cloud solutions to ensure seamless service delivery and support rapid business growth
Focus on cross-service integrations and components within the SaaS stack, adhering to DevOps best practices
Develop, enhance, and maintain CI/CD pipelines using a GitOps-driven approach, ensuring efficient and secure deployment workflows
Streamline processes through automation, focusing on scalability, security, metric collection, and enhanced visibility across environments
Partner with developers to optimize service reliability, performance, failover strategies, and scalability
Work as part of an innovative and high-performing team, leveraging modern tools and technologies.
Requirements:
5+ years of experience in DevOps roles, with a proven track record of managing large-scale systems
3+ years of hands-on experience with cloud platforms, preferably AWS or Azure
Proficiency in Kubernetes for container orchestration- must
Strong experience with Infrastructure as Code (IaC) tools, particularly Terraform- must
Advanced expertise in enterprise Linux administration in production environments, including deployment, configuration, and lifecycle management
In-depth knowledge of Continuous Delivery (CI/CD) and GitOps methodologies, with tools such as Jenkins, GitHub Actions, and ArgoCD
Expertise in configuration management tools such as Ansible
Familiarity with message bus technologies (e.g., RabbitMQ, Kafka, or similar)
Hands-on experience with monitoring and logging solutions
Proficiency in automation scripting using Bash and at least one programming language such as Python or Go
Solid understanding of networking concepts and information security, including firewalls, VPNs, LDAP, identity management, and access control
Strong collaboration skills, with the ability to work across teams, communicate complex issues, and support technical decision-making
Analytical and proactive, with high ownership and accountability for end-to-end system reliability in dynamic environments.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8600847
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a hands-on Senior DevOps Engineer with a strong cloud-native mindset to build, maintain, and evolve our highly scalable, highly-available cloud infrastructure. This role is pivotal in driving operational excellence, security, and automation across our entire engineering organization. You will promote communication, integration, and collaboration to significantly enhance our software development productivity and reliability. You'll work closely with engineering and product teams to streamline delivery, enforce platform standards, and enable a high-velocity development environment-all while keeping reliability and security top of mind.
Responsibilities:
Design, Automate, and Manage complex cloud infrastructure on AWS using best-in-class Infrastructure as Code (IaC) practices.
Lead the operation and enhancement of our production Kubernetes environments (EKS), focusing on automation, security, observability, and seamless CI/CD integration.
Drive continuous improvement across platform tooling, developer experience, and operational processes to meet our ambitious performance and uptime goals.
Implement and enforce security-first infrastructure patterns, including strong IAM, network segmentation, and secure secrets management.
Actively contribute to high-level technical design discussions and cross-functional architectural decision-making, ensuring solutions align with long-term platform strategy.
Requirements:
7+ years of experience as a DevOps Engineer, Platform Engineer, or in a similar infrastructure-focused role.
Strong hands-on expertise across the AWS Stack (e.g. EC2, EKS, RDS, VPC, IAM, S3, Lambda).
Mastery of Infrastructure as Code - Terraform or equivalent.
Deep operational knowledge of Kubernetes, including architecture, cluster management, networking, and advanced debugging in production environments.
Strong expertise in designing and managing CI/CD methodologies and platforms (e.g. Jenkins, Github Actions).
Experience with monitoring tools such as Prometheus, DataDog, Coralogix (OTEL), Grafana etc.
Proven prior experience building and maintaining highly-available, production-grade, and service-oriented systems.
Strong scripting and automation background in languages such as Python or Bash.
Exceptional communication and collaboration skills with the ability to articulate complex technical needs and influence cross-functional teams.
Strong knowledge of AWS Networking - an advantage.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8625782
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Senior AI Engineer to join our Cybersecurity team in Tel Aviv. You will design, build, and productionize LLM-powered applications, multi-agent systems, and MLOps infrastructure that power our company's next-generation cybersecurity capabilities. This is a high-impact, hands-on role at the intersection of applied AI, agentic systems, and network securit
What You'll Do
Design and develop LLM-powered security features and internal AI tools, including RAG pipelines, multi-agent workflows, and prompt-engineered systems tailored for cybersecurity use cases
Architect and operate multi-agent systems in production - including agent orchestration, inter-agent communication, task delegation, and failure handling at scale
Build robust agent monitoring and observability pipelines: tracing agent execution, detecting drift or failure, alerting on anomalous behavior, and maintaining agent reliability SLAs
Build and maintain scalable MLOps infrastructure: model serving, evaluation frameworks, experiment tracking, and CI/CD for ML models
Work with internal datasets (network telemetry, security logs, threat intelligence) to fine-tune and adapt foundation models for domain-specific detection and response tasks
Partner with the Cybersecurity, R&D, and infrastructure teams to define AI-driven security features and deliver them end-to-end
Establish best practices for model observability, safety, and responsible AI deployment within the organization
Stay current with the fast-moving LLM/GenAI and agentic AI ecosystem and evaluate emerging frameworks, models, and tools for adoption.
Requirements:
Must-Have
5-8 years of software engineering experience, with at least 2-3 years focused on AI/ML engineering
Hands-on experience building production-grade LLM applications - RAG, agents, tool use, or fine-tuning
Proven experience designing and running multi-agent systems in production: orchestration patterns, agent state management, retries, and graceful degradation
Experience monitoring and observing AI agents in production - execution tracing, latency tracking, failure detection, and alerting (e.g., LangSmith, Arize, custom observability stacks)
Proficiency with agentic frameworks: LangChain, LangGraph, and/or AWS Bedrock AgentCore
Strong Python skills and comfort working across the full AI application stack
Experience designing and operating MLOps pipelines (model versioning, deployment, monitoring)
Solid understanding of transformer-based models, embeddings, and vector databases (e.g., Pinecone, Weaviate, pgvector)
Comfortable working in cloud environments (AWS, GCP, or Azure) and containerized deployments (Docker, Kubernetes)
Strong problem-solving skills and ability to work autonomously in a fast-paced environment
Nice-to-Have
Background in cybersecurity - threat detection, SIEM, SOC automation, or security data analysis - a significant plus for this role
Familiarity with networking concepts (SDN, cloud-native networking, BGP, telemetry)
Experience with model evaluation and benchmarking (LLM-as-judge, RAGAS, or custom eval harnesses)
Exposure to MCP (Model Context Protocol) for tool-augmented agentic workflows
Prior experience in enterprise SaaS, networking, or telecom domains
Publications, open-source contributions, or projects in the LLM/GenAI or agentic AI space
Our Stack
Python PyTorch OpenAI / Anthropic APIs LangChain LangGraph AWS Bedrock AgentCore LangSmith Kubernetes Kafka Elasticsearch AWS PostgreSQL GitHub Jira Confluence.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8595648
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
07/04/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
we are looking for a talented DevOps Engineer to join us on our mission to simplify decision-making for millions!

What You'll Do

Responsibilities

Design, own, and evolve DevOps tooling, CI/CD architecture, and infrastructure automation strategies across multi-cloud environments (AWS & GCP), supporting high-performance, resilient production systems.
Manage Kubernetes clusters (EKS, GKE) and containerized microservices at scale, leveraging Helm, IaC, and other cloud-native technologies.
Collaborate with engineering and data teams to optimize cloud-native architectures for performance, cost-efficiency, scalability, and high availability.
Automate infrastructure and pipeline workflows using Python, Bash, and Groovy, with IaC tools like Terraform and CloudFormation, and CI/CD platforms such as Jenkins and GitHub Actions.
Support data workflows and ML deployments using orchestration tools like Airflow and CI/CD for data pipelines.
Work with AI-native tooling (e.g., MCP, agent frameworks, Cursor, OpenAI, Gemini and Vertex).
Bring out-of-the-box thinking, excellent problem-solving skills, and the ability to debug complex systems.
Requirements:
3+ years of hands-on experience with AWS in production environments, with strong working knowledge of Linux-based systems for deployment, debugging, and automation.
3+ years of DevOps experience supporting production-grade systems with high availability, scalability, and operational reliability.
Strong expertise in Kubernetes-based orchestration (EKS, KOps, GKE).
Extensive experience with CI/CD tools such as Git, GitHub, Jenkins, GitHub Actions, and Nexus.
Proficiency in scripting/programming languages, including Bash, Python, or Groovy, for automating infrastructure and pipelines.
Experience with Infrastructure as Code (IaC) tools like Terraform and CloudFormation.
Experience with logging, metrics, and observability stacks, such as Datadog, Telegraf, Elasticsearch, Kibana, Prometheus, and Grafana.
Ability to troubleshoot and debug complex, distributed systems across multiple cloud environments.
Only candidates meeting the above requirements will be considered.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8602273
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
We are seeking an exceptional Senior Backend Engineer to join our Platform Group, where you will architect and develop shared infrastructure components that power enterprise security and identity products. This is a highly collaborative role requiring both technical excellence and strong cross-functional partnership skills.
What You'll Do:
As a Senior Backend Engineer on our Core Platform team, you will:
Design and build shared platform components used across multiple product teams, ensuring scalability, reliability, and maintainability
Architect distributed systems using microservices and event-driven patterns that support enterprise-scale workloads
Own critical backend services from conception through deployment, including code quality, performance optimization, and operational excellence
Collaborate extensively with product engineers, DevOps, and architects to define platform capabilities and technical standards
Mentor junior engineers through code reviews, pair programming, and technical guidance
Drive technical initiatives that improve developer productivity, system observability, and platform resilience
Contribute to architectural decisions and establish best practices for backend development across the organization
Analyze system performance and data patterns to identify optimization opportunities and inform future platform investments
Requirements:
7+ years of professional software engineering experience, with strong expertise in backend development
Deep proficiency in C# and .NET Core, including modern framework features and performance optimization
Production experience with Kubernetes and container orchestration in cloud environments (Azure/AWS/GCP preferred)
Proven track record designing and implementing microservices architectures and event-driven systems at scale
Experience developing shared libraries, frameworks, or platform components consumed by multiple teams
Demonstrated ability building enterprise SaaS applications serving high-volume, multi-tenant environments
Strong collaborative mindset with excellent communication skills and experience working across teams to drive consensus
Bachelor's degree in Computer Science or equivalent practical experience
Technical leadership experience, including mentoring engineers and leading technical initiatives
Fluency in English (written and verbal)
Preferred Qualifications
Experience in the Enterprise Identity and Access Management (IAM) domain
Hands-on expertise with Kafka, RabbitMQ, or similar message brokers
Knowledge of ElasticSearch or other distributed search/analytics platforms
Open-source contributions or experience maintaining shared component libraries
Experience with observability tools (Prometheus, Grafana, distributed tracing)
Background in API design and governance for platform services
Familiarity with CI/CD pipelines and infrastructure-as-code practices
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8636248
סגור
שירות זה פתוח ללקוחות VIP בלבד