Site Reliability Engineer

2 ימים

Site Reliability Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

Realize your potential by joining the leading performance-driven advertising company!
As Site Reliability Engineer on the IT Production team in our TLV Office, youll play a vital role in building robust services and solving infrastructure challenges with automations while working with cutting-edge technologies and bringing those to their limits on our mostly on-prem cloud like infrastructure.
How youll make an impact:
As a Site Reliability Engineer, youll bring value by:
Ensure Reliability & Scalability: Design, implement and manage highly reliable and scalable distributed systems across our on-premise, cloud and AI/ML environments. Proactively optimize performance, efficiency, resource utilization and cloud cost.
Drive Automation: Automate repetitive tasks, infrastructure provisioning, configuration and deployments using IaC and scripting languages (e.g., Python, Go, Rust).
Develop Observability & Capacity: Implement comprehensive monitoring and alerting systems to ensure system health. Collaborate on capacity planning to meet future growth.
Maintain Security & Compliance: Integrate security best practices and ensure compliance with industry standards.
Lead Incident Management: Participate in on-call rotations, lead incident responses and conduct root cause analysis to minimize downtime.
Foster Collaboration & Improvement: Work closely with development, operations and security teams to drive shared responsibility and continuous improvement in SRE practices.

Requirements:
7 years of experience as an SRE, DevOps Engineer, System Administrator in a large distributed environment with focus on Linux operating systems.
Experience supporting, troubleshooting and scaling large distributed systems in production.
Deep understanding of HTTP protocol, including HTTP/1.1, HTTP/2, caching semantics, TLS and gRPC delivery.
Experience configuring and operating CDN services (e.g., Akamai, Fastly, Cloudflare, AWS CloudFront).
Deep understanding in Linux system internals and system performance tuning.
Experience with Configuration Management Tools (Puppet, Ansible, Chef, Terraform).
Experience programming in at least one of the following languages (Python, Golang, Rust, Ruby, C++, Java).
Experience with monitoring and metrics collection systems (Prometheus, Grafana, ELK).
Experience with cloud providers and platforms (AWS, Azure, GCP, Alibaba).
Experience with containerization technologies (Kubernetes, Docker).
Deep understanding of networking principles (TCP/IP, DNS, load balancing).

This position is open to all candidates.

Hide

עדכון קורות החיים לפני שליחה

8335946

שירות זה פתוח ללקוחות VIP בלבד

משרות דומות שיכולות לעניין אותך

דיווח על תוכן לא הולם או מפלה

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

28/08/2025

Senior DevOps Engineer

חברה חסויה

Location: Tel Aviv-Yafo and Yokne`am

Job Type: Full Time

we are at the forefront of the AI revolution, delivering cutting-edge accelerated compute platforms for global impact. Our Network Insights group is seeking a talented and motivated Sr. DevOps Engineer to architect, scale, and optimize the DevOps infrastructure supporting our advanced networking simulation services. In this high-impact role, you will lay the foundations to scale a key insight product to reach 10100 times more users, design robust CI/CD pipelines, drive automation, and ensure the reliability, scalability, and security of our cloud-based, and on-prem platforms.. If you are passionate about solving complex infrastructure challenges and enabling world-class software delivery, we want to hear from you.
What You'll Be Doing:
Architect and optimize CI/CD pipelines for large-scale, high-availability simulation services, ensuring fast, reliable, and secure deployments.
Drive automation across infrastructure provisioning, configuration management, and monitoring to support rapid development cycles and minimize manual intervention.
Collaborate with software engineering and product teams to design and implement scalable, cloud-native solutions that meet evolving business needs.
Promote standard processes in infrastructure as code, containerization, and cloud security, ensuring compliance and resilience across environments.
Monitor, troubleshoot, and resolve infrastructure and deployment issues, maximizing uptime and ensuring efficient performance for internal and external customers.
Evaluate and integrate new tools and technologies to continually enhance the reliability, observability, and efficiency of our DevOps ecosystem.
Participate in incident response and post-mortem processes, driving root cause analysis and systemic improvements.

Requirements:
BSc or above in Computer Science, Computer Engineering, or a related field, or equivalent experience.
5+ overall years of hands-on experience in DevOps or Site Reliability Engineering roles.
Proven expertise in designing, building, and maintaining CI/CD pipelines (e.g., Jenkins, GitLab CI, GitHub Actions, or similar).
Deep knowledge of cloud platforms (AWS, preferably), On-Prem deployment, container orchestration (Kubernetes, Docker), and infrastructure as code.
Strong scripting and automation skills (Python, Bash, or similar).
Experience with monitoring, logging, and observability tools (Prometheus, Grafana, ELK, etc.).
Proven understanding of security standard methodologies in cloud & on-prem DevOps environments.
Excellent communication and interpersonal skills, with a track record of multi-functional collaboration.
Experience supporting large-scale, high-availability production systems.
Ways to Stand Out From the Crowd:
Prior background in networking or simulation environments.
Prior experience with building a new team from the grounds up.
Familiarity with performance tuning and cost optimization in cloud and on-prem environments.
Experience with building CI/CD pipelines from the ground up.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8322880

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

05/08/2025

Sr Staff Site Reliability Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

The ideal candidate enjoys working in a fast-paced environment with highly innovative technologies.
Your Impact
Provision, configure, and support resilient hybrid cloud deployment architectures using the automation framework
Collaborate with development teams to ensure applications are production-ready, scalable, and reliable from the outset
Manage CI/CD platform, Linux infrastructure, and collaborate with other SREs to deploy and maintain the automation framework, perform capacity planning, and create and review operational runbooks.
Set up critical infrastructure and develop tools and frameworks to automate operational tasks, including the deployment of machines, services, and applications
Participate in Incident Command on-call rotation supporting critical applications and services.
Conducts root cause analysis of critical business and production issues and drives future preventive measures
Manage scalability, capacity planning, redundancy, and resiliency
Maintain service availability and performance SLAs based on business and product requirements.
Contribute to documentation related to design, deployment, validation, and operations
Design proactive service monitoring, alerting, and trend analysis of underlying infrastructure, and support the operations team in implementation
Establish end-to-end monitoring and alerting on all critical components of the application.

Requirements:
6+ Years of system engineering experience on mission-critical, enterprise-level systems
6+ years of experience using Infrastructure-As-Code to build large-scale environments, mainly on Linux platform (Ubuntu, SUSE, CentOS).
3+ years of experience working with cloud environments, primarily Google Cloud Platform
Demonstrated Linux/Systems experience in a hybrid (cloud, on-prem) environment
Strong experience with CI/CD pipeline, GitHub, Jenkins, Artifactory
Must have a strong foundation in Linux operating systems, Troubleshooting, Design, and Implementation
Expertise in configuration management with a framework such as Terraform, Ansible, and Helm.
Experience using Infrastructure-As-Code to build large-scale environments
Experience with Linux vulnerability management process and patching
Must have programming knowledge in Python/Bash/Perl/Go languages to automate infrastructure workflow
Understanding of software development methodologies and practices, including agile development, continuous integration, and continuous delivery
Understanding of Network Firewalls, load balancers, and complex network designs
Experience in monitoring technologies like Datadog, Nagios, Graphite, Cacti, and Grafana.
Understanding Kubernetes, container lifecycle, and troubleshooting
Hands-on knowledge of high-availability approaches such as load balancing, failover, clustering, and disaster recovery
Excellent problem-solving, critical thinking, communication, and teamwork skills
Passion, drive, energy, a sense of humor, and a great attitude.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8290765

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

31/08/2025

DevOps Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

We are looking for a DevOps DevOps Engineer to take ownership of our Cloud Infrastructure and Platform Engineering strategy, enabling high-scale, cutting-edge GenAI products running across 40+ Kubernetes clusters on GCP and AWS.
This role is a hands-on engineering , requiring deep expertise in cloud-native technologies, Kubernetes at scale, and modern DevOps principles. You will work closely with engineering teams to design and implement scalable infrastructure solutions, optimize developer workflows, and ensure reliability and efficiency across our platform.
Role and Responsibilities:
Cloud & Kubernetes Expertise: Design and implement highly scalable multi-cluster Kubernetes environments across GCP & AWS.
Developer Experience & Enablement: Lead the development of self-service tools and automation that improve efficiency for R&D teams.
Incident & Reliability Engineering: Work with engineering teams to optimize cost, performance, and reliability of production infrastructure through monitoring, capacity planning, and scaling strategies.
Security & Governance: Contribute to best practices for RBAC, IAM, cloud security, and compliance while ensuring infrastructure reliability.
Automation & Infrastructure as Code: Drive adoption of GitOps workflows and Infrastructure as Code (Terraform, Helm, Crossplane) to enhance automation and consistency.
Mentorship & Team Growth: Provide technical mentorship within the platform engineering team and contribute to knowledge-sharing across R&D.
Cross-Team Collaboration: Work closely with engineering teams to align cloud infrastructure goals with business needs and reliability requirements.

Requirements:
5+ years of DevOps, or SRE experience
3+ years working with public cloud platforms (AWS, GCP) at scale
Deep Kubernetes expertise, including managing large-scale, multi-cluster enterprise-grade Kubernetes environments
Experience designing and managing Custom Resource Definitions (CRDs) and custom controllers
Strong background in Infrastructure as Code (Terraform, Helm) and GitOps principles (ArgoCD, Crossplane, FluxCD, etc.)
Hands-on experience in observability & monitoring (Prometheus, Grafana, Datadog, OpenTelemetry, etc.)
Proficiency in scripting & automation (Python, Go, Bash) for infrastructure automation
Expertise in cloud networking (VPC, load balancers, service meshes) and security best practices (RBAC, IAM, security groups, network policies, etc.)
Experience with CI/CD pipelines, optimizing for performance, security, and developer velocity
Nice-to-Have:
Experience with self-hosted on-prem deployments and managed private VPC deployments (Bring Your Own Cloud models)
Advanced expertise in Helm and Crossplane for Kubernetes resource management.
Other cloud provider experience
Experience in GenAI or large-scale SaaS platforms
Familiarity with SQL/NoSQL databases and distributed systems
DevSecOps experience, with a strong understanding of security automation and compliance frameworks

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8326421

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

05/08/2025

Principal DevOps Engineer (Cortex Cloud)

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

As a Principal DevOps Engineer in our Platform Engineering team, you will lead the design and implementation of cutting-edge CI/CD pipelines and cloud architecture that powers our development environment. You'll drive initiatives to enhance developer productivity through automation, tooling, and infrastructure improvements, working with a modern tech stack including Kubernetes, Python, cloud-native and high-scale technologies.
Your Impact
Architect and implement scalable, resilient CI/CD pipelines and cloud infrastructure that supports our engineering organization's evolving needs
Design and develop internal developer tools and platforms that significantly improve developer experience and productivity
Drive the evolution of our Kubernetes-based deployment infrastructure in Google Cloud Platform, ensuring security, reliability and performance
Optimize and scale our CI/CD infrastructure including Jenkins, GitLab, TeamCity, and artifact management systems
Mentor and guide other engineers on DevOps best practices, infrastructure design, and implementation strategies
Drive adoption of infrastructure-as-code, automated testing, and deployment methodologies
Collaborate with development teams to understand their needs and implement solutions that accelerate their workflow
Establish standards and best practices for infrastructure reliability, observability, and performance.

Requirements:
7+ years of experience in DevOps, Site Reliability Engineering, or Platform Engineering roles
Extensive experience with CI/CD pipeline design and implementation in complex environments
Advanced knowledge of Kubernetes administration, deployment patterns, and ecosystem tools
Strong programming skills in Python with solid understanding of OOP principles and design patterns
Deep understanding of cloud architecture, specifically with Google Cloud Platform services
Proven track record designing and implementing developer tooling and automation
Experience managing containerized applications and services in production environments
Strong system design skills with focus on scalability, reliability, and security
Knowledge of GitOps workflows and infrastructure-as-code using tools like Terraform, Pulumi, or equivalent
Familiarity with GitLab CI administration and pipeline development
participate in an on call rotation for working and non-working hours
Nice-to-Have
Knowledge of observability platforms and practices (Prometheus, Grafana, distributed tracing)
Familiarity with TeamCity administration and pipeline development
Experience implementing security best practices in CI/CD pipelines
Understanding of compliance requirements in software delivery pipelines
Experience with Infrastructure as Code testing frameworks
Knowledge of software architecture patterns and microservices design.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8290390

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

10/08/2025

Staff Devops Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

We are looking for a Staff Devops Engineer.
As a Devops Staff Engineer, you will not be assigned a specific R&D group, but will serve as a focal point for the DevOps engineers, to help and support with any issue.
Youll be leading projects that cross DevOps, push forward technical discussions and interact with each DevOps engineer as needed to solve diverse complex problems of high scale.
Youll support multi-region environments, build and maintain tools for automation, deployment, monitoring, and operations.
Youll troubleshoot and resolve issues in our various environments.
Youll play a key role in designing and enforcing infrastructure patterns that support zero-downtime deployments, high resilience, and compliance standards.
Youll collaborate with teams across the company to define and drive forward scalable, production-grade architecture.
Youll conduct periodic on-call duties and emergency response.

Requirements:
10+ years of experience in the industry, including 6+ years of hands-on experience in high-scale SaaS companies or zero-downtime/disaster recovery enterprise environments (e.g., banking, cybersecurity, healthcare, or large-scale cloud platform providers).
5+ years of experience in DevOps roles across a minimum of 2 different companies, with strong hands-on experience in Kubernetes and AWS. Experience with hybrid or multi-cloud architectures is a strong plus.
Experience with on-call duties to manage critical infrastructure and application issues outside business hours, ensuring high availability and reliability.
3+ years of experience with CI/CD tools such as GitLab, GitHub Actions, CircleCI, or similar.
2+ years of experience with programming languages such as Python or TypeScript. Strong Linux administration skills, including debugging and Bash scripting.
2+ years of experience with Terraform (experience with Terragrunt is a plus), as well as GitOps systems such as ArgoCD.
2+ years of experience with configuration management tools such as Ansible, Chef, or Puppet, and monitoring and alerting systems such as Datadog, Splunk, New Relic, or Grafana.
Strong understanding of networking concepts, including VPC, service meshes, routing, DNS, TLS, and firewalls.
Production-oriented mindset with a strong sense of ownership over reliability, scalability, and incident response.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8296098

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

10/08/2025

Staff Engineer, Platform (DX)

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

We are looking for a Staff Engineer.
What Youll Do:
Lead architecture and system design for critical components of the Developer Experience platform, ensuring scalability, resilience, and long-term maintainability.
Own end-to-end delivery of complex initiatives, from requirements gathering and design to implementation, rollout, and observability.
Design, implement, and maintain robust microservices supporting high-throughput and low-latency operations.
Define and uphold API design standards, including gateway configuration, versioning strategy, and long-term lifecycle management.
Build and optimize backend systems that enable developer-facing products such as SDKs, APIs, and webhooks.
Work with both relational and NoSQL databases to ensure data consistency, scalability, and performance.
Collaborate with cross-functional teams to design systems that meet operational and business requirements.
Research and implement cloud-native architectures to support growth and scalability.
Contribute to the creation of developer tools and standards that improve the usability of our APIs and SDKs.

Requirements:
10+ years of experience in backend development, with a strong focus on scalable infrastructure.
Proficiency in Node.js and TypeScript; additional experience with other backend languages is a plus.
Strong expertise in relational and NoSQL databases, including schema design, query optimization, and troubleshooting.
Experience designing and managing RESTful APIs, including versioning strategies, API gateway integration, and developer-first design.
Proven experience designing and deploying microservices-based architectures in production environments.
Hands-on experience with cloud providers (AWS, GCP, Azure) and container orchestration tools (e.g., Kubernetes, Docker).
Solid understanding of system design principles, distributed systems, and scalability.
Experience with monitoring and logging frameworks (e.g. Datadog, Prometheus, Grafana, ELK stack).
Deep understanding of REST APIs and event-driven architectures.
Advantage - Familiarity with AWS, Servers-less
Strong problem-solving skills, with the ability to troubleshoot production issues effectively.
Ability to manage multiple priorities and thrive in a service-oriented, fast-paced environment.
Bonus Points:
Experience designing developer-centric SDKs, tools, or CLI utilities.
Track record of contributing to internal platform teams or DX-focused initiatives.
Knowledge of OpenAPI/Swagger specifications and API documentation best practices.
Passion for elevating developer experience and usability across engineering platforms.
Hands-on experience in designing developer-friendly SDKs and APIs.
Knowledge of CI/CD pipelines and best practices for automated testing and deployment.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8296063

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

31/08/2025

Senior DevOps Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

This is a great opportunity to be part of one of the fastest-growing infrastructure companies in history, an organization that is in the center of the hurricane being created by the revolution in artificial intelligence.
"our company's data management vision is the future of the market."- Forbes
we are the data platform company for the AI era. We are building the enterprise software infrastructure to capture, catalog, refine, enrich, and protect massive datasets and make them available for real-time data analysis and AI training and inference. Designed from the ground up to make AI simple to deploy and manage, our company takes the cost and complexity out of deploying enterprise and AI infrastructure across data center, edge, and cloud.
Our success has been built through intense innovation, a customer-first mentality and a team of fearless company ronauts who leverage their skills & experiences to make real market impact. This is an opportunity to be a key contributor at a pivotal time in our companys growth and at a pivotal point in computing history.
The DevOps Engineer position is an operational engineering role and is an integrated part of our development team. You will be responsible for improving the efficiency of our processes, software, and infrastructure, and will be assisting RnD Team with product development. If you are DevOps Engineer that is passionate about automating and scaling everything, this job is for you.
Responsibilities
Monitor and optimize cloud infrastructure for performance, scalability, and cost-efficiency.
Manage and Maintain CI Infrastructure (GitLab CI and Jenkins).
Manage, Maintain and Improve our Release and Development Environments.
Support critical production infrastructure deployed in Multiple Clouds (AWS, Azure, and GCP).
Develop and Support RnD toolchain and implement best practices for code deployment, testing, and maintenance.
Automate On-Premises Labs Infrastructure by adopting IaC practices.
Lead and Develop Monitoring, Telemetry, Alerting, and Logging Production services.

Requirements:
Desired Qualifications:
Proven hands-on experience with Docker and Kubernetes in production. Hands-on experience deploying and managing complex Kubernetes environments, including services, ingresses, load balancers, and Helm charts
Solid understanding of Linux/Unix Internals and experience with handling complex performance and configuration problems in Linux/Unix environment.
Multi-Cloud Expertise: Deep familiarity with both GCP and AWS for provisioning, networking, and cost-optimization strategies
Experience in DSL Configuration tools like Ansible, Chef, or Puppet.
Experienced with programming languages (Python is preferred).
Shell scripting experience.
Proficient in SRE\Monitoring methodologies (Monitoring stacks with emphasis on Prometheus)
Nice To Have Skills
Experienced with CI/CD tools and frameworks.
Experience with managing binary repositories (RPMs, Pypi, NPM and etc)
Experience with developing Ansible collections, roles, and modules.
Experience with managing GitLab and GitLab CI.
Experience with Hashicorp Products: Terraform, Packer, Consul, Vault, and Vagrant.
Experience with automating configuration and deployment of On-Premises Lab Hardware.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8325791

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

2 ימים

Staff MLOps Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

Realize your potential by joining the leading performance-driven advertising company!
As a Staff MLOps Engineer on the Infra group, youll play a vital role in develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools.
About Algo platform:
The objective of the algo platform group is to own the existing algo platform (including health, stability, productivity and enablement), to facilitate and be involved in new platform experimentation within the algo craft and lead the platformization of the parts which should graduate into production scale. This includes support of ongoing ML projects while ensuring smooth operations and infrastructure reliability, owning a full set of capabilities, design and planning, implementation and production care.
The group has deep ties with both the algo craft as well as the infra group. The group reports to the infra department and has a dotted line reporting to the algo craft leadership.
The group serves as the professional authority when it comes to ML engineering and ML ops, serves as a focal point in a multidisciplinary team of algorithm researchers, product managers, and engineers and works with the most senior talent within the algo craft in order to achieve ML excellence.
How youll make an impact:
As a Staff MLOps Engineer Engineer, youll bring value by:
Develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools, including CI/CD, monitoring and alerting and more
Have end to end ownership: Design, develop, deploy, measure and maintain our machine learning platform, ensuring high availability, high scalability and efficient resource utilization
Identify and evaluate new technologies to improve performance, maintainability, and reliability of our machine learning systems
Work in tandem with the engineering-focused and algorithm-focused teams in order to improve our platform and optimize performance
Optimize machine learning systems to scale and utilize modern compute environments (e.g. distributed clusters, CPU and GPU) and continuously seek potential optimization opportunities.
Build and maintain tools for automation, deployment, monitoring, and operations.
Troubleshoot issues in our development, production and test environments
Influence directly on the way billions of people discover the internet.

Requirements:
Experience developing large scale systems. Experience with filesystems, server architectures, distributed systems, SQL and No-SQL. Experience with Spark and Airflow / other orchestration platforms is a big plus.
Highly skilled in software engineering methods. 5+ years experience.
Passion for ML engineering and for creating and improving platforms
Experience with designing and supporting ML pipelines and models in production environment
Excellent coding skills in Java & Python
Experience with TensorFlow a big plus
Possess strong problem solving and critical thinking skills
BSc in Computer Science or related field.
Proven ability to work effectively and independently across multiple teams and beyond organizational boundaries
Deep understanding of strong Computer Science fundamentals: object-oriented design, data structures systems, applications programming and multi threading programming
Strong communication skills to be able to present insights and ideas, and excellent English, required to communicate with our global teams.
Bonus points if you have:
Experience in leading Algorithms projects or teams.
Experience in developing models using deep learning techniques and tools
Experience in developing software within a distributed computation framework.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8335911

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

20/08/2025

Senior DevOps Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time

Were looking for a Senior DevOps Engineer to join our newly formed Foundations Teama small, high-impact group responsible for the infrastructure, tools, and shared services that power our entire R&D organization.
In this role, youll design, build, and evolve internal platform infrastructure, CI/CD systems, and developer enablement tooling. Your mission is to empower developers across the company to work autonomously, by creating self-service tools, automation, and clear standards that reduce friction and increase reliability.
Youll collaborate closely with engineers across disciplines and partner with the Foundations Team Lead to shape DevOps practices that scale. This is a hands-on role for someone who thrives in high-velocity, mission-critical environments and is passionate about building tools that make developers faster, more productive, and confident in running their own services.
What Youll Do
Design and maintain scalable, developer-friendly CI/CD pipelines and deployment workflows.
Build self-service tooling and automation that enables teams to manage deployments, environments, secrets, and observability independently
Be responsible for cloud infrastructure and operations foundations
Implement and promote best practices for monitoring, logging, and alerting across services.
Operate and optimize Kubernetes-based production environments, ensuring performance, security, and stability.
Manage infrastructure using Infrastructure as Code (IaC) and ensure repeatability and traceability through tools like Terraform.
Collaborate with R&D teams to support onboarding to internal tooling and promote a culture of enablement over dependency.
Monitor cloud cost, ensuring our cloud operates efficiently.

Requirements:
4+ years of hands-on experience in DevOps or infrastructure engineering, ideally in high-velocity, mission-critical production environments.
Deep expertise in Kubernetes and containerized infrastructure, with experience deploying and managing workloads at scale.
Strong understanding of cloud infrastructure and operations, including networking, storage, compute, and securityGCP experience preferred.
Proficiency with Infrastructure as Code tools, especially Terraform, with a focus on automation and operational excellence.
Experience developing and managing CI/CD processes and tools, with a passion for improving developer workflows and release quality.
Strong debugging and problem-solving skills, with the ability to troubleshoot complex systems across the stack.
Highly self-motivated and organized, able to work independently in a fast-paced, collaborative environment.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8311657

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

21/08/2025

Senior DevOps Engineer

חברה חסויה

Location: Tel Aviv-Yafo

Job Type: Full Time and English Speakers

We are growing and are looking for a Senior DevOps Engineer
who value personal and career growth, team-work, and winning!
What your day will look like:
This role is critical in supporting our companys short and long-term SaaS product expansion plans. Youll join an expanding group where your contributions can make a strong impact.
You will have the chance to design and build our customer and internal production systems using up-to-date technologies and tools.
As a DevOps Eng, you will have a chance to play a key role in all our mission critical services, working with multiple teams and groups and having a very wide view of how all of this comes together, from the request to the design, up to the build , deploy and whole lifecycle of what we build and manage in our cloud .
You will collaborate with our R&D, Security, and Corporate IT teams to deliver safe, scalable, and high-performing solutions. Members of the SaaS DevOps team act as trusted technical architects, and play a key role in determining the future of how builds and delivers its cybersecurity asset management services.
Responsibilities:
Evaluate , Design & Implement new cloud infrastructure technologies & Architecture to support our ever growing SaaS solution
Build a SaaS product to serve thousands of global customers in a modern and scalable way
Design, deploy, and operate cloud infrastructure and services for various internal (corporate) applications on behalf of other teams

Requirements:
5+ years of experience managing production environments
2+ years of experience managing cloud-based environments
Experience with AWS
At least 4 years of experience with Linux based OS administration
Extensive experience with modern DevOps tooling, including configuration management and Infrastructure as Code
Professional software development experience with any language (eg Python, Ruby, GoLang, Javascript)
Proven experience and understanding of architecture principles across infrastructure platforms, security, data, integration, and application layers
Experience deploying and operating services based on Linux containers and virtualization (Docker, etc.)
Monitoring and operational metrics gathering (e.g., CloudWatch, Prometheus, Grafana, Datadog, etc)
Building and managing infrastructure that requires high availability and high security standards
Strong written and verbal communication skills in English and Hebrew

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8313495

שירות זה פתוח ללקוחות VIP בלבד