דרושים » תוכנה » Site Reliability Engineer

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Realize your potential by joining the leading performance-driven advertising company!
As Site Reliability Engineer on the IT Production team in our TLV Office, youll play a vital role in building robust services and solving infrastructure challenges with automations while working with cutting-edge technologies and bringing those to their limits on our mostly on-prem cloud like infrastructure.
How youll make an impact:
As a Site Reliability Engineer, youll bring value by:
Ensure Reliability & Scalability: Design, implement and manage highly reliable and scalable distributed systems across our on-premise, cloud and AI/ML environments. Proactively optimize performance, efficiency, resource utilization and cloud cost.
Drive Automation: Automate repetitive tasks, infrastructure provisioning, configuration and deployments using IaC and scripting languages (e.g., Python, Go, Rust).
Develop Observability & Capacity: Implement comprehensive monitoring and alerting systems to ensure system health. Collaborate on capacity planning to meet future growth.
Maintain Security & Compliance: Integrate security best practices and ensure compliance with industry standards.
Lead Incident Management: Participate in on-call rotations, lead incident responses and conduct root cause analysis to minimize downtime.
Foster Collaboration & Improvement: Work closely with development, operations and security teams to drive shared responsibility and continuous improvement in SRE practices.
Requirements:
7 years of experience as an SRE, DevOps Engineer, System Administrator in a large distributed environment with focus on Linux operating systems.
Experience supporting, troubleshooting and scaling large distributed systems in production.
Deep understanding of HTTP protocol, including HTTP/1.1, HTTP/2, caching semantics, TLS and gRPC delivery.
Experience configuring and operating CDN services (e.g., Akamai, Fastly, Cloudflare, AWS CloudFront).
Deep understanding in Linux system internals and system performance tuning.
Experience with Configuration Management Tools (Puppet, Ansible, Chef, Terraform).
Experience programming in at least one of the following languages (Python, Golang, Rust, Ruby, C++, Java).
Experience with monitoring and metrics collection systems (Prometheus, Grafana, ELK).
Experience with cloud providers and platforms (AWS, Azure, GCP, Alibaba).
Experience with containerization technologies (Kubernetes, Docker).
Deep understanding of networking principles (TCP/IP, DNS, load balancing).
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8335946
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
3 ימים
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
We are looking for a Site Reliability Engineer (SRE) to join our Engineering team. Someone who has a passion for observability, monitoring, automation, and high-availability systems, and who has a desire to solve complex technological challenges with a proactive approach to continuous improvement.
We use an interesting and mixed technology stack: Kubernetes, Terraform, CI/CD pipelines, Datadog, Prometheus, and cloud-native architectures.
In this position, you will use your expertise in building and scaling SRE operations, and will design, implement, and operate a world-class reliability strategy.
About Us
we are a key player the network security field, striving to provide the leading SASE platform in the market. Our innovative approach, merging cloud and on-device protection, redefines how businesses connect in the era of cloud and remote work.
Key Responsibilities
Develop and maintain our monitoring, alerting, and logging systems, ensuring high visibility into production environments.
Implement automation to improve system reliability, scalability, and efficiency.
Troubleshoot and resolve production incidents, leading root cause analyses and implementing permanent fixes.
Collaborate with software engineers and DevOps teams to enhance application performance and resilience.
Continuously improve operational processes, focusing on reducing toil and improving reliability.
Requirements:
3+ years of experience as an SRE, DevOps Engineer, or in a similar role.
Hands-on experience with monitoring and observability tools like Datadog, Prometheus, and Grafana.
Strong understanding of Linux systems, networking, and cloud-native architectures.
Experience with Kubernetes, Terraform, and CI/CD pipelines.
A problem solver, capable of finding creative solutions and getting things done.
Fluent with incident management, RCA processes, and operational best practices.
It would be great if you also have:
Experience in high-scale distributed systems.
Background in security and compliance for cloud infrastructure.
Familiarity with AWS (EKS, EC2, RDS, S3, networking configurations).
Proficiency in Python, Go, or Bash for automation and scripting.
Understanding of cost optimization and resource management in cloud environments.
Familiarity with machine learning or predictive analytics for proactive reliability management.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8341627
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
28/08/2025
חברה חסויה
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
we are at the forefront of the AI revolution, delivering cutting-edge accelerated compute platforms for global impact. Our Network Insights group is seeking a talented and motivated Sr. DevOps Engineer to architect, scale, and optimize the DevOps infrastructure supporting our advanced networking simulation services. In this high-impact role, you will lay the foundations to scale a key insight product to reach 10100 times more users, design robust CI/CD pipelines, drive automation, and ensure the reliability, scalability, and security of our cloud-based, and on-prem platforms.. If you are passionate about solving complex infrastructure challenges and enabling world-class software delivery, we want to hear from you.
What You'll Be Doing:
Architect and optimize CI/CD pipelines for large-scale, high-availability simulation services, ensuring fast, reliable, and secure deployments.
Drive automation across infrastructure provisioning, configuration management, and monitoring to support rapid development cycles and minimize manual intervention.
Collaborate with software engineering and product teams to design and implement scalable, cloud-native solutions that meet evolving business needs.
Promote standard processes in infrastructure as code, containerization, and cloud security, ensuring compliance and resilience across environments.
Monitor, troubleshoot, and resolve infrastructure and deployment issues, maximizing uptime and ensuring efficient performance for internal and external customers.
Evaluate and integrate new tools and technologies to continually enhance the reliability, observability, and efficiency of our DevOps ecosystem.
Participate in incident response and post-mortem processes, driving root cause analysis and systemic improvements.
Requirements:
BSc or above in Computer Science, Computer Engineering, or a related field, or equivalent experience.
5+ overall years of hands-on experience in DevOps or Site Reliability Engineering roles.
Proven expertise in designing, building, and maintaining CI/CD pipelines (e.g., Jenkins, GitLab CI, GitHub Actions, or similar).
Deep knowledge of cloud platforms (AWS, preferably), On-Prem deployment, container orchestration (Kubernetes, Docker), and infrastructure as code.
Strong scripting and automation skills (Python, Bash, or similar).
Experience with monitoring, logging, and observability tools (Prometheus, Grafana, ELK, etc.).
Proven understanding of security standard methodologies in cloud & on-prem DevOps environments.
Excellent communication and interpersonal skills, with a track record of multi-functional collaboration.
Experience supporting large-scale, high-availability production systems.
Ways to Stand Out From the Crowd:
Prior background in networking or simulation environments.
Prior experience with building a new team from the grounds up.
Familiarity with performance tuning and cost optimization in cloud and on-prem environments.
Experience with building CI/CD pipelines from the ground up.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8322880
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/08/2025
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Principal DevOps Engineer in our Platform Engineering team, you will lead the design and implementation of cutting-edge CI/CD pipelines and cloud architecture that powers our development environment. You'll drive initiatives to enhance developer productivity through automation, tooling, and infrastructure improvements, working with a modern tech stack including Kubernetes, Python, cloud-native and high-scale technologies.
Your Impact
Architect and implement scalable, resilient CI/CD pipelines and cloud infrastructure that supports our engineering organization's evolving needs
Design and develop internal developer tools and platforms that significantly improve developer experience and productivity
Drive the evolution of our Kubernetes-based deployment infrastructure in Google Cloud Platform, ensuring security, reliability and performance
Optimize and scale our CI/CD infrastructure including Jenkins, GitLab, TeamCity, and artifact management systems
Mentor and guide other engineers on DevOps best practices, infrastructure design, and implementation strategies
Drive adoption of infrastructure-as-code, automated testing, and deployment methodologies
Collaborate with development teams to understand their needs and implement solutions that accelerate their workflow
Establish standards and best practices for infrastructure reliability, observability, and performance.
Requirements:
7+ years of experience in DevOps, Site Reliability Engineering, or Platform Engineering roles
Extensive experience with CI/CD pipeline design and implementation in complex environments
Advanced knowledge of Kubernetes administration, deployment patterns, and ecosystem tools
Strong programming skills in Python with solid understanding of OOP principles and design patterns
Deep understanding of cloud architecture, specifically with Google Cloud Platform services
Proven track record designing and implementing developer tooling and automation
Experience managing containerized applications and services in production environments
Strong system design skills with focus on scalability, reliability, and security
Knowledge of GitOps workflows and infrastructure-as-code using tools like Terraform, Pulumi, or equivalent
Familiarity with GitLab CI administration and pipeline development
participate in an on call rotation for working and non-working hours
Nice-to-Have
Knowledge of observability platforms and practices (Prometheus, Grafana, distributed tracing)
Familiarity with TeamCity administration and pipeline development
Experience implementing security best practices in CI/CD pipelines
Understanding of compliance requirements in software delivery pipelines
Experience with Infrastructure as Code testing frameworks
Knowledge of software architecture patterns and microservices design.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8290390
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
31/08/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a DevOps DevOps Engineer to take ownership of our Cloud Infrastructure and Platform Engineering strategy, enabling high-scale, cutting-edge GenAI products running across 40+ Kubernetes clusters on GCP and AWS.
This role is a hands-on engineering , requiring deep expertise in cloud-native technologies, Kubernetes at scale, and modern DevOps principles. You will work closely with engineering teams to design and implement scalable infrastructure solutions, optimize developer workflows, and ensure reliability and efficiency across our platform.
Role and Responsibilities:
Cloud & Kubernetes Expertise: Design and implement highly scalable multi-cluster Kubernetes environments across GCP & AWS.
Developer Experience & Enablement: Lead the development of self-service tools and automation that improve efficiency for R&D teams.
Incident & Reliability Engineering: Work with engineering teams to optimize cost, performance, and reliability of production infrastructure through monitoring, capacity planning, and scaling strategies.
Security & Governance: Contribute to best practices for RBAC, IAM, cloud security, and compliance while ensuring infrastructure reliability.
Automation & Infrastructure as Code: Drive adoption of GitOps workflows and Infrastructure as Code (Terraform, Helm, Crossplane) to enhance automation and consistency.
Mentorship & Team Growth: Provide technical mentorship within the platform engineering team and contribute to knowledge-sharing across R&D.
Cross-Team Collaboration: Work closely with engineering teams to align cloud infrastructure goals with business needs and reliability requirements.
Requirements:
5+ years of DevOps, or SRE experience
3+ years working with public cloud platforms (AWS, GCP) at scale
Deep Kubernetes expertise, including managing large-scale, multi-cluster enterprise-grade Kubernetes environments
Experience designing and managing Custom Resource Definitions (CRDs) and custom controllers
Strong background in Infrastructure as Code (Terraform, Helm) and GitOps principles (ArgoCD, Crossplane, FluxCD, etc.)
Hands-on experience in observability & monitoring (Prometheus, Grafana, Datadog, OpenTelemetry, etc.)
Proficiency in scripting & automation (Python, Go, Bash) for infrastructure automation
Expertise in cloud networking (VPC, load balancers, service meshes) and security best practices (RBAC, IAM, security groups, network policies, etc.)
Experience with CI/CD pipelines, optimizing for performance, security, and developer velocity
Nice-to-Have:
Experience with self-hosted on-prem deployments and managed private VPC deployments (Bring Your Own Cloud models)
Advanced expertise in Helm and Crossplane for Kubernetes resource management.
Other cloud provider experience
Experience in GenAI or large-scale SaaS platforms
Familiarity with SQL/NoSQL databases and distributed systems
DevSecOps experience, with a strong understanding of security automation and compliance frameworks
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8326421
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/08/2025
Location: Tel Aviv-Yafo
Job Type: Full Time
The ideal candidate enjoys working in a fast-paced environment with highly innovative technologies.
Your Impact
Provision, configure, and support resilient hybrid cloud deployment architectures using the automation framework
Collaborate with development teams to ensure applications are production-ready, scalable, and reliable from the outset
Manage CI/CD platform, Linux infrastructure, and collaborate with other SREs to deploy and maintain the automation framework, perform capacity planning, and create and review operational runbooks.
Set up critical infrastructure and develop tools and frameworks to automate operational tasks, including the deployment of machines, services, and applications
Participate in Incident Command on-call rotation supporting critical applications and services.
Conducts root cause analysis of critical business and production issues and drives future preventive measures
Manage scalability, capacity planning, redundancy, and resiliency
Maintain service availability and performance SLAs based on business and product requirements.
Contribute to documentation related to design, deployment, validation, and operations
Design proactive service monitoring, alerting, and trend analysis of underlying infrastructure, and support the operations team in implementation
Establish end-to-end monitoring and alerting on all critical components of the application.
Requirements:
6+ Years of system engineering experience on mission-critical, enterprise-level systems
6+ years of experience using Infrastructure-As-Code to build large-scale environments, mainly on Linux platform (Ubuntu, SUSE, CentOS).
3+ years of experience working with cloud environments, primarily Google Cloud Platform
Demonstrated Linux/Systems experience in a hybrid (cloud, on-prem) environment
Strong experience with CI/CD pipeline, GitHub, Jenkins, Artifactory
Must have a strong foundation in Linux operating systems, Troubleshooting, Design, and Implementation
Expertise in configuration management with a framework such as Terraform, Ansible, and Helm.
Experience using Infrastructure-As-Code to build large-scale environments
Experience with Linux vulnerability management process and patching
Must have programming knowledge in Python/Bash/Perl/Go languages to automate infrastructure workflow
Understanding of software development methodologies and practices, including agile development, continuous integration, and continuous delivery
Understanding of Network Firewalls, load balancers, and complex network designs
Experience in monitoring technologies like Datadog, Nagios, Graphite, Cacti, and Grafana.
Understanding Kubernetes, container lifecycle, and troubleshooting
Hands-on knowledge of high-availability approaches such as load balancing, failover, clustering, and disaster recovery
Excellent problem-solving, critical thinking, communication, and teamwork skills
Passion, drive, energy, a sense of humor, and a great attitude.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8290765
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/08/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Staff Devops Engineer.
As a Devops Staff Engineer, you will not be assigned a specific R&D group, but will serve as a focal point for the DevOps engineers, to help and support with any issue.
Youll be leading projects that cross DevOps, push forward technical discussions and interact with each DevOps engineer as needed to solve diverse complex problems of high scale.
Youll support multi-region environments, build and maintain tools for automation, deployment, monitoring, and operations.
Youll troubleshoot and resolve issues in our various environments.
Youll play a key role in designing and enforcing infrastructure patterns that support zero-downtime deployments, high resilience, and compliance standards.
Youll collaborate with teams across the company to define and drive forward scalable, production-grade architecture.
Youll conduct periodic on-call duties and emergency response.
Requirements:
10+ years of experience in the industry, including 6+ years of hands-on experience in high-scale SaaS companies or zero-downtime/disaster recovery enterprise environments (e.g., banking, cybersecurity, healthcare, or large-scale cloud platform providers).
5+ years of experience in DevOps roles across a minimum of 2 different companies, with strong hands-on experience in Kubernetes and AWS. Experience with hybrid or multi-cloud architectures is a strong plus.
Experience with on-call duties to manage critical infrastructure and application issues outside business hours, ensuring high availability and reliability.
3+ years of experience with CI/CD tools such as GitLab, GitHub Actions, CircleCI, or similar.
2+ years of experience with programming languages such as Python or TypeScript. Strong Linux administration skills, including debugging and Bash scripting.
2+ years of experience with Terraform (experience with Terragrunt is a plus), as well as GitOps systems such as ArgoCD.
2+ years of experience with configuration management tools such as Ansible, Chef, or Puppet, and monitoring and alerting systems such as Datadog, Splunk, New Relic, or Grafana.
Strong understanding of networking concepts, including VPC, service meshes, routing, DNS, TLS, and firewalls.
Production-oriented mindset with a strong sense of ownership over reliability, scalability, and incident response.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8296098
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/08/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Staff Engineer.
What Youll Do:
Lead architecture and system design for critical components of the Developer Experience platform, ensuring scalability, resilience, and long-term maintainability.
Own end-to-end delivery of complex initiatives, from requirements gathering and design to implementation, rollout, and observability.
Design, implement, and maintain robust microservices supporting high-throughput and low-latency operations.
Define and uphold API design standards, including gateway configuration, versioning strategy, and long-term lifecycle management.
Build and optimize backend systems that enable developer-facing products such as SDKs, APIs, and webhooks.
Work with both relational and NoSQL databases to ensure data consistency, scalability, and performance.
Collaborate with cross-functional teams to design systems that meet operational and business requirements.
Research and implement cloud-native architectures to support growth and scalability.
Contribute to the creation of developer tools and standards that improve the usability of our APIs and SDKs.
Requirements:
10+ years of experience in backend development, with a strong focus on scalable infrastructure.
Proficiency in Node.js and TypeScript; additional experience with other backend languages is a plus.
Strong expertise in relational and NoSQL databases, including schema design, query optimization, and troubleshooting.
Experience designing and managing RESTful APIs, including versioning strategies, API gateway integration, and developer-first design.
Proven experience designing and deploying microservices-based architectures in production environments.
Hands-on experience with cloud providers (AWS, GCP, Azure) and container orchestration tools (e.g., Kubernetes, Docker).
Solid understanding of system design principles, distributed systems, and scalability.
Experience with monitoring and logging frameworks (e.g. Datadog, Prometheus, Grafana, ELK stack).
Deep understanding of REST APIs and event-driven architectures.
Advantage - Familiarity with AWS, Servers-less
Strong problem-solving skills, with the ability to troubleshoot production issues effectively.
Ability to manage multiple priorities and thrive in a service-oriented, fast-paced environment.
Bonus Points:
Experience designing developer-centric SDKs, tools, or CLI utilities.
Track record of contributing to internal platform teams or DX-focused initiatives.
Knowledge of OpenAPI/Swagger specifications and API documentation best practices.
Passion for elevating developer experience and usability across engineering platforms.
Hands-on experience in designing developer-friendly SDKs and APIs.
Knowledge of CI/CD pipelines and best practices for automated testing and deployment.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8296063
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
This is a great opportunity to be part of one of the fastest-growing infrastructure companies in history, an organization that is in the center of the hurricane being created by the revolution in artificial intelligence.
"our company's data management vision is the future of the market."- Forbes
we are the data platform company for the AI era. We are building the enterprise software infrastructure to capture, catalog, refine, enrich, and protect massive datasets and make them available for real-time data analysis and AI training and inference. Designed from the ground up to make AI simple to deploy and manage, our company takes the cost and complexity out of deploying enterprise and AI infrastructure across data center, edge, and cloud.
Our success has been built through intense innovation, a customer-first mentality and a team of fearless company ronauts who leverage their skills & experiences to make real market impact. This is an opportunity to be a key contributor at a pivotal time in our companys growth and at a pivotal point in computing history.
The DevOps Engineer position is an operational engineering role and is an integrated part of our development team. You will be responsible for improving the efficiency of our processes, software, and infrastructure, and will be assisting RnD Team with product development. If you are DevOps Engineer that is passionate about automating and scaling everything, this job is for you.
Responsibilities
Monitor and optimize cloud infrastructure for performance, scalability, and cost-efficiency.
Manage and Maintain CI Infrastructure (GitLab CI and Jenkins).
Manage, Maintain and Improve our Release and Development Environments.
Support critical production infrastructure deployed in Multiple Clouds (AWS, Azure, and GCP).
Develop and Support RnD toolchain and implement best practices for code deployment, testing, and maintenance.
Automate On-Premises Labs Infrastructure by adopting IaC practices.
Lead and Develop Monitoring, Telemetry, Alerting, and Logging Production services.
Requirements:
Desired Qualifications:
Proven hands-on experience with Docker and Kubernetes in production. Hands-on experience deploying and managing complex Kubernetes environments, including services, ingresses, load balancers, and Helm charts
Solid understanding of Linux/Unix Internals and experience with handling complex performance and configuration problems in Linux/Unix environment.
Multi-Cloud Expertise: Deep familiarity with both GCP and AWS for provisioning, networking, and cost-optimization strategies
Experience in DSL Configuration tools like Ansible, Chef, or Puppet.
Experienced with programming languages (Python is preferred).
Shell scripting experience.
Proficient in SRE\Monitoring methodologies (Monitoring stacks with emphasis on Prometheus)
Nice To Have Skills
Experienced with CI/CD tools and frameworks.
Experience with managing binary repositories (RPMs, Pypi, NPM and etc)
Experience with developing Ansible collections, roles, and modules.
Experience with managing GitLab and GitLab CI.
Experience with Hashicorp Products: Terraform, Packer, Consul, Vault, and Vagrant.
Experience with automating configuration and deployment of On-Premises Lab Hardware.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8325791
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Realize your potential by joining the leading performance-driven advertising company!
As a Staff MLOps Engineer on the Infra group, youll play a vital role in develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools.
About Algo platform:
The objective of the algo platform group is to own the existing algo platform (including health, stability, productivity and enablement), to facilitate and be involved in new platform experimentation within the algo craft and lead the platformization of the parts which should graduate into production scale. This includes support of ongoing ML projects while ensuring smooth operations and infrastructure reliability, owning a full set of capabilities, design and planning, implementation and production care.
The group has deep ties with both the algo craft as well as the infra group. The group reports to the infra department and has a dotted line reporting to the algo craft leadership.
The group serves as the professional authority when it comes to ML engineering and ML ops, serves as a focal point in a multidisciplinary team of algorithm researchers, product managers, and engineers and works with the most senior talent within the algo craft in order to achieve ML excellence.
How youll make an impact:
As a Staff MLOps Engineer Engineer, youll bring value by:
Develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools, including CI/CD, monitoring and alerting and more
Have end to end ownership: Design, develop, deploy, measure and maintain our machine learning platform, ensuring high availability, high scalability and efficient resource utilization
Identify and evaluate new technologies to improve performance, maintainability, and reliability of our machine learning systems
Work in tandem with the engineering-focused and algorithm-focused teams in order to improve our platform and optimize performance
Optimize machine learning systems to scale and utilize modern compute environments (e.g. distributed clusters, CPU and GPU) and continuously seek potential optimization opportunities.
Build and maintain tools for automation, deployment, monitoring, and operations.
Troubleshoot issues in our development, production and test environments
Influence directly on the way billions of people discover the internet.
Requirements:
Experience developing large scale systems. Experience with filesystems, server architectures, distributed systems, SQL and No-SQL. Experience with Spark and Airflow / other orchestration platforms is a big plus.
Highly skilled in software engineering methods. 5+ years experience.
Passion for ML engineering and for creating and improving platforms
Experience with designing and supporting ML pipelines and models in production environment
Excellent coding skills in Java & Python
Experience with TensorFlow a big plus
Possess strong problem solving and critical thinking skills
BSc in Computer Science or related field.
Proven ability to work effectively and independently across multiple teams and beyond organizational boundaries
Deep understanding of strong Computer Science fundamentals: object-oriented design, data structures systems, applications programming and multi threading programming
Strong communication skills to be able to present insights and ideas, and excellent English, required to communicate with our global teams.
Bonus points if you have:
Experience in leading Algorithms projects or teams.
Experience in developing models using deep learning techniques and tools
Experience in developing software within a distributed computation framework.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8335911
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
02/09/2025
Location: Tel Aviv-Yafo
Job Type: Full Time
we are revolutionizing the way businesses interact with the digital world by revealing to them everything that happens online.
Our unique data and solutions empower over 4,300 customers globally, including industry giants like Google, eBay, and Adidas, to make game-changing decisions that drive their digital strategies.
In 2021, we went public on the New York Stock Exchange, and we continue to reach new heights! Come work alongside across the globe who are bright, curious, practical and good people.
What Youll Do:
Youll be part of the group responsible for main products. You will impact the work of developers in the group by designing, building and maintaining the core infrastructure of our solution and leading the research and development of new technologies as well as maintaining code standards and practices.
What does the day to day of Infrastructure Engineer at look like:
You will be working on core B2B platform that serves tens of thousands of customers, serving hundreds of terabytes in production. Our backend engineers are responsible for the entire data lifecycle - from our endless datatlakes, through choosing the right serving methods and databases, all the way to our api services.

Your role will include:
Design and implement scalable backend services and libraries that are reusable and maintainable, serving as the foundation for various applications across the company.
Build and maintain tools that streamline development workflows, enabling product teams to focus on delivering business value.
Define and promote best practices for code quality, performance, and reliability, ensuring healthy production environments and rapid development cycles.
Lead the adoption and integration of AI tools to assist in code generation, testing, documentation, and debugging, thereby accelerating development processes.
Perform proof-of-concepts (POCs) on emerging technologies, including AI agents and platforms, to assess their applicability and benefits to our development ecosystem.
Drive cross-team technical projects aimed at improving infrastructure scalability, reliability, and developer experience.
Analyze and resolve complex production issues, ensuring minimal downtime and optimal performance.
Contribute to the evolution of our system architecture, ensuring it supports rapid development and scaling needs.
Requirements:
Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
5+ years of experience in backend development, with a strong focus on infrastructure and platform engineering.
Proficiency in programming languages such as C#, Python, Java, or Go.
Experience building large-scale infrastructure applications or large-scale web applications.
Experience improving stability of large-scale systems using monitoring, solving bottle-necks and making appropriate changes.
High coding standards, working independently and experience leading long term tech tasks involving many teams and stakeholders.
Experience with cloud platforms (e.g., AWS, GCP, Azure) and container orchestration tools like Kubernetes.
Familiarity with CI/CD pipelines and infrastructure-as-code tools (e.g., Terraform, Ansible).
Demonstrated experience in integrating and leveraging AI tools to enhance development workflows.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8329714
סגור
שירות זה פתוח ללקוחות VIP בלבד