דרושים » מחשבים ורשתות » Site Reliability Engineer

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 3 שעות
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Now were looking for a visionary Site Reliability Engineer to join the R&D team. In this critical role, you will support our growing operation, network, and systems. You will play a pivotal role in administering our internal systems as well as participate in key design decisions. In this position, you can innovate, build best practice processes, and consistently work with new ideas.
Responsibilities:
Responsible for the company's production environment, including complex architecture with multiple virtual servers, deployments & various cloud technologies.
Manage the availability, latency, scalability and efficiency of our company services by engineering reliability into software and systems.
Respond to and resolve emergent service problems; build tools and automation to prevent problem recurrence.
Review and influence new and evolving design, architecture, standards, and methods for operating services and systems.
Participate in software and system performance analysis and tuning, service capacity planning and demand forecasting.
Requirements:
5+ years of experience in a similar role as SRE / DevOps Engineer.
Experience with AWS or other cloud providers.
Solid knowledge in cloud native technologies (Kubernetes, Prometheus, Grafana, etc.)
Experience programming in one or more of the following languages: Python or Go.
Solid understanding of Unix/Linux operating systems.
Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
A knowledge seeker, curious nature, with the need to understand how things work "under the hood".
Excellent communication and teamwork skills.
Advantages
Experience with IaC tools such as Terraform, Ansible, Chef, etc.
Solid knowledge in networking and internet technologies - e.g. HTTP servers, DNS, firewalls, proxies, etc.
Experience working for a SaaS company.
Experience with building and maintaining ETL pipelines.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8441567
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Lead DevOps Engineer, your role involves the design and development of robust, scalable, and resilient distributed systems. You'll define product specifications, leveraging your technical expertise to create optimal solutions hosted in Kubernetes on AWS Cloud. This position requires extensive collaboration with various teams throughout the software development lifecycle. You will lead design discussions and code reviews, contributing to the overall quality of engineering within the organization.

Your responsibilities also include creating and supporting reusable application components and patterns, considering both business and technology perspectives. You'll utilize developer tools and a range of AWS services for task management, source code handling, building, deployment, operations, and real-time communication. You are expected to demonstrate advanced skills in application design, implementation, and maintenance, often with minimal supervision.

Beyond technical tasks, you will mentor other engineers, sharing your knowledge and actively contributing to the enhancement of best practices and processes within and across teams.

Responsibilities:

Design, build, and maintain the scalable cloud infrastructure and CI/CD pipelines necessary to support our cutting-edge AI and optimization services.

Champion Infrastructure as Code (IaC) practices using tools like Terraform and Kubernetes to automate the deployment, scaling, and management of our production environments.

Implement robust monitoring, logging, and alerting systems to ensure the high availability, performance, and reliability of all services.

Partner with development teams to streamline the software development lifecycle, improve deployment velocity, and embed best practices for security and operational excellence.


JR314438
Requirements:
4+ years of hands-on experience in DevOps Concepts and Cloud Architecture.

4+ years of experience with AWS (mandatory to know concepts around s3, sqs, dynamodb, iam and kms) or other similar concepts around different cloud service providers e.g., GCP and Azure (Optional)

4+ Experience deploying and managing CI/CD pipelines. E.g., Jenkins and/or Spinnaker

Advanced programming experience with at least two modern languages such as GoLang, Java, C++, Or Python including object-oriented design.

Proven understanding of micro-services-oriented architecture and extensible REST and gRPC APIs. Experience building the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems.

Knowledge and experience to ensure Kubernetes cluster management including workloads in deployments and statefulsets remains reliable, available, secured and meet performance expectations

Experience with Kubernetes packaging technologies such as HELM and experience in administrating Kubernetes config maps, services, deployments, and stateful sets.

Experience with monitoring production and staging of test and development environments for a number of applications in a dynamic organization.

Good command of the version control tools including but not limited to GIT.

Strong expertise in troubleshooting complex production issues. Excellent problem-solving, critical thinking, and communication skills.

Degree or equivalent relevant experience required. Experience will be evaluated based on the core competencies for the role (e.g. extracurricular leadership roles, military experience, volunteer roles, work experience, etc.).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8431996
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Site Reliability Engineer Panda team
Realize your potential by joining the leading performance-driven advertising company!
As Site Reliability Engineer on the IT Production team in our Tel Aviv Office, youll play a vital role in building robust services and solving infrastructure challenges with automations while working with cutting-edge technologies and bringing those to their limits on our mostly on-prem cloud like infrastructure.
As a Site Reliability Engineer, youll bring value by:
Ensure Reliability & Scalability: Design, implement and manage highly reliable and scalable distributed systems across our on-premise, cloud and AI/ML environments. Proactively optimize performance, efficiency, resource utilization and cloud cost.
Drive Automation: Automate repetitive tasks, infrastructure provisioning, configuration and deployments using IaC and scripting languages (e.g., Python, Go, Rust).
Develop Observability & Capacity: Implement comprehensive monitoring and alerting systems to ensure system health. Collaborate on capacity planning to meet future growth.
Maintain Security & Compliance: Integrate security best practices and ensure compliance with industry standards.
Lead Incident Management: Participate in on-call rotations, lead incident responses and conduct root cause analysis to minimize downtime.
Foster Collaboration & Improvement: Work closely with development, operations and security teams to drive shared responsibility and continuous improvement in SRE practices.
Our Tech Stack:
Linux, Kubernetes, nginx, Istio, AWS, GCP, Azure, Alicloud, Fastly, Terraform, Consul, Prometheus, Loki, Grafana, Airflow, Redis, Kafka, Vector, Hadoop, Cassandra, Vertica, MySQL, HDFS, ELK.
Requirements:
4+ years of experience in software development with a proven track record of designing and developing internal tools, automation frameworks and platform components in large-scale distributed production environments with focus on linux operating systems.
Deep, demonstrable expertise in one of the following programming languages ( Golang, C, Rust, Python or Java).
Experience in observability tooling development, specifically implementing custom metrics, tracing and logging within application code.
Practical understanding of the HTTP protocol (including HTTP methods, status codes and headers). Proven ability to design, implement and instrument robust internal APIs (e.g., using REST or gRPC).
Understanding in Linux operating system internals: kernel configuration, system calls, process management, memory and I/O.
Proven ability to troubleshoot and optimize performance bottlenecks under heavy load using advanced monitoring and profiling tools for high-throughput and low-latency applications.
Bonus points if you have:
Experience as an SRE, DevOps Engineer, System Administrator in a large distributed environment with focus on Linux operating systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8439403
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Site Reliability Engineer
Realize your potential by joining the leading performance-driven advertising company!
As Site Reliability Engineer on the IT Production team in our TLV Office, youll play a vital role in building robust services and solving infrastructure challenges with automations while working with cutting-edge technologies and bringing those to their limits on our mostly on-prem cloud like infrastructure.
How youll make an impact:
As a Site Reliability Engineer, youll bring value by:
Ensure Reliability & Scalability: Design, implement and manage highly reliable and scalable distributed systems across our on-premise, cloud and AI/ML environments. Proactively optimize performance, efficiency, resource utilization and cloud cost.
Drive Automation: Automate repetitive tasks, infrastructure provisioning, configuration and deployments using IaC and scripting languages (e.g., Python, Go, Rust).
Develop Observability & Capacity: Implement comprehensive monitoring and alerting systems to ensure system health. Collaborate on capacity planning to meet future growth.
Maintain Security & Compliance: Integrate security best practices and ensure compliance with industry standards.
Lead Incident Management: Participate in on-call rotations, lead incident responses and conduct root cause analysis to minimize downtime.
Foster Collaboration & Improvement: Work closely with development, operations and security teams to drive shared responsibility and continuous improvement in SRE practices.
Our Tech Stack:
Linux, Kubernetes, nginx, Istio, AWS, GCP, Azure, Alicloud, Fastly, Terraform, Consul, Prometheus, Loki, Grafana, Airflow, Redis, Kafka, Vector, Hadoop, Cassandra, Vertica, MySQL, HDFS, ELK.
Requirements:
7 years of experience as an SRE, DevOps Engineer, System Administrator in a large distributed environment with focus on Linux operating systems.
Experience supporting, troubleshooting and scaling large distributed systems in production.
Deep understanding of HTTP protocol, including HTTP/1.1, HTTP/2, caching semantics, TLS and gRPC delivery.
Experience configuring and operating CDN services (e.g., Akamai, Fastly, Cloudflare, AWS CloudFront).
Deep understanding in Linux system internals and system performance tuning.
Experience with Configuration Management Tools (Puppet, Ansible, Chef, Terraform).
Experience programming in at least one of the following languages (Python, Golang, Rust, Ruby, C++, Java).
Experience with monitoring and metrics collection systems (Prometheus, Grafana, ELK).
Experience with cloud providers and platforms (AWS, Azure, GCP, Alibaba).
Experience with containerization technologies (Kubernetes, Docker).
Deep understanding of networking principles (TCP/IP, DNS, load balancing).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8439391
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a skilled DevOps Engineer to join our R&D infrastructure team and play a key role in building and scaling cloud-based platforms. In this position, you will design, implement, and maintain modern CI/CD pipelines, manage multi-cloud environments, and support microservices-based architectures. You will work closely with developers, QA, and product teams to streamline delivery processes, improve system reliability, and ensure smooth deployments. This is a hands-on role where you will directly influence the stability, scalability, and efficiency of our production systems while leveraging cutting-edge technologies across AWS and Azure.

Responsibilities
Infrastructure as Code (IaC): Develop and maintain infrastructure using tools like Terraform, Ansible
Cloud Infrastructure Management: Deploy, manage, and monitor applications in cloud environments(aws and Azure)
Collaboration & Support: Work closely with developers, QA, and product teams to streamline releases and improve productivity.
Provide technical support for development and operations teams during incidents and deployments.
CI/CD Pipeline Management:
Design, implement, and maintain continuous integration and delivery pipelines. Automate build, test, and deployment processes to improve speed and reliability.
Requirements:
3-5 years experience as DevOps Engineer\SRE Engineer\Platform Engineer
Strong problem-solving skills
Microservices architecture & container orchestration (Docker and Kubernetes)
Experience with IaaC tools (e.g. Terraform)
Strong knowledge of CI/CD tools such as Jenkins, GitHub.
Experience with Configuration Managements tools (e.g. Chef, Ansible or Puppet)
Experience with GitOps (e.g. ArgoCD)
Proven Scripting capabilities: PowerShell/Bash/Python
Hands-on experience with cloud platforms AWS/Azure/GCP
Strong troubleshooting skills
Familiarity with monitoring and logging tools (Prometheus, Grafana, ELK, etc.)
Excellent collaboration and communication skills for working across development, QA, and operations teams
BSc degree in computer science, computer engineering, relevant technical discipline, or equivalent practical experience
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8423258
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Senior DevOps Engineer to join our R&D team in developing the next rising product in the health tech landscape. If you are looking for a challenging, influential position and are passionate about making an impact, this might be the role for you.

As a Senior DevOps Engineer , youll play a key role in the design, development, testing, deployment, and monitoring of our infrastructure and products. In this position, you'll make significant contributions to our observability stack, helping build and maintain robust systems for logs, metrics, traces, and alerting.

Our ideal candidate is passionate about DevOps and observability, has strong communication skills, and thrives on constant improvement for both technology and processes. If you enjoy working on multiple projects in parallel and are a proactive team player, youll fit right in.

This is a unique opportunity to join the core team of a fast-growing startup, where your contributions will have a direct impact on our product and success.

Responsibilities

Support and collaborate with cross-functional engineering teams using cutting-edge technologies.
Contribute to the design, implementation, and maintenance of monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, Loki)
Secure, scale, and manage our cloud environments (AWS and GCP)
Design and implement automation solutions for both development and production
Manage and improve our CI/CD pipelines for fast and safe delivery
Lead best practices in infrastructure, observability, configuration management, and system hardening
Continuously assess and improve existing infrastructure in line with industry standards
Requirements:
BSc in Computer Science, Engineering, or equivalent experience
5+ years of experience as a DevOps Engineer or similar software engineering role
Proven experience with Docker and Kubernetes (EKS preferred)
Hands-on experience with monitoring and observability tools, including Prometheus, Grafana, Datadog, or similar.
Expertise in Terraform for AWS infrastructure-as-code deployments
Strong collaboration and interpersonal communication skills
Excellent analytical thinking and problem-solving mindset
Proficiency with relational databases
Solid knowledge of Python and Bash scripting
Experience with test automation an advantage
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8398069
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 4 שעות
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a DevOps Engineer.
As a key member of our engineering team, youll work at the intersection of development, operations, and reliability. Youll automate cloud infrastructure, ensure system performance, and maintain secure, scalable deployments in a regulated fintech environment.
Responsibilities:
Manage and enhance cloud infrastructure (AWS, GCP, Azure, or similar).
Develop, maintain, and automate CI/CD pipelines to streamline application delivery.
Implement Infrastructure as Code (e.g., Terraform, Ansible, CloudFormation) for provisioning and managing environments.
Set up and maintain monitoring, observability, and alerting systems using tools like Prometheus, Grafana, Splunk, New Relic, ELK,etc.
Define, track, and act upon SRE metrics (SLIs, SLOs, error budgets) to balance reliability and development velocity.
Participate in incident response, including root cause analysis and remediation.
Automate repetitive tasks to reduce toil and increase system resiliency and uptime.
Collaborate with developers and security teams to embed security and compliance best practices (e.g., PCI DSS, DevSecOps).
Support on-call rotation and continuously improve operational processes.
Requirements:
5-8 years experience in DevOps, SRE, or related engineering roles.
Proven experience working with at least one cloud provider (AWS, GCP, Azure).
Proven experience with containerization and orchestration (Docker, Kubernetes,GKE).
Proficiency in CI/CD tooling (e.g., GitLab CI, Jenkins, GitHub Actions).
Hands-on experience with Infrastructure as Code tools (Terraform, Ansible,CloudFormation).
Strong command of monitoring and observability tools (Prometheus, Grafana, ELK stack, Splunk, New Relic).
Solid scripting ability in Python, Bash, or similar.
Familiarity with Linux/Unix systems, networking, and basic system administration.
Comfortable working in fast-paced, collaborative environments and able to handle operational incidents effectively.
Excellent communication skills and a mindset geared toward continuous learning and improvement.
Nice to Have:
Exposure to containerization and orchestration (Docker, Kubernetes, GKE).
Understanding of SLA/SLI/SLO frameworks, error budgets, and reliability engineering principles. WikipediaReddit
Awareness of financial compliance standards like PCI DSS. jobs.singaporefintech.org
Knowledge of DevSecOps practices (security-as-code, shifting security left). Wikipedia
Familiarity with incident management and on-call culture.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8441385
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
31/10/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Were looking for an DevOps Engineer to join the R&D team and spread the power of our company.
In this role, you will design and implement scalable systems that will keep our company running smoothly and support our significant business growth.
You will join an innovative, high-performance team and work with cutting-edge technologies in a dynamic and agile environment.
WHAT YOULL DO
Take an active part of all DevOps areas: our companys infrastructure and cloud environment, tools, services and up-time.
Be part of our products architectural and infrastructure design, examine and implement new cloud technologies, and open-source tools to improve the delivery and availability of the product.
Plan and push forward the growth and scale of data capacity for various products.
Be responsible for the smooth production-grade execution of provided solutions.
Requirements:
4+ years of experience as a DevOps engineer on a high-scale distributed system, working in a Linux environment.
Proven hands-on experience with containerized environments and microservices; Docker and Kubernetes - a must
Extensive experience with working in a multi-cloud environment
The mindset and approach for automating away from manual efforts
In-depth knowledge of build/release systems, CI/CD pipelines.
Scripting/programming skills with Python/Bash/Go.
An innovative approach, with the ability to quickly learn technologies
A strong sense of ownership and accountability
Full professional fluency (written and verbal) in both Hebrew and English.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8394348
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Staff Engineer, you will be the technical lead and driving force behind the groups most complex initiatives. You will work closely with engineers, tech leads, architects, and product managers to solve high-scale distributed systems challenges, improve performance, and design robust, future-proof systems.
This role is ideal for experienced software architects and senior developers who are passionate about system architecture, performance at scale, and leading cross-team engineering efforts without formal management duties.

Key Responsibilities:
Act as the technical authority for large-scale backend systems within the Execution group.
Gain deep understanding of the Orchestration groups services, the campaign targeting flow, and how the product works as a whole, in order to make architectural decisions in the broader product context.
Champion the groups strategic adoption of AI and Vibe Coding practices, becoming a key enabler for increasing developer efficiency through the use of cutting-edge AI development tools.
Lead the design and implementation of distributed, high-throughput, low-latency services that support billions of message executions monthly.
Partner with Engineering Managers and Architects to shape the groups long-term technical vision and architecture roadmap.
Define and enforce engineering standards and best practices across services.
Conduct in-depth design and code reviews, mentoring other engineers and elevating technical excellence.
Proactively identify cross-cutting concerns and drive group-wide engineering initiatives (e.g., observability, resiliency, fault tolerance).
Analyze and improve system bottlenecks in data flow, message queuing, storage, and processing pipelines.
Take ownership of non-functional requirements such as reliability, scalability, maintainability, and security.
Collaborate with Product and Data Science teams to ensure engineering plans align with business priorities.
דרישות:
Technical Skills and Experience:
10+ years of software engineering experience, with at least 3 years in senior or staff-level roles involving architectural decision-making.
Proven experience designing and building scalable, distributed systems and services in .NET/C# (preferred) or other modern languages (Java, Go, etc.).
Expertise in designing event-driven architectures using Kafka or equivalent messaging systems.
Deep understanding of data pipelines, message queues, batch and stream processing at scale.
Strong experience with cloud-native development, container orchestration, and infrastructure-as-code (e.g., GCP, Docker, Kubernetes, Terraform).
Experience with relational and NoSQL databases and an understanding of their tradeoffs.
Strong familiarity with performance monitoring, alerting, and observability tools.
Experience driving technical design documents, evaluating new technologies, and communicating decisions effectively to varied audiences.
Curiosity and hands-on experience with AI-powered development workflows, LLM tools, and productivity boosters is a strong plus.
Leadership & Impact
Recognized as a go-to expert and trusted advisor by engineers across the group.
Strong mentoring skills-willing and able to guide others through design challenges and deep technical problems.
Comfortable operating in ambiguity, proposing solutions, and reducing complexity.
Influences architecture, priorities, and processes beyond their immediate team.
Passionate about creating a culture of engineering excellence, ownership, and continuous improvement.
Leads cross-functional technical initiatives that span multiple teams and disciplines.

Preferred Qualifications:
Experience in a high-growth SaaS company or one with high-throughput systems.
Background in campaign orchestration, marketing automation, or messaging systems.
Experience working with data engineering tools and pipelines (e.g., Airflow, BigQuery, dbt) is a plus.
Contributor to open-source or internal developer communities.#E המשרה מיועדת לנשים ולגברים כאחד.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8386300
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
31/10/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Were looking for a Site Reliability Engineer (SRE) to join the R&D team and spread the power of our company.
In this role, you'll design and build scalable systems to ensure our company runs seamlessly and supports our rapid business growth.
You'll be part of an innovative, high-performing team, working with cutting-edge technologies in a fast-paced, agile environment.
WHAT YOULL DO
Work closely with R&D engineers on coordination, communication, and execution of production-related operations
Maintain a safe and healthy production environment by following and enforcing standards and procedures
Work in an agile & fast-growing environment
Identify operational problems by observing and studying system functioning and performance results
Provide operational management information by collecting and analyzing operating and engineering data and trends
Constantly improve the technology stack, supporting the data growth and customer requirements
Work with cutting-edge technologies in a dynamic and agile environment.
Requirements:
Proven hands-on experience with containerized environments and microservices; Docker and Kubernetes - a must
4 years of experience as a DevOps/SRE engineer on a high-scale distributed system, working in a Linux environment
Proven experience with building monitoring tools
An innovative approach, with the ability to quickly learn technologies
The mindset and approach for automating away from manual efforts
A strong sense of ownership and accountability
Excellent problem-solving and troubleshooting - system, application, and database level
Full professional fluency (written and verbal) in both Hebrew and English.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8394327
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
1 ימים
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We're looking for a Senior SRE Engineer who combines strong infrastructure expertise with solid programming skills to help scale our platform, who can balance operational excellence with software development.
This is an exciting opportunity to build SRE processes from the ground up - creating new reliability pipelines, monitoring frameworks, and foundational practices that will scale with our rapid growth.
You'll lead our infrastructure and reliability efforts while writing code to automate, optimize, and enhance our systems. This role requires both deep technical expertise and the ability to mentor team members as we scale.
Stack: AWS, Python, EKS, K8s, Kafka, RabbitMQ, Pulumi, PostgreSQL, Databricks, GitHub Actions
Core Responsibilities:
Design and implement scalable, reliable infrastructure solutions on AWS using Infrastructure as Code (Terraform/Pulumi).
Build and maintain sophisticated CI/CD pipelines with GitOps methodologies.
Develop custom tooling and automation scripts in Python/Go/similar languages to improve operational efficiency.
Architect and implement comprehensive observability solutions (metrics, logging, tracing, alerting).
Define and track SLIs/SLOs/Error Budgets to ensure system reliability.
Lead incident response, conduct thorough post-mortems, and drive systemic improvements.
Optimize cloud costs through data-driven analysis and architectural improvements.
Collaborate with development teams to improve application reliability and performance.
Mentor team members on SRE best practices and infrastructure design patterns.
Requirements:
5+ years of DevOps/SRE experience in production environments.
Solid programming skills in at least one language (Python, Go, Java, or similar) with ability to write production-quality code.
Strong understanding of SRE principles: reliability engineering, capacity planning, chaos engineering.
Deep expertise with Kubernetes (EKS preferred) including operators, CRDs, and advanced networking.
Proven experience implementing Infrastructure as Code at scale.
Hands-on experience with observability stacks (Prometheus, Grafana, ELK, Datadog, or similar).
Experience with distributed systems concepts and troubleshooting.
Excellent problem-solving skills with a systematic approach to debugging.
Strong communication skills and ability to work across teams.
What Sets You Apart:
You write code to solve operational problems, not just configure existing tools.
You think in systems and can identify root causes across complex architectures.
You're passionate about automation and eliminating toil.
You balance perfectionism with pragmatism to deliver reliable solutions quickly.
You stay current with cloud-native technologies and best practices.
You can translate technical concepts for various audiences.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8439435
סגור
שירות זה פתוח ללקוחות VIP בלבד