דרושים » הנדסה » Site Reliability Engineer

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 5 שעות
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Join our DeviantArt team as a Senior DevOps Engineer and play a pivotal role in maintaining and architecting a robust infrastructure that powers one of the largest online art communities. You'll be at the forefront of ensuring our platform's high availability, performance, and security, handling over 1.5 billion monthly page views.
The DeviantArt DevOps Team is a very small remote team that performs all tasks normally inclusive of SRE/DevOps/Infrastructure Engineers, with a bit of networking, security, and database administration mixed in. We are responsible for the day-to-day management and implementation of large-scale, mission-critical production systems that run on a public cloud.
This role requires wearing a lot of hats, and is equal parts fun and challenging. In this role, you will:
Architect and maintain a highly available infrastructure with a focus on proactive and reactive DDOS mitigation, autoscaling, self-healing, site performance, and cost optimization
Participate in a 24/7 on-call rotation, responding swiftly to outages or performance issues, and focus on less urgent alerts during normal work hours
Maintain and develop a developer environment and CI/CD pipelines in parity with production systems, for seamless testing and release of changes
Automate infrastructure provisioning and management using configuration management tools, complete with tests and documentation
Optimize and support sharded MySQL databases for efficient and reliable data handling amidst growing data reads and writes
Regularly update system components to avoid security issues and ensure up-to-date technology
We take our work seriously, but we dont take ourselves too seriously! We enjoy designing and building systems using open source tools and industry standards, and are in the fortunate position to be able to make decisions as a team about adopting newer technologies, and redesigning our infrastructure when appropriate.
This role is on a fully remote and distributed team, and asynchronous communication within and across teams is crucial. To be successful in this role, a candidate will need to work flexibly, balancing server and service issues, needs from development teams, security needs, and shifting priorities in our own tasks in managing our infrastructure.
Requirements:
5+ years of experience managing systems at scale as a DevOps Engineer, Site Reliability Engineer, or Platform Engineer
Excellent technical analytical skills with the ability to implement DDOS mitigation, troubleshoot complex problems, analyze system bottlenecks, and implement effective solutions, from frontend through backend systems, sometimes during production degradation or outage for a high traffic site
Exceptional command line Linux skills, with proficiency in Bash and Python for investigation of server and services issues, scripting, and automation
In-depth knowledge of AWS services, infrastructure as code using Terraform, GitOps tools and methodologies, and container orchestration using Docker, Helm, and Kubernetes
Experience with setup, administration, and maintenance of sharded MySQL database clusters while maintaining no downtime or data loss
Excellent communication skills with fluent English, and the ability to collaborate effectively across teams while articulating technical concepts to non-technical stakeholders
The ability to get up to speed on systems, make decisions, be flexible, and execute independently with attention to detail for production systems.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8220324
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
Were growing and looking to hire Site Reliability Engineer (SRE) who embodies our core values: People First, Customer Obsession, Strive for Excellence, and Integrity.
We are looking for a skilled and motivated Site Reliability Engineer (SRE) to join our team and help ensure our production cloud environment's reliability, performance, and scalability. As an SRE, you will work at the intersection of software engineering and operations, taking ownership of system stability, incident response, automation, and continuous improvement of our infrastructure.
This role is ideal for engineers who thrive in dynamic environments, value reliability, and enjoy building resilient and scalable systems.
As an SRE, Your impact will be:
Production Reliability: Ensure system uptime and performance by identifying and addressing potential issues before they affect end users.
Incident Response: Serve as part of the on-call rotation, rapidly diagnosing and resolving incidents, and conducting root cause analysis and postmortems.
Monitoring and Alerting: Build and maintain monitoring dashboards and alerting systems to detect and respond to anomalies in real time.
Automation and Tooling: Develop and maintain automation tools for deployments, scaling, and operational efficiency using Terraform, Ansible, Bash, or Python.
Infrastructure Maintenance: Perform regular maintenance and upgrades of production infrastructure to ensure security, stability, and performance.
Release Engineering: Support and optimize the rollout of new features and updates, minimizing risk and impact on production environments.
Staging Environment Management: Ensure staging environments accurately reflect production for robust testing and validation of changes.
Requirements:
Experience in SRE, DevOps, or production engineering roles
Strong skills in system troubleshooting, incident response, and root cause analysis
Proficiency with tools such as:
Jenkins, Terraform, Ansible, GIT, GitHub
Bash, Python
AWS, ArgoCD, or similar CI/CD and cloud platforms
Familiarity with observability tools and practices (metrics, logging, tracing)
Ability to work effectively in cross-functional teams
Strong communication and documentation skills
Bachelor's degree in Computer Science, Information Technology, or a related field (preferred)
Familiarity with Agile development methodologies
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8198455
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Site Reliability Engineer- Infra
Realize your potential by joining the leading performance-driven advertising company!
As a Site Reliability Engineer- infra, on our Infrastructure team at the TLV office, you will play a key role in ensuring the reliability, scalability, and performance of our critical systems. You will be responsible for managing and improving our core infrastructure, with a focus on automation, monitoring, and incident response. You will work with a wide range of technologies, including Kubernetes, monitoring and observability tools, configuration management systems, and core networking services.
How youll make an impact:
As a Site Reliability Engineer, youll bring value by:
Ensure the reliability, availability, and performance of our infrastructure services.
Manage and maintain our Kubernetes infrastructure, including KubeVirt.
Design, implement, and maintain our monitoring and observability stack (SensuGo, VictoriaMetrics, Prometheus, ELK).
Automate infrastructure provisioning, configuration, and deployment processes using Puppet and Ansible.
Manage and maintain core services such as DNS and networking.
Troubleshoot and resolve complex infrastructure issues in a timely and efficient manner.
Participate in on-call rotations and incident response.
Develop and maintain infrastructure-as-code (IaC).
Identify and implement proactive measures to prevent incidents and improve system reliability.
Collaborate with development teams to ensure smooth and reliable deployments.
Contribute to the design and implementation of new infrastructure solutions.
Drive improvements in system architecture, processes, and tools.
Mentor and coach other team members.
Requirements:
To thrive in this role, youll need:
5+ years of experience in a Site Reliability Engineering, Systems Engineering, or similar role.
Deep understanding of Site Reliability Engineering principles and practices.
Extensive experience with Kubernetes, including deployment, management, and troubleshooting.
Strong experience with monitoring and observability tools such as SensuGo, Zabbix, VictoriaMetrics, Prometheus, and ELK.
Proficiency in configuration management tools such as Puppet and Ansible.
Solid understanding of Linux internals and networking.
Experience with managing and maintaining core services such as DNS and networking.
Strong programming skills in Python and/or Go.
Experience with both on-premises and cloud environments.
Experience with KubeVirt.
Excellent troubleshooting and problem-solving skills.
Strong communication and collaboration skills.
Ability to work in a fast-paced, dynamic environment.
Ability to participate in on-call rotations including weekends.
Preferred Qualifications:
Experience with large-scale, distributed systems.
Experience with other cloud providers (e.g., AWS, Azure, GCP).
Contributions to open-source projects.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8205377
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 51 דקות
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Senior DevOps Engineer, you will be responsible for building and maintaining scalable, reliable infrastructure and deployment pipelines with a strong emphasis on security integration throughout the software delivery lifecycle. You will work closely with development teams to improve development velocity while ensuring system reliability, security, and performance. This role is critical in bridging the gap between software development and operations, implementing DevSecOps best practices throughout our organization.

Main Responsibilities:
Infrastructure Management: Design, implement, and maintain cloud-based infrastructure using Infrastructure as Code principles
CI/CD Implementation: Build and optimize continuous integration and continuous deployment pipelines to enable rapid, reliable software delivery
Automation: Develop automation scripts and tools to streamline operations and eliminate manual processes
Containerization: Manage containerization strategies and orchestration using Docker and Kubernetes
Security Integration: Implement security scanning, testing, and validation throughout the CI/CD pipeline
Vulnerability Management: Conduct regular security assessments and remediate vulnerabilities in infrastructure and application code
Compliance Automation: Automate compliance checks and reporting to ensure adherence to security standards
Performance Optimization: Analyze and optimize system performance, scalability, and cost-efficiency
Documentation: Create and maintain thorough documentation for infrastructure, deployment processes, and operational procedures
Incident Response: Participate in on-call rotations and lead incident resolution with thorough post-mortem analysis
Requirements:
5+ years of experience in DevOps, DevSecOps, or similar roles
Cloud Platforms: Extensive hands-on experience with AWS services and architecture patterns
Infrastructure as Code: Proficiency with Terraform, AWS Cloud Formation, or similar IaC tools
Containerization: Advanced knowledge of Docker and Kubernetes ecosystem
Kubernetes Technologies: Experience with ArgoCD, Prometheus, Grafana, and other Kubernetes tooling
Programming/Scripting: Strong coding skills in Python, Bash, or Go
CI/CD Tools: Experience implementing and maintaining CI/CD pipelines using Jenkins, GitLab CI, GitHub Actions, or similar
Advantage - Security Tools: Experience with security scanning tools (SonarQube, OWASP ZAP, Snyk, etc.)
Advantage - Networking: Solid understanding of networking principles, load balancing, and security concepts
Exceptional problem-solving abilities and analytical thinking
Strong communication skills with the ability to explain complex technical concepts to various audiences
Collaborative mindset with experience working in cSECross-functional teams
Self-motivated with the ability to work independently
Proactive approach to identifying and resolving potential issues before they impact production

Advantage:
experience with PostgreSQL, NoSQL, Shell scripting, Networking, Firewalls, System security.
Strong background in software development with security focus
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8220838
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
20/05/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are seeking a results-driven DevOps Team Lead to head the DevOps and Infrastructure team within our R&D organization. This role requires strategic vision, technical expertise, and a proactive approach to drive operational excellence, empower teams with robust tools and automation, and ensure high system reliability and scalability.

As the DevOps Team Lead, you will be instrumental in delivering critical KPIs, including system uptime, automation, incident management, and collaboration with development and QA teams to enable self-sufficiency. Additionally, you will serve as the leader for strategic projects, identifying opportunities to improve infrastructure and operational processes, setting long-term goals, and executing initiatives that align with Cyberints business objectives and growth.

Key Responsibilities
Strategic Leadership

Identify and lead strategic projects to enhance Cyberints platform scalability, reliability, and operational efficiency.
Develop and execute a roadmap for critical infrastructure and DevOps initiatives that drive business success.
Collaborate with senior stakeholders to align projects with organizational priorities and deliver measurable outcomes.
System Reliability & Uptime

Lead initiatives to ensure system reliability, minimize disruptions, and maintain high availability for Cyberints SaaS platform.
Establish and manage proactive monitoring, alerting, and preventive maintenance strategies.
Drive incident prevention efforts, ensuring robust failover and disaster recovery mechanisms.
Develop and maintain playbooks to enable rapid diagnosis and resolution of issues.
Automation, Infrastructure as Code (IaC), & Self-Service Enablement

Champion the adoption of automation and IaC to streamline infrastructure management and deployments.
Build and enhance self-service tools and frameworks, empowering R&D teams to operate independently with minimal reliance on DevOps.
Continuously improve CI/CD pipelines to optimize deployment speed and reliability.
Collaboration & Support for Self-Sufficiency

Collaborate closely with development, QA, and support teams to deliver tools and frameworks that promote team autonomy and efficiency.
Advocate for cross-functional engagement to align operational processes with R&D objectives.
Provide training and mentorship to teams on using DevOps tools effectively.
Accountability, Ownership, & Scalability

Take ownership of all systems and infrastructure, ensuring solutions are scalable, resilient, and aligned with Cyberints growth objectives.
Establish clear accountability frameworks for maintaining infrastructure and delivering on key projects.
Design and execute a roadmap to support self-service-oriented and scalable solutions.
Requirements:
5+ years of experience in DevOps or SRE roles, with 2+ years in a leadership capacity.
Proven expertise in building and maintaining highly available, cloud-native environments (AWS preferred).
Experience with Kubernetes, Terraform, CI/CD pipelines, and monitoring technology and tools (Prometheus, Grafana, Jenkins, ArgoCD, Terraform, Elasticsearch, Redis, EKS, etc.).
Skills & Expertise

Strong understanding of automation, Infrastructure as Code (IaC), and self-service enablement.
Expertise in incident management and a track record of delivering reliable, scalable systems.
Hands-on experience with scripting and automation tools (Python, Bash).
Deep understanding of containerization, orchestration, and cloud-native architectures.
Familiarity with cost monitoring and optimization strategies to ensure infrastructure is both efficient and cost-effective.
Knowledge of security best practices for infrastructure and DevOps environments.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8185042
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
6 ימים
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
At UVeye, we're on a mission to redefine vehicle safety and reliability on a global scale. Founded in 2016, we have pioneered the world's first fully automated suite of vehicle inspection systems. At the heart of this innovation lies our advanced AI-driven technology, representing the pinnacle of machine learning, GenAI, and computer vision within the automotive sector. With close to $400M in funding and strategic partnerships with industry giants such as Amazon, General Motors, Volvo, and CarMax, UVeye stands at the forefront of automotive technological advancement. Our growing global team of over 200 employees is committed to creating a workplace that celebrates diversity and encourages teamwork. Our drive for innovation and pursuit of excellence are deeply embedded in our vibrant company culture, ensuring that each individual's efforts are recognized and valued as we unite to build a safer automotive world.
We are seeking a highly motivated and skilled Release Engineer to join our AIOps group. In this role, you'll play a critical part in bridging the gap between development and operations, ensuring the seamless qualification, deployment, and monitoring of our AI algorithms and infrastructure, and be responsible for the end-to-end operationalization of our core technology.
A day in the life and how you’ll make an impact:
* Manage the end-to-end release process of machine learning algorithms and infrastructure components, from qualification through deployment.
* Validate and test new algorithm releases to ensure they meet performance, stability, and compliance standards.
* Create and execute deployment plans across various environments (staging, production), ensuring minimal risk and downtime.
* Collaborate closely with AI researchers, MLOps, and software engineers to understand release requirements, share feedback, and resolve pre-release issues.
* Identify and drive automation opportunities within the release pipeline to improve efficiency, reliability, and traceability.
* Oversee updates to infrastructure components, ensuring compatibility and performance across systems.
* Monitor deployments, proactively identify issues related to model behavior or infrastructure anomalies, and drive resolution with relevant teams.
* Maintain clear and accurate release documentation, including version history, deployment notes, and incident reports.
Requirements:
* Bachelor's degree in Computer Science, Software Engineering, or industry equivalent.
* 2+ years of experience in QA & Automation
* Proficiency in scripting languages (e.g., Python, Bash).
* Experience with containerization technologies (e.g., Docker, Kubernetes).
* Familiarity with CI/CD pipelines (e.g., GitLab CI/CD, Jenkins).
* Experience with cloud platforms (e.g., AWS, GCP).
* Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
* Excellent problem-solving skills and attention to detail.
* Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams.
Bonus if you have: Strong understanding of the machine learning lifecycle, from experimentation to deployment and monitoring.
* Experience with specific MLOps platforms or tools.
* Experience in a fast-paced startup environment.

Why UVeye: Pioneer Advanced Solutions: Harness cutting-edge technologies in AI, machine learning, and computer vision to revolutionize vehicle inspections. Drive Global Impact: Your innovations will play a crucial role in enhancing automotive safety and reliability, impacting lives and businesses on an international scale. Career Growth Opportunities: Participate in a journey of rapid development, surrounded by groundbreaking advancements and strategic industry partnerships.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8214831
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
20/05/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
we are looking for a Senior DevOps Engineer to join our Cloud Network Security group.

Key Responsibilities
As a DevOps Engineer at Check Point, you will design, implement, and manage CI/CD pipelines, collaborate with cross-functional teams, and ensure the high availability and reliability of our cloud-based services and solutions.

Responsibilities:

Design, implement, and manage CI/CD pipelines to automate the deployment of SaaS
Collaborate with development, QA, and operations teams to ensure smooth and reliable software releases.
Monitor system performance and troubleshoot issues to ensure high availability and reliability of our services.
Implement and manage infrastructure as code (IaC) using tools like Terraform, CloudFormation and ARM.
Optimize system performance, scalability, and security.
Develop and maintain documentation for infrastructure and deployment processes.
Requirements:
2-4 years of experience in DevOps or a related role, working with distributed systems and SaaS applications.
Proficiency with CI/CD tools such as Gerrit, GitLab CI, GitHub
Experience with Cloud Providers like: AWS, Azure, GCP
Solid foundation in Cloud account users management & cost optimizations (FinOps principles)
Solid understanding of networking, security, and system administration.
Familiarity with logging and monitoring stacks (e.g., Elasticsearch, CloudWatch, Grafana, Prometheus).
Proficiency in scripting (Python, Bash) for automation and tooling.
Solid grasp of IaC & GitOps principles and best practices (Terraform, Helm, ArgoCD, Crossplane).
Knowledge of agile methodologies and practices
Strong knowledge of distributed systems, microservices, and orchestration technologies
Expertise in containerization and orchestration tools like Docker and Kubernetes
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8185035
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
25/05/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
At UVeye, we're on a mission to redefine vehicle safety and reliability on a global scale. Founded in 2016, we have pioneered the world's first fully automated suite of vehicle inspection systems. At the heart of this innovation lies our advanced AI-driven technology, representing the pinnacle of machine learning, GenAI, and computer vision within the automotive sector. With close to $400M in funding and strategic partnerships with industry giants such as Amazon, General Motors, Volvo, and CarMax, UVeye stands at the forefront of automotive technological advancement. Our growing global team of over 200 employees is committed to creating a workplace that celebrates diversity and encourages teamwork. Our drive for innovation and pursuit of excellence are deeply embedded in our vibrant company culture, ensuring that each individual's efforts are recognized and valued as we unite to build a safer automotive world.
We are looking for a DevOps Engineer to join our DevOps R&D team. In this position, you will be responsible for integrating developers and operations teams to improve collaboration and productivity by automating infrastructure, automating workflows, and continuously measuring application performance.
A day in the life and how you’ll make an impact:
* Establish, maintain, and evolve concepts in continuous integration and deployment (CI/CD) pipelines for existing and new services.
* Collaborate with Engineering and Operations teams to improve automation of workflows, infrastructure, code testing, and deployment of on-premise and cloud services.
* Remain up-to-date on industry trends, share knowledge among teams, and abide by industry best practices for configuration management and automation.
* Implement effective monitoring and increase the sophistication of our alerting and escalation mechanisms
* Identify and resolve performance and scalability issues in products and infrastructure.
Requirements:
* 5+ years of experience in systems and production engineering and 3+ years of DevOps experience in a Linux environment
* Experience maintaining and deploying highly available, fault-tolerant systems at scale
* Experience in developing Python and scripting using bash
* Practical experience with Docker containerization and clustering (Kubernetes)
* Experience with configuration management tools (e.g. Ansible, Terraform)
* Experience implementing CI/CD (e.g. Jenkins,, GitHub actions, bitbucket pipelines)
* Experience with cloud providers (eg: AWS, GCP)
Ideally, we’re looking for:
* Bachelor's or master’s degree in CS
* AWS Certification
* Experience working in and advocating for agile environments
* Knowledge of Linux Kernel fundamentals, including job management, memory management, file systems, networking & debugging

Why UVeye: Pioneer Advanced Solutions: Harness cutting-edge technologies in AI, machine learning, and computer vision to revolutionize vehicle inspections. Drive Global Impact: Your innovations will play a crucial role in enhancing automotive safety and reliability, impacting lives and businesses on an international scale. Career Growth Opportunities: Participate in a journey of rapid development, surrounded by groundbreaking advancements and strategic industry partnerships.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8010890
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
29/05/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a highly skilled and versatile Senior DevOps Engineer to join our development team. As a Senior DevOps Engineer, you'll play a vital role in ensuring the reliability, scalability, and performance of our systems while collaborating with cross-functional teams to deliver high-quality software products


Responsibilities:
Take an active part of all DevOps areas: Develop and maintain monitoring and alerting infrastructure to ensure system reliability and performance
Oversee building and maintaining tools and procedures for monitoring, deployment, and alerting for our SaaS multi-tenant product family
Design and implement CI/CD processes for continuous integration and deployment of software applications
Develop and manage a containerized production environment using technologies such as Kubernetes, Docker, and Helm
Define and enforce DevOps standards, best practices, and procedures across the organization
Providing ad-hoc custom solutions to meet the technical needs of other teams
Acting as a resource and mentor for engineers with less DevOps experience, providing guidance and support
Requirements:
5+ years of experience as a DevOps Engineer
Hands on experience with any public cloud provider (such as: GCP)
Hands on experience with Kubernetes and Docker containers
Strong knowledge with IAC tools such as Terraform and Helm charts
Hands on experience with CI/CD automation - Github Actions and ArgoCD
Experience with ELK Stack (Elasticsearch, Logstash, Kibana) for log analysis and monitoring
Experience in architecting and scaling in cloud environments
Extensive experience with Linux operating system and proficiency in bash scripting
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8199545
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Site Reliability Engineer
Realize your potential by joining the leading performance-driven advertising company!
As Site Reliability Engineer on the IT Production team in our TLV Office, youll play a vital role in building robust services and solving infrastructure challenges with automations while working with cutting-edge technologies and bringing those to their limits on our mostly on-prem cloud like infrastructure.
How youll make an impact:
As a Site Reliability Engineer, youll bring value by:
Ensure Reliability & Scalability: Design, implement and manage highly reliable and scalable distributed systems across our on-premise, cloud and AI/ML environments. Proactively optimize performance, efficiency, resource utilization and cloud cost.
Drive Automation: Automate repetitive tasks, infrastructure provisioning, configuration and deployments using IaC and scripting languages (e.g., Python, Go, Rust).
Develop Observability & Capacity: Implement comprehensive monitoring and alerting systems to ensure system health. Collaborate on capacity planning to meet future growth.
Maintain Security & Compliance: Integrate security best practices and ensure compliance with industry standards.
Lead Incident Management: Participate in on-call rotations, lead incident responses and conduct root cause analysis to minimize downtime.
Foster Collaboration & Improvement: Work closely with development, operations and security teams to drive shared responsibility and continuous improvement in SRE practices.
Our Tech Stack:
Linux, Kubernetes, nginx, Istio, AWS, GCP, Azure, Alicloud, Fastly, Terraform, Consul, Prometheus, Loki, Grafana, Airflow, Redis, Kafka, Vector, Hadoop, Cassandra, Vertica, MySQL, HDFS, ELK.
Requirements:
7 years of experience as an SRE, DevOps Engineer, System Administrator in a large distributed environment with focus on Linux operating systems.
Experience supporting, troubleshooting and scaling large distributed systems in production.
Deep understanding of HTTP protocol, including HTTP/1.1, HTTP/2, caching semantics, TLS and gRPC delivery.
Experience configuring and operating CDN services (e.g., Akamai, Fastly, Cloudflare, AWS CloudFront).
Deep understanding in Linux system internals and system performance tuning.
Experience with Configuration Management Tools (Puppet, Ansible, Chef, Terraform).
Experience programming in at least one of the following languages (Python, Golang, Rust, Ruby, C++, Java).
Experience with monitoring and metrics collection systems (Prometheus, Grafana, ELK).
Experience with cloud providers and platforms (AWS, Azure, GCP, Alibaba).
Experience with containerization technologies (Kubernetes, Docker).
Deep understanding of networking principles (TCP/IP, DNS, load balancing).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8205371
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/06/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are hiring a DevOps Engineer to manage all engineering infrastructure, improve the CI/CD process, develop and provide the necessary support to drive engineering efforts forward. In this role, you will be responsible for managing our infrastructure across all accounts, taking full ownership of the CI/CD process in all environments, and overseeing production runtime. You will also implement best practices to enhance the products security, stability, and monitoring.
Responsibilities:
Design, implement, and manage infrastructure in AWS
Ensure that infrastructure is scalable, resilient, and fault-tolerant
Develop and maintain CI/CD pipelines using tools such as Jenkins, GitLab CI, or similar
Automate building, and deployment processes to improve the speed and reliability of product releases
Set up and manage comprehensive monitoring and logging for applications and infrastructure using tools like Prometheus, Grafana, elastic, New Relic, Bugsnag, etc.
Act on monitoring alerts and logs to troubleshoot and ensure systems' performance, availability, and stability
Develop scripts and tools to automate repetitive tasks and processes
Implement and maintain security and compliance by following proper access controls and vulnerability scanning
Design and implement runtime environments for development, staging, and production
Work with developers to identify areas where DevOps can enhance productivity and quality.
Requirements:
4+ years of experience in DevOps engineering roles, ideally within a SaaS environment
Hands-on experience with the AWS ecosystem
Proficiency in CI/CD tools (Jenkins / GitLab) and IaC tools (Terraform, CloudFormation)
Strong scripting skills (Python/bash)
Strong understanding of LINUX/UNIX-based system
Experience with Kubernetes and containerization
Coding AWS lambda functions
Familiarity with monitoring and logging tools
Excellent communication skills, with the ability to work collaboratively in a fast-paced environment
Problem-solving mindset, with attention to detail and a proactive approach to challenges
Ability to take on new challenges, with a can do mindset
Experience in a high-availability SaaS environment.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8205103
סגור
שירות זה פתוח ללקוחות VIP בלבד