דרושים » תוכנה » DevOps Engineer

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 19 שעות
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
vtransforming the way products are made, distributed, sold, used, reused, and ultimately, recycled by enabling a new level of visibility of products in the wholesale channel, cutting the capital tied up in inventories and environmental footprint, while driving up sales by having real-time insights into their products wherever they are used. We invite you to help make sense of the world/
We are looking for a DevOps engineer to join the team.

Responsibilities
Own operational responsibility for the application and platform layers in a critical SLA.
Oversee and own overall production - deployments, maintenance, and enhancements.
Ensuring Production SaaS platform high availability working with various teams.
Improve our systems, deployments, operations, and overall cloud activities.
Responsible for deployment, tools, troubleshooting, and performance tuning.
Manage incidents
Focusing on root cause analysis, prevention measures, and knowledge transfer.
Develop and maintain processes, documentation, and automation.
Platform maintenance and testing initiatives.
Requirements:
BSc in Engineering or equivalent experience.
At least 5 years of experience in running solutions in production; Hands-on approach
Experience with a cloud provider (AWS preference) a must
Experience with Docker and K8S managing in production
Strong troubleshooting, problem-solving skills and incident management principles.
UNIX/Linux experience and system administration knowledge.
Experience with major monitoring solutions (e.g., Prometheus, Grafana, Loki).
Write and maintain technical documents and standard operating procedures.
Strong verbal and written communications in English.
Knowledge of CI/CD concepts Jenkins, BitBucket, GitLab.
Team player, get-stuff-done attitude with self-learning skills.
Strong spoken and written English
Experience working for a global company
It would be an advantage for you to have:
Development background
Terraform, Helm
Knowledge of security and networking of production environments
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8218535
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
29/05/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a highly skilled and versatile Senior DevOps Engineer to join our development team. As a Senior DevOps Engineer, you'll play a vital role in ensuring the reliability, scalability, and performance of our systems while collaborating with cross-functional teams to deliver high-quality software products


Responsibilities:
Take an active part of all DevOps areas: Develop and maintain monitoring and alerting infrastructure to ensure system reliability and performance
Oversee building and maintaining tools and procedures for monitoring, deployment, and alerting for our SaaS multi-tenant product family
Design and implement CI/CD processes for continuous integration and deployment of software applications
Develop and manage a containerized production environment using technologies such as Kubernetes, Docker, and Helm
Define and enforce DevOps standards, best practices, and procedures across the organization
Providing ad-hoc custom solutions to meet the technical needs of other teams
Acting as a resource and mentor for engineers with less DevOps experience, providing guidance and support
Requirements:
5+ years of experience as a DevOps Engineer
Hands on experience with any public cloud provider (such as: GCP)
Hands on experience with Kubernetes and Docker containers
Strong knowledge with IAC tools such as Terraform and Helm charts
Hands on experience with CI/CD automation - Github Actions and ArgoCD
Experience with ELK Stack (Elasticsearch, Logstash, Kibana) for log analysis and monitoring
Experience in architecting and scaling in cloud environments
Extensive experience with Linux operating system and proficiency in bash scripting
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8199545
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
Were growing and looking to hire Site Reliability Engineer (SRE) who embodies our core values: People First, Customer Obsession, Strive for Excellence, and Integrity.
We are looking for a skilled and motivated Site Reliability Engineer (SRE) to join our team and help ensure our production cloud environment's reliability, performance, and scalability. As an SRE, you will work at the intersection of software engineering and operations, taking ownership of system stability, incident response, automation, and continuous improvement of our infrastructure.
This role is ideal for engineers who thrive in dynamic environments, value reliability, and enjoy building resilient and scalable systems.
As an SRE, Your impact will be:
Production Reliability: Ensure system uptime and performance by identifying and addressing potential issues before they affect end users.
Incident Response: Serve as part of the on-call rotation, rapidly diagnosing and resolving incidents, and conducting root cause analysis and postmortems.
Monitoring and Alerting: Build and maintain monitoring dashboards and alerting systems to detect and respond to anomalies in real time.
Automation and Tooling: Develop and maintain automation tools for deployments, scaling, and operational efficiency using Terraform, Ansible, Bash, or Python.
Infrastructure Maintenance: Perform regular maintenance and upgrades of production infrastructure to ensure security, stability, and performance.
Release Engineering: Support and optimize the rollout of new features and updates, minimizing risk and impact on production environments.
Staging Environment Management: Ensure staging environments accurately reflect production for robust testing and validation of changes.
Requirements:
Experience in SRE, DevOps, or production engineering roles
Strong skills in system troubleshooting, incident response, and root cause analysis
Proficiency with tools such as:
Jenkins, Terraform, Ansible, GIT, GitHub
Bash, Python
AWS, ArgoCD, or similar CI/CD and cloud platforms
Familiarity with observability tools and practices (metrics, logging, tracing)
Ability to work effectively in cross-functional teams
Strong communication and documentation skills
Bachelor's degree in Computer Science, Information Technology, or a related field (preferred)
Familiarity with Agile development methodologies
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8198455
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
At UVeye, we're on a mission to redefine vehicle safety and reliability on a global scale. Founded in 2016, we have pioneered the world's first fully automated suite of vehicle inspection systems. At the heart of this innovation lies our advanced AI-driven technology, representing the pinnacle of machine learning, GenAI, and computer vision within the automotive sector. With close to $400M in funding and strategic partnerships with industry giants such as Amazon, General Motors, Volvo, and CarMax, UVeye stands at the forefront of automotive technological advancement. Our growing global team of over 200 employees is committed to creating a workplace that celebrates diversity and encourages teamwork. Our drive for innovation and pursuit of excellence are deeply embedded in our vibrant company culture, ensuring that each individual's efforts are recognized and valued as we unite to build a safer automotive world.
We are seeking a highly motivated and skilled Release Engineer to join our AIOps group. In this role, you'll play a critical part in bridging the gap between development and operations, ensuring the seamless qualification, deployment, and monitoring of our AI algorithms and infrastructure, and be responsible for the end-to-end operationalization of our core technology.
A day in the life and how you’ll make an impact:
* Manage the end-to-end release process of machine learning algorithms and infrastructure components, from qualification through deployment.
* Validate and test new algorithm releases to ensure they meet performance, stability, and compliance standards.
* Create and execute deployment plans across various environments (staging, production), ensuring minimal risk and downtime.
* Collaborate closely with AI researchers, MLOps, and software engineers to understand release requirements, share feedback, and resolve pre-release issues.
* Identify and drive automation opportunities within the release pipeline to improve efficiency, reliability, and traceability.
* Oversee updates to infrastructure components, ensuring compatibility and performance across systems.
* Monitor deployments, proactively identify issues related to model behavior or infrastructure anomalies, and drive resolution with relevant teams.
* Maintain clear and accurate release documentation, including version history, deployment notes, and incident reports.
Requirements:
* Bachelor's degree in Computer Science, Software Engineering, or industry equivalent.
* 2+ years of experience in QA & Automation
* Proficiency in scripting languages (e.g., Python, Bash).
* Experience with containerization technologies (e.g., Docker, Kubernetes).
* Familiarity with CI/CD pipelines (e.g., GitLab CI/CD, Jenkins).
* Experience with cloud platforms (e.g., AWS, GCP).
* Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
* Excellent problem-solving skills and attention to detail.
* Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams.
Bonus if you have: Strong understanding of the machine learning lifecycle, from experimentation to deployment and monitoring.
* Experience with specific MLOps platforms or tools.
* Experience in a fast-paced startup environment.

Why UVeye: Pioneer Advanced Solutions: Harness cutting-edge technologies in AI, machine learning, and computer vision to revolutionize vehicle inspections. Drive Global Impact: Your innovations will play a crucial role in enhancing automotive safety and reliability, impacting lives and businesses on an international scale. Career Growth Opportunities: Participate in a journey of rapid development, surrounded by groundbreaking advancements and strategic industry partnerships.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8214831
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
20/05/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
we are looking for a Senior DevOps Engineer to join our Cloud Network Security group.

Key Responsibilities
As a DevOps Engineer at Check Point, you will design, implement, and manage CI/CD pipelines, collaborate with cross-functional teams, and ensure the high availability and reliability of our cloud-based services and solutions.

Responsibilities:

Design, implement, and manage CI/CD pipelines to automate the deployment of SaaS
Collaborate with development, QA, and operations teams to ensure smooth and reliable software releases.
Monitor system performance and troubleshoot issues to ensure high availability and reliability of our services.
Implement and manage infrastructure as code (IaC) using tools like Terraform, CloudFormation and ARM.
Optimize system performance, scalability, and security.
Develop and maintain documentation for infrastructure and deployment processes.
Requirements:
2-4 years of experience in DevOps or a related role, working with distributed systems and SaaS applications.
Proficiency with CI/CD tools such as Gerrit, GitLab CI, GitHub
Experience with Cloud Providers like: AWS, Azure, GCP
Solid foundation in Cloud account users management & cost optimizations (FinOps principles)
Solid understanding of networking, security, and system administration.
Familiarity with logging and monitoring stacks (e.g., Elasticsearch, CloudWatch, Grafana, Prometheus).
Proficiency in scripting (Python, Bash) for automation and tooling.
Solid grasp of IaC & GitOps principles and best practices (Terraform, Helm, ArgoCD, Crossplane).
Knowledge of agile methodologies and practices
Strong knowledge of distributed systems, microservices, and orchestration technologies
Expertise in containerization and orchestration tools like Docker and Kubernetes
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8185035
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
01/06/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Site Reliability Engineer (SRE) to join our Engineering team. Someone who has a passion for observability, monitoring, automation, and high-availability systems, and who has a desire to solve complex technological challenges with a proactive approach to continuous improvement.

We use an interesting and mixed technology stack: Kubernetes, Terraform, CI/CD pipelines, Datadog, Prometheus, and cloud-native architectures.

In this position, you will use your expertise in building and scaling SRE operations, and will design, implement, and operate a world-class reliability strategy.


Key Responsibilities
Develop and maintain our monitoring, alerting, and logging systems, ensuring high visibility into production environments.
Implement automation to improve system reliability, scalability, and efficiency.
Troubleshoot and resolve production incidents, leading root cause analyses and implementing permanent fixes.
Collaborate with software engineers and DevOps teams to enhance application performance and resilience.
Continuously improve operational processes, focusing on reducing toil and improving reliability.
Requirements:
3+ years of experience as an SRE, DevOps Engineer, or in a similar role.
Hands-on experience with monitoring and observability tools like Datadog, Prometheus, and Grafana.
Strong understanding of Linux systems, networking, and cloud-native architectures.
Experience with Kubernetes, Terraform, and CI/CD pipelines.
A problem solver, capable of finding creative solutions and getting things done.
Fluent with incident management, RCA processes, and operational best practices.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8200136
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/06/2025
Location: Tel Aviv-Yafo
Job Type: Full Time
we empower organizations to run in the cloud by aligning operations and security around access management. Our platform provides companies with Just-In-Time and Just Enough access across their hybrid environments reducing the access risk while improving productivity.
has offices in New York and Tel Aviv and supports dozens of customers across the US and the world, including large Fortune 500 companies and was honored in Gartner's Magic Quadrant for Privileged Access Management.

What are we looking for?
We are seeking a highly skilled and motivated Software Engineering Team Leader to build and lead our new Analysis & Discovery team. In this pivotal role, you will be responsible for the technical vision, execution, and management of the team, driving the development of modules that provide actionable intelligence in the realm of identity management, permissions, and digital identities.

About the Analysis & Discovery Team:
The Analysis & Discovery team plays a crucial role in empowering administrators with deep insights into their organization's identity and permission landscape. Our mission is to develop cutting-edge modules that allow administrators to discover, analyze, and act upon how permissions and entitlements are granted and utilized across their systems. We are at the forefront of tackling complex challenges in identity management, including the new developing field of non-human (digital) identities.

Responsibilities:
Lead, mentor, and grow a team of talented software engineers in a fast-paced startup environment.
Spearhead the design, development, and maintenance of modules focused on high-scale data ingestion,processing & inferring insights
Pioneering the development of new modules to manage and secure non-human identities.
Lead & apply Agile/SCRUM methodologies, including facilitating SCRUM ceremonies and fostering a culture of continuous improvement.
Perform deep technical dives to resolve complex issues and ensure high-quality deliverables.
Collaborate closely with product management, security researchers, and other engineering teams to define roadmaps and deliver impactful solutions.
Foster a culture of innovation, security-first, technical excellence, and strong teamwork.
Requirements:
A proactive, "startup mindset" with a passion for building innovative solutions from the ground up.
At least 6 years of experience in software development, with a strong background in backend systems.
At least 3 years of experience in a team leadership or engineering management role.
High familiarity with the Cloud Identify Management world entitlements & security models
Strong understanding of authentication, authorization & zero-trust principles & best practices
Experience with data analysis, log processing, or building data insight pipelines.
Proven expertise and hands-on experience with Kotlin and Go for backend development.
Experience working with cloud provider APIs and services (e.g., AWS, Azure, GCP).
Demonstrable experience in managing software development lifecycle using SCRUM methodologies in a startup environment.
Exceptional problem-solving skills, with an ability to dissect complex problems and devise elegant solutions.
Excellent communication, leadership, and interpersonal skills, with a "team player" mindset.
High proficiency in English
Advantages:
Experience or familiarity with frontend development using React and TypeScript for building user interfaces that present complex data.
Knowledge of security best practices and threat modeling.
Familiarity with non-human identities, service accounts, and machine-to-machine authentication.
Experience with Docker, Kubernetes, Temporal, RabbitMQ, PostgreSQL
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8205418
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
we are seeking a DevOps Engineer to join our dynamic Dev team in the Midtown-TLV. In this role, you'll develop AI tools to empower our Dev team, ensuring every day brings new challenges and rewards. You'll collaborate closely on our production infrastructure, build Continuous Integration and Continuous Deployment (CI/CD) pipelines, and architect complex solutions.
Develop AI tools to enhance and empower the Dev team.
Design, build, and maintain Continuous Integration and Continuous Deployment (CI/CD) pipelines.
Architect complex solutions to support development and deployment processes.
Develop and maintain HiBobs cutting-edge infrastructure.
Manage the full development cycle from source code to production.
Monitor key performance indicators (KPIs) to measure success and optimize performance.
Contribute to technological advancements and drive operational efficiency.
Oversee and improve monitoring, disaster recovery (DR), and security systems.
Requirements:
5+ Years of Experience as a DevOps Engineer
Coding experience in Python and/or Go
Experience with LLM/AI development - STRONG ADVANTAGE
Hands-on experience with AWS or similar cloud services
Experience in managing Kubernetes environments with Helm in production
Ability to lead large-scale cross-team projects.
Enthusiastic about solving complex problems and building enabling tools for developers
Skilled in designing and building GitOps CI/CD with tools like GitHub Actions and ArgoCD
Required knowledge in IaC using tools like Crossplane and Terraform
Familiarity with DataDog or equivalent monitoring and observability platform
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8199329
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Software Developer.
As a Software Developer you will be responsible to:
Play a key role in designing and implementing infrastructure solutions to support our development and deployment processes.This role requires a versatile individual with strong technical skills, an out-of-the-box thinker, and a passion for solving challenging problems in cloud-based environments.
Key Responsibilities:
Develop, maintain, and optimize Python-based services and tools for cloud platforms (AWS preferred).
Design and implement scalable and secure infrastructure solutions, leveraging modern cloud technologies.
Build and maintain APIs, microservices, and server-side applications using Python frameworks
Collaborate with cross-functional teams to integrate CI/CD pipelines and improve deployment efficiency.
Troubleshoot and resolve technical bottlenecks in cloud-based systems.
Contribute to system architecture design and ensure best practices are followed.
Stay up-to-date with emerging technologies and trends to continuously improve development and deployment processes.
Requirements:
5+ years of professional experience as a backend developer, with strong skills in Python
Solid experience in developing services on cloud platforms, particularly AWS (experience with Azure or GCP is a plus).
Expertise in Python frameworks and Node.js frameworks (e.g., Express).
Strong understanding of software engineering principles, including system design and data structures.
Experience building RESTful APIs and microservices architectures.
Familiarity with relational and NoSQL databases (e.g., MySQL, MongoDB).
Strong debugging, performance optimization, and troubleshooting skills.
Strong communication and collaboration skills, with the ability to work effectively in a team.
Proactive, detail-oriented, and self-motivated with the ability to thrive in a fast-paced environment.
Nice-to-Have:
Experience with .js
Knowledge of serverless architecture (e.g., AWS Lambda).
Experience with DevOps tools and practices, including:
Infrastructure as Code (IaC) tools like Terraform.
Containerization and orchestration tools like Docker and Kubernetes (K8s).
CI/CD tools such as Jenkins or similar.
Knowledge of security best practices for cloud-based environments.
Previous experience working in an infrastructure or architecture team.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8184620
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
01/06/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are seeking an Experienced Senior DevOps Engineer to join our high-performing SaaS Security team.

Born from a successful cybersecurity startup acquisition, we operate as a dynamic team within a larger organization, driven by innovation and a passion for building cutting-edge security solutions.

As our lead DevOps engineer, you will build and maintain our large-scale, automated, SaaS infrastructure.

Our technology stack includes GCP, K8s, Prometheus, Terraform, PostgreSQL, Kafka, Temporal and Hasura. Youll work collaboratively with software engineering teams and tech leads, help automate and streamline our operations and processes and troubleshoot issues in our development, test and production environments.

Key Responsibilities
Design and develop our SaaS infrastructure, CI/CD pipelines, permissions model and deployment strategy.
Improve overall security posture, manage permissions and improve work processes.
Assist in introducing new technologies into group infrastructure.
Be a focal point for the stability of our infrastructure.
Requirements:
3+ years of experience in a DevOps role
Experience with Terraform or other IaC systems
Experience with Kubernetes/Helm.
Understanding CI/CD pipeline and building tools (Git, Bazel, CircleCI, Docker, Artifactory)
A working understanding of code and script (Bash, Python)
Experience with GCP or other cloud providers
Basic experience with Databases (PostgreSQL, BigQuery) - An advantage
Experience with K8s monitoring applications (Graphite, Grafana, Prometheus, Sensu, etc.) - An advantage
Experience with implementing GCP SOC2 | security standards - An advantage
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8200070
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Site Reliability Engineer- Infra
Realize your potential by joining the leading performance-driven advertising company!
As a Site Reliability Engineer- infra, on our Infrastructure team at the TLV office, you will play a key role in ensuring the reliability, scalability, and performance of our critical systems. You will be responsible for managing and improving our core infrastructure, with a focus on automation, monitoring, and incident response. You will work with a wide range of technologies, including Kubernetes, monitoring and observability tools, configuration management systems, and core networking services.
How youll make an impact:
As a Site Reliability Engineer, youll bring value by:
Ensure the reliability, availability, and performance of our infrastructure services.
Manage and maintain our Kubernetes infrastructure, including KubeVirt.
Design, implement, and maintain our monitoring and observability stack (SensuGo, VictoriaMetrics, Prometheus, ELK).
Automate infrastructure provisioning, configuration, and deployment processes using Puppet and Ansible.
Manage and maintain core services such as DNS and networking.
Troubleshoot and resolve complex infrastructure issues in a timely and efficient manner.
Participate in on-call rotations and incident response.
Develop and maintain infrastructure-as-code (IaC).
Identify and implement proactive measures to prevent incidents and improve system reliability.
Collaborate with development teams to ensure smooth and reliable deployments.
Contribute to the design and implementation of new infrastructure solutions.
Drive improvements in system architecture, processes, and tools.
Mentor and coach other team members.
Requirements:
To thrive in this role, youll need:
5+ years of experience in a Site Reliability Engineering, Systems Engineering, or similar role.
Deep understanding of Site Reliability Engineering principles and practices.
Extensive experience with Kubernetes, including deployment, management, and troubleshooting.
Strong experience with monitoring and observability tools such as SensuGo, Zabbix, VictoriaMetrics, Prometheus, and ELK.
Proficiency in configuration management tools such as Puppet and Ansible.
Solid understanding of Linux internals and networking.
Experience with managing and maintaining core services such as DNS and networking.
Strong programming skills in Python and/or Go.
Experience with both on-premises and cloud environments.
Experience with KubeVirt.
Excellent troubleshooting and problem-solving skills.
Strong communication and collaboration skills.
Ability to work in a fast-paced, dynamic environment.
Ability to participate in on-call rotations including weekends.
Preferred Qualifications:
Experience with large-scale, distributed systems.
Experience with other cloud providers (e.g., AWS, Azure, GCP).
Contributions to open-source projects.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8205377
סגור
שירות זה פתוח ללקוחות VIP בלבד