דרושים » מחשבים ורשתות » DevOps Team Leader

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
09/04/2026
משרה זו סומנה ע"י המעסיק כלא אקטואלית יותר
מיקום המשרה: תל אביב יפו
סוג משרה: משרה מלאה
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Were looking for a DevOps Architect to help shape the infrastructure strategy behind our Revenue AI platform. This role sits at the center of our engineering ecosystem, driving architectural direction, improving operational excellence, and enabling teams to scale with confidence. Youll work across engineering groups to identify systemic gaps, define scalable standards, and accelerate execution without becoming a delivery bottleneck.
Youll Own:
Infrastructure Strategy & Standards: Define and evolve our cloud and infrastructure architecture across Kubernetes, networking, observability, security, and data platforms. Establish clear standards and scalable best practices that enable teams to move faster with consistency and reliability.
Technical Debt & System Health Visibility: Continuously identify, prioritize, and drive resolution of cross-team technical debt, architectural gaps, and operational inefficiencies. Create organizational visibility around the most critical infrastructure challenges and opportunities.
Cross-Org Technical Leadership: Partner closely with engineering leaders and teams to influence architectural decisions, challenge assumptions, and ensure solutions are scalable, maintainable, and secure. Lead through expertise and influence, not direct ownership.
Developer Enablement & Engineering Velocity: Provide frameworks, tooling direction, and lightweight prototypes or POCs that empower teams to execute independently with higher quality and efficiency.
Critical Infrastructure Initiatives: Drive major cross-functional initiatives around reliability, scalability, security, observability, and cost optimization from identification through execution and measurable impact.
Youll Solve:
Scaling Complexity: How do we maintain simplicity, reliability, and operational clarity while supporting rapid growth and increasingly complex distributed systems.
Cross-Team Alignment: How do we create architectural consistency across independent engineering groups without slowing down innovation and execution?
Operational Excellence at Scale: How do we proactively surface and resolve systemic weaknesses before they become production issues?
Balancing Speed & Sustainability: How do we enable fast delivery today while protecting the long-term health and scalability of the platform?
AI Infrastructure Evolution: How do we build infrastructure that supports modern AI/ML workloads, GPUs, large-scale data pipelines, and future platform requirements
Youll Impact:
Platform Reliability & Scalability: Your work will directly improve the resilience, scalability, and operational maturity of our infrastructure platform.
Engineering Efficiency: By creating better standards, tooling, and architectural guidance, youll act as a force multiplier for engineering teams across the company.
Long-Term System Health: Youll help reduce operational friction, minimize technical debt, and ensure our infrastructure can support long-term business growth.
Execution Quality Across Teams: Your influence will elevate engineering quality, decision-making, and operational discipline throughout the organization.
Requirements:
A Deep Technical Expert: Someone with 8+ years of hands-on experience with AWS and cloud-native infrastructure at scale, including strong Kubernetes expertise and distributed systems knowledge.
An Infrastructure Architect: Someone with deep experience in Infrastructure as Code and GitOps methodologies using tools like Terraform, Crossplane, or Pulumi.
A Pragmatic Builder: A strong engineer with programming experience in Python or Go who can build tools, prototypes, and automation when needed.
A Systems Thinker: Someone who can identify patterns, uncover systemic issues, and drive improvements across complex technical environments.
An Influential Technical Leader: Someone with proven experience leading cross-team initiatives and driving alignment without direct authority.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8665155
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
24/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an experienced SRE Team Lead to drive the reliability, observability, and automation practices across our private cloud infrastructure and operations. In this role, you will lead a team of site reliability engineers, own the engineering roadmap for monitoring and automation, and act as a key liaison between development, operations, and platform teams. You bring at least 3-4 years of hands-on people management experience and a deep technical background in SRE or DevOps disciplines.



What will you do?

Leadership & Team Management

Lead, mentor, and grow a team of SREs, providing technical direction, career development guidance, and day-to-day management.

Own the team roadmap for reliability, observability, and automation initiatives - prioritizing work, removing blockers, and driving delivery.

Conduct regular 1:1s, performance reviews, and hiring processes to build and sustain a high-performing team.

Foster a culture of operational excellence, blameless post-mortems, and continuous improvement.

Act as an escalation point for complex incidents and reliability issues, leading post-incident reviews and ensuring follow-through on action items.


Automation & Infrastructure

Design, develop, and maintain automation tools to support infrastructure and operations teams at scale.

Manage pipelines and infrastructure workflows using Jenkins, Ansible, Python, and Bash.

Drive the adoption of infrastructure-as-code practices across the organization.

Collaborate with system engineers to improve scalability, performance, and fault tolerance of critical systems.


Monitoring & Observability

Build and extend monitoring and alerting systems using Grafana, the ELK (Elastic) stack, Zabbix, and custom scripts.

Implement and enforce observability best practices to ensure full visibility into systems, applications, and infrastructure.

Define and track SLIs, SLOs, and error budgets across key services.

Partner with development teams to embed observability earlier in the software development lifecycle.


Database & Platform Support

Support monitoring and infrastructure integration for databases including MongoDB and PostgreSQL.

Maintain documentation and champion knowledge sharing around automation, monitoring, and reliability practices.
Requirements:
What you need:

Experience & Leadership

3-4+ years of experience in a people management or team lead capacity within SRE, DevOps, or infrastructure engineering.

5-8+ years of overall experience in SRE, DevOps, or infrastructure automation roles.

Proven track record of building, coaching, and retaining high-performing engineering teams.

Experience owning an engineering roadmap and driving cross-functional reliability initiatives.



Technical Skills

Strong scripting skills in Python and Bash; comfortable building and maintaining production-grade automation.

Hands-on experience with infrastructure automation tools, particularly Ansible.

Solid experience with monitoring and observability platforms - ELK stack, Grafana, and Zabbix.

Good understanding of CI/CD pipelines and related tooling, including Jenkins.

Familiarity with managing and monitoring MongoDB and PostgreSQL in a production environment.

Comfortable working in Linux-based environments.

Excellent problem-solving skills and strong written and verbal communication.



Ability to support the following:

Experience with cloud providers - AWS, GCP, or Azure.

Exposure to containerization technologies such as Docker and Kubernetes.

Familiarity with infrastructure provisioning using Terraform.

Experience introducing SRE practices (SLOs, error budgets, chaos engineering) at an organizational level.

Exposure and experience with migrating/ building AI tools to improve process.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8662300
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
01/06/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
As a Senior DevOps Engineer within the XSPM team, you will be a critical, go-to technical expert responsible for the health, performance, and evolution of our database and infrastructure systems. When production databases degrade or behave unexpectedly, you are the person who dives deep, investigating root causes hands-on, understanding the underlying mechanics of the problem, and designing lasting solutions. Your mastery of database systems makes you the authority the team relies on to diagnose complex performance issues, architect better data solutions, and ensure our infrastructure scales with confidence.

Beyond databases, you will drive our DevOps practices end-to-end - CI/CD pipelines, infrastructure automation, and operational reliability across the XSPM platform. This is a high-impact, highly visible role at the intersection of database engineering and DevOps, where your expertise directly shapes how the team delivers and operates at scale.

We're a highly collaborative, friendly, inclusive and diverse group that prizes collaboration over competition. We provide opportunities to learn new skills, mentor fellow engineers, and contribute to the direction of both the team and the products for which we're responsible. We work in a distributed, high-trust environment where you manage your own time and have the flexibility to balance your work and personal life.

What You Will Do:

Serve as the team's database expert, the first person to investigate, diagnose, and resolve complex performance problems across our production database systems (MongoDB, OpenSearch, PostgreSQL, Cassandra).

Perform deep-dive root cause analysis on database performance issues, understanding query execution internals, resource consumption patterns, cluster behavior, and system-level interactions to identify the real source of problems, not just symptoms.

Design and propose better database architectures and solutions, recommending when to re-architect data models, migrate workloads, introduce new technologies, or redesign how services interact with their data layer.

* You will put in every effort within the team to ensure the data architecture is well designed.

Own capacity planning, scaling strategies, and high-availability designs for database clusters, ensuring systems are built to handle the team's growth trajectory.

Act as the bridge between development and infrastructure, advising engineers on how their application patterns impact database performance and guiding them toward sustainable solutions.

Build and maintain CI/CD pipelines, infrastructure-as-code (Terraform, Helm, Kubernetes manifests), and automated deployment workflows for the xspm team's services.

Design and manage observability stacks, dashboards, alerting rules, and SLOs, to maintain best-in-class availability for critical data pipelines and services.

Drive infrastructure automation to reduce operational toil, including automated scaling, self-healing systems, and configuration management.

Participate in on-call rotations, incident response, and post-incident reviews, driving root-cause analysis and long-term reliability improvements.

Evaluate and adopt new database technologies and infrastructure tooling that align with the team's evolving data architecture needs.
Requirements:
7+ years experience in DevOps, SRE, DBA, or infrastructure engineering, with significant hands-on responsibility for production database systems at scale.

Expert-level knowledge of a common DB such as MongoDB, Opensearch, Postgress, deep understanding of its internals, performance characteristics, replication, sharding, and the ability to diagnose and solve complex issues from first principles.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8675475
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
13/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
At UVeye, we're on a mission to redefine vehicle safety and reliability on a global scale. Founded in 2016, we have pioneered the world's first fully automated suite of vehicle inspection systems. At the heart of this innovation lies our advanced AI-centric technology, representing the pinnacle of computer vision, machine learning, and generative AI within the automotive sector. With over $380M in funding and strategic partnerships with industry giants such as Toyota, Amazon, General Motors, Volvo, and Hertz, our technology is utilized in manufacturing plants, dealerships, wholesale auctions, delivery fleets, seaports, and more. Our growing global team of over 200 employees is committed to creating a workplace that celebrates diversity, encourages teamwork, and strives for excellence.
We are looking for a driven, systems-minded Release Engineer to join our AI-Ops team. In this role, you will be the execution layer of our delivery pipeline—the technical gatekeeper who owns the safe, predictable deployment of software and AI models to global edge and cloud systems. You will balance high-speed deployment velocity with rock-solid operational stability. But you won't just be deploying code; you'll be acting as an internal project manager, driving our organizational roadmap by building Agentic AI tools and automating processes to scale our delivery capabilities continuously.
A day in the life and how you’ll make an impact:
* Act as the technical gatekeeper, validating and transitioning versions through strict release gates. Enforce rigorous governance and ensure strict "Definition of Done" criteria are met.
* Lead risk-mitigated rollouts across diverse global hardware environments.
* Monitor real-time deployment performance, Quality of Service (QoS), and algorithmic accuracy, making decisive, crisis-resilient calls to proceed, pause, or rollback to prevent regressions.
* Define and execute comprehensive test plans that verify cross-team dependencies. Validate that new versions meet detection accuracy requirements without degrading infrastructure.
* Triage complex production failures. Look beyond immediate issues to identify root causes using system metrics, logs, and container states, delivering actionable evidence to R&D.
* Build and integrate Agentic AI and LLM-based tools to accelerate log analysis, risk assessment, and deployment troubleshooting.
* Architect automated workflows to eliminate manual overhead and enhance system observability with robust monitors and dashboards.
Requirements:
* 2+ years of experience in Release Engineering, DevOps, QA, or a similar operations-centric role.
* Strong systems-level troubleshooting skills with the ability to analyze data, system metrics, logs, and container states.
* Demonstrated ability to maintain decisive control and make smart risk-management decisions during live, high-stakes deployments.
* Experience enforcing data integrity and process governance using Jira or similar issue-tracking tools.
Bonus if you have:
* Experience building or integrating AI, LLMs, or Agentic workflows into operational tooling.
* Familiarity with deploying software to both cloud environments and distributed edge hardware.
* Experience with performance benchmarking (throughput, bandwidth, algorithmic accuracy).
* Prior experience acting as a project manager for internal engineering initiatives or tools
Why UVeye: Pioneer Advanced Solutions: Harness cutting-edge technologies in AI, machine learning, and computer vision to revolutionize vehicle inspections. Drive Global Impact: Your innovations will play a crucial role in enhancing automotive safety and reliability, impacting lives and businesses on an international scale. Career Growth Opportunities: Participate in a journey of rapid development, surrounded by groundbreaking advancements and strategic industry partnerships
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8649166
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
24/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an experienced SRE Engineer to drive the reliability, observability, and automation practices across our private cloud infrastructure and operations. In this role, you will be in a team of site reliability engineers, own the engineering roadmap for monitoring and automation, and act as a key liaison between development, operations, and platform teams. You bring at least 4+ years of hands-on people management experience and a deep technical background in SRE or DevOps disciplines.

What will you do?

Automation & Infrastructure
- Design, develop, and maintain automation tools to support infrastructure and operations teams at scale.
- Manage pipelines and infrastructure workflows using Jenkins, Ansible, Python, and Bash.
- Drive the adoption of infrastructure-as-code practices across the organization.
- Collaborate with system engineers to improve scalability, performance, and fault tolerance of critical systems.

Monitoring & Observability
- Build and extend monitoring and alerting systems using Grafana, the ELK (Elastic) stack, Zabbix, and custom scripts.
- Implement and enforce observability best practices to ensure full visibility into systems, applications, and infrastructure.
- Define and track SLIs, SLOs, and error budgets across key services.
- Partner with development teams to embed observability earlier in the software development lifecycle.

Database & Platform Support
- Support monitoring and infrastructure integration for databases including MongoDB and PostgreSQL.
- Maintain documentation and champion knowledge sharing around automation, monitoring, and reliability practices.
Requirements:
What you need:

4+ years of overall experience in SRE, DevOps, or infrastructure automation roles.

Strong scripting skills in Python and Bash; comfortable building and maintaining production-grade automation.

Hands-on experience with infrastructure automation tools, particularly Ansible.

Solid experience with monitoring and observability platforms - ELK stack, Grafana, and Zabbix.

Good understanding of CI/CD pipelines and related tooling, including Jenkins.

Familiarity with managing and monitoring MongoDB and PostgreSQL in a production environment.

Comfortable working in Linux-based environments.

Excellent problem-solving skills and strong written and verbal communication.


Ability to support the following:
Experience with cloud providers - AWS, GCP, or Azure.
Exposure to containerization technologies such as Docker and Kubernetes.
Familiarity with infrastructure provisioning using Terraform.
Experience introducing SRE practices (SLOs, error budgets, chaos engineering) at an organizational level.
Exposure and experience with migrating/ building AI tools to improve process.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8662378
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
20/05/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a seasoned and execution driven DevOps Team Lead. In this role you will drive the evolution of our infrastructure and cloud operations empowering our entire R&D organization. You will lead fully the DevOps aspects of the company including strategic infrastructure management within a multi-cloud environment. In this role you will leverage cutting-edge AI technologies to accelerate important DevOps flows. As a technical lighthouse, you will set the standard for engineering excellence, cloud-native architecture, and operational scale. You will work in a global environment with highly-skilled engineering teams reporting to the VP R&D.



What Youll Do?

Build and lead a strong DevOps team fostering a culture of ownership, collaboration, and technical excellence.
Take end-to-end leadership and project management of large-scale cloud infrastructures, driving technical decisions, execution, and outcomes.
Continuously modernize DevOps practices (CI/CD, GitOps, monitoring etc) to drive efficiency, agility, and performance across the organization.
Use AI-powered tools and agentic workflows to improve R&D efficiency.
Design and maintain cloud environments using Infrastructure as Code.
Our stack - Mainly GCP and AWS, Kubernetes, Terraform, Terragrunt, Node.js, Golang and more
Requirements:
Proven success record managing DevOps team(s) in hyper growth tech companies.
Hands-on and expertise in AWS or GCP production environments at scale.
Hands-on and expertise in Kubernetes and managing large scale Prod+Dev environments.
Experience with FinOps and cloud cost optimization.
Experience in backend micro-services / API development.
A proactive Can-Do attitude and a drive to build something new.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8659707
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Required DevOps Engineer
About the role:
Our DevOps team operates the infrastructure that powers our AI and Computer Vision platform across construction sites in 15+ countries. From data pipelines and ML workloads to backend services - you'll work with a diverse, modern, Kubernetes-based stack and have real influence on how we build, deploy, and operate.
What you'll do:
Own Multi-Cloud Infrastructure: Work alongside the team to design, scale, and operate our high-scale, multi-region production infrastructure across AWS and GCP, powering construction sites globally.
Drive Kubernetes at Scale: Manage and evolve our Kubernetes platform on EKS and leveraging GitOps practices with ArgoCD and Helm to enable safe, fast, and reliable deployments.
Build Robust CI/CD: Design and maintain CI/CD pipelines that empower dozens of engineers to ship confidently - with automation, testing, and progressive delivery built in.
Tackle Diverse Infrastructure Challenges: Work hands-on with a wide variety of workloads - from heavy data processing and Computer Vision pipelines to backend services and ML inference - each with unique scaling, performance, and reliability requirements.
Ensure Reliability & Observability: Build and maintain world-class observability (metrics, logs, tracing, alerting) so that issues are caught early and resolved fast. Performance, reliability, and scalability are at the core of what you do.
Security & Cost: Partner with the team to strengthen our security posture, identity and access management, compliance, and cloud cost optimization across both clouds.
Ownership from 0 to 1: You will have real influence over our architecture and tooling. We want engineers who care about shaping what we build and how we build it, ensuring performance, security, and observability are baked in from day one.
Requirements:
A seasoned DevOps / Infrastructure engineer (5+ years) with strong hands-on experience in production cloud environments.
Proven expertise operating large-scale, distributed systems - with deep understanding of Kubernetes, networking, and cloud-native architecture.
Strong experience with multi-cloud environments (AWS and/or GCP), Infrastructure-as-Code (Terraform), and GitOps workflows (ArgoCD, Flux, or similar).
Hands-on experience with CI/CD systems (Jenkins, GitHub Actions, etc.).
Solid scripting and automation skills (Python, Bash, or Go).
Proven track record of being a collaborative team player who partners closely with developers, ML engineers, and cross-functional stakeholders across the organization.
Experience with observability stacks (Prometheus, Grafana, OpenTelemetry, Logz.io, or similar).
Experience with databases (relational and/or NoSQL) - including operational aspects like backups, migrations, and performance tuning.
AI-Native Engineering: You are an AI-native engineer who leverages LLMs and agentic tools (like Cursor, Copilot, or Claude) not just for command completion, but as a core operational partner - automating diagnostics, runbooks, and infrastructure workflows so you can focus on the critical things.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8670484
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Senior DevOps Engineer to join our R&D team in developing the next rising product in the health tech landscape. If you are looking for a challenging, influential position and are passionate about making an impact, this might be the role for you.

As a Senior DevOps Engineer , youll play a key role in the design, development, testing, deployment, and monitoring of our infrastructure and products. In this position, you'll make significant contributions to our observability stack, helping build and maintain robust systems for logs, metrics, traces, and alerting.

Our ideal candidate is passionate about DevOps and observability, has strong communication skills, and thrives on constant improvement for both technology and processes. If you enjoy working on multiple projects in parallel and are a proactive team player, youll fit right in.

This is a unique opportunity to join the core team of a fast-growing startup, where your contributions will have a direct impact on our product and success.

Responsibilities

Support and collaborate with cross-functional engineering teams using cutting-edge technologies.
Contribute to the design, implementation, and maintenance of monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, Loki)
Secure, scale, and manage our cloud environments (AWS and GCP)
Design and implement automation solutions for both development and production
Manage and improve our CI/CD pipelines for fast and safe delivery
Lead best practices in infrastructure, observability, configuration management, and system hardening
Continuously assess and improve existing infrastructure in line with industry standards
Requirements:
BSc in Computer Science, Engineering, or equivalent experience
5+ years of experience as a DevOps Engineer or similar software engineering role
Proven experience with Docker and Kubernetes (EKS preferred)
Hands-on experience with monitoring and observability tools, including Prometheus, Grafana, Datadog, or similar.
Expertise in Terraform for AWS infrastructure-as-code deployments
Strong collaboration and interpersonal communication skills
Excellent analytical thinking and problem-solving mindset
Proficiency with relational databases
Solid knowledge of Python and Bash scripting
Experience with test automation - an advantage
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8671069
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
Required Al Infrastructure & Reliability Engineer
What this role is really about
Youll join a 3-person platform team within our Business Technology group -owning the internal infrastructure that our AI platform and its users depend on. This isnt a product engineering role, and it isnt ticket work or babysitting pipelines someone else built. Youre building and operating the internal foundation that the company runs on. The work covers the full stack of platform engineering: core cloud infrastructure (AWS, Kubernetes, IaC), CI/CD pipelines, AI-driven infrastructure components, and the SRE and observability practice that keeps it all honest -metrics, alerting, incident response, and reliability standards. As our AI capabilities grow, so does the complexity underneath them, and staying ahead of that is central to the role. If you treat infrastructure as a product -reusable, automated, observable, and built to last -this is your kind of role.
Job responsibilities
DevOps & AI-Driven Infrastructure - own CI/CD, deployment processes, and release reliability. Build and operate cloud infrastructure that is automated, intelligent, and continuously self-improving - not just managed.
Design and build our Terraform repository and IaC pipeline from scratch -AI-assisted generation, drift detection, and policy enforcement built in.
Build AI-driven GitHub Actions pipelines -automated code review, risk assessment, and intelligent deployment decisions.
Manage Kubernetes workloads across AWS accounts -zero downtime, fully automated, nothing left behind.
Embed AI into the operational layer -proactive drift detection, automated remediation, and intelligent scaling toward a self-healing runtime.
Reliability & SRE -improve uptime, resilience, and incident response.
Define and enforce SLOs/SLIs, error budgets, and on-call practices.
Lead incident response, postmortems, and systemic reliability improvements.
Own AI-specific reliability: model latency SLOs, token quota monitoring, rate limit handling, fallback and retry strategies, and cost-per-request alerting.
Observability & Telemetry - increase visibility, reduce noise, improve troubleshooting.
Establish and continuously evolve the observability stack: metrics, logs, distributed tracing, and alerting tuned for both application and AI workloads.
AI / LLM Operations- bringing AI systems to production and operating them at scale, with a focus on reliability, performance, and trust.
Own the AI infrastructure layer: rate limits, quota management, latency SLOs, and fallback strategies (retries, circuit breakers).
Operate LLM APIs in production with resilience and cost attribution per team/model.
Requirements:
2-4 years Hands-on DevOps, SRE, or infrastructure engineering in production SaaS environments.
Strong AWS experience: multi-account architecture, cross-account IAM, serverless and event-driven services (Lambda, SQS, SNS, EventBridge), and EKS cluster management.
Proven Kubernetes experience in production, including cross-account migrations and stateful workload management.
Proficiency with Terraform - repository structure design, module architecture, and CI/CD pipeline implementation.
Hands-on experience building and maintaining GitHub Actions pipelines for end-to-end CI/CD workflows.
Working Python proficiency for scripting, internal tooling, and workflow automation.
Practical experience implementing observability stacks from scratch: metrics, logging, distributed tracing, and alerting.
Experience owning reliability practices: SLOs, incident response, and postmortem culture.
Nice to have
Hands-on experience operating LLM APIs in production: rate-limit and quota management, cost attribution per team/model, latency monitoring, and resilience patterns (retries, fallbacks, circuit breakers).
FinOps experience across cloud, AI, and observability spend.
Experience introducing self-healing or auto-remediation patterns in production.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8659781
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
3 ימים
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an IT Administration& DevOps professional who will join our great, fast-growing team. You'll be in charge of setting up, managing, and troubleshooting our internal and production systems and providing service for all employees in the company. You will also be responsible for our products' design, development, testing, and deployment. Available outside of regular working hours when needed.
* Manage IT projects across teams and sites
* Setup of production and development IT systems
* Resolve technical problems with LAN, WAN, WiFi, and other network equipment
* Be the first responder for production issues
* Configure and maintain all technical equipment, including computers, printers, switches, conference rooms, TVs, and back-office systems
* Manage and maintain the IT inventory and purchasing of technical equipment
* Setup of production and development IT systems
* Handle onboarding / offboarding processes for employees
* Provide Help desk support
Requirements:
* At least 3 year of experience as an IT administrator, including installations, monitoring, and supporting systems, backend environment understanding (GPO, Single Sign-On, Active Directory, O365, G-Suite)
* Experience with Linux Systems - Must
* Experience with Cloud services AWS- a big advantage
* Experience with Git, Jenkins, and CI/CD- an advantage
* Excellent ability to install, administer, and troubleshoot computer hardware and software (Mac OS X- must, Windows- big advantage)
*  Knowledge of networking components and infrastructure (LAN / WAN, TCP/IP, DHCP, DNS, switches, firewalls)
* Experienced with monitoring systems
* Troubleshooting capabilities
*  Fluent in Hebrew & English
* Knowledge with MySQL/PostgreSQL/Elastic - Advantage
* High customer orientation skills.
About the Company: Our mission is to protect every mobile app worldwide and its users. We provide mobile brands with the only patented, centralized, data -driven Mobile Cyber Defense Automation platform. Our platform delivers rapid no-code, no-SDK mobile app security, anti-fraud, anti-malware, anti-cheat, anti-bot implementations, configuration as code ease, Threat-Events threat-aware UI / UX control, ThreatScope Mobile XDR, and Certified Secure DevSecOps Certification in one integrated system. With us, mobile Developers, cyber and fraud teams can accelerate delivery, guarantee compliance, and leverage automation to build, TEST, release, and monitor the full range of cyber, anti-fraud, and other defenses needed in mobile apps from within mobile DevOps and CI/CD pipelines. Leading financial, healthcare, m-commerce, consumer, and B2B brands use us to upgrade mobile DevSecOps and protect Android & IOS apps, mobile customers, and businesses globally. Today, our customers use our platform to secure over 50,000+ mobile apps, with protection for over 1 billion mobile end users projected. We are an Equal Opportunity Employer. We are committed to diversity, equity, and inclusion in our workplace. We do not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other characteristic protected by law. All qualified applicants will be considered for employment without regard to these characteristics.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
6416485
סגור
שירות זה פתוח ללקוחות VIP בלבד