דרושים » תוכנה » MLOps Engineer - AI Infra Group

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
15/01/2026
משרה זו סומנה ע"י המעסיק כלא אקטואלית יותר
שם חברה חסוי
מיקום המשרה: תל אביב יפו
סוג משרה: משרה מלאה
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
6 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
It starts with you - an engineer driven to build resilient, automated infrastructure that enables teams to move fast with confidence. You care about operational excellence, developer experience, and reliability at scale. Youll architect and operate the compute and networking infrastructure that powers our AI platform - from CI/CD pipelines to Kubernetes clusters to observability systems - across cloud and on-prem environments.
If you want to build infrastructure that powers mission-critical AI systems at national scale, join our companys mission - this role is for you.
The Responsibilities
Architect and operate Kubernetes-based infrastructure across AWS and on-prem environments, ensuring high availability, security, and performance.
Design and maintain CI/CD pipelines for application and service deployments with automated testing, security scanning, and rollback capabilities.
Drive infrastructure-as-code practices for compute and networking - building reproducible, auditable, and version-controlled infrastructure.
Own reliability and incident response - establish SLOs, build alerting systems, lead incident resolution, and drive post-incident improvements.
Enable AI-native operations - support agentic deployment pipelines, self-healing infrastructure, and secure sandboxing for model experimentation.
Build and maintain observability systems - metrics, logging, tracing, and dashboards that provide visibility into system health.
Optimize infrastructure cost and performance - right-size resources, implement auto-scaling, and identify efficiency opportunities.
Collaborate with Engineering, Data Platform, Data Engineering, and Security teams to align infrastructure with platform needs.
Shape infrastructure characteristics that support data freshness, correctness, and low-latency pathways for AI training/inference, retrieval, and agentic workflows.
Contribute paved-road tooling - reusable CI/CD patterns for services, IaC modules for compute and networking, and runbooks - that streamline delivery across teams.
Collaborate with Engineering, Data Platform, Data Engineering, Security, Product, AI/ML, Data Science, and Analytics to anticipate and meet cross-functional needs.
Requirements:
6+ years in DevOps, SRE, or infrastructure engineering, with hands-on experience building and operating infrastructure at scale.
Container orchestration - Kubernetes (EKS, on-prem), Helm, service mesh technologies like Istio or Linkerd
Cloud & infrastructure - AWS services (EC2, EKS, S3, IAM, VPC, Lambda), hybrid cloud architectures, on-prem infrastructure
Infrastructure-as-Code - Terraform, Pulumi, or CloudFormation; GitOps practices with ArgoCD or Flux
CI/CD - GitHub Actions, GitLab CI, Jenkins, or similar; artifact management, deployment strategies (blue-green, canary)
Observability - Prometheus, Grafana, ELK/OpenSearch, Datadog, or similar; distributed tracing, log aggregation, alerting
Security & compliance - Secrets management (Vault, AWS Secrets Manager), network security, compliance automation
Scripting & automation - Python, Bash, Go; configuration management with Ansible or similar.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8561434
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
17/02/2026
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
Required Machine learning operations engineer
Your Mission:
As an MLOps Engineer, your mission is to design, build, and operate the platforms that power our machine learning and generative AI products spanning real-time use cases such as large-scale fraud scoring, MCP & agentic workflows support. Youll create reliable CI/CD for models and Agents, robust data/feature pipelines, secure model serving, and comprehensive observability. You will also support our agentic AI ecosystem and Model Context Protocol (MCP) services so that models can safely use tools, data, and actions across.
You will partner closely with Data Scientists, Data/Platform Engineers, Product, and SRE to ensure every model from classic ML to LLM/RAG agents moves from prototype to production with strong reliability, governance, cost efficiency, and measurable business impact.
Responsibilities:
Operate & Develop ML/LLM platforms on Kubernetes + cloud (Azure; AWS/GCP ok) with Docker, Terraform, and other relevant tools
Manage object storage, GPUs, and autoscaling for training & low-latency model serving
Manage cloud environment, networking, service mesh, secrets, and policies to meet PCI-DSS and data-residency requirements
Build end-to-end CI/CD for models/agents/MCP tooling (versioning, tests, approvals)
Deliver real-time fraud/risk scoring & agent signals under strict latency SLOs.
Maintain MCP servers/clients: tool/resource definitions, versioning, quotas, isolation, access controls
Integrate agents with microservices, event streams, and rule engines; provide SLAs, tracing, and on-call runbooks
Measure operational metrics of ML/LLM (latency, throughput, cost, tokens, tool success, safety events)
Enforce governance: RBAC/ABAC, row-level security, encryption, PII/secrets management, audit trails.
Partner with DS on packaging (wheels/conda/containers), feature contracts, and reproducible experiments.
lead incident response and post-mortems.
Drive FinOps: right-sizing, GPU utilization, batching/caching, budget alerts.
Requirements:
4+ years in DevOps/MLOps/Platform roles building and operating production ML systems (batch and real-time)
Strong hands-on with Kubernetes, Docker, Terraform/IaC, and CI/CD
Practical experience with Spark/Databricks and scalable data processing
Proficiency in Python & Bash
Ability to operate DS code and optimize runtime performance.
Experience with model registries (MLflow or similar), experiment tracking, and artifact management.
Production model serving using FastAPI/Ray Serve/Triton/TorchServe, including autoscaling and rollout strategies
Monitoring and tracing with Prometheus/Grafana/OpenTelemetry; alerting tied to SLOs/SLAs
Solid understanding of PCI-DSS/GDPR considerations for data and ML systems
Experience with the Azure cloud environment is a big plus
Operating LLM/agent workloads in production (prompt/config versioning, tool execution reliability, fallback/retry policies)
Building/maintaining RAG stacks (indexing pipelines, vector DBs, retrieval evaluation, hybrid search)
Implementing guardrails (policy checks, content filters, allow/deny lists) and human-in-the-loop workflows
Experience with feature stores - Qwak Feature Store, Feast
A/B testing for models and agents, offline/online evaluation frameworks
Payments/fraud/risk domain experience; integrating ML outputs with rule engines and operational systems - Advantage
Familiarity with Databricks Unity Catalog, dbt, or similar tooling.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8550121
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
11/02/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Join our companys AI research group, a cross-functional team of ML engineers, researchers and security experts building the next generation of AI-powered security capabilities. Our mission is to leverage large language models to understand code, configuration, and human language at scale, and to turn this understanding into security AI capabilities that will drive our companys future security solutions.
We foster a hands-on, research-driven culture where youll work with large-scale data, modern ML infrastructure, and a global product footprint that impacts over 100,000 organizations worldwide.
Key Responsibilities
Your Impact & Responsibilities
As a Data Engineer - AI Technologies, you will be responsible for building and operating the data foundation that enables our LLM and ML research: from ingestion and augmentation, through labeling and quality control, to efficient data delivery for training and evaluation.
You will:
Own data pipelines for LLM training and evaluation
Design, build and maintain scalable pipelines to ingest, transform and serve large-scale text, log, code and semi-structured data from multiple products and internal systems.
Drive data augmentation and synthetic data generation
Implement and operate pipelines for data augmentation (e.g., prompt-based generation, paraphrasing, negative sampling, multi-positive pairs) in close collaboration with ML Research Engineers.
Build tagging, labeling and annotation workflows
Support human-in-the-loop labeling, active learning loops and semi-automated tagging. Work with domain experts to implement tools, schemas and processes for consistent, high-quality annotations.
Ensure data quality, observability and governance
Define and monitor data quality checks (coverage, drift, anomalies, duplicates, PII), manage dataset versions, and maintain clear documentation and lineage for training and evaluation datasets.
Optimize training data flows for efficiency and cost
Design storage layouts and access patterns that reduce training time and cost (e.g., sharding, caching, streaming). Work with ML engineers to make sure the right data arrives at the right place, in the right format.
Build and maintain data infrastructure for LLM workloads
Work with cloud and platform teams to develop robust, production-grade infrastructure: data lakes / warehouses, feature stores, vector stores, and high-throughput data services used by training jobs and offline evaluation.
Collaborate closely with ML Research Engineers and security experts
Translate modeling and security requirements into concrete data tasks: dataset design, splits, sampling strategies, and evaluation data construction for specific security use.
דרישות:
What You Bring
3+ years of hands-on experience as a Data Engineer or ML/Data Engineer, ideally in a product or platform team.
Strong programming skills in Python and experience with at least one additional language commonly used for data / backend (e.g., SQL, Scala, or Java).
Solid experience building ETL / ELT pipelines and batch/stream processing using tools such as Spark, Beam, Flink, Kafka, Airflow, Argo, or similar.
Experience working with cloud data platforms (e.g., AWS, GCP, Azure) and modern data storage technologies (object stores, data warehouses, data lakes).
Good understanding of data modeling, schema design, partitioning strategies and performance optimization for large datasets.
Familiarity with ML / LLM workflows: train/validation/test splits, dataset versioning, and the basics of model training and evaluation (you dont need to be the primary model researcher, but you understand what the models need from the data).
Strong software engineering practices: version control, code review, testing, CI/CD, and documentation.
Ability to work independently and in collaboration with ML engineers, researchers and security experts, and to translate high-level requirements into concrete data engineering tasks.
Nice to Have המשרה מיועדת לנשים ולגברים כאחד.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8541065
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an ML Engineer / MLOps Tech Lead to promote machine learning engineering excellence. Someone who is passionate about building scalable, high-quality data products and processes, while ensuring production systems maintain strong real-time performance observability.
You will focus on designing and maintaining the core infrastructure that empowers the Machine Learning Engineers working within Data Science product teams. Youll collaborate closely with stakeholders across data science, product, and engineering, playing a pivotal role in driving the business by architecting and enabling the infrastructure for machine learning model development, serving, and lifecycle management-the foundation of our product.
Responsibilities:
Partner with MLEs in Data Science product teams and key stakeholders to design and maintain infrastructure for:
Data wrangling - supporting and enabling data requirements for research, training, validation, and testing.
End-to-end ML delivery - enabling model performance development, training, validation, testing, and version control.
Drive engineering best practices including code and model versioning, CI/CD pipelines, rollout strategies, and disaster recovery procedures.
Build and support monitoring and observability tools - dashboards, alerts, and performance tracking of models in production.
Lead architecture projects such as:
Feature Store - centralizing feature engineering and serving across teams.
Vector Databases - enabling large-scale embedding storage and retrieval for advanced ML applications.
GPU Cluster Scaling - optimizing distributed training and inference infrastructure.
Collaborate with product, data science, and engineering teams to solve complex problems, identify trends, and create opportunities through robust ML infrastructure.
Requirements:
3+ years of experience as an ML Engineer / MLOps
2+ years of experience in a technical leadership role (leading engineers or data scientists)
Strong programming skills in Python and SQL
Hands-on experience with MPP frameworks such as Spark, Flink, Ray, or Dask or equivalent
Strong analytical and critical thinking skills
Experience in a similar role - big advantage
Experience as a backend or DevOps engineer - advantage.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8515740
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
6 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
It starts with you - a senior ML engineer responsible for building, training, evaluating, and operating machine learning systems in production. The role focuses on data pipelines, model training, experimentation, evaluation, and scalable deployment.
If you want to grow your skills building AI products for mission-critical AI, join our companys mission - this role is for you.
The Responsibilities
Design, train, and evaluate ML models for production use.
Build and maintain data pipelines for training, validation, and inference.
Own experimentation workflows: feature engineering, training runs, and comparison.
Implement model evals, monitoring, and drift detection.
Package and deploy models to production systems.
Optimize training and inference performance, cost, and reliability.
Collaborate with data, platform, and product teams.
Mentor engineers and promote ML engineering best practices.
Requirements:
4+ years software engineering experience with 2+ years applied ML in production.
Strong foundations in machine learning, statistics, and data analysis.
Hands-on experience with model training frameworks (e.g., PyTorch, TensorFlow, JAX).
Experience with distributed training and large-scale datasets.
Experience building data pipelines, feature engineering, and dataset versioning.
Proven experience designing and operating ML evals, experiment tracking, and monitoring.
Familiarity with feature stores, model registries, and ML lifecycle management.
Experience with model serving patterns and production deployment.
Proficiency in Python and strong system design skills.
Experience deploying ML systems on Kubernetes or similar platforms.
Familiarity with GPU acceleration and performance optimization.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8561447
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
28/01/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
Your Career The SASE Platform team builds and operates highly available, secure, and globally distributed services that protect users, applications, and data for some of the worlds largest enterprises. Our mission is to deliver cloud-native security and networking capabilities that seamlessly converge networking and security at scale. As enterprises accelerate adoption of cloud, remote work, and AI-driven workloads, the need for resilient, observable, and secure SASE platforms has never been greater. As an SRE, you will play a critical role in ensuring our platform is reliable, scalable, performant, and secure from day one. Your Impact As a Site Reliability Engineer, you will be an integral part of the product and platform lifecycle, partnering closely with software engineers, security experts, and infrastructure teams. You will: Collaborate with development teams to embed reliability, scalability, and operability into services from the earliest design stages Design, review, and evolve cloud-native architectures to improve availability, performance, cost efficiency, and fault tolerance Build and operate automation for provisioning, deploying, and managing infrastructure at global scale using Infrastructure as Code Improve CI/CD pipelines and release processes to enable safe, fast, and repeatable deployments Drive observability best practices, including metrics, logs, traces, SLIs/SLOs, and data-driven incident analysis Participate in on-call rotations, continuously reducing MTTR through automation, runbooks, and proactive reliability improvements Mentor and guide engineers on large-scale cloud and SASE deployments, fostering a strong SRE culture Participate in architecture and design reviews, bringing a reliability and operational excellence mindset Champion reliability, security, and operational maturity across the organization.
Requirements:
Your Experience Bachelors degree in Engineering, Computer Science, or a related technical field (or equivalent practical experience) 5+ years of experience working with Unix/Linux systems (shell, tools, networking, storage, kernel concepts) 2+ years of hands-on experience with microservices architectures running on Kubernetes and container platforms Strong understanding of distributed systems design, fault tolerance, scalability patterns, and high-availability architectures Experience operating workloads in public cloud environments (AWS, GCP, Azure, or hybrid) at medium to large scale Proficiency in building automation and tools in Python, Java, or similar languages for production environments Strong Infrastructure as Code experience (Terraform, Ansible, Chef, Puppet, or similar) Experience designing and operating monitoring, alerting, and observability systems at scale A tools-first mindset with a passion for reducing toil and increasing engineering efficiency Excellent communication skills and the ability to lead discussions across engineering and security teams Experience applying reliability and security frameworks to design, review, and operate production systems Nice to have: Networking expertise, including TCP/IP, DNS, BGP, routing, load balancing, proxies, VPNs, and cloud networking concepts-especially relevant to SASE architectures Experience operating or supporting SASE, SD-WAN, Zero Trust, or network security platforms Familiarity with AI/LLM technologies, including: Using LLMs to improve operational workflows (incident analysis, alert enrichment, runbooks, automation) Experience integrating AI/ML services into production systems Understanding of reliability, security, and governance considerations for AI-driven services.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8522215
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/02/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Senior DevOps Engineer with strong engineering skills and a product mindset, who is passionate about platform engineering, developer experience (DevEx), and AI-assisted automation.
This role is ideal for someone who enjoys building scalable internal platforms and paved roads that empower R&D teams to independently take services from design to production. You will focus on reducing friction, increasing developer autonomy, and embedding best practices through automation and tooling.
Location: Tel Aviv (Hybrid)
About Us
we are a global leader in cybersecurity, delivering advanced security solutions that protect organizations worldwide.
The Harmony SASE rocket is building a cloud-native, high-scale security platform that enables secure connectivity for the modern, distributed workforce.
Our DevOps Platform team supports Harmony SASE R&D by providing the infrastructure, CI/CD, observability, and tooling that allow teams to move fast while operating safely in production. We are heavily investing in Platform Engineering, DevEx, and AI-assisted operations to scale our engineering velocity and reliability.
Key Responsibilities
Design and build platform capabilities and self-service tooling that enable R&D teams to deploy and operate services independently.
Develop and maintain Infrastructure as Code and deployment patterns for large-scale cloud environments.
Build and evolve CI/CD pipelines and automation using GitHub Actions and cloud-native services.
Improve developer experience, observability, and operational readiness across production systems.
Explore and integrate AI-driven automation and intelligent tooling into DevOps and platform workflows.
Collaborate with R&D and architecture teams to support new services from design to production.
Requirements:
6+ years of experience in DevOps / Platform / Infrastructure Engineering, including ownership of large-scale production environments.
Experience building infrastructure solutions for high-scale SaaS systems.
Strong hands-on experience with Infrastructure as Code (Terraform, Terragrunt, or Pulumi).
Strong programming skills in Python or Go.
Experience designing and maintaining CI/CD pipelines, preferably with GitHub Actions.
Strong experience with AWS, including services such as ECS, EKS, Lambda, and API Gateway.
Solid understanding of microservices architecture, Linux systems, and cloud networking.
Experience with monitoring and logging tools such as Datadog, Prometheus, and Grafana.
Advantages
Experience with platform engineering, internal developer platforms, or DevEx initiatives.
Hands-on experience integrating AI-assisted automation or tooling into engineering workflows.
Experience with HashiCorp tools (Vault, Consul, Nomad).
Familiarity with configuration management tools (Ansible, Chef).
Strong networking fundamentals (DNS, HTTP/S, proxies, CDN).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8540398
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Senior MLOps Engineer
Realize your potential by joining the leading performance-driven advertising company!
As a Senior MLOps Engineer on the Infra group, youll play a vital role in develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools.
About Algo platform:
The objective of the algo platform group is to own the existing algo platform (including health, stability, productivity and enablement), to facilitate and be involved in new platform experimentation within the algo craft and lead the platformization of the parts which should graduate into production scale. This includes support of ongoing ML projects while ensuring smooth operations and infrastructure reliability, owning a full set of capabilities, design and planning, implementation and production care.
The group has deep ties with both the algo craft as well as the infra group. The group reports to the infra department and has a dotted line reporting to the algo craft leadership.
The group serves as the professional authority when it comes to ML engineering and ML ops, serves as a focal point in a multidisciplinary team of algorithm researchers, product managers, and engineers and works with the most senior talent within the algo craft in order to achieve ML excellence.
How youll make an impact:
As a Senior MLOps Engineer Engineer, youll bring value by:
Develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools, including CI/CD, monitoring and alerting and more
Have end to end ownership: Design, develop, deploy, measure and maintain our machine learning platform, ensuring high availability, high scalability and efficient resource utilization
Identify and evaluate new technologies to improve performance, maintainability, and reliability of our machine learning systems
Work in tandem with the engineering-focused and algorithm-focused teams in order to improve our platform and optimize performance
Optimize machine learning systems to scale and utilize modern compute environments (e.g. distributed clusters, CPU and GPU) and continuously seek potential optimization opportunities.
Build and maintain tools for automation, deployment, monitoring, and operations.
Troubleshoot issues in our development, production and test environments
Influence directly on the way billions of people discover the internet
Our tech stack:
Java, Python, TensorFlow, Spark, Kafka, Cassandra, HDFS, vespa.ai, ElasticSearch, AirFlow, BigQuery, Google Cloud Platform, Kubernetes, Docker, git and Jenkins.
Requirements:
Experience developing large scale systems. Experience with filesystems, server architectures, distributed systems, SQL and No-SQL. Experience with Spark and Airflow / other orchestration platforms is a big plus.
Highly skilled in software engineering methods. 5+ years experience.
Passion for ML engineering and for creating and improving platforms
Experience with designing and supporting ML pipelines and models in production environment
Excellent coding skills - in Java & Python
Experience with TensorFlow - a big plus
Possess strong problem solving and critical thinking skills
BSc in Computer Science or related field.
Proven ability to work effectively and independently across multiple teams and beyond organizational boundaries
Deep understanding of strong Computer Science fundamentals: object-oriented design, data structures systems, applications programming and multi threading programming
Strong communication skills to be able to present insights and ideas, and excellent English, required to communicate with our global teams.
Bonus points if you have:
Experience in leading Algorithms projects or teams.
Experience in developing models using deep learning techniques and tools
Experience in developing software within a distributed computation framework.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8559413
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
Were looking for a Senior Software Engineer to join the AZ Team in Tel Aviv - a group of passionate developers building the secure, scalable backbone of our Customer Experience (CX) Platform.

As a key member of the AZ Team, youll play a pivotal role in shaping the foundations of our CX Platform - driving features and architectural decisions from concept to production-grade solutions. Youll design and build secure, scalable systems for user management, authentication, authorization, and data access that serve thousands of developers across us.

You will:

Design, develop, and build platform-wide authentication and authorization services, creating a cohesive identity fabric that integrates seamlessly with multiple identity vendors and systems.

Lead the evolution of the data consumption layer, enabling governed, efficient, and context-aware access to data across the CX ecosystem.

Drive architectural decisions from concept to production, ensuring solutions are secure, scalable, and optimized for both developer experience and operational excellence.

Leverage AI and automation to enhance access control, anomaly detection, and developer productivity - turning complex platform insights into actionable intelligence.

Collaborate cross-functionally with product, data, and infrastructure teams to build interoperable solutions that power our next-generation developer platform.

Influence platform-wide engineering standards, promoting robust design, observability, and maintainability across services.

Champion developer experience, crafting APIs, SDKs, and tools that simplify integration and accelerate innovation.

Mentor and guide engineers, fostering a culture of technical depth, curiosity, and impact-driven innovation.
Requirements:
Minimum Qualifications:

8+ years of professional software engineering experience, with proven ability to design, implement, and deliver complex distributed systems in production.

Strong problem-solving, debugging, and system-design skills, with a focus on scalability and maintainability.

Validated experience in backend or full-stack development using one or more of the following languages: Java, TypeScript/Node.js, Go, or Python.

Proven understanding of distributed systems, microservices architecture, and RESTful or GraphQL APIs.

Hands-on experience with cloud-native development on AWS, including containerized workloads running on EKS (Kubernetes).

Proficiency with databases - relational (e.g., PostgreSQL) or NoSQL (e.g., MongoDB, Redis, OpenSearch) - and familiarity with data-driven application design.

Deep understanding of authentication, authorization, and modern identity and access management concepts.

Familiarity with streaming and messaging systems, such as Apache Kafka.

Preferred Qualifications:

Experience building or integrating with multiple identity providers (e.g., Okta, Azure AD, Ping) and designing identity fabric or zero-trust architectures.

Exposure to AI-driven platforms, leveraging AI/ML for developer productivity, anomaly detection, or access intelligence.

Knowledge of Infrastructure as Code (IaC) tools such as Helm and Terraform, and familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry).

Background in security-focused design, including secrets management, policy-as-code, and compliance automation.

Experience contributing to platform engineering or developer-enablement initiatives in large-scale environments.

Passion for innovation, continuous improvement, and building tools that make developers lives easier.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8545937
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Staff MLOps Engineer
Realize your potential by joining the leading performance-driven advertising company!
As a Staff MLOps Engineer on the Infra group, youll play a vital role in develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools.
About Algo platform:
The objective of the algo platform group is to own the existing algo platform (including health, stability, productivity and enablement), to facilitate and be involved in new platform experimentation within the algo craft and lead the platformization of the parts which should graduate into production scale. This includes support of ongoing ML projects while ensuring smooth operations and infrastructure reliability, owning a full set of capabilities, design and planning, implementation and production care.
The group has deep ties with both the algo craft as well as the infra group. The group reports to the infra department and has a dotted line reporting to the algo craft leadership.
The group serves as the professional authority when it comes to ML engineering and ML ops, serves as a focal point in a multidisciplinary team of algorithm researchers, product managers, and engineers and works with the most senior talent within the algo craft in order to achieve ML excellence.
How youll make an impact:
As a Staff MLOps Engineer Engineer, youll bring value by:
Develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools, including CI/CD, monitoring and alerting and more
Have end to end ownership: Design, develop, deploy, measure and maintain our machine learning platform, ensuring high availability, high scalability and efficient resource utilization
Identify and evaluate new technologies to improve performance, maintainability, and reliability of our machine learning systems
Work in tandem with the engineering-focused and algorithm-focused teams in order to improve our platform and optimize performance
Optimize machine learning systems to scale and utilize modern compute environments (e.g. distributed clusters, CPU and GPU) and continuously seek potential optimization opportunities.
Build and maintain tools for automation, deployment, monitoring, and operations.
Troubleshoot issues in our development, production and test environments
Influence directly on the way billions of people discover the internet
Our tech stack:
Java, Python, TensorFlow, Spark, Kafka, Cassandra, HDFS, vespa.ai, ElasticSearch, AirFlow, BigQuery, Google Cloud Platform, Kubernetes, Docker, git and Jenkins.
Requirements:
Experience developing large scale systems. Experience with filesystems, server architectures, distributed systems, SQL and No-SQL. Experience with Spark and Airflow / other orchestration platforms is a big plus.
Highly skilled in software engineering methods. 5+ years experience.
Passion for ML engineering and for creating and improving platforms
Experience with designing and supporting ML pipelines and models in production environment
Excellent coding skills - in Java & Python
Experience with TensorFlow - a big plus
Possess strong problem solving and critical thinking skills
BSc in Computer Science or related field.
Proven ability to work effectively and independently across multiple teams and beyond organizational boundaries
Deep understanding of strong Computer Science fundamentals: object-oriented design, data structures systems, applications programming and multi threading programming
Strong communication skills to be able to present insights and ideas, and excellent English, required to communicate with our global teams.
Bonus points if you have:
Experience in leading Algorithms projects or teams.
Experience in developing models using deep learning techniques and tools
Experience in developing software within a distributed computation framework.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8559811
סגור
שירות זה פתוח ללקוחות VIP בלבד