דרושים » תוכנה » Senior ML Data Engineer

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 11 שעות
חברה חסויה
Location: Ramat Gan
Job Type: Full Time
The AI Engineering group builds modern infrastructure and solutions that improve how algorithms are developed at our company.
We are a small, independent team of experienced engineers with a mix of skills in algorithms, software, and infrastructure. We work in a DevOps style and build cross-team solutions that support research and development of advanced perception algorithms.
Our flagship project is a unified AV dataset used to train and evaluate next-generation models. We take large volumes of multi-camera video, object labels, HD maps, and sensor data from across the organization, and turn it into a curated, high-quality training set - at scale.
We are looking for someone who brings ML and computer-vision depth to the team - someone who can help shape the intelligence layer that decides what data is worth training on.
What will your job look like:
Work collaboratively with shared ownership. Your focus area will be the curation and ML side of our data pipeline, but you will contribute across the full stack alongside the rest of the team.
Build and improve the curation pipeline - from vision-model embeddings and scene detection, through VLM-based scene analysis, to scoring, deduplication, and sampling that produces a balanced and diverse dataset.
Run and optimize GPU inference at scale (embedding extraction, VLM inference) across thousands of driving sessions using workflow orchestration.
Develop scoring and sampling strategies that ensure rare but important scenarios (night driving, adverse weather, hazardous situations) are well-represented in the final dataset.
Work with algorithm teams to understand what data gaps hurt model performance and translate those into curation criteria.
Build validation and diagnostics that measure dataset quality - not just pipeline health, but whether the data is actually good for training.
Contribute to the core dataset SDK, converter, and 3D-geometry tooling (camera projection, calibration, coordinate transforms).
Requirements:
4+ years in data engineering or backend/software engineering with serious data work - pipelines that run in production, not just notebooks.
Strong Python and the PyData stack (NumPy, PyArrow, Pandas, DuckDB).
Some background in research, algorithms, or ML - enough that you can read a paper, understand a model's outputs, and have informed conversations with algorithm engineers.
Comfort working with vision-model outputs as data: embeddings, detection results, VLM responses.
Ability to work across team boundaries - this role lives between algorithm teams, infra teams, and our own.
Experience with autonomous-driving datasets or perception pipelines.
3D geometry and camera model intuition (or the mathematical background to ramp up).
Workflow orchestration (Argo, Airflow, Kubeflow).
Vector databases or columnar analytics (LanceDB, DuckDB, Parquet at scale).
Familiarity with curation concepts (active learning, hard-example mining, distribution balancing) - useful context, not a requirement.
Exposure to LLM agents or agentic workflows for data tasks.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8699030
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 18 שעות
דרושים בOne DatAI
Location: Ramat Gan and Petah Tikva
Job Type: Full Time and Hybrid work
What youll do:
Lead design and delivery of enterprise data platforms
Own end-to-end pipelines from ingestion to serving
Architect scalable lakehouse solutions on Databricks
Drive best practices across Spark, Python, and SQL
Lead and mentor data engineers across projects
Work closely with stakeholders to define solutions
Optimize performance, cost, and reliability of pipelines
Implement data governance, quality, and monitoring
Requirements:
What were looking for:
5+ years of experience as a data Engineer
Proven experience leading large data projects
Strong hands-on experience with Databricks (must)
Deep knowledge of Spark, Python, and SQL
Experience designing lakehouse architectures
Strong understanding of batch and streaming pipelines
Experience with data modeling and large-scale data processing
Ability to translate business needs into technical solutions
This position is open to all candidates.
 
Show more...
הגשת מועמדות
עדכון קורות החיים לפני שליחה
8614034
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 16 שעות
דרושים בהראל ביטוח ופיננסים
מיקום המשרה: רמת גן
סוג משרה: משרה מלאה
אנחנו מחפשים Data Engineer מנוסה להצטרף לצוות הדאטה שלנו ולהיות שותף מרכזי בהקמה, פיתוח ותפעול של פלטפורמת Data & AI ארגונית מתקדמת בסביבת Azure.

במסגרת התפקיד:
תכנון, פיתוח ותפעול Lakehouse ארגוני על גבי Azure Databricks
בניית Data Pipelines מודרניים מקצה לקצה על בסיס Delta Lake במודל Bronze / Silver / Gold
פיתוח תהליכי Data Ingestion חכמים ממערכות ליבה, APIs וEvent Streams, כולל Incremental Loading וCDC
עבודה עם Unity Catalog לניהול Data Governance, הרשאות, Lineage ושיתוף נתונים מאובטח בין צוותים
שימוש יומיומי ביכולות מתקדמות של Databricks:
Workflows & Jobs
Autoloader
Delta Live Tables
Structured Streaming
עבודה שוטפת וצמודה עם Data Scientists ותמיכה בML Workflows
הטמעה וניהול של MLflow:
ניהול ניסויים
Model Tracking
Versioning וReproducibility
בניית סטנדרטים ארגוניים לData Quality, Monitoring וObservability
אופטימיזציה של ביצועים ועלויות בסביבת Azure (Cluster Sizing, Caching, Partitioning)
תמיכה בהקמת AI / Data Platform ארגונית התומכת בAgents, Machine Learning ומוצרי דאטה
דרישות:
מה אנחנו מחפשים?
ניסיון של 3-5 שנים בתפקיד Data Engineer בסביבת Cloud
ניסיון מעשי משמעותי עם Azure Databricks
שליטה גבוהה ב Python,SQL, Spark / PySpark
ניסיון בעבודה עם: Delta Lake, Unity Catalog, Databricks Workflows / Jobs
ניסיון בבניית Data Pipelines בסביבת Production, ניסיון בעבודה משותפת עם Data Scientists וML Pipelines
היכרות עם MLflow לניהול מודלים וניסויים
הבנה טובה של Data Modeling בסביבת Lakehouse
ניסיון בעבודה עם Git וCI/CD לתהליכי דאטה
יכולת עבודה בצוות רבתחומי, חשיבה מוצרית וראייה מערכתית על דאטה

אנחנו ממוקמים ברמת גן,
מתחם הבורסה, בסמוך לרכבת סבידור מרכז ולרכבת הקלה

* המשרה מיועדת לנשים ולגברים כאחד.
 
עוד...
הגשת מועמדות
עדכון קורות החיים לפני שליחה
8570851
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 10 שעות
חברה חסויה
Location: Ramat Gan
Job Type: Full Time
we are seeking a strong ML Software Engineer to join our deep learning LiDAR & Radar group and help scale the systems that bring cutting‑edge perception models into production. Youll build the software layers, data pipelines, and runtime systems that turn advanced neural networks into reliable, high-performance solutions running on edge devices.
This is a hands-on, high‑ownership role within a growing group working closely with algorithm developers. The work spans Python and C++, ML infrastructure, model integration, performance optimization, and production delivery.
** The role includes working on-site at our Jerusalem office several days per week.
What will your job look like:
Lead end-to-end development of features - from design and implementation to integration, testing, and deployment
Build ML pipelines for data-based diverse dataset creation and efficient model inference
Design data selection and sampling strategies to ensure coverage of rare and critical scenarios
Partner with algorithm teams to translate model weaknesses into data curation criteria
Develop validation and diagnostics to measure dataset quality-not just pipeline health but training effectiveness
Integrate neural network models into C++ production systems, including runtime, data flow, and pre/post‑processing
Bring models from research/prototype stage into robust, production‑ready deployments
Optimize runtime performance (latency, memory, and throughput) in resource‑constrained environments
Contribute to deployment flows (e.g., model conversion, profiling, optimization)
Build and improve CI/CD pipelines, automated testing, and development workflows.
Requirements:
B.Sc. in Computer Science, Software Engineering, or equivalent
3+ years of hands-on C++ development experience
3+ years of hands-on Python development experience, including the PyData stack (NumPy, Pandas)
Experience working in Linux environments
Strong motivation to work closely with deep learning algorithms and production of AI systems
Interest in neural network deployment on edge devices, including inference runtimes, performance optimization, and model integration
Proven ability to work across team boundaries (algorithms, infra, product)
Strong motivation to work on production AI systems and deep learning integration
Interest in edge deployment, inference runtimes, and performance optimization
Advantages:
Experience with autonomous-driving datasets or perception pipelines
Background in 3D geometry and/or strong mathematical foundation
Experience with workflow orchestration tools (Airflow, Argo)
Familiarity with data curation techniques (e.g., active learning, hard example mining, distribution balancing)
2+ years in data engineering or backend systems with large‑scale data (production environments).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8699125
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 9 שעות
Location: Ramat Gan
Job Type: Full Time
our company's ML Platform group builds and operates the core infrastructure that powers large-scale AI workloads across on-prem bare-metal GPU clusters and multi-cloud environments. Our goal is to deliver the modern infrastructure and tooling that accelerates our company's entire algorithm development lifecycle - from a researcher's first experiment to a production deployment.
We are a small, independent group of engineers with diverse skills across software, infrastructure, and systems. We set the standards, build the cross-company products, and take end-to-end ownership of everything we ship.
What will your job look like?
Design, develop, and maintain the Python framework that enables algorithm developers across our company to train, validate, quantize, and deploy deep learning models - locally, on-prem, and across cloud providers - through a single unified interface
Build high-performance data streaming libraries that feed large-scale distributed training pipelines in Rust with Python interfaces
Set the standard for reliable, reproducible research at scale - experiment tracking, configuration management, checkpoint handling, and multi-node training
Work directly alongside algorithm researchers to understand friction, propose solutions, and ship them - without layers of process in between
Contribute to open source when the right fix/feature belongs upstream.
Requirements:
A value-first mindset focused on shipping early and often
2+ years of hands-on experience as a software engineer in the industry or in a similar relevant role
B.Sc. in Computer Science, Software Engineering, or equivalent hands-on experience
Strong software engineer skills in Python - tested, production-grade code that other engineers can build on
Familiarity with deep learning frameworks (ideally Pytorch) and distributed training workflows
Experience with containerization and CI/CD pipelines
Contributions to open source projects
Familiarity with Linux internals - networking, file systems, process management
Experience in Rust/C/Cuda
Experience with cloud infrastructure (AWS or similar) and distributed storage
Exposure to infrastructure-as-code or Kubernetes-based deployments.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8699142
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Ramat Gan
Job Type: Full Time
Required AI Infrastructure Engineer
Description
We are building its internal AI infrastructure layer from the ground up. We have real agents running in production, a growing base of employees using AI in their daily work, and a clear architectural direction. What we don't have yet is a dedicated engineer to own it.
You'll be the first. Your job is to close the gap between "working prototype" and "production platform" - owning the foundation that hosts our agents, the pipelines that ship them, and the reliability layer (observability, cost controls, audit trails, evals) that makes it safe to run AI at scale in a trust & safety company.
This is an infrastructure-first role with deep AI fluency - not a prompt engineer, not a wrapper-framework operator, not a no-code builder. You should be equally comfortable writing a Terraform module, debugging a Kubernetes pod, and tracing an agent's tool-call chain.
We dont operate with a predefined backlog here; you will be responsible for identifying high-impact needs and bringing them to life. The perfect fit for this role has a track record of deploying agentic systems that have held up under real-world usage, balances a focus on infrastructure with a deep concern for user experience, and recognizes that the primary hurdle in AI integration is rarely the model itself.
Responsibilities:
Platform & Infrastructure:
Architect, build, and run the AWS/Kubernetes platform that hosts our internal AI agents and tools; drive AWS Well-Architected pillars (operational excellence, security, reliability, performance, cost, sustainability).
Own Infrastructure-as-Code: Terraform modules, standards, and reviews for Bedrock, agent runtimes, vector DBs, and supporting services.
AI Systems:
Design and ship production-grade agents and multi-agent pipelines using the Anthropic Agent SDK, Claude Code, AWS Bedrock, and MCP - not wrapper frameworks.
Own the full agent lifecycle: scoping → prototyping → eval → deploy → monitor → iterate.
Integrate agentic workflows into internal and product systems via APIs, databases, webhooks, Slack, and email.
Reliability, Observability, Cost:
Build first-class observability across apps and infra: OpenTelemetry, Prometheus, plus LLM-specific tracing (Langfuse or equivalent), token/cost metrics, and eval pipelines.
Define SLOs/SLIs and error budgets for AI services - latency, model fallback chains, eval regression gates, agent success rates. Lead incident readiness, response, and post-mortems.
Drive FinOps: model routing by cost, cache hit rates, batch vs. realtime tradeoffs, budget alarms, per-team chargeback visibility.
Implement guardrails: prompt-injection defenses, PII redaction, model allowlists, human-in-the-loop checkpoints, audit trails.
Org Impact:
Identify high-leverage workflows across the organization and translate them into scalable agentic automations.
Partner with R&D, Delivery, security, and external vendors to deliver platform capabilities.
דרישות:
Requirements (must-have)
3-5 years in software engineering, shipping and operating production-grade systems.
2+ years hands-on AWS, Kubernetes, and Terraform in production - not familiarity, ownership.
1-2 years hands-on building and deploying LLM-powered or agentic systems in production.
Proficiency in Python: async patterns, REST APIs, cloud-native architecture.
Production experience with native agentic SDKs (Anthropic Agent SDK, Claude Code) and MCP - tool-calling patterns, server configuration, memory systems, vector DBs.
Hands-on AWS Bedrock for model access, IAM-based auth, and enterprise deployment patterns.
Production CI/CD ownership (GitHub Actions, Argo CD, or equivalent) and observability stack experience (OpenTelemetry + Prometheus, plus LLM tracing).
Proven ownership: design → implement → release → operate → improve, independently and within a team.
Strong debugging instincts across multi-step agent chains and distributed המשרה מיועדת לנשים ולגברים כאחד.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8688629
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
17/05/2026
Location: Ramat Gan
Job Type: Full Time and Hybrid work
We are looking for a Senior AI Researcher to work closely with our delivery team, translating real-world clinical and scientific questions into rigorous AI and machine learning solutions. In this role, you will lead the development of methodologies, models, and benchmarking frameworks that address complex clinical problems using our unique immune-cell data. Your work will directly impact how our partners understand biology, evaluate hypotheses, and make decisions.

Location: Ramat Gan (hybrid model).

What will you do?
Partner with the collaborations team and scientists to frame complex clinical and biological questions as well-defined data science and modeling problems.
Design, implement, and evaluate the right analytical and modeling approaches for each problem, including robust benchmarking and validation strategies.
Clearly communicate results, limitations, and tradeoffs to internal teams and external partners, enabling confident, data-driven decisions.
Establish best practices for rigor, reproducibility, evaluation, and documentation across delivery-focused data science work.
Work hand-in-hand with immunologists, computational biologists, AI scientists, and engineers to ensure solutions are scientifically sound and production-aware.
Translate modeling solutions into engineered and iterable products used by the internal collaborations team.
Requirements:
Required qualifications:
MSc or PhD in Computer Science, Statistics, Mathematics, Machine Learning/Data Science, Physics, Computational Biology, or a related field
At least 4 years of industry experience.
Strong foundation in machine learning - neural networks/classical machine learning.
Hands-on experience with Python-based data science and ML tooling.
Proven experience working with large, complex datasets (biological data preferred).
Experience with MLOps/DataOps practices for robust and reproducible ML training and evaluation pipelines, including experience with benchmarking tools and pipeline optimization.
Experience in statistics, probability, mathematical modeling and experimental design - preferred.
Experience using transformer-based models for applied solutions - preferred.
Experience in biotech, life sciences, or healthcare - preferred.

Desired personal traits:
You want to make an impact on humankind.
You prioritize We over I.
You enjoy getting things done and striving for excellence.
You collaborate effectively with people of diverse backgrounds and cultures.
You have a growth mindset.
You are candid, authentic, and transparent.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8653666
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
03/06/2026
חברה חסויה
Location: Ramat Gan
Job Type: Full Time and Hybrid work
As we scale our portfolio of live automations, were looking for an experienced AI Automation & Agent Engineer to take on a broad and high-impact role. Youll lead new automation projects from ideation to production, own the reliability of existing systems, and serve as the go-to expert helping our employees get the most out of AI tools - especially Claude Code. This is a hands-on, cross-functional position at the center of how we adopt and scale AI internally.
Responsibilities
Lead Automation Projects:
Partner with department leads across sales, support, finance, and HR to identify high-impact AI and automation opportunities.
Design and build LLM-based workflows, integrating APIs, MCP servers, and internal tools.
Develop agent-like automations and internal copilots that augment decision-making and execution.
Own the full lifecycle - from ideation and process design through development, testing, and production launch.
Present project plans, progress updates, and outcomes with measurable impact to stakeholders at all levels.
Build AI Systems & Integrations:
Build robust, maintainable workflows using N8N, Claude Code, and other orchestration tools.
Integrate across systems using REST APIs, webhooks, and external/internal tools.
Design reusable patterns for skills, agents, and workflows that can scale across teams.
Continuously evaluate and adopt new AI tooling, MCP capabilities, and agent frameworks.
Maintain & Improve Live Systems:
Monitor, triage, and resolve issues across all live AI automations, copilots, and agents.
Identify recurring failure patterns and implement systemic improvements to reliability, performance, and cost.
Ship incremental improvements and new capabilities quickly and safely.
Maintain clear documentation for workflows, agents, and system behavior.
Drive AI Adoption, Skills & Governance Across:
Act as the internal expert and first point of contact for employees using Claude Code and AI tools.
Help teams build and scale AI skills - from basic usage to advanced workflows and agent design.
Manage and optimize AI usage and performance across the organization (tokens, costs, reliability, adoption).
Build and evolve an internal AI control tower - providing visibility into usage, performance, governance, and impact.
Run onboarding sessions, workshops, and create practical guides that empower teams to work independently with AI.
Guide teams through MCP integrations, tool configurations, and best practices.
Stay current on Claude Code updates, new MCP capabilities, and emerging AI tooling - and proactively share relevant developments with the team.
Requirements:
Must-haves:
3+ years of experience in a technical role in software development, data analyst or AI/ML operations.
Proven ability to lead projects end-to-end, from requirements to production.
Hands-on experience building LLM-based workflows, automations, or agents.
Strong experience with workflow tools (N8N, Zapier, Make, Temporal, or similar).
Solid coding skills in Python and/or JavaScript.
Experience integrating systems using APIs, webhooks, and structured data (JSON).
Strong communication skills - able to work closely with non-technical teams and translate needs into solutions.
Nice-to-haves:
Experience building internal copilots or AI-powered tools.
Familiarity with multi-agent systems, MCP ecosystem, or orchestration frameworks.
Experience defining best practices, patterns, or frameworks for AI usage.
Background working across business domains (sales, finance, support, HR)
Experience enabling AI tool adoption - training, documentation, or internal consulting for business teams.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8678758
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
03/06/2026
חברה חסויה
Location: Ramat Gan
Job Type: Full Time and Hybrid work
Required Data Engineering Manager
Description
We are one of the most popular and downloaded apps in the world. Working with us provides a unique opportunity to influence hundreds of millions of our users and to be part of the journey that makes us a super-app. Our mission is to make peoples lives easier by enabling meaningful connections, from precious moments with family and friends, through managing business relationships to pursuing their passions.
As an Engineering Manager in the data department, youll build and scale our data platform and data apps that powers our business insights. Youll design and implement robust pipelines to process billions of daily records, leveraging cutting-edge cloud technologies to transform data into actionable intelligence.
If youre passionate about data engineering and driving business growth through insights, wed love to hear from you!
Responsibilities
Lead and grow Data Engineering and Machine Learning teams in a high-scale environment (tens of billions of events per day).
Own the design and evolution of a self-service data platform enabling internal teams to easily build, ship, and consume data products.
Architect and scale batch and streaming pipelines powering core business and ML use cases.
Drive production ML systems end-to-end (recommendation, ranking, prediction) with direct business KPI impact.
Ensure reliability, scalability, and observability of large-scale data and ML systems in production.
Requirements:
3+ years of engineering management experience leading Data / ML / Software engineering teams in production environments.
6+ years of experience building large-scale distributed systems in Data Engineering, ML Engineering, or Software Engineering roles.
Proven ownership of production-grade data or ML platforms, including delivery and adoption across R&D and Product stakeholders.
Hands-on experience building and operating high-scale distributed data systems (Spark, Storm, Flink) in production.
Strong experience with Java and Python in AWS cloud environments.
Advantages
Proven track record leading multi-disciplinary teams and driving measurable business impact through data/ML systems.
Experience building ML platforms, feature stores, or self-serve data infrastructure at scale.
Deep experience with modern ML/infra stack (PyTorch, TensorFlow, SageMaker, Kubernetes, Argo).
Experience with modern data lakehouse and analytics stack (Iceberg, Athena, ClickHouse, data catalogs, data quality frameworks).
Experience deploying LLM-based systems or AI-driven infrastructure in production environments.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8678744
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Ramat Gan
Job Type: Full Time
We're looking for a Data Platform Engineer to own and scale the Kubernetes infrastructure powering our large-scale data processing platform.

This is a hands-on role at the intersection of infrastructure and data engineering. You'll operate Kubernetes clusters running thousands of nodes, supporting workloads like Spark, Airflow, and remote shuffle services. Your focus: making distributed data workloads reliable, cost-efficient, and performant at scale.

This is not a traditional DevOps or SRE role. You won't be building CI/CD pipelines or managing web services. Instead, you'll be deep in Spark executor scaling, shuffle optimization, batch scheduler tuning, and capacity planning for clusters that process massive datasets daily.

If you've tuned Spark on Kubernetes at scale, wrestled with shuffle storage bottlenecks, or optimized batch scheduling across thousands of concurrent pods - this role is for you.

WHAT YOU'LL DO:
Operate and scale Kubernetes clusters with thousands of nodes supporting large-scale Spark and data processing workloads.
Manage and optimize Apache Spark on Kubernetes - executor autoscaling, driver scheduling, resource tuning, spot instance strategies.
Deploy and tune remote shuffle services (e.g., Apache Celeborn) to handle shuffle data at scale across multiple availability zones.
Operate and improve self-hosted Apache Airflow infrastructure on Kubernetes
Configure and optimize batch schedulers (e.g., YuniKorn, Volcano) for gang scheduling, fair-share queuing, and resource prioritization.
Drive cost optimization across large compute fleets - spot vs. on-demand strategies, node right-sizing, autoscaling policies, local SSD utilization.
Support and collaborate with Data Engineering teams on workload. performance, resource allocation, and infrastructure requirements.
Manage infrastructure-as-code (Terraform) and GitOps deployments (ArgoCD, Helm) for data platform services.
Integrate with managed data platforms (e.g., Databricks) and cloud storage for hybrid processing architectures.
Requirements:
REQUIREMENTS:
3+ years of experience operating Kubernetes in production at significant scale (hundreds to thousands of nodes).
Hands-on experience with Apache Spark on Kubernetes - you understand executors, drivers, dynamic allocation, shuffle behavior, and how they map to K8s primitives.
Strong understanding of Kubernetes internals - scheduling, resource management, node autoscaling, pod lifecycle, taints/tolerations, local storage
Experience with cloud infrastructure (GCP preferred) - managed Kubernetes, spot/preemptible instances, local SSDs, networking at scale.
Comfortable with infrastructure-as-code (Terraform) and GitOps workflows.
Proficiency in Python or Go.

NICE TO HAVE:
Experience operating Apache Airflow at scale on Kubernetes.
Experience with Apache Celeborn or similar remote shuffle services.
Familiarity with YuniKorn or Volcano batch schedulers.
Experience with Databricks administration and integration.
Knowledge of data formats and storage systems (Parquet, Delta Lake, cloud object storage).
Experience with streaming or messaging systems (Kafka).
Experience with Prometheus/Grafana observability stacks for data platform monitoring.
Contributions to open-source data infrastructure projects.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8655796
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 10 שעות
Location: Ramat Gan
Job Type: Full Time
What will your job look like:
Design and optimize algorithms and pipelines for large scale model inference
Build scalable systems for high throughput data processing and streaming
Develop data transformation and preprocessing components for ML workloads
Improve performance, efficiency, and reliability across distributed inference systems
Work closely with ML researchers, infrastructure, and platform teams
Drive architectural decisions for production ML and data systems.
Requirements:
5+ years of experience in Algorithm Engineering, ML Infrastructure, or Data Systems
Strong programming skills in Python
Hands on experience with Spark, Polars, Pandas, DuckDB, and AWS
Strong understanding of distributed systems, scalability, and performance optimization
Experience building or supporting ML inference pipelines in production
Strong system design and architecture skills.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8699091
סגור
שירות זה פתוח ללקוחות VIP בלבד