דרושים » AI » Senior ML Platform Engineer - Sovereign AI Engineering

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 4 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
At our company, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; its a Dream job. we are where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Lets build something extraordinary together.
our company's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Built as part of a broader sovereign AI platform, our technology is designed to operate in on-premise, private cloud, and air-gapped environments, enabling nations to maintain full control over their data, infrastructure, and AI capabilities. Central to our company's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.
At our company, our talented team, driven by passion, expertise, and innovative minds, inspires us daily. We are not just dreamers, we are dream-makers.
Responsibilities
Build and operate ML training infrastructure - distributed training pipelines, compute scheduling, and reproducible experiment workflows that data scientists rely on daily.
Own model serving and inference systems - packaging, deployment, autoscaling, A/B testing, canary rollouts, and latency/cost optimization for production models.
Run feature stores, model registries, and dataset versioning - enabling self-serve feature engineering, model lineage, and reproducible experiments across teams.
Build experiment tracking and evaluation infrastructure - automated evals, comparison dashboards, drift detection, and monitoring that give teams visibility into model behavior and performance.
Build and maintain production pipelines for training, fine-tuning workflows, and serving domain models - owning reliability, reproducibility, and scale.
Build and maintain the monitoring and observability layer - model performance tracking, data and prediction drift detection, data quality validation, and alerting.
Improve performance and cost across the ML stack - training throughput, inference latency, batch vs. real-time tradeoffs, and compute cost management.
Ship shared tooling - libraries, templates, CI/CD for models, IaC, and runbooks - while collaborating across Data Platform, AI, Data Science, Engineering, and DevOps. Own architecture, documentation, and operations end-to-end.
Requirements:
5+ years in software engineering, with 2+ years focused on ML infrastructure, MLOps, or data-intensive systems
Engineering craft - Strong Python, distributed systems design, testing, secure coding, API design, CI/CD discipline, and production ownership.
ML platform & serving - Model serving frameworks (e.g., Triton, TorchServe, vLLM, Ray Serve); model packaging, deployment pipelines, and inference optimization
Training infrastructure - Distributed training pipelines (e.g., frameworks like PyTorch, JAX) experiment orchestration and reproducibility
ML lifecycle tooling - Feature stores, model registries, experiment tracking (e.g., MLflow, Weights & Biases); dataset versioning and lineage
Data pipelines - Building training and inference data pipelines; familiarity with tools like Spark, Airflow/Dagster, and streaming ingestion
Comfortable with AI coding tools like Cursor, Claude Code, or Copilot
Nice to Have:
Experience operating in constrained environments - on-premise, private cloud, or air-gapped deployments
Hands-on experience with simulation environments, synthetic data generation, or reinforcement learning workflows
Platform & infra - Kubernetes, AWS, Terraform or similar IaC, CI/CD, observability, incident response
Hands-on data science or applied ML experience.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8664296
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 5 שעות
דרושים בCrowdStrike
Location: Tel Aviv-Yafo
Job Type: Full Time
CrowdStrike's Data Science Studio is seeking a pioneering Senior MLOps Engineer to establish and lead our MLOps function from the ground up. As the first MLOps engineer in the studio, you will play a foundational role in shaping how we build, deploy, and scale machine learning systems that protect thousands of organizations worldwide.

This is a unique opportunity to define the technical strategy, influence the technology stack, and architect the infrastructure that will power our AI/ML-driven security solutions for years to come.

This role combines strategic vision with hands-on execution. You'll work at the intersection of data science, engineering, and production operations - building production-grade systems that operate at immense scale while collaborating closely with highly technical data scientists and ML engineering teams across CrowdStrike.

What You'll Do:
- Architect MLOps infrastructure from the ground up: Design and implement the foundational MLOps platform, establishing best practices, tooling, and workflows that will scale with our growing data science initiatives
- Define technology strategy: Evaluate, select, and integrate MLOps technologies and platforms that best serve our needs - from experiment tracking and model versioning to deployment pipelines and monitoring systems
- Build production-grade ML pipelines: Develop robust, scalable pipelines for model training, validation, deployment, and monitoring that handle massive data volumes and ensure reliability in production
- Enable data scientist productivity: Create tools, frameworks, and automation that empower data scientists to move quickly from research to production while maintaining high quality and reliability standards
- Establish monitoring and observability: Implement comprehensive monitoring, logging, and alerting systems to ensure ML models perform optimally in production and issues are detected proactively
- Drive MLOps culture and practices: Champion best practices in ML engineering, CI/CD for ML, model governance, and reproducibility across the data science organization
- Collaborate cross-functionally: Partner closely with data scientists to understand their workflows and pain points, and work with ML engineering teams to ensure seamless integration with broader platform capabilities
 -Scale for the future: Design systems with scalability, security, and maintainability in mind, anticipating the needs of a rapidly growing ML portfolio
Requirements:
- 6+ years of experience in MLOps, ML engineering, DevOps, or related infrastructure roles with focus on machine learning systems
- Production ML systems expertise: Proven track record of building and operating ML systems at scale in production environments
- Strong infrastructure and automation skills: Deep knowledge of cloud platforms (AWS, Azure, or GCP), containerization (Docker, Kubernetes), and infrastructure-as-code (Terraform, CloudFormation)
- ML pipeline proficiency: Hands-on experience with ML workflow orchestration tools (e.g., Airflow, Kubeflow, MLflow, Metaflow) and building end-to-end ML pipelines
- Programming excellence: Strong coding skills in Python; experience with additional languages is a plus
- CI/CD and DevOps practices: Expertise in building automated deployment pipelines, version control, and modern DevOps methodologies
- Strategic and hands-on balance: Ability to think architecturally about long-term solutions while rolling up your sleeves to implement them
- Collaborative mindset: Excellent communication skills and ability to work effectively with data scientists, engineers, and stakeholders with varying technical backgrounds
- Startup mentality: Comfort with ambiguity and ability to build from scratch in a fast-paced environment
This position is open to all candidates.
 
Show more...
הגשת מועמדות
עדכון קורות החיים לפני שליחה
8611396
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 4 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
At our company, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; its a Dream job. we are where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Lets build something extraordinary together.
our company's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Built as part of a broader sovereign AI platform, our technology is designed to operate in on-premise, private cloud, and air-gapped environments, enabling nations to maintain full control over their data, infrastructure, and AI capabilities. Central to our company's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.
At our company, our talented team, driven by passion, expertise, and innovative minds, inspires us daily. We are not just dreamers, we are dream-makers.
Responsibilities
Set technical direction for the ML platform - training pipelines, model serving, feature stores, experiment tracking, and compute orchestration - through RFCs, prototypes, design reviews, and build-vs-buy decisions
Lead and grow a team of ML Engineers - hire, mentor, pair on hard problems, and raise the bar through code and design reviews
Contribute to critical systems, debug production issues, and maintain deep context on the codebase to inform technical decisions
Own operational excellence for model serving - set and enforce SLAs, run capacity planning, and keep compute costs predictable
Establish ML engineering standards - reproducible experiments, automated evals, model packaging, CI/CD for models, and observability
Support the full lifecycle of our company's models - from training on domain-specific data to low-latency inference powering production systems
Work closely with Data Platform, AI, Data Science, and Product teams - translate business priorities into engineering work and manage cross-team dependencies
Measure and improve developer experience - deploy friction, onboarding time, CI turnaround - as seriously as model performance.
Requirements:
6+ years in software engineering, ML engineering, or platform engineering, with hands-on experience building and operating ML infrastructure at scale.
2+ years leading an engineering team - hiring, mentoring, conducting design reviews, and shipping alongside your team
Engineering craft - Strong Python, distributed systems design, testing, secure coding, API design, CI/CD discipline, and production ownership.
ML platform & serving - Model serving frameworks (e.g., Triton, TorchServe, vLLM, Ray Serve); model packaging, deployment pipelines, and inference optimization
Training infrastructure - Distributed training pipelines (e.g., frameworks like PyTorch, JAX) experiment orchestration and reproducibility
ML lifecycle tooling - Feature stores, model registries, experiment tracking (e.g., MLflow, Weights & Biases); dataset versioning and lineage
Data pipelines - Building training and inference data pipelines; familiarity with tools like Spark, Airflow/Dagster, and streaming ingestion
Comfortable with AI coding tools like Cursor, Claude Code, or Copilot
Nice to Have:
Experience operating in constrained environments - on-premise, private cloud, or air-gapped deployments
Hands-on experience with simulation environments, synthetic data generation, or reinforcement learning workflows
Platform & infra - Kubernetes, AWS, Terraform or similar IaC, CI/CD, observability, incident response
Hands-on data science or applied ML experience.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8664328
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 4 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
At our company, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; its a Dream job. we are where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Lets build something extraordinary together.
our company's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Built as part of a broader sovereign AI platform, our technology is designed to operate in on-premise, private cloud, and air-gapped environments, enabling nations to maintain full control over their data, infrastructure, and AI capabilities. Central to our company's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.
At our company, our talented team, driven by passion, expertise, and innovative minds, inspires us daily. We are not just dreamers, we are dream-makers.
Responsibilities
Architect and evolve the AI platform - agent orchestration, LLM gateways, context engineering pipelines, evaluation infrastructure, tool-calling systems, and retrieval pipelines - through RFCs, prototypes, and design reviews.
Lead and grow a small team of AI Engineers building the agent framework, production backend services, and AI platform infrastructure - hire, mentor, pair on hard problems, and raise the bar through hands-on code and design reviews.
Contribute to critical systems, debug production incidents, and maintain enough codebase context to make sound technical calls.
Own reliability across AI and agent services - set and enforce SLAs, build observability for non-deterministic systems, and harden tool execution environments for cost and security.
Set the standard for AI engineering practices - agent testing strategies, evaluation frameworks with human-in-the-loop oversight, retrieval quality benchmarks, and CI/CD for AI systems.
Work closely with ML Platform, Data Platform, DevOps, Data Science, and Product teams across the Applied AI Engineering group - ensure the AI platform evolves to serve teams building agentic workflows across the organization.
Measure and improve developer experience - deploy friction, onboarding time, CI turnaround - as seriously as system performance.
Requirements:
6+ years in backend software engineering, with 4+ years focused on production systems that integrate AI/ML models or LLMs.
2+ years leading an engineering team - hiring, mentoring, conducting design reviews, and shipping alongside your team.
Engineering craft - Strong Python, Go, or Java, system architecture, API design, testing, and secure coding practices.
Agentic systems & LLM integration - Deep understanding of agent orchestration, tool-use architectures, LLM integration patterns, context engineering, and frameworks like LangGraph or similar, or custom-built equivalents
Backend & platform engineering - Experience building and operating production APIs, services, and platform infrastructure at scale; comfortable working with relational databases, message queues, and event-driven architectures
RAG & retrieval - Experience with production RAG pipelines, vector databases, embedding systems, and retrieval quality
Evaluation & observability - Experience building LLM and agent eval infrastructure, monitoring AI quality, and observability for non-deterministic systems
Nice to Have:
Platform & infra - Kubernetes, AWS, Terraform or similar IaC, CI/CD, service architecture, incident management
Experience with MCP or similar tool-use protocols for agent-to-service communication
Hands-on ML experience - model training, fine-tuning, or working directly with ML pipelines.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8664323
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 4 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
At our company, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; its a Dream job. we are where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Lets build something extraordinary together.
our company's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Built as part of a broader sovereign AI platform, our technology is designed to operate in on-premise, private cloud, and air-gapped environments, enabling nations to maintain full control over their data, infrastructure, and AI capabilities. Central to our company's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.
At our company, our talented team, driven by passion, expertise, and innovative minds, inspires us daily. We are not just dreamers, we are dream-makers.
Responsibilities
Design and build agentic systems - single and multi-agent workflows with planning, memory, context engineering, and tool use - for both internal automation and product-facing autonomous capabilities operating over long time horizons.
Build and operate the AI platform layer - LLM gateways, prompt management, structured output handling, tool-calling infrastructure, and cost/latency optimization - deployed on Kubernetes, consumed by every team for their agentic work.
Own the agent framework layer - orchestration primitives, execution environments, state management, and sandboxed tool execution - giving every team at our company the building blocks to create and operate their own agents.
Build evaluation infrastructure that gives teams confidence in agent behavior - automated LLM and agent evals for quality, correctness, safety, latency, cost, and regressions, including human-in-the-loop oversight for mission-critical workflows.
Productionize and harden backend services (APIs, gRPC, async workers) that integrate LLMs - with proper error handling, retries, circuit breakers, and high-availability patterns.
Own RAG pipelines and retrieval systems - indexing, chunking, embedding, vector database management, filtering, and relevance tuning for production retrieval.
Optimize performance and cost across the AI stack - model routing, caching, batching, and inference cost management.
Ship shared tooling - libraries, SDKs, agent templates, and documentation - while working closely with ML Platform, Data Platform, DevOps, and other teams across the Applied AI Engineering group. Own architecture, documentation, and operations end-to-end.
Requirements:
5+ years in backend or distributed systems engineering, with 2+ years focused on production systems that integrate AI/ML models or LLMs.
Engineering craft - Strong Python, Go, or Java, system architecture, API design, testing, and secure coding practices.
Agentic systems - Experience designing and building agent orchestration, tool-use systems, and autonomous workflows; familiarity with frameworks like LangGraph or similar, or having built equivalent from scratch
Backend engineering - Experience building production APIs and services (FastAPI or similar); async programming, service architecture, high-availability, and reliability patterns (retries, circuit breakers, backpressure)
LLM integration - Hands-on experience integrating LLMs via SDKs and APIs; context engineering, structured outputs, tool calling, and model routing
RAG & retrieval - Experience with embedding pipelines, vector databases (e.g., Milvus, Qdrant, Pinecone), chunking strategies, and relevance tuning
Evaluation & observability - Experience designing LLM and agent evals, monitoring AI system quality, and building observability for non-deterministic systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8664306
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 4 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
At our company, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; its a Dream job. we are where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Lets build something extraordinary together.
our company's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Built as part of a broader sovereign AI platform, our technology is designed to operate in on-premise, private cloud, and air-gapped environments, enabling nations to maintain full control over their data, infrastructure, and AI capabilities. Central to our company's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.
At our company, our talented team, driven by passion, expertise, and innovative minds, inspires us daily. We are not just dreamers, we are dream-makers.
Responsibilities
Design and build complex, interactive UIs with React, TypeScript, and Next.js.
Apply design engineering principles to translate Figma designs into highly polished, responsive implementations.
Own component architecture: build reusable, composable, and well-documented components.
Own production health and observability to ensure system reliability at scale.
Ensure accessibility and cross-browser compatibility.
Write tests and maintain code quality across the frontend codebase.
Collaborate with design, backend, and product teams to ship features end-to-end.
Mentor engineers and promote frontend engineering best practices.
Leverage Al-assisted development tools to accelerate workflows and improve code quality.
Requirements:
5+ years frontend engineering experience.
Strong foundations in JavaScript, TypeScript, HTML, and CSS.
Deep experience with React and the Next.js ecosystem, including modern state management patterns.
Hands-on experience translating Figma designs into production code.
Experience building and maintaining component libraries or design systems.
Proven ability to manage frontend observability, track core web vitals, and maintain application health in production.
Strong understanding of accessibility standards and implementation.
Experience with modern build tools (Vite, Turbopack) and testing frameworks (Jest, Playwright, Cypress).
Familiarity with REST and GraphQL APIs and frontend data fetching patterns.
Experience with CI/CD pipelines and frontend deployment workflows.
Eye for design detail and strong collaboration with design teams.
Proficiency with Al coding assistants (Cursor, Claude Code) and a track record of using them to ship faster without sacrificing quality.
Ability to write effective prompts for code generation, review Al-generated code critically, and integrate Al tools into daily development workflows.
Nice to Have:
Experience with AI agent design and orchestrating frontend interactions with frameworks like LangChain or LangGraph.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8664313
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
13/05/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
Join our companys AI research group, a cross-functional team of ML engineers, researchers and security experts building the next generation of AI-powered security capabilities. Our mission is to leverage large language models to understand code, configuration, and human language at scale, and to turn this understanding into security AI capabilities which will drive our company AI future security solutions.
We foster a hands-on, research-driven culture where youll work with large-scale data, modern ML infrastructure, and a global product footprint that impacts over 100,000 organizations worldwide.
Key Responsibilities
As a Senior ML Research Engineer, you will be responsible for the end-to-end lifecycle of large language models: from data definition and curation, through training and evaluation, to providing robust models that can be consumed by product and platform teams.
Own training and fine-tuning of LLMs / seq2seq models: Design and execute training pipelines for transformer-based models (encoder-decoder, decoder-only, retrievalaugmented, etc.), and fine-tune open-source LLMs on our company-specific data (security content, logs, incidents, customer interactions).
Apply advanced LLM training techniques such as instruction tuning, preference / contrastive learning, LoRA / PEFT, continual pre-training, and domain adaptation where appropriate.
Work deeply with data: define data strategies with product, research and domain experts; build and maintain data pipelines for collecting, cleaning, de-duplicating and labeling large-scale text, code and semi-structured data; and design synthetic data generation and augmentation pipelines.
Build robust evaluation and experimentation frameworks: define offline metrics for LLM quality (task-specific accuracy, calibration, hallucination rate, safety, latency and cost); implement automated evaluation suites (benchmarks, regression tests, redteaming scenarios); and track model performance over time.
Scale training and inference: use distributed training frameworks (e.g. DeepSpeed, FSDP, tensor/pipeline parallelism) to efficiently train models on multi-GPU / multi-node clusters, and optimize inference performance and cost with techniques such as quantization, distillation and caching.
Collaborate closely with security researchers and data engineers to turn domain knowledge and threat intelligence into high-value training and evaluation data, and to expose your models through well-defined interfaces to downstream product and platform teams.
Requirements:
What You Bring
5+ years of hands-on work in machine learning / deep learning, including 3+ years focused on NLP / language models.
Proven track record of training and fine-tuning transformer-based models (BERT-style, encoder-decoder, or LLMs), not just consuming hosted APIs.
Strong programming skills in Python and at least one major deep learning framework (PyTorch preferred; TensorFlow).
Solid understanding of transformer architectures, attention mechanisms, tokenization, positional encodings, and modern training techniques.
Experience building data pipelines and tools for large-scale text / log / code processing (e.g. Spark, Beam, Dask, or equivalent frameworks).
Practical experience with ML infrastructure, such as experiment tracking (Weights & Biases, MLflow or similar), job orchestration (Airflow, Argo, Kubeflow, SageMaker, etc.), and distributed training on multi-GPU systems.
Strong software engineering practices: version control, code review, testing, CI/CD, and documentation.
Ability to own research and engineering projects end-to-end: from idea, through prototype and controlled experiments, to models ready for integration by product and platform teams.
Good communication skills and the ability to work closely with non-ML stakeholders (security experts, product managers, engineers).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8650168
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/05/2026
Location: Tel Aviv-Yafo
Job Type: Full Time and Hybrid work
We are always looking for exceptional talent to join us on the journey!
We are always looking for exceptional talent to join us on the journey!


Your Mission

As an MLOps Engineer at Nuvei, your mission is to design, build, and operate the platforms that power our machine learning and generative AI products spanning real-time use cases such as large-scale fraud scoring, MCP & agentic workflows support. Youll create reliable CI/CD for models and Agents, robust data/feature pipelines, secure model serving, and comprehensive observability. You will also support our agentic AI ecosystem and Model Context Protocol (MCP) services so that models can safely use tools, data, and actions across .
You will partner closely with Data Scientists, Data/Platform Engineers, Product, and SRE to ensure every model from classic ML to LLM/RAG agents moves from prototype to production with strong reliability, governance, cost efficiency, and measurable business impact.
Responsibilities:
Operate & Develop ML/LLM platforms on Kubernetes + cloud (Azure; AWS/GCP ok) with Docker, Terraform, and other relevant tools
Manage object storage, GPUs, and autoscaling for training & low-latency model serving
Manage cloud environment, networking, service mesh, secrets, and policies to meet PCI-DSS and data-residency requirements
Build end-to-end CI/CD for models/agents/MCP tooling (versioning, tests, approvals)
Deliver real-time fraud/risk scoring & agent signals under strict latency SLOs.
Maintain MCP servers/clients: tool/resource definitions, versioning, quotas, isolation, access controls
Integrate agents with microservices, event streams, and rule engines; provide SLAs, tracing, and on-call runbooks
Measure operational metrics of ML/LLM (latency, throughput, cost, tokens, tool success, safety events)
Enforce governance: RBAC/ABAC, row-level security, encryption, PII/secrets management, audit trails.
Partner with DS on packaging (wheels/conda/containers), feature contracts, and reproducible experiments.
lead incident response and post-mortems.
Drive FinOps: right-sizing, GPU utilization, batching/caching, budget alerts.
Requirements:
4+ years in DevOps/MLOps/Platform roles building and operating production ML systems (batch and real-time)
Strong hands-on with Kubernetes, Docker, Terraform/IaC, and CI/CD
Practical experience with Spark/Databricks and scalable data processing
Proficiency in Python & Bash
Ability to operate DS code and optimize runtime performance.
Experience with model registries (MLflow or similar), experiment tracking, and artifact management.
Production model serving using FastAPI/Ray Serve/Triton/TorchServe, including autoscaling and rollout strategies
Monitoring and tracing with Prometheus/Grafana/OpenTelemetry; alerting tied to SLOs/SLAs
Solid understanding of PCI-DSS/GDPR considerations for data and ML systems
Experience with the Azure cloud environment is a big plus
Operating LLM/agent workloads in production (prompt/config versioning, tool execution reliability, fallback/retry policies)
Building/maintaining RAG stacks (indexing pipelines, vector DBs, retrieval evaluation, hybrid search)
Implementing guardrails (policy checks, content filters, allow/deny lists) and human-in-the-loop workflows
Experience with feature stores - Qwak Feature Store, Feast
A/B testing for models and agents, offline/online evaluation frameworks
Payments/fraud/risk domain experience; integrating ML outputs with rule engines and operational systems - Advantage
Familiarity with Databricks Unity Catalog, dbt, or similar tooling
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8644480
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
we are looking for a Data science Team Lead.
As the Data Science Team Lead, you will lead a talented team of data scientists and ML engineers building the infrastructure, systems, and workflows for designing, training, evaluating, and deploying machine learning models that protect millions of users worldwide from fraud and account compromise.
This role combines hands-on technical leadership with people management and strategic ownership. You will drive innovation across real-time model serving, customer-specific model tuning, offline AI evaluations, and scalable ML systems in a production-grade SaaS environment.
If you are passionate about applied machine learning, fraud detection, and building intelligent systems at scale - we want you on our team.
What youl do:
Lead and mentor a team of Data Scientists and ML Engineers focused on fraud detection and response capabilities.
Build ML infrastructure focused on design, train, evaluate, and optimize machine learning models for real-time fraud prevention and risk assessment.
Own the lifecycle of ML models in production, including experimentation, deployment, monitoring, retraining, and performance optimization.
Drive customer-specific model training and tuning strategies to improve accuracy and adaptability across different customer environments.
Build and improve offline AI evaluation frameworks to measure model quality, drift, effectiveness, and business impact.
Collaborate closely with Engineering, Product, Security, and Data teams to deliver scalable and reliable AI-powered capabilities.
Define best practices for model serving, feature engineering, experimentation, observability, and operational excellence.
Balance model performance, latency, scalability, explainability, and operational constraints in high-scale production environments.
Promote a culture of technical excellence, continuous improvement, ownership, and innovation.
Requirements:
5+ years of experience in Data Science, Machine Learning, or Applied AI roles, with at least 2 years in a leadership capacity.
Strong hands-on experience building and deploying ML models in production environments.
Experience with real-time inference/model serving architectures and low-latency prediction systems.
Deep understanding of model training, evaluation, tuning, and monitoring methodologies.
Experience designing customer-specific ML solutions and personalization strategies.
Strong programming skills in Python and experience with modern ML frameworks and tooling.
Proven ability to lead technical initiatives and guide teams in fast-paced, production-focused environments.
Strong analytical and problem-solving skills with a data-driven mindset.
Excellent communication and cross-functional collaboration skills.
Advantages:
Experience with fraud detection, identity risk, cybersecurity, or behavioral analytics systems.
Experience with MLOps practices and tooling.
Background in Data Engineering and large-scale data processing systems.
Experience with feature stores, stream processing, and real-time data pipelines.
Familiarity with cloud platforms such as AWS, GCP, or Azure.
Experience with Kubernetes, Kafka, Spark, Airflow, or similar distributed systems technologies.
Bachelors degree in Computer Science, Mathematics, Statistics, Engineering, or a related field
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8659154
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
The MLIL DataPlane team is looking for a Senior Software Development Engineer to own the design and implementation of our inference data plane. We build the software that makes large models run efficiently on custom hardware - spanning model execution, memory management, data movement, and serving integration.
Our work covers the full inference path: integrating serving engines with custom hardware, developing high-performance compute kernels, enabling efficient data movement, and driving models from early validation through production. We operate at frontier scale with large distributed models.
This is a ground-up effort with rapidly evolving hardware and software. We need a senior IC who can write and optimize low-level code for custom hardware, validate model architectures end-to-end, build test and profiling infrastructure, and drive performance across the stack.

Key job responsibilities
- Develop and optimize compute kernels for a custom ML accelerator architecture, targeting production-level performance for large language model inference.
- Implement and validate LLM architectures (decoder-only, mixture-of-experts) end-to-end - from PyTorch model definition through distributed execution on custom hardware.
- Integrate custom accelerator backends into open-source ML serving frameworks (vLLM, PyTorch), including scheduler extensions, memory management, and model parallelism.
- Build and maintain test infrastructure for model correctness validation across CPU, GPU, simulator, and hardware targets.
- Profile and optimize inference workloads - identify bottlenecks, instrument critical paths, and drive latency and throughput improvements from simulation through hardware bringup.
- Own features end-to-end: from design through implementation, testing, and integration into the broader software stack.
- Contribute to CI/CD pipelines that gate model and kernel changes on correctness and performance regressions.
- Mentor engineers, drive design reviews, and raise the engineering bar across the team.
Requirements:
Basic Qualifications
- Bachelor's degree in computer science or equivalent.
- 7+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience.
- Knowledge of Machine Learning and LLM fundamentals, including transformer architecture, training/inference lifecycles, and optimization techniques.
- Knowledge of computer architecture, operating systems, and parallel computing.
- Strong proficiency in C/C++.
- Strong Linux systems knowledge.
- Experience developing compute kernels for GPUs, DSPs, or custom accelerators.
- Proven track record of owning and delivering complex software features end-to-end.

Preferred Qualifications
- Knowledge of ML frameworks including JAX, PyTorch, vLLM, SGLang, Dynamo, TorchXLA, and TensorRT.
- Experience in developing and deploying LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware, or experience with CUDA kernels or ML/low-level kernels.
- Familiarity with speculative decoding, KV cache optimization, or other LLM serving optimizations.
- Experience with distributed systems - collective communication, RDMA, or high-speed interconnect programming.
- Experience with hardware simulation environments and model validation workflows.
- Demonstrated early adopter of AI-assisted development tools - uses LLMs or code-generation agents as part of daily workflow.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8660080
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
03/05/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a Senior Engineer with a data engineering background to join our growing ML Platform team. This is a great opportunity, whether you have experience with ML and are looking for a ML focused product or are an experienced Data Engineer looking to enter the world of ML. Together well provide tools to develop more effective models, get them into production faster, and ensure that they continue to perform well over time.
ML is central to our work. It enables us to process billions of $ worth e-commerce transactions, make decisions in real time, identify fraud rings, and quickly detect new attack methods. Precision is crucial - bad decisions by our models cost us directly and put money into the pockets of fraudsters.
Our adoption by merchants around the world provides us with billions of fresh data points each day. Our team of data scientists, analysts, and cyber intelligence specialists continually identify new signals, engineer new features, and research new models. But as the volume of data and the number and complexity of models grows, so do the engineering challenges.
If this kind of working environment sounds exciting to you, if you understand that Engineering is about building the most effective and elegant solution within a given set of constraints - consider applying for this position.
Why should you join us?
Youll be part of a highly proficient engineering team that is a focal point for all ML engineering activity, striving to constantly bring innovation and leverage ML capabilities across all company teams and products.
This role presents a unique opportunity to enter the ML domain. For those already experienced in ML infrastructure, it offers the chance to grow within a team that specializes in high-scale, Big Data and ML systems.
What you will be doing:
Designing, building, and maintaining the ML infrastructure that allows our models to make billions of real-time decisions every year.
Building a platform that enables managing a full ML model lifecycle - from researching to training, deploying, and serving predictions in real-time.
Building distributed data processing pipelines to support model development.
Acting as a consultant to researchers, data scientists, and expert analysts and enabling them to research new models faster and with greater precision by providing cutting-edge tooling.
Expanding our ML infrastructure to make it scalable, quick, and efficient to bring diverse models to production and to monitor their performance and drift over time.
Expanding the pool of internal customers able to use ML. Work with them to understand their needs and help them make the most of the infrastructure that well provide.
Acting as an advocate for MLOps, continually improving our processes, and raising our standards.
Requirements:
4+ years experience with large-scale data processing, ideally with Apache Spark.
5+ years developing complex software projects with at least one of general-purpose languages (preferably Python, but not a must)
Backend and server-side development experience of complex, highly scalable systems
Experienced with machine learning concepts and frameworks.
Motivation to understand the needs of internal users, provide them with great tooling, and teach them how to use it.
Experience working with public clouds (AWS / GCP / Azure)
Fluent in written and spoken English
Itd be really cool if you also:
Are familiar with Databricks or Airflow.
Are comfortable in a containerized environment.
Have experience with maintaining highly available, low latency, real-time services.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8633510
סגור
שירות זה פתוח ללקוחות VIP בלבד