דרושים » הנדסה » MLOps Team Lead

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 50 דקות
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for an exceptional MLOps Team Lead to own, build, and scale the infrastructure and automation that powers Labs state-of-the-art Large Language Models (LLMs) and AI systems.
This is a technical leadership role that blends hands-on engineering with strategic vision. You will define MLOps best practices, build high-performance ML infrastructure, and lead a world-class team working at the intersection of AI research and production-grade ML systems.
You will work closely with LLM Algorithm Researchers, ML Engineers, and Data Scientists to enable fast, scalable, and reliable ML workflows covering everything from distributed training to real-time inference optimization.
If you have deep technical expertise, thrive in high-scale AI environments, and want to lead the next generation of MLOps, we want to hear from you.
Requirements:
3+ years of experience in MLOps, ML infrastructure, or AI platform engineering.
2+ years of hands-on experience in ML pipeline automation, large-scale model deployment, and infrastructure scaling.
Expertise in deep learning frameworks (like PyTorch, TensorFlow, JAX) and MLOps platforms (like Kubeflow, MLflow, TFX).
Proven track record of building production-grade ML systems that scale to billions of predictions daily.
Deep knowledge of Kubernetes, cloud-native architectures (AWS/GCP), and infrastructure as code (Terraform, Helm, ArgoCD).
Strong software engineering skills in Python, Bash, and Go, with a focus on writing clean, maintainable, and scalable code.
Experience with observability & monitoring stacks (Prometheus, Grafana, Datadog, OpenTelemetry).
Strong background in security, compliance, and model governance for AI/ML systems.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8443005
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a highly skilled AI Engineer with a strong engineering mindset to bridge the gap between research and production.

In this role, you will be responsible for validating AI models developed by our Data Science team against real-world production systems, and then leading their optimization, deployment, and ongoing maintenance.

You will be part of the R&D team, working closely with engineers, data scientists, and product managers to ensure our AI solutions are scalable, reliable, and deliver long-term value.

If you enjoy working at the intersection of AI and engineering bringing models to life in production, optimizing for performance, and building reliable systems this role is for you!

Responsibilities
Lead the transition of AI models from proof-of-concept to full-scale production, ensuring they meet architectural, scalability, and performance standards.
Build observability and troubleshooting tools for AI services in production, including logging, performance tracking, and failure analysis pipelines.
Optimize inference performance, including latency, resource usage, and throughput, while maintaining model quality.
Manage model versioning and deployment readiness, including handoff processes, rollback plans, and configuration management.
Partner with the Data Science team to assess model readiness for production, validate input and output compatibility, and ensure assumptions align with real-world system behavior.
Collaborate cross-functionally with DevOps, backend engineers and data scientists to ensure scalable, secure, and cost-effective deployment of ML models
Requirements:
3-5 years of experience in ML Engineering, AI models deployment or MLOps roles.
Strong software engineering background with hands-on experience (Python or Java preferred) building and maintaining production ML services
Solid understanding of machine learning systems and inference pipelines
Familiarity with monitoring practices and production diagnostics for ML services (e.g logs, metrics, alerting)
Proven experience in optimizing AI models for performance (response-time, memory, CPU usage) particularly in real-time or large-scale environments.
Strong proficiency with ML frameworks (TensorFlow, PyTorch, Scikit-Learn, etc.)
Experience deploying AI solutions in cloud environments (AWS, GCP, or Azure)
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8423266
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
2 ימים
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
The ML Detection team is a new, cross-functional team at the core of our product strategy. Its mission is to own and advance the machine learning-based validations that form the foundation of our accessibility testing solutions. This includes everything from data collection and research to deploying, monitoring, and maintaining models in production. You will lead a multidisciplinary team of Data Scientists, ML/Data Engineers, and Algorithm Developers to build foundational models and standalone validations that extend the reach and precision of our products, giving Evinced a critical competitive edge.

Your Impact:

Lead, mentor, and grow a multidisciplinary team of Data Scientists, ML Engineers, and Algorithm Developers, fostering a culture of trust, collaboration, and professional growth.
Own the end-to-end delivery of ML-based detection capabilities, from initial research and data collection to model deployment, monitoring, and algorithmic post-processing.
Provide hands-on leadership, actively contributing to technical solutions, architecture, and design reviews for new algorithms and models.
Collaborate closely with Product, DevOps, and R&D teams to align on priorities, manage integrations, and ensure efficient, impactful development cycles.
Architect and oversee the development of the infrastructure required for the entire ML lifecycle, including data management, experimentation platforms, and model deployment/monitoring systems.
Define and refine team processes, tools, and methodologies to optimize performance, quality, and stability, ensuring data science tasks have clear definitions of done.
Drive the strategic vision for the evolution of our detection capabilities, including UI patterns, interactive elements and accessibility issues, to meet current and future needs.
Act as the primary problem solver and technical resource for the team, guiding them through complex research and engineering challenges.
Requirements:
8+ years of experience in software development or data science, with a focus on delivering ML/AI-powered features or products.
4+ years of experience leading, mentoring, and managing development teams, preferably multidisciplinary teams that include Data Scientists and ML Engineers.
Strong hands-on technical expertise in the ML project lifecycle, including data engineering, research, and production workflows.
Experience with modern cloud environments (GCP/Azure/AWS) and MLOps practices (e.g., model monitoring, managed training pipelines, feature stores).
Familiarity with Python and common data science/ML frameworks.
Proven ability to architect and design scalable, reliable ML systems and the infrastructure to support them.
Experience collaborating closely with product managers and other engineering teams in a fast-paced environment.
Excellent problem-solving skills with a proactive, "can-do" attitude and the ability to navigate ambiguity.
Bachelor's degree in Computer Science, Engineering, a related field, or equivalent experience.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8437596
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
1 ימים
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Were seeking a visionary and technically hands-on AI Team Leader to join Alison.ai and take charge of driving innovation across our core AI systems. As the AI Team Leader, you'll guide the architecture and development of advanced machine learning models that power our video and creative analysis platform.
You will collaborate cross-functionally with product managers, engineering, and business development teams to deliver cutting-edge features that push the boundaries of whats possible in creative performance analysis.
This is a high-impact role for someone who thrives at the intersection of machine learning, product strategy, and creativity.
Key Responsibilities:
Lead the development, deployment, and optimization of machine learning models for video, image, and creative performance analysis.
Architect scalable AI/ML pipelines and infrastructure that support real-time and batch processing of multimedia data.
Guide research and experimentation initiativesidentify new technologies, modeling techniques, and opportunities for innovation.
Mentor and grow a team of machine learning engineers and data scientists as we scale.
Champion AI ethics, fairness, and explainability in our model development and deployment.
Stay ahead of industry trends in generative AI, computer vision, NLP, and Martech innovation, translating insights into competitive advantages.
Requirements:
Hands-on experience in generative AI or ML, with a strong track record of delivering real-world tools, prototypes, or research-backed systems.
Experience integrating AI solutions with APIs, data pipelines, and external systems in production environments.
Deep expertise in multimodal learning, generative models, or agent-based frameworksespecially involving LLMs.
Strong programming skills in Python and SQL, with hands-on experience in building and deploying AI/ML pipelines.
Understanding of cloud platforms (e.g., AWS, GCP, Azure) and AI infrastructure, including MLOps best practices.
Proven ability to integrate and operationalize AI-assisted development tools (e.g., GitHub Copilot, Cursor).
Previous experience in leading or mentoring AI/ML teams, fostering collaboration and technical excellence.
Excellent communication and collaboration skills, with the ability to translate technical advances into business impact.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8439359
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Were hiring a Machine Learning Engineering Manager to guide and grow a high-impact ML team driving AI-powered innovation across B2B SaaS platform. Youll lead the design and delivery of AI solutions while mentoring engineers and setting the technical direction for AI-first development at scale.
This is a leadership role with a balance of hands-on engineering and team management, perfect for someone who thrives on solving technical challenges, inspiring a team, and shaping the future of AI in fintech automation.
What You Will Do:
Lead & Mentor: Manage, mentor, and grow a team of ML engineers, fostering technical excellence and career development.
Set Technical Direction: Define the ML strategy, ensuring best practices in architecture, frameworks, and operationalization.
Build and deploy AI-based solutions: Oversee the development and deployment of GenAI/LLM-powered solutions that address real-world challenges across products.
Scale & Operationalize: Establish scalable ML infrastructure, CI/CD, observability, and data pipelines for high-availability production systems.
Collaborate Cross-Functionally: Partner with product managers, engineers, and business stakeholders, clearly communicate progress, challenges, and outcomes.
Requirements:
7+ years of experience as a Backend Developer / Data Engineer / ML Engineer
3+ years in a technical leadership role.
Python (Java as an advantage)
Bachelors degree in Computer Science or related STEM field (Masters preferred).
Proven track record of building and deploying AI-based solutions at scale.
Deep expertise with LLMs and ML frameworks (e.g., LangChain, LangGraph, Hugging Face, TensorFlow, PyTorch).
Strong background in system design, cloud-native architecture, and microservices.
Experience with NoSQL and real-time data processing pipelines.
Exceptional leadership, mentorship, and communication skills.
Strategic mindset with the ability to balance hands-on coding and team leadership.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8435436
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Were hiring a ML Engineer to accelerate AI-driven innovation across B2B SaaS platform.
Youll be at the forefront of building intelligent systems that power core product experiences and automate internal operations, driving efficiency, speed, and scale across the organization. This is a high-impact, hands-on role in a fast-growing, AI-first company where machine learning is a foundational pillar, not a bolt-on feature. You'll partner with product, engineering, and operations teams to design and implement powerful ML and LLM-based solutions that make a measurable difference.
What You Will Do.:
Build Intelligent Systems: Design and develop ML/LLM-powered solutions that solve real-world challenges across product and internal workflows.
Own Full Lifecycles: Take projects from concept all the way to production, including model training, evaluation, integration, and monitoring.
Leverage State-of-the-Art Tools: Work with leading frameworks like LangChain, Hugging Face, TensorFlow, and PyTorch to deliver cutting-edge functionality.
Collaborate Cross-Functionally: Partner with product managers, engineers, and stakeholders to embed AI capabilities into user-facing features and backend services.
Ship at Scale: Build and maintain scalable APIs and services, integrating best practices in CI/CD, observability, and cloud infrastructure.
Report with Impact: Share progress, challenges, and results clearly with technical and executive stakeholders.
Requirements:
6+ years of experience as a Backend Developer, Data Engineer, or ML Engineer
Bachelors degree in Computer Science or a related STEM field
Strong proficiency in Python and ML tooling
Proven ability to build production-grade ML systems end-to-end
Deep experience with LLMs and ML frameworks (e.g., LangChain, LangGraph, Hugging Face, TensorFlow, PyTorch)
Solid foundation in system design, architecture, and microservice patterns
Excellent problem-solving skills and ownership mindset
Strong collaboration and communication abilities
Bonus if you have:
M.Sc. in Computer Science, Software Engineering, or similar field
Experience building and scaling LLM-powered applications
Familiarity with AWS and DevOps best practices (CI/CD, monitoring, IaC)
Exposure to NoSQL and real-time data processing pipelines
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8435449
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
In this role, you will lead a team of Machine Learning Scientists and Engineers dedicated to building, training, and deploying cutting-edge Generative AI models. This includes developing foundation models trained on our companys extensive textual data and creating fine-tuned models designed to tackle complex travel-related tasks. Your leadership will play a pivotal role in advancing the application of AI to transform the travel experience for millions of customers.
As a technical manager, you should be passionate about GenAI technology, keep up to date with recent breakthroughs in the field, define and shape the teams ML roadmap, and not be afraid to get your hands dirty with code when needed.
You are expected to be the focal point for all technical aspects, make sure your team members deliver on their tasks, and work together with other stakeholders to define and shape the roadmap of our products. You will work independently and will also be responsible for making technical decisions within your team.
When it comes to management, your expertise in handling people will motivate and inspire them to reach outstanding success! You should have experience in developing people. You will mentor and coach your team while working closely with a Product Manager.
Key Job Responsibilities and Duties:
Leadership in LLM Development- Build, Guide and mentor a team of ML scientists and ML engineers in the development, fine-tuning, and deployment of large language models (LLMs) tailored for the travel domain.
Expertise in the engineering aspects of deploying LLMs at scale with minimal latency. This includes optimizing model performance, scalability, and efficiency to meet the demands of real-time, high-traffic applications.
Define and communicate the technical vision and strategy for LLM-related initiatives, ensuring alignment with company goals and customer needs.
Foster a culture of collaboration, innovation, and excellence within the team.
Prioritize work in collaboration with Product Managers, depending on business needs and keeping stakeholders aligned at all times.
Translate machine learning vision and strategy into planning and execution, and ensure timely delivery of their plans.
Develop innovative ML models, algorithms, and engineering approaches or identify existing ones, with the potential to impact our business.
Design and execute applied research plans to understand, apply, test, evolve, and generalize these technologies into reusable frameworks.
Translate business problems into viable, reliable and robust ML and AI solutions, accounting for constraints of the production environment.
Monitor product health, performance and business impact and act accordingly when requirements are not met.
Identify underlying issues and opportunities across domains and situations that are not obviously related through application of structured thinking and logic.
Solve issues by applying methods and insights gained from a variety of disciplines, navigating a variety of environments.
Requirements:
Leadership Experience: At least 4 years of experience leading ML teams in Natural Language Processing (NLP) or Generative AI (GenAI) domains, with a proven ability to guide teams in achieving impactful results.
LLM Expertise: Advanced knowledge and experience in managing teams developing Large Language Models (LLMs), with strong expertise in the engineering aspects of scalable LLM deployment, ensuring optimal performance and minimal latency.
Academic and Applied Background:
MSc with 6+ years of professional experience, or PhD with 4+ years of experience, applying Machine Learning to solve business challenges.
Masters, PhD, or equivalent experience in a quantitative field (e.g., Computer Science, Engineering, Mathematics, Artificial Intelligence, Physics, etc.).
Strong advantage for candidates whose MSc or PhD thesis work is related to NLP, showcasing deep research capabilities in this field.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8430187
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Senior MLOps Engineer
Realize your potential by joining the leading performance-driven advertising company!
As a Senior MLOps Engineer on the Infra group, youll play a vital role in develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools.
About Algo platform:
The objective of the algo platform group is to own the existing algo platform (including health, stability, productivity and enablement), to facilitate and be involved in new platform experimentation within the algo craft and lead the platformization of the parts which should graduate into production scale. This includes support of ongoing ML projects while ensuring smooth operations and infrastructure reliability, owning a full set of capabilities, design and planning, implementation and production care.
The group has deep ties with both the algo craft as well as the infra group. The group reports to the infra department and has a dotted line reporting to the algo craft leadership.
The group serves as the professional authority when it comes to ML engineering and ML ops, serves as a focal point in a multidisciplinary team of algorithm researchers, product managers, and engineers and works with the most senior talent within the algo craft in order to achieve ML excellence.
How youll make an impact:
As a Senior MLOps Engineer Engineer, youll bring value by:
Develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools, including CI/CD, monitoring and alerting and more
Have end to end ownership: Design, develop, deploy, measure and maintain our machine learning platform, ensuring high availability, high scalability and efficient resource utilization
Identify and evaluate new technologies to improve performance, maintainability, and reliability of our machine learning systems
Work in tandem with the engineering-focused and algorithm-focused teams in order to improve our platform and optimize performance
Optimize machine learning systems to scale and utilize modern compute environments (e.g. distributed clusters, CPU and GPU) and continuously seek potential optimization opportunities.
Build and maintain tools for automation, deployment, monitoring, and operations.
Troubleshoot issues in our development, production and test environments
Influence directly on the way billions of people discover the internet
Our tech stack:
Java, Python, TensorFlow, Spark, Kafka, Cassandra, HDFS, vespa.ai, ElasticSearch, AirFlow, BigQuery, Google Cloud Platform, Kubernetes, Docker, git and Jenkins.
Requirements:
To thrive in this role, youll need:
Experience developing large scale systems. Experience with filesystems, server architectures, distributed systems, SQL and No-SQL. Experience with Spark and Airflow / other orchestration platforms is a big plus.
Highly skilled in software engineering methods. 5+ years experience.
Passion for ML engineering and for creating and improving platforms
Experience with designing and supporting ML pipelines and models in production environment
Excellent coding skills in Java & Python
Experience with TensorFlow a big plus
Possess strong problem solving and critical thinking skills
BSc in Computer Science or related field.
Proven ability to work effectively and independently across multiple teams and beyond organizational boundaries
Deep understanding of strong Computer Science fundamentals: object-oriented design, data structures systems, applications programming and multi threading programming
Strong communication skills to be able to present insights and ideas, and excellent English, required to communicate with our global teams.
Bonus points if you have:
Experience in leading Algorithms projects or teams.
Experience in developing models using deep learning techniques and tools
Experience in developing software within a distributed computation framework.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8437899
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
28/10/2025
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
we are a fast-growing digital health SaaS company thats on a mission to transform the way physicians interact with patient data. Thousands of clinicians across the United States already use our companys AI-powered solution that transforms complex and fragmented patient data into concise patient portraits" and actionable clinical insights at the point of care. With our company, physicians experience less burnout, reduce missed diagnoses, and can devote more time giving better care to their patients.
our company has been named one of the Top 100 AI companies globally by CB Insights and made the list of the Top 50 Digital Health startups. We are already working with industry-leading value-based organizations including Privia Health and Agilon.
We're looking for talented, versatile and highly independent Machine Learning Engineers to join our winning team. You will be in charge of end-to-end development and you'll have a massive impact on the product and the technical decisions made.
The ML Engineers are part of our infrastructure group, working closely with the exceptional research team, taking trained models and scaling them out to production.
Responsibilities
Take ownership of the entire machine learning engineering lifecycle - from building scalable training and evaluation pipelines to deploying models in production, with robust monitoring and maintenance systems.
Help in creating scalable solutions by enabling us to continuously increase the accuracy of our algorithms across thousands of clinics.
Designing a secured large-scale system that is suitable for sensitive patient data.
Continue to enhancing our deep learning infrastructure to supports our AI models at scale, including CI/CD, automation, testing and monitoring.
Collaborate with the research, medical and product teams in implementing ML solutions to the digital health space.
Requirements:
5+ years of hands-on experience in software engineering (Backend preferably in Python).
2+ years of experience in machine learning pipelines on cloud environments.
Knowledge in statistics and machine learning techniques.
Proven ability to lead product feature development, from concept to production.
Experience with large scale, high performance, production environments.
Experience working with SQL and NoSQL databases.
Experience in AWS cloud environment.
Advantages:
Bs.c / Ms.c in Computer Science / Software Engineering.
Experience with ML Frameworks such as PyTorch, TensorFlow and MLFlow.
Experience with Deep Learning, NLP and LLM pipelines (RAG and agentic systems).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8390090
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Staff MLOps Engineer
Realize your potential by joining the leading performance-driven advertising company!
As a Staff MLOps Engineer on the Infra group, youll play a vital role in develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools.
About Algo platform:
The objective of the algo platform group is to own the existing algo platform (including health, stability, productivity and enablement), to facilitate and be involved in new platform experimentation within the algo craft and lead the platformization of the parts which should graduate into production scale. This includes support of ongoing ML projects while ensuring smooth operations and infrastructure reliability, owning a full set of capabilities, design and planning, implementation and production care.
The group has deep ties with both the algo craft as well as the infra group. The group reports to the infra department and has a dotted line reporting to the algo craft leadership.
The group serves as the professional authority when it comes to ML engineering and ML ops, serves as a focal point in a multidisciplinary team of algorithm researchers, product managers, and engineers and works with the most senior talent within the algo craft in order to achieve ML excellence.
How youll make an impact:
As a Staff MLOps Engineer Engineer, youll bring value by:
Develop, enhance and maintain highly scalable Machine-Learning infrastructures and tools, including CI/CD, monitoring and alerting and more
Have end to end ownership: Design, develop, deploy, measure and maintain our machine learning platform, ensuring high availability, high scalability and efficient resource utilization
Identify and evaluate new technologies to improve performance, maintainability, and reliability of our machine learning systems
Work in tandem with the engineering-focused and algorithm-focused teams in order to improve our platform and optimize performance
Optimize machine learning systems to scale and utilize modern compute environments (e.g. distributed clusters, CPU and GPU) and continuously seek potential optimization opportunities.
Build and maintain tools for automation, deployment, monitoring, and operations.
Troubleshoot issues in our development, production and test environments
Influence directly on the way billions of people discover the internet
Our tech stack:
Java, Python, TensorFlow, Spark, Kafka, Cassandra, HDFS, vespa.ai, ElasticSearch, AirFlow, BigQuery, Google Cloud Platform, Kubernetes, Docker, git and Jenkins.
Requirements:
Experience developing large scale systems. Experience with filesystems, server architectures, distributed systems, SQL and No-SQL. Experience with Spark and Airflow / other orchestration platforms is a big plus.
Highly skilled in software engineering methods. 5+ years experience.
Passion for ML engineering and for creating and improving platforms
Experience with designing and supporting ML pipelines and models in production environment
Excellent coding skills in Java & Python
Experience with TensorFlow a big plus
Possess strong problem solving and critical thinking skills
BSc in Computer Science or related field.
Proven ability to work effectively and independently across multiple teams and beyond organizational boundaries
Deep understanding of strong Computer Science fundamentals: object-oriented design, data structures systems, applications programming and multi threading programming
Strong communication skills to be able to present insights and ideas, and excellent English, required to communicate with our global teams.
Bonus points if you have:
Experience in leading Algorithms projects or teams.
Experience in developing models using deep learning techniques and tools
Experience in developing software within a distributed computation framework
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8439446
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
We are seeking a highly skilled and motivated Team Leader to build and lead a new team dedicated to developing orchestration tools and software solutions for AI datacenters.
The main goal of this team is to design and deliver customer-focused orchestration platforms that simplify the deployment, management, and monitoring of large-scale AI workloads.
This role combines technical leadership with hands-on development, covering the entire AI datacenter ecosystem including switches, hosts, smart NICs, GPUs, ROCm, and RCCL. The team will primarily develop in Python, complemented by modern full-stack technologies for user interfaces and control systems.
Key Responsibilities:
Lead and mentor a team of engineers building orchestration tools that manage complex AI datacenter infrastructures.
Define the teams vision, roadmap, and architecture for orchestration solutions that enhance customer experience and operational efficiency.
Design and implement distributed control and orchestration systems using Python and full-stack frameworks.
Collaborate with networking, compute, and AI acceleration teams to integrate orchestration capabilities across all datacenter components (switches, NICs, GPUs, and software stacks).
Work closely with product, QA, and DevOps teams to identify customer requirements and translate them into scalable, production-grade orchestration platforms.
Ensure software reliability, scalability, and maintainability through strong design principles, testing, and CI/CD practices.
Foster a culture of innovation, technical excellence, and cross-functional collaboration.
Requirements:
5+ years of software development experience, including 2+ years in a team leadership or technical lead role.
Strong proficiency in Python for backend, orchestration, and systems integration.
Proven experience in designing and implementing orchestration or control-plane systems for datacenter or cloud environments.
Deep understanding of datacenter infrastructure networking, compute, storage, or GPU acceleration.
Hands-on experience with containers, orchestration frameworks, and CI/CD pipelines (Kubernetes, Docker, etc.).
Excellent problem-solving, leadership, and communication skills.
Preferred Qualifications:
Experience with AI workloads and GPU software stacks (ROCm, RCCL, PyTorch, TensorFlow).
Familiarity with control-plane architectures, distributed systems, or cluster management frameworks.
Background in telemetry, resource scheduling, or performance optimization for large-scale systems.
Knowledge of microservices, REST/gRPC APIs, and cloud-native architectures.
Practical experience with full-stack development (React, Angular, Node.js, or similar).
Experience with testing frameworks (pytest, Robot Framework, etc.).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8423027
סגור
שירות זה פתוח ללקוחות VIP בלבד