דרושים » חשמל ואלקטרוניקה » senior software engineer, deep learning inference

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
we seek a versatile senior software engineer who is passionate about performance optimization and generative ai. our team builds software solutions that enable efficient inference on the latest and greatest generative ai models. we tackle problems on all levels of the stack-from server-level request batching to gpu Kernel fusion-and collaborate with teams across diverse disciplines to push nvidia's hardware to its full potential.
what youll be doing:
cooperate with research teams to onboard new llms and vlms into nvidia's opensource ai runtimes
optimize inference workloads using sophisticated profiling and simulation tools
build solid, extendable inference software systems, and refine robust apis
implement and debug low-level gpu code to harness the latest hw features
own end-to-end inference acceleration features and work with teams around the world to deliver production-grade products
Requirements:
what we need to see:
b.sc., m.sc. or equivalent experience in Computer Science or computer engineering
5+ years of relevant hands-on software engineering experience
profound knowledge of software design principles
strong proficiency in at least one system and one scripting language
strong grasp of Machine Learning concepts
people person with excellent communication skills that enjoys collaboration and teamwork.
ways to stand out from the crowd:
familiarity with nvidia's DL software stack, e.g. triton inference server, tensorrt-llm, and model optimizer
proven track record of performance modeling, profiling, debugging, and development in a performance-critical setting with nvidia's accelerators.
familiarity with llm quantization, fine-tunning, and caching algorithms
proficiency in gpu Kernel programming (cuda or opencl)
prior experience working on a large software project with 50+ contributors
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593825
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
6 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
our company is seeking a sharp, innovative, and hands-on architect to help shape the future of llm inference at scale. join our dynamic e2e architecture group, where we build cutting-edge systems powering the next generation of generative ai workloads. in this role, you will work across software and hardware domains to design and optimize inference infrastructure for large language models running on some of the most advanced gpu clusters in the world.
youll help define how ai models are deployed and scaled in production, driving decisions on everything from memory orchestration and compute scheduling to inter-node communication and system -level optimizations. this is an opportunity to work with top engineers, researchers, and partners across our company and leave a mark on the way generative ai reaches real-world applications.
 
what youll be doing:
design and evolve scalable architectures for multi-node llm inference across gpu clusters.
develop infrastructure to optimize latency, throughput, and cost-efficiency of serving large models in production.
collaborate with model, systems, compiler, and networking teams to ensure holistic, high-performance solutions.
prototype novel approaches to kv cache handling, tensor/pipeline parallel execution, and dynamic batching.
evaluate and integrate new software and hardware technologies relevant to model inference (e.g., memory hierarchy, network topology, modern inference architectures).
work closely with internal teams and external partners to translate high-level architecture into reliable, high-performance systems.
author design documents, internal specs, and technical blog posts and contribute to open-source efforts when appropriate.
Requirements:
what we need to see:
bachelors, masters, or phd in Computer Science, electrical engineering, or equivalent experience.
5+ years of experience building large-scale distributed systems or performance-critical software.
deep understanding of deep learning systems, gpu acceleration, and ai model execution flows.
solid software engineering skills in C ++ and/or Python, with strong familiarity with cuda or similar platforms.
strong system -level thinking across memory, networking, scheduling, and compute orchestration.
excellent communication skills and ability to collaborate across diverse technical domains.
 
ways to stand out from the crowd:
experience working on llm inference pipelines, transformer model optimization, or model-parallel deployments.
demonstrated success in profiling and optimizing performance bottlenecks across the llm training or inference stack.
familiarity with  data center-scale orchestration, cluster schedulers, or ai service deployment pipelines.
passion for solving tough technical problems and shipping high-impact solutions.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593692
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
our company has been redefining computer graphics, pc gaming, and accelerated computing for more than 25 years. its a unique legacy of innovation thats fueled by great technology-and amazing people. today, were tapping into the unlimited potential of ai to define the next era of computing. an era in which our gpu acts as the brains of computers, robots, and self-driving cars that can understand the world. doing whats never been done before takes vision, innovation, and the worlds best talent. being an employee means being part of a diverse and encouraging setting that encourages everyone to perform at their peak. come join the team and discover how you can develop a lasting influence on the world. 
our company is in search of a Senior Software Architect - a creative, forward-thinking, and practical researcher to improve the framework for widespread llm learning and prediction. as part of our dynamic e2e architecture group, you will design and optimize systems driving generative ai workloads, working at the intersection of software and hardware on some of the most advanced gpu clusters worldwide. you will define how ai models are deployed and scaled in production using the spectrum-x networking platform, influencing decisions from inter-node communication and compute scheduling to system -level optimization. this is an opportunity to collaborate with best-in-class engineers and researchers and shape the future of generative ai in real-world applications. your work will make a lasting impact by enabling generative ai technologies to reach real-world applications and improve global computing capabilities. 
what youll be doing: 
lead research and development of end-to-end networking solutions for distributed ai training and inference at scale, with a focus on job completion time, failure resiliency, telemetry, scheduling, and placement.
analyze current deployments, develop prototypes, and recommend architectural improvements. 
stay abreast of the latest research; become the teams authority in emerging networking techniques and technologies. 
design, simulate, and validate new systems using novel, scalable network simulator nsx. 
develop and TEST prototypes on large-scale gpu clusters (e.g., israel-1). 
collaborate across hardware, firmware, and software teams to translate ideas into real networking product features. 
publish patents and present research at leading conferences.
Requirements:
what we need to see: 
m.sc. or phd (preferred) in Computer Science, electrical/computer engineering, or related field-or b.sc. with research experience and publications.
5+ years of relevant experience.
deep expertise in networking and communication internals (nccl, rdma, congestion control, routing). 
strong software engineering skills in C ++ and/or Python. 
excellent system -level design and problem-solving abilities. 
outstanding communication and collaboration skills across technical domains.
ways to stand out from the crowd: 
proven passion for solving sophisticated technical problems and delivering impactful solutions. 
record of publications in top-tier conferences. 
experience in designing and building large-scale ai training clusters. 
post-phd research experience 
practical understanding of deep learning systems, gpu acceleration, and ai model execution flows.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593805
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
looking for a strong technical senior architect to join us in shaping the future. senior architects are innovators who can translate business needs into workable technology solutions. their expertise is deep and broad. they are hands on, producing both detailed technical work and high-level architectural designs.
as a senior architect in the ai networking research team, you will explore technological challenges on accelerate networking and building ai data centers. research new transport functions and semantics for optimizing ai workloads, ai systems communication and accelerations and much more. you will also be leading architectural and development efforts across numerous technological fields, related to the modern ai data center, such as distributed ai and deep learning solutions, data analytics, high performance computing (hpc), software defined networking (sdn), virtualization, Storage, and more.
what youll be doing:
co-design hardware features (e.g., in gpus, dpus, or interconnects) that accelerate data movement and enable new capabilities for inference and model serving. 
identify and evaluate new technologies, innovations and partner relationships for alignment with our technology roadmap and business value.
lead architecture and design of new technologies and innovations such as runtime systems, communication libraries, ai-specific technologies.
lead proof-of-concept development to evaluate and drive such technologies.
Requirements:
what we need to see:
hold a m.sc. or ph.d. in Computer Science, electrical or computer engineering from a leading university (or equivalent experience).
5+ years of industry experience (or equivalent) in system architecture, ai systems architecture, scaling of ai, parallelism of ai frameworks, or deep learning training workloads.
experienced in algorithm design, system programming, computer architecture and operating systems.
experienced in virtualization, networking and Storage.
deep understanding of performance profiling and optimization techniques, together with defining and using hardware features.
strong programming and software development skills.
ability and flexibility to work and communicate effectively in a multi-national, multi-time-zone corporate environment.
ways to stand out from the crowd:
shown research track record.
have experience and passion for system architecture, cpu/gpu/memory/ Storage /networking.
stellar communication skills.
knowledge in deep learning frameworks and ai communication libraries (nccl, ucx, mpi and equivalents).
deep understanding of inference and training workloads and optimizations, like prefill/decode, data parallelism, tensor parallelism, fdsp and others.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593803
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
22/02/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
Were growing fast, and our team is passionate about pushing AI engineering to new heights - solving complex problems in LLM training, inference optimization, reasoning, and agent orchestration at scale.
About the Role:
As a Machine Learning Engineer, youll work on cutting-edge
code-focused LLMs and AI agent systems
that power next-generation developer platform. Youll be at the center of research, model training, and productionization of intelligent systems that understand software deeply, collaborate with developers, and help automate engineering workflows end-to-end. Your work will immediately impact millions of engineers worldwide.
Responsibilities:
Push LLM Innovation: Research, design, and fine-tune domain-specific LLMs for code generation, refactoring, debugging, and multi-turn reasoning.
Agent-Oriented Development: Build multi-agent coding systems that integrate retrieval-augmented generation (RAG), code execution, testing, and tool use to create autonomous, context-aware coding workflows.
Production-Grade AI: Own the training-to-inference pipeline for large code models-optimize inference with quantization, distillation, and caching techniques.
Rapid Experimentation: Prototype and validate ideas quickly; leverage reinforcement learning, human feedback, and synthetic data generation to push accuracy and reasoning.
Cross-Functional Collaboration: Partner with product, engineering, and design teams to ship AI-powered features that help developers focus on high-impact work.
Scale the Platform: Contribute to distributed training, scalable serving systems, and GPU/TPU-efficient architectures for ultra-low-latency developer tools.
Requirements:
2+ years of hands-on experience designing, training, and deploying machine-learning models
M.Sc. or higher in Computer Science / Mathematics / Statistics or equivalent from a university, or B.Sc. with strong hands-on ML experience
Practical experience with Natural Language Processing (NLP) and LLMs
Experience with data acquisition, data cleaning, and data pipelines
A passion for building products and helping people, both customers and colleagues
All-around team player, fast, self-learning individual
Nice to have:
3+ years of development experience with a passion for excellence
Experience building AI coding assistants, code reasoning models, or dev-focused LLM agents.
Familiarity with RAG, function-calling, and tool-using LLMs.
Knowledge of model optimizations (quantization, distillation, LoRA, pruning).
Startup or product-driven ML experience, especially in high-scale, latency-sensitive environments.
Contributions to open-source AI or developer tools.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8556109
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
22/03/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
Help build an Always-On, low-overhead GPU profiling service that runs in production, scales across cluster environments, and delivers actionable insights for ML workloads. You will be hands-on delivering our profiling solutions across system software, drivers, and CUDA to make profiling continuously available and reliable.

What youll be doing:

Develop low-overhead, high-reliability implementations in C/C++, with bounded CPU/memory budgets.

Lead end-to-end feature delivery spanning user-mode components, driver/platform layers, and performance counter/trace providers.

Establish profiling models that integrate with existing ML/AI workflows (e.g., PyTorch/XLA) to turn low-level signals into actionable insights.
Requirements:
What we need to see:

BS or MS degree or equivalent experience in Computer Engineering, Computer Science, or related degree.

5+ years of system-level C/C++ development, including concurrency, memory management, and performance engineering.

Familiarity with system software design, operating systems fundamentals, computer architectures, performance analysis, and delivering production-quality software.

Strong interpersonal, verbal, and written communication; able to influence across organizations and build trust with external collaborators.

Ways to stand out from the crowd:

Extensive experience with profiling/tracing stacks for CPU/GPU (e.g., CUPTI, Nsight, performance counters, event correlation) and debugging highly concurrent systems.

Deep hands-on knowledge of CUDA and GPU architecture, including runtime/driver APIs, CUDA streams/graphs, and kernel behavior.

Track record building continuous, always-on, or multi-client profiling systems designed for predictable overhead at scale.

Hands-on experience tuning ML training/inference loops based on deep profiling analysis, with familiarity in ML ecosystems (e.g., PyTorch, JAX) and correlating application events with GPU metrics to translate data into actionable performance insights (e.g., bottleneck triage, compute vs. memory bound).

Experience with user-mode driver development and integration within platform security and permissions models.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8586600
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
6 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
our company has evolved ai infrastructure by merging gpu virtualization with kubernetes-native capabilities. our world class ai platform allows organizations to improve productivity and efficiency for data scientists and Machine Learning engineers. with deep kubernetes expertise and a focus on innovation, we are dedicated to developing cutting-edge technologies, delivering the best User Experience for our customers, and providing deep visibility into workload performance through rich metrics that help users optimize their ai workloads. we are looking for highly skilled software engineers to join our platform group and help shape the future of ai infrastructure.
the role of a senior software engineer in the platform group is to design and develop scalable, high-performance systems that support the next generation of ai workloads. you will collaborate with experts across domains, tackle complex challenges, and drive innovations that empower our users to push the limits of ai capabilities.
what youll be doing:
designing and developing enterprise-grade systems with a strong focus on scalability, reliability, and performance.
building and optimizing microservices-based architectures using kubernetes and cloud-native technologies.
collaborating closely with backend engineers, product managers, and other collaborators to deliver impactful solutions.
writing clean, maintainable, and testable code in go
conducting code and design reviews to uphold high-quality standards and mentor team members.
Requirements:
what we need to see:
b.sc. in Computer Science or a related field.
5+ years proven experience in backend software development, including system design and architecture.
proficiency in at least one backend programming language (we write in go).
strong understanding of microservices architecture, restful apis, and relational databases.
deep familiarity with kubernetes and the cloud-native ecosystem.
demonstrated ability to tackle complex technical challenges and deliver high-quality solutions.
ways to stand out from the crowd:
expertise in kubernetes internals and advanced cloud-native technologies.
hands-on experience with hpc or ai/ml platforms.
familiarity with ai inference workloads and performance optimization.
proficiency in Linux, with knowledge in networking, security, Storage, and virtualization.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593557
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
22/03/2026
Job Type: Full Time
We're looking for a Senior AI/MLOps Engineer to join a group that specializes in Security and Networking, and specifically ML, AI and agent development. As a Senior AI/MLOps Engineer, youll build and maintain the infrastructure, tools and processes necessary to support the AI lifecycle in a production environment. You will collaborate closely with data scientists, software engineers, security architects and DevOps teams to ensure smooth deployment, modeling and optimization of AI models. This role involves creative problem solving alongside engineering teams, and is pivotal for the continued success of AI networking security.

What youll be doing:

Developing, improving and optimizing scalable infrastructure for handling and deploying security and networking AI models and agents in production, ensuring high availability, scalability, reproducibility, and performance.

Optimizing AI models and agents for performance, scalability, and resource utilization, considering factors such as latency, efficiency, and cost.

Monitoring and deploying agentic systems, LLMs, and ML models in production.

Designing and implementing frameworks/pipelines for AI training, inference, and experimentation.

Collaborating closely with data scientists, security architects and software engineers to operationalize and deploy AI models and agents, including packaging and integration with existing systems. Participate in developing and reviewing code, design documents, use case reviews, and test plan reviews.

Collaborating with DevOps teams to integrate pipelines and workflows into the CI/CD process, ensuring flawless deployments and rollbacks.

Building and maintaining monitoring and alerting systems to proactively identify and resolve issues relating to quality, performance and infrastructure.

Implementing access controls, authentication mechanisms, and encryption standards for AI models and data.

Documenting guidelines, and standard operating procedures for MLOps/AI processes and sharing knowledge with the wider team.

Develop proof-of-concepts for new features.
Requirements:
What we need to see:

BSc/MSc in CS/CE or related field (or equivalent experience).

Strong background in AI with experience deploying and monitoring AI/ML models, LLMs and agents to production systems at scale, including distributed and multi-node environments - at least 5 years of experience.

Proficiency in programming languages such as Python, Java, or Scala, along with experience in using ML/AI frameworks and libraries (e.g. TensorFlow, PyTorch).

Proficiency in microservices architecture, container orchestration, cloud platforms, and scalable infrastructure for training and inference workloads.

Knowledge of inference optimization techniques.

Understanding of build infrastructure and CI/CD tools and practices (e.g. GitLab, GitHub Actions, Jenkins).

You are detail-oriented and care deeply about robust, well tested, high-performance code in production environments.

You are proactive, take full ownership of your deliverables, have a can-do approach, and excellent communication and collaboration skills, able to work effectively in multifunctional teams.

Ways to stand out from the crowd:

Knowledge of network protocols and Linux internals.

Security and networking background, with knowledge of security protocols, network architectures, firewalls, intrusion detection systems, and other relevant security and networking concepts.

Experience deploying and optimizing generative models and agents.

Knowledge of network security principles and practices.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8586605
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
help build an always-on, low-overhead gpu profiling service that runs in production, scales across cluster environments, and delivers actionable insights for ml workloads. you will be hands-on delivering our profiling solutions across system software, drivers, and cuda to make profiling continuously available and reliable.
what youll be doing:
develop low-overhead, high-reliability implementations in C / C ++, with bounded cpu/memory budgets.
lead end-to-end feature delivery spanning user-mode components, driver/platform layers, and performance counter/trace providers.
establish profiling models that integrate with existing ml/ai workflows (e.g., pytorch/xla) to turn low-level signals into actionable insights.
Requirements:
what we need to see:
bs or ms degree or equivalent experience in computer engineering, Computer Science, or related degree.
5+ years of system -level C / C ++ development, including concurrency, memory management, and performance engineering.
familiarity with system software design, operating systems fundamentals, computer architectures, performance analysis, and delivering production-quality software.
strong interpersonal, verbal, and written communication; able to influence across organizations and build trust with external collaborators.
ways to stand out from the crowd:
extensive experience with profiling/tracing stacks for cpu/gpu (e.g., cupti, nsight, performance counters, event correlation) and debugging highly concurrent systems.
deep hands-on knowledge of cuda and gpu architecture, including runtime/driver apis, cuda streams/graphs, and Kernel behavior.
track record building continuous, always-on, or multi-client profiling systems designed for predictable overhead at scale.
hands-on experience tuning ml training/inference loops based on deep profiling analysis, with familiarity in ml ecosystems (e.g., pytorch, jax) and correlating application events with gpu metrics to translate data into actionable performance insights (e.g., bottleneck triage, compute vs. memory bound).
experience with user-mode driver development and integration within platform security and permissions models.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593763
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
our company is leading the way in groundbreaking developments in artificial intelligence, high performance computing and visualization. the gpu, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars.
come work for the team that brought to you nccl, nvshmem & gpudirect. our gpu communication libraries are crucial for scaling deep learning and hpc applications! we are looking for a motivated partner enablement engineer to guide our key partners and customers with nccl. most DL /hpc applications run on large clusters with high-speed networking (infiniband, roce, ethernet). this is an outstanding opportunity to get an end to end understanding of the ai networking stack. are you ready for to contribute to the development of innovative technologies and help realize our vision?
what you will be doing:
engage with our partners and customers to root cause functional and performance issues reported with nccl
conduct performance characterization and analysis of nccl and DL applications on groundbreaking gpu clusters
develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (azure, aws, gcp, etc.)
guide our customers and support teams on hpc knowledge and standard methodologies for running applications on multi-node clusters
document and conduct trainings/webinars for nccl
engage with internal teams in different time zones on networking, gpus, Storage, infrastructure and support.
Requirements:
what we need to see:
b.s./m.s. degree in cs/ce or equivalent experience with 5+ years of relevant experience. experience with parallel programming and at least one communication runtime (mpi, nccl, ucx, nvshmem)
excellent C / C ++ programming skills, including debugging, profiling, code optimization, performance analysis, and TEST design
experience working with engineering or academic research community supporting hpc or ai
practical experience with high performance networking: infiniband/roce/ethernet networks, rdma, topologies, congestion control
expert in Linux fundamentals and a scripting language, preferably Python
familiar with containers, cloud provisioning and scheduling tools (docker, docker swarm, kubernetes, slurm, ansible)
adaptability and passion to learn new areas and tools
flexibility to work and communicate effectively across different teams and timezones
ways to stand out from the crowd:
experience conducting performance benchmarking and developing infrastructure on hpc clusters. prior system administration experience, esp for large clusters. experience debugging network configuration issues in large scale deployments
familiarity with cuda programming and/or gpus. good understanding of Machine Learning concepts and experience with deep learning frameworks such pytorch, tensorflow
deep understanding of technology and passionate about what you do
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593743
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
We are seeking a skilled software engineer to join our NPU software stack development team. This role involves developing high-performance GPU programming frameworks, runtime systems, and libraries for AI/ML workloads. You will be responsible for implementing, optimizing, and maintaining GPU software stack components to support distributed AI training and inference.
Key Responsibilities
Identify bottlenecks, analysis and optimize in distributed NPU eco-system
Design and develop NPU memory management system
Design and develop optimized NPU development framework, execution path and debugging
Develop compatibility with AI frameworks (Triton, PyTorch, JAX)
Write high-quality, well-tested code with comprehensive documentation
Collaborate with other teams (Hardware, Network, QA, AI Framework Integration)
Participate in code reviews and technical design discussions.
Requirements:
5+ years of experience in distributed system programming
3+ years of experience with NPU programming (Triton, CUDA, HIP, OpenCL)
Expert-level C/C++ programming with focus on performance optimization
Expert-level Python programming with focus on DL/ML frameworks (PyTorch/JAX/etc)
Deep understanding of NPU architecture, memory tiering, and programming models
Knowledge of NPU runtime systems
Experience with performance profiling and optimization tools
Strong problem-solving and debugging skills
Experience with version control systems, Ticking system and collaborative development
Team player with excellent communication skills
Fast learner, highly organized, detail-oriented with high motivation
Preferred Qualifications
Experience with NPU software stack development
Experience with large-scale NPU systems (100+ NPUs)
Experience with DL/ML workloads (oriented AI) and distributed training / inferencing
Familiarity with containerization and orchestration.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8595732
סגור
שירות זה פתוח ללקוחות VIP בלבד