דרושים » חשמל ואלקטרוניקה » manager, ai networking performance research and analysis

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Yokne`am
Job Type: Full Time
lead performance research and evaluation of advanced networking technologies supporting ai workloads, including llm training and inference at supercomputing scale.
define end-to-end performance TEST plans and methodology for next-generation networking hw and networking technologies, including performance expectations and target kpis.
drive benchmarking, profiling, reporting, and deep performance characterization of networking workloads and offload features.
collaborate closely with simulation, architecture, chip-design, firmware, and software teams to assess performance tradeoffs and identify bottlenecks.
perform deep root cause analysis (rca) for performance gaps and stability issues, and drive cross-team mitigation plans.
develop and enhance performance analysis tools, automation frameworks, and scalable methodologies for cluster-level performance evaluation.
own performance observability efforts, including telemetry pipelines, dashboards, and job-level performance analytics.
Requirements:
what we need to see:
b.sc in Computer Science or software engineering
5+ years of experience with high-performance networking technologies (rdma, Storage, security, ovs, mpi)
3+ years as an engineering team manager
demonstrated performance analysis skills and methodologies.
experience with cluster level performance, telemetry, nic, dpus, switches, and gpus.
fast and self-learning capabilities with strong analytical and problem solving skills
programming languages: Python, bash and C / C ++ languages
experience with Linux os distros
team player and a leader with good communication and interpersonal skills
ways to stand out from the crowd:
deep system -level architecture knowledge (intel / amd / arm cpus, nvidia gpus, hca/dpu architecture, memory subsystems, pcie, Storage, nvlink).
strong expertise in rdma networking performance and ai communication stacks (e.g., nccl).
proven experience analysing ai workload communication patterns and benchmarking distributed llm training workloads at scale.
experience designing telemetry frameworks, monitoring pipelines, and performance dashboards for large clusters.
familiarity with modern ai tooling including performance-driven agents, automation pipelines, and rag-based applications.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8594141
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
חברה חסויה
Location: Yokne`am
Job Type: Full Time
in this role, you will help build and evolve systems that support performance analysis, telemetry, and optimization for large-scale gpu- and cpu-based clusters used in ai and high-performance computing environments. you will work closely with hardware, networking, firmware, and software teams to collect, analyze, and interpret performance data from live systems. this is a fast-paced r&d environment where system behavior and requirements evolve rapidly, requiring adaptable engineering solutions and strong analytical thinking.
what youll be doing:
profile, benchmark, and analyze ai and hpc workloads on gpu and cpu clusters
explore performance characteristics of high-performance networking and collective communications (e.g., nccl, rdma, mpi, roce)
identify performance bottlenecks across networking, compute, memory, and system architecture
develop and enhance performance analysis, benchmarking, and diagnostic tools
define performance TEST plans and establish expectations for new technologies and platforms
collaborate across hardware, firmware, networking, systems, and software teams to provide actionable performance insights
support telemetry collection and data refinement efforts to enable accurate performance analysis
maintain high standards for  data quality, reproducibility, and traceability of performance results
Requirements:
what we need to see:
b.sc. or m.sc. in Computer Science, computer engineering, software engineering, or equivalent experience
5+ years of experience in performance analysis, systems engineering, or hpc/ai infrastructure
demonstrated expertise in performance analysis skills and methodologies
hands-on experience with high-performance networking (rdma, mpi, nccl, congestion control)
strong understanding of  system performance metrics (latency, throughput, resource utilization)
exposure to hardware, firmware, or Embedded telemetry environments
strong analytical, problem-solving, and communication skills
ability to work effectively in cross-functional, fast-paced r&d teams
ways to stand out from the crowd:
knowledge of cuda, nccl internals, and congestion control algorithms
deep system -level understanding of cpu architectures, gpus, hcas, memory, and pcie
experience with nvidia gpus, cuda, and deep learning frameworks such as pytorch or tensorflow
experience with cloud platforms 
proficiency in  Python ; experience with bash and C / C ++ is a plus as well as a strong experience working in  Linux environments
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8594112
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
18/03/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a Senior networking test engineer with strong system‑level debugging skills to join our End‑to‑End Verification team. You will work on cutting‑edge Ethernet‑based AI clusters, owning complex issues across hardware, system software and AI workloads. NVIDIA is widely considered to be one of the technology worlds most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

What youll be doing:

Design and review test and product requirements across the Ethernet / NIC / DPU / Switch portfolio, focusing on large‑scale AI cluster behavior.

Build and maintain realistic customer‑like testbeds, including heterogeneous hardware, OS / driver combinations and complex network fabrics.

Own end‑to‑end cluster troubleshooting: reproduce customer scenarios, triage across the stack and drive issues to root cause and fix.

Read and understand relevant source code to identify defects, validate fixes and improve logging and instrumentation.

Collaborate closely with development teams to debug NCCL, RoCE/RDMA and related networking components using logs, code inspection and targeted experiments.

Define tests and guide the automation team to implement robust suites that produce actionable logs, metrics and traces.

Run Regression, Performance, Functional and Scale testing, analyze results and provide clear, data‑driven reports to stakeholders.

Profile and benchmark deep learning training and inference workloads, correlating model‑level metrics with system and network telemetry to uncover bottlenecks.
Requirements:
What we need to see:

B.A./B.Sc. in Computer Science, Electrical Engineering, or equivalent IT/Network/Systems experience.

5+ years of hands‑on networking or system‑level testing and debugging on Linux.

Strong Linux networking and debugging skills (for example perf, tcpdump, ethtool, iproute2).

Proven production‑grade debugging experience: forming hypotheses, running experiments, and driving issues to root cause under pressure.

Expertise in host‑side NIC validation and tuning (offloads, queues, interrupts, firmware/driver interactions).

Strong knowledge of AI networking libraries (such as NCCL) and protocols (such as RoCE and RDMA), including performance and correctness debugging.

Ability to read and reason about source code (C/C++/Python or similar) and collaborate closely with developers on fixes.

Solid scripting and automation skills with Bash / Python / Ansible for setup, log collection, and experiment orchestration.

Fast learner, familiar with modern AI tools and workflows, able to adapt quickly.

Excellent analytical, problem‑solving and communication skills, with strong ownership and a collaborative mindset.

Ways to stand out from the crowd:

Hands‑on debugging of collective communication libraries (for example NCCL) or large‑scale LLM training / inference clusters.

Experience with large cluster environments (tens to thousands of GPUs or nodes), including incident response and post‑mortem analysis.

Deep expertise in tuning and debugging congestion control and lossless Ethernet for AI workloads (for example DCQCN, ECN, PFC).

Familiarity with NVIDIA networking technologies (for example BlueField / BF3, ConnectX NICs) and their software stack and diagnostics.

Experience debugging issues that span multiple layers (L2/L3, transport, AI frameworks) or contributing to open‑source networking / AI systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8584095
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
22/03/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a Senior networking test engineer with strong system‑level debugging skills to join our End‑to‑End Verification team. You will work on cutting‑edge Ethernet‑based AI clusters, owning complex issues across hardware, system software and AI workloads. We are widely considered to be one of the technology worlds most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

What youll be doing:

Design and review test and product requirements across the Ethernet / NIC / DPU / Switch portfolio, focusing on large‑scale AI cluster behavior.

Build and maintain realistic customer‑like testbeds, including heterogeneous hardware, OS / driver combinations and complex network fabrics.

Own end‑to‑end cluster troubleshooting: reproduce customer scenarios, triage across the stack and drive issues to root cause and fix.

Read and understand relevant source code to identify defects, validate fixes and improve logging and instrumentation.

Collaborate closely with development teams to debug NCCL, RoCE/RDMA and related networking components using logs, code inspection and targeted experiments.

Define tests and guide the automation team to implement robust suites that produce actionable logs, metrics and traces.

Run Regression, Performance, Functional and Scale testing, analyze results and provide clear, data‑driven reports to stakeholders.

Profile and benchmark deep learning training and inference workloads, correlating model‑level metrics with system and network telemetry to uncover bottlenecks.
Requirements:
What we need to see:

B.A./B.Sc. in Computer Science, Electrical Engineering, or equivalent IT/Network/Systems experience.

5+ years of hands‑on networking or system‑level testing and debugging on Linux.

Strong Linux networking and debugging skills (for example perf, tcpdump, ethtool, iproute2).

Proven production‑grade debugging experience: forming hypotheses, running experiments, and driving issues to root cause under pressure.

Expertise in host‑side NIC validation and tuning (offloads, queues, interrupts, firmware/driver interactions).

Strong knowledge of AI networking libraries (such as NCCL) and protocols (such as RoCE and RDMA), including performance and correctness debugging.

Ability to read and reason about source code (C/C++/Python or similar) and collaborate closely with developers on fixes.

Solid scripting and automation skills with Bash / Python / Ansible for setup, log collection, and experiment orchestration.

Fast learner, familiar with modern AI tools and workflows, able to adapt quickly.

Excellent analytical, problem‑solving and communication skills, with strong ownership and a collaborative mindset.

Ways to stand out from the crowd:

Hands‑on debugging of collective communication libraries (for example NCCL) or large‑scale LLM training / inference clusters.

Experience with large cluster environments (tens to thousands of GPUs or nodes), including incident response and post‑mortem analysis.

Deep expertise in tuning and debugging congestion control and lossless Ethernet for AI workloads (for example DCQCN, ECN, PFC).

Familiarity with our networking technologies (for example BlueField / BF3, ConnectX NICs) and their software stack and diagnostics.

Experience debugging issues that span multiple layers (L2/L3, transport, AI frameworks) or contributing to open‑source networking / AI systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8586994
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Yokne`am
Job Type: Full Time
we are looking for a senior networking TEST engineer with strong systemlevel debugging skills to join our endtoend verification team. you will work on cuttingedge ethernetbased ai clusters, owning complex issues across hardware, system software and ai workloads. widely considered to be one of the technology worlds most desirable employers. we have some of the most forward-thinking and hardworking people in the world working for us. if you're creative and autonomous, we want to hear from you! 
what youll be doing:
design and review TEST and product requirements across the ethernet / nic / dpu / switch portfolio, focusing on largescale ai cluster behavior
build and maintain realistic customerlike testbeds, including heterogeneous hardware, os / driver combinations and complex network fabrics
own endtoend cluster troubleshooting: reproduce customer scenarios, triage across the stack and drive issues to root cause and fix
read and understand relevant source code to identify defects, validate fixes and improve logging and instrumentation
collaborate closely with development teams to debug nccl, roce/rdma and related networking components using logs, code inspection and targeted experiments
define tests and guide the automation team to implement robust suites that produce actionable logs, metrics and traces
run regression, performance, functional and scale testing, analyze results and provide clear, datadriven reports to stakeholders
profile and benchmark deep learning training and inference workloads, correlating modellevel metrics with system and network telemetry to uncover bottlenecks
Requirements:
what we need to see:
b.a./b.sc. in Computer Science, electrical engineering, or equivalent it/network/systems experience
5+ years of handson networking or systemlevel testing and debugging on Linux
strong Linux networking and debugging skills (for example perf, tcpdump, ethtool, iproute2)
proven productiongrade debugging experience: forming hypotheses, running experiments, and driving issues to root cause under pressure
expertise in hostside nic validation and tuning (offloads, queues, interrupts, firmware/driver interactions)
strong knowledge of ai networking libraries (such as nccl) and protocols (such as roce and rdma), including performance and correctness debugging
ability to read and reason about source code ( C / C ++/ Python or similar) and collaborate closely with developers on fixes
solid scripting and automation skills with bash / Python / ansible for setup, log collection, and experiment orchestration
fast learner, familiar with modern ai tools and workflows, able to adapt quickly
excellent analytical, problemsolving and communication skills, with strong ownership and a collaborative mindset
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8594163
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
22/03/2026
Job Type: Full Time
We're looking for a Senior AI/MLOps Engineer to join a group that specializes in Security and Networking, and specifically ML, AI and agent development. As a Senior AI/MLOps Engineer, youll build and maintain the infrastructure, tools and processes necessary to support the AI lifecycle in a production environment. You will collaborate closely with data scientists, software engineers, security architects and DevOps teams to ensure smooth deployment, modeling and optimization of AI models. This role involves creative problem solving alongside engineering teams, and is pivotal for the continued success of AI networking security.

What youll be doing:

Developing, improving and optimizing scalable infrastructure for handling and deploying security and networking AI models and agents in production, ensuring high availability, scalability, reproducibility, and performance.

Optimizing AI models and agents for performance, scalability, and resource utilization, considering factors such as latency, efficiency, and cost.

Monitoring and deploying agentic systems, LLMs, and ML models in production.

Designing and implementing frameworks/pipelines for AI training, inference, and experimentation.

Collaborating closely with data scientists, security architects and software engineers to operationalize and deploy AI models and agents, including packaging and integration with existing systems. Participate in developing and reviewing code, design documents, use case reviews, and test plan reviews.

Collaborating with DevOps teams to integrate pipelines and workflows into the CI/CD process, ensuring flawless deployments and rollbacks.

Building and maintaining monitoring and alerting systems to proactively identify and resolve issues relating to quality, performance and infrastructure.

Implementing access controls, authentication mechanisms, and encryption standards for AI models and data.

Documenting guidelines, and standard operating procedures for MLOps/AI processes and sharing knowledge with the wider team.

Develop proof-of-concepts for new features.
Requirements:
What we need to see:

BSc/MSc in CS/CE or related field (or equivalent experience).

Strong background in AI with experience deploying and monitoring AI/ML models, LLMs and agents to production systems at scale, including distributed and multi-node environments - at least 5 years of experience.

Proficiency in programming languages such as Python, Java, or Scala, along with experience in using ML/AI frameworks and libraries (e.g. TensorFlow, PyTorch).

Proficiency in microservices architecture, container orchestration, cloud platforms, and scalable infrastructure for training and inference workloads.

Knowledge of inference optimization techniques.

Understanding of build infrastructure and CI/CD tools and practices (e.g. GitLab, GitHub Actions, Jenkins).

You are detail-oriented and care deeply about robust, well tested, high-performance code in production environments.

You are proactive, take full ownership of your deliverables, have a can-do approach, and excellent communication and collaboration skills, able to work effectively in multifunctional teams.

Ways to stand out from the crowd:

Knowledge of network protocols and Linux internals.

Security and networking background, with knowledge of security protocols, network architectures, firewalls, intrusion detection systems, and other relevant security and networking concepts.

Experience deploying and optimizing generative models and agents.

Knowledge of network security principles and practices.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8586605
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
6 ימים
Location: Yokne`am
Job Type: Full Time
we are looking for a senior hpc and ai cluster administrator to join the networking clusters solutions hpc/ai infrastructure team. we are building supercomputers and ai clusters based on groundbreaking technologies. we are looking for a system administrator to be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and gpu computing
you will work with the latest accelerated computing and deep learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. you will interact with hpc, os, gpu compute, and systems specialist to architect, develop and bring up large scale performance platforms. does this sound like you? if so, we would love to hear from you!
what you will be doing: deploy, manage and maintain large scale hpc/ai clusters
managing Linux job/workload schedules and orchestration tools
support and maintain continuous integration and delivery pipelines
troubleshooting and fixing, bottom up from bare metal, operating system, software stack and application level
supporting research & development activities and engaging in pocs/povs for future improvements.
Requirements:
what we need to see: bachelor's degree in Computer Science, engineering, or a related field; or equivalent experience
5+ years of experience
knowledge of hpc and ai solution technologies from cpus and gpus to high speed interconnects and supporting software
experience with job scheduling workloads and orchestration tools such as slurm, k8s
excellent knowledge of windows and Linux (redhat/centos and ubuntu) networking (sockets, firewalls, iptables, wireshark, etc.) and internals, acls and os level security protection and common protocols e.g. tcp, dhcp, dns, etc.
experience with multiple Storage solutions such as lustre, gpfs, zfs and xfs. familiarity with newer and emerging Storage technologies.
Python programming and bash scripting experience, automation and configuration management tools such as jenkins, ansible, gitops
knowledge of networking protocols like infiniband, ethernet
experience with virtual systems (for example VMware, hyper-v, kvm)
familiarity with cloud computing platforms (e.g. aws, azure, google cloud)
ways to stand out from the crowd: knowledge of cpu and/or gpu architecture
knowledge of kubernetes, container related microservice technologies
experience with gpu-focused hardware/software (dgx, cuda)
background with rdma (infiniband or roce) fabrics
our company has been redefining computer graphics, pc gaming, and accelerated computing for more than 25 years. we have a unique legacy of innovation thats fueled by great technology-and amazing people. today, were tapping into the unlimited potential of ai to define the next era of computing. an era in which our gpu acts as the brains of computers, robots, and self-driving cars that can understand the world. doing whats never been done before takes vision, innovation, and the worlds best talent. our teams are composed of driven, innovative professionals dedicated to pushing the boundaries of technology. we offer highly competitive salaries, an extensive benefits package, and a work environment that promotes diversity, inclusion, and flexibility. as an equal opportunity employer, we are committed to fostering a supportive and empowering workplace for all
#il-hybrid
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593421
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
6 ימים
Location: Yokne`am
Job Type: Full Time
we are looking for a data center network deployment engineer to join the networking clusters solutions hpc/ai infrastructure team. we are building supercomputers and ai clusters based on groundbreaking technologies. we are looking for a network/ system Engineer to be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and gpu computing.
you will work with the latest accelerated computing and deep learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. you will interact with hpc, os, gpu compute, and systems specialist to architect, develop and bring up large scale performance platforms. does this sound like you? if so, we would love to hear from you!
what you'll be doing:
deploy, manage and maintain large scale ai data centers - control, network and Storage stack
work with multiple software and hardware teams to optimize the clusters networking health and performance
develop and implement automation scripts for network, compute and Storage operations and deployments
supporting research & development activities and engaging in pocs/povs for future improvements.
Requirements:
what we need to see:
b.sc. in engineering or ccnp certificate
3+ years of proficiency in networking fundamentals, configuring ethernet switches, understanding the tcp/ip stack, and data center architecture.
excellent knowledge of windows and Linux (redhat/centos and ubuntu) networking (sockets, firewalls, iptables, wireshark, etc.) and internals, acls and os level security protection and common protocols e.g. tcp, dhcp, dns, etc.
proactive individual with the ability to work independently, prioritizing tasks to optimize technology and enhance Customer Experience.
provides ad-hoc knowledge transfers, develops handover materials, and offers deployment support for engagements.
ways to stand out from the crowd:
combination of interpersonal skills and technical competence
knowledge of hpc and ai solution technologies from cpus and gpus to high speed interconnects and supporting software
experience with multiple Storage solutions such as lustre, gpfs, and newer and emerging Storage technologies.
automation tooling background (ansible, salt, puppet etc.).
we are widely considered to be one of the technology worlds most desirable employers! we have some of the most forward-thinking and hardworking individuals in the world working for us. if you're creative and autonomous, we want to hear from you!
#il-hybrid
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593381
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Yokne`am
Job Type: Full Time
seeking a highly skilled and modern software engineer to develop and prototype brand new advancements in distributed training and inference using nvidias spectrum-x ai fabric. this role offers a rare chance to pioneer ai and networking technology, contributing to ground-breaking projects that will define the landscape of large-scale ai systems. improve ai app-networking connection by refining communication, crafting congestion control, coding nic firmware, and expanding switch sdk features for enhanced ai factory efficiency. your work impacts large ai system development, scaling, and speed.
what youll be doing:
prototype end-to-end solutions to improve distributed training and disaggregated inference performance.
analyze and optimize communication flows across application, transport, and network layers.
develop system software spanning communication libraries, drivers, and firmware integrations.
collaborate with hardware, firmware, and sdk teams to co-design network features.
validate and integrate prototypes into nvidias ai infrastructure and products.
Requirements:
what we need to see:
bsc/msc/phd in Computer Science or electrical engineering
5+ years of relevant experience and/or knowledge
deep understanding of networking and communication internals - nccl, rdma/roce, congestion control.
hands-on experience with hw/sw/fw integration and low-level programming ( C / C ++, Kernel, drivers).
some background in distributed training systems (such as pytorch ddp, megatron-lm, deepspeed).
ways to stand out from the crowd:
demonstrated innovation and leadership turning prototypes into impactful product features.
experience with programmable data planes (p4, ebpf, doca sdk, or switch sdks).
familiarity with nic firmware scheduling, in-network compute, or congestion management.
contributions to open-source projects, academic papers, or performance benchmarking tools.
strong background in ai factory architectures, distributed inference, or network telemetry.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8593751
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
02/03/2026
Location: Yokne`am
Job Type: Full Time
The Networking Advanced Development Software team develops new groundbreaking technologies to enable new market shares for the company and tighten customer relationships. These are emerging technologies in networking and distributed computing for the booming AI factories and data centers. They span areas such as AI neural networks, Deep Learning, High Performance Computing (HPC), Storage, Cloud, SW Defined Network, Network Function Virtualization and more. We develop the solutions top-down, all the way from application behavioral analysis, to architecture definition and down to the implementation, using the world-leading our devices. The development traverses any needed component - application SW, middleware SW, OS kernel subsystems, device drivers, embedded SW (Firmware) and CUDA GPU. We collaborate with partners and key customers in the analysis processes and engage with open source communities introducing our leading features.

What youll be doing:

Design and implement solutions throughout all layers from high level application, OS and driver subsystem to firmware.

Work on impactful projects involving state-of-the-art high-performance computing hardware and software.

Provide insight and technical guidance and collaborate with peers from across the company - including software architecture, chip architecture, and engineering departments to improve our future technology.

Collaborate with our partners and customers.
Requirements:
What we need to see:

B.Sc. in Computer Science, Electrical Engineering, Computer Engineering, or a related field.

5+ overall years of industry experience in system programming or related fields.

Understanding of multi core hardware, operating systems design, concurrency, virtual memory, caching, interrupts, device drivers, real-time

Excellent programming skills.

Ability to learn complex concepts in a fast pace environment.

A teammate with a can-do attitude, high energy and excellent interpersonal skills.

Ways to stand out from a crowd:

Familiarity with networking protocols.

Hands-on experience with CUDA programming and GPU acceleration.

Hands-on experience with LLM serving frameworks.

Experience with open-source projects (coursework, personal, or contributions).

Working in a fast-paced and dynamic environment.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8566056
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
5 ימים
Location: Yokne`am
Job Type: Full Time
we are now looking for a tensorrt-llm software development engineer! our company is hiring software engineers for its tensorrt-llm team. academic and commercial groups around the world are using gpus to power a revolution in deep learning-powered ai, enabling breakthroughs in areas like llm, chatgpt and generative ai that have put DL at the iPhone moment for ai. join the team which is building the inferencing software which is foundational to product lines within our company and across the industry! the ability to work on a fast-paced delivery-focused team is required and excellent interpersonal skills are a must.
what you'll be doing:
craft and develop robust inference software that can be scaled to multiple platforms for functionality and performance
performance analysis, optimization, and tuning for large language models (llms)
conduct unit tests and performance tests for different stages of the inference pipeline.
closely follow academic developments in the field of artificial intelligence and feature update tensorrt-llm
write safe, scalable, modular, and high-quality ( C ++/ Python ) code for our core backend software for llm inference.
collaborate across the company to guide the direction of deep learning inference, working with software, research and product teams
Requirements:
what we need to see:
bachelors, masters or higher degree in computer engineering, Computer Science, applied mathematics or related computing focused degree (or equivalent experience).
5+ years of relevant software development experience.
excellent Python programming skills, software design, and software engineering skills
awareness of the latest developments in llm architectures and llm inference techniques
experience working with deep learning frameworks like pytorch and huggingface
proactive and able to work without supervision
excellent written and oral communication skills in english
ways to stand out from the crowd:
prior experience with a llm inference framework (tensorrt-llm, sglang, vllm, etc.) or a DL compiler in inference, deployment, algorithms, or implementation
prior experience with performance modeling, profiling, debug, and code optimization of a DL /hpc/high-performance application
excellent C / C ++ programming and software design skills, including debugging, performance analysis, and TEST design.
architectural knowledge of cpu and gpu
gpu programming experience (cuda or opencl)
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8594191
סגור
שירות זה פתוח ללקוחות VIP בלבד