דרושים » תוכנה » QA Engineer, DPU Firmware

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 7 שעות
חברה חסויה
Location: Yokne`am
Job Type: Full Time
We are looking for a highly motivated and experienced Software QA Engineer to join our DPU Firmware QA Team. Quality is an integral aspect of our products, and as a Firmware QA Engineer you will have a significant impact in helping us to deliver our products to customers in a fast paced and constantly evolving environment. You will bring your upbeat personality with excellent communication skills, attention to detail, and real passion for a positive user experience. The candidate will take part in wide areas of Software testing, systems testing, networking and automation in Linux environment.

What you'll be doing:

Design, Develop and Execute tests (manual and automatic).

Learn new networking standards, features and protocols and improve QA coverage.

Participate in reviews and provide feedback on product feature requirements, specifications, and technical design documents.

Collaborate with various team including project management, hardware, and software developers, provide technical analysis of bugs.

Execute tests in different scopes: regression, performance, functional, security; report the progress of testing and provide summary reports of testing activity.

Develop/Enhance Tools, applications, or processes to improve the quality and test efficiency.

Define and build setups topologies for appropriate product coverage.
Requirements:
What we need to see:

B.A. / B.Sc. in Computer Science/Software Engineering or Practical Software engineer.

Knowledge in Networking and system experience.

Programming skill.

High English level.

Fast learner with outstanding technical skills.

Ability to self-manage, show leadership, good analytical skill.

Ways to stand out from the crowd:

Experience as Software QA Engineer.

Python programming proven skill.

Familiarity with network protocols.

Experience working with BMC and Host Management Network interfaces and standards.

Experience with Crypto and Security Networking standards.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8586981
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for an experienced engineer to join our SW/FW testing team. This position will be part of QA group, which enables our products meet the industry leading benchmarks of efficiency and quality. The ideal candidate will engage in testing of industry-leading features, be able to learn new features, execute manual and automated tests, and participate in automation development.


What you'll be doing:

Review arch design and requirements documents for new features introduced in different domains.

Design, develop and implement tests for the new features, as part of SW/FW update releases.

Automate newly added tests in the existing automation framework, add new capabilities and features to the framework.

Report bugs found during execution, assist with reproduction and debugs to understand root cause, verify bug fixes provided by R&D team, raise if not fixed.

Perform tests in different scopes: regression, performance, functional, security; report progress of testing and provide summary reports of testing activity.
Requirements:
What we need to see:

Practical / BSc in Computer Science or Electrical Engineer.

2+ years of proven experience in QA.

Scripting skills and proven experience in Python.

Clear verbal and written communication, proficient written and spoken English.

Independent worker, able to plan and shine in area of responsibility.

Standout colleague with good communication, desire to lead.


Ways to stand out from the crowd:

Knowledge / experience in Networking.

Familiarity with ITU-T standards and synchronization protocols, such as SyncE and PTP or equivalent experience.

Proven experience in QA - methodologies, test design.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8540052
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a Software QA Engineer with a strong background in Networking and Automation to join our InfiniBand (IB) and NVLINK (NVL) Switch QA team. Our team is responsible for qualifying software stack for our IB Switch, Router, Gateway and NVLINK systems, delivering world-class networking solutions. You will work at the heart of cutting-edge technology, validating software management features, designing topologies, developing automated test suites, and collaborating with engineering and product teams to ensure delivery of robust and scalable systems.

What youll be doing:
Design, develop, and execute manual and automated tests as part of software stack releases.
Define, build, and manage testbed topologies for functional, regression, and performance validation.
Analyze architectural designs and feature requirements for new networking capabilities.
Debug failures, identify root causes, and verify fixes delivered by development teams.
Schedule test runs, track testing progress, and generate test status reports with detailed defect documentation.
Write and maintain automation tests across multiple frameworks (Python, Perl), enhancing test efficiency and scalability.
Collaborate with cross-functional global teams including R&D, product marketing, and system verification.
Requirements:
What we need to see:
B.Sc./ M.Sc. in Computer Science, Information Systems, Electrical Engineering, or related technical field.
2+ years of hands-on experience in QA, preferably with a focus on networking.
Strong understanding of software testing methodologies, test planning, and bug lifecycle.
Proficiency in automation scripting (Python, Perl, or Shell) on Unix/Linux platforms.
Familiarity with networking concepts, protocols, and devices (e.g., switches, NICs).
Strong analytical and debugging skills with an eye for detail.
Excellent communication skills, both written and verbal.

Ways to stand out from the crowd:
Experience in Python automation and working with source control tools (Git, Gerrit), Solid knowledge of Linux and kernel internals.
Hands-on experience with virtualized and mixed computing environments (KVM, VMware, Linux/Windows).
In-depth understanding of TCP/IP, routing protocols, LAN switching, and data center topologies.
Exposure to QA methodologies, release management, and end-to-end test lifecycle.
Familiarity with NVIDIA technologies such as Infiniband, NVLINK, GPUs is a strong advantage.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8539917
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
3 ימים
Location: Yokne`am
Job Type: Full Time
We are looking for a System/Network Test Engineer to join our End-to-End Cloud solution team. You are going to be part of the E2E Verification team with the main goal of testing and be part of defining the most sophisticated Ethernet/InfiniBand NIC and Switch features and topologies, which build fast enablement for the products to meet growing market demands. The ideal candidate will engage in testing and defining of Networking industry-leading systems and will bring with him an ability of the fast learning of new features, technologies, and protocols.

What you'll be doing:

Contribute to design review and product features requirements under the whole Ethernet/InfiniBand NIC & Switch portfolio and AI network.

Design and build setup topologies for appropriate product coverage with an emphasis on an emulation of customer large scale / complex environments.

Design requirements, for testing automation team, and implement tests for the new features, as part of our growing network switch and adaptors division.

Lead innovation approach by prepare and deploy different POC activities, based on the growing field demands.

Generate comprehensive test reports during release execution procedure, assist with reproduction and debugs complex customer use cases, with determination of the issue root cause, be an engineering PIC for the full verification cycles of the customer use cases fixes provided by R&D team.

Execute end-to-end test scenarios in different scopes: Regression, Performance, Functional and Scale; Report the progress of testing and provide summary reports of testing activity.
Requirements:
What we need to see:

B.A./B.Sc. in Computer Science or Electrical Engineering or equivalent experience as IT/Network Engineer.

2+ years of practical experience.

Strong Hands-on experience in Linux based platform.

Experience with L2 & L3 network protocols.

Fast and self-learner with outstanding technical skills.

Independent, responsible worker, able to plan and complete.

Effective trouble shooting and problem-solving skills.

Standout colleague with good communication and interpersonal skills.

Ways to stand out from the crowd:

Experience with virtualization technologies (KVM, HyperV, VMWARE, OpenStack, Kubernetes).

Experience in Congestion Control/DCQCN, Switches and knowledges about collective communication: NCCL, MPI etc.

Scripting skills and experience: Bash / Python.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8585200
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
11/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for an experienced Software QA Engineer to join us.
This position will be part of the sofrware QA team while the main goal is to test the physical layer area, which enables our products to meet the industry-leading benchmarks of efficiency and quality. The ideal candidate will be responsible for developing and implementing automation solutions that will drive efficiency and productivity across our engineering efforts. Your work will directly contribute to the flawless execution of our projects and help us maintain its position as a world-class technology leader. You will also engage in creating tests for many industry-leading features and be able to learn new technologies, features, and develop automated tests.

What you'll be doing:

Reviewing arch design and requirements for new features introduced in the SoC software.

Collaborating with diverse teams to identify automation requirements and devise creative solutions.

Designing, developing, and implementing automation tests, frameworks, and tools to streamline engineering processes.

Assisting with issues reproduction and debugging to understand the root cause and reporting new bugs.

Maintaining and executing automated tests to ensure the quality and reliability of our products.

Resolving automation issues and optimizing operations for seamless functionality and minimal interruptions.

Providing technical guidance and mentoring to junior engineers to foster their growth and development.
Requirements:
What we need to see:

B.A. / B.Sc. in Computer Science or Electrical Engineering.

2+ years of experience in the QA field or Automation.

Strong programming skills in Python.

Background with QA Methodologies and testing tools.

Strong problem-solving abilities to identify and resolve automation issues effectively.

Independent, responsible worker, able to plan and deliver.

Great teammate with good communication and interpersonal skills.

Ways to stand out from the crowd:

Experience in VMware.

Experience in Linux operating systems.

Knowledge of Networking test design and automation.

Familiarity with virtual machines and hypervisor environments with good knowledge of computer hardware.

Fluent English communication skills.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8541444
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
3 ימים
חברה חסויה
Location: Yokne`am and Tel Hai
Job Type: Full Time
We are looking for a QA Manager to own a new testing domain, which enables our products to meet the industry leading benchmarks of efficiency and quality. The ideal candidate will able to learn new features, execute manual/automated tests and be technical expert to the team

What youll be doing:

Lead the QA team from the planning stage through execution up to GA release, with the highest quality for the products.

Collaborate with peer engineering teams, and program/product management to ensure that product requirements, goals and objectives are met or exceeded.

Lead and implement analysis, debugging tracking flows with accordance to QA and SW methodologies.

Review the teams test-plans for the product coverage and strive for full automation.

Report bugs found during execution, assist with reproduction and debugs to understand root cause, verify bug fixes provided by R&D team.

Mentor and guide the professional and technical development of the team members.

Review and identify improvement opportunities in established processes, infrastructure, and practices to ensure the teams are performing in the most efficient way.
Requirements:
What we need to see:

B.Sc. degree or equivalent experience in Engineering/Computer Science/related field

3+ overall years of experience as a QA and QA Automation Engineer

2+ years of managerial experience

Demonstrated ability to collaborate and engage other peers in the organization and with support organizations

Experience with scripting language

Outstanding leadership, communication, interpersonal, and analytical skills with the ability to successfully lead multiple teams in highly dynamic matrix organization

Good understanding of networking concepts.

Ways to stand out from the crowd:

Expert level in Python programming

Experience with Kubernetes, dockers & virtualization

Background with release management, V&V, E2E testing
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8585133
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
02/03/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a top-tier Senior QA Software Test Engineer to join the Cumulus system Ethernet group. This position will be part of our QA team while the main goal is testing of our Ethernet Switch/Router and Spectrum-X AI Solution. You will be participating in requirements and design reviews, developing test plans, test execution, and automated test development for the specific features of Cumulus Linux. The rule will include lab-activity and setups installation.

What you'll be doing:

Functional / System Testing of various Layer2 and Layer3 features of our Mellanox Spectrum series Ethernet Switch systems running our Cumulus Linux.

Writing detailed feature and system test plans, defining and designing test beds and topologies.

Setup and install new switches /servers in lab including network configuration.

Hardware/Firmware related testing for new new and legacy systems.

Identifying and reporting issues found during testing into the defect tracking system and validating the fixes and workarounds.

Developing automated test suites for different features of NVIDIA-Cumulus Linux.

Building and maintaining automation required to ensure quality via continuous functional regression.
Requirements:
What we need to see:

B.S. degree in Engineering/Computer Science/related field, or equivalent experience.

5+ years of demonstrable experience in Software Quality Engineering.

Strong skills in Python or other scripting languages is a must.

Strong technical abilities, problem solving, designing, coding and debugging skills.

Hands on experience on any L2 and L3 protocols like MLAG, VLAN, STP, OSPF, BGP, EVPN etc.

Experience with using test tools from Ixia or Spirent and working experience in test management.

Experience IT and Switch/Server installation including operation systems.

Proficiency with tools like HP ALM.

Good experience working on Unix or Linux based OS.

Ways to stand out from the crowd:

Experience with bring up and troubleshooting of Ethernet interfaces and modules.

Knowledge in performance testing and solving problems performance issues.

Experience with CI methodology & tools (Git, Gerrit, Jenkins etc.).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8566062
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
11/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a Senior networking test engineer with strong system‑level debugging skills to join our End‑to‑End Verification team. You will work on cutting‑edge Ethernet‑based AI clusters, owning complex issues across hardware, system software and AI workloads. We are widely considered to be one of the technology worlds most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

What youll be doing:

Design and review test and product requirements across the Ethernet / NIC / DPU / Switch portfolio, focusing on large‑scale AI cluster behavior.

Build and maintain realistic customer‑like testbeds, including heterogeneous hardware, OS / driver combinations and complex network fabrics.

Own end‑to‑end cluster troubleshooting: reproduce customer scenarios, triage across the stack and drive issues to root cause and fix.

Read and understand relevant source code to identify defects, validate fixes and improve logging and instrumentation.

Collaborate closely with development teams to debug NCCL, RoCE/RDMA and related networking components using logs, code inspection and targeted experiments.

Define tests and guide the automation team to implement robust suites that produce actionable logs, metrics and traces.

Run Regression, Performance, Functional and Scale testing, analyze results and provide clear, data‑driven reports to stakeholders.

Profile and benchmark deep learning training and inference workloads, correlating model‑level metrics with system and network telemetry to uncover bottlenecks.
Requirements:
What we need to see:

B.A./B.Sc. in Computer Science, Electrical Engineering, or equivalent IT/Network/Systems experience.

5+ years of hands‑on networking or system‑level testing and debugging on Linux.

Strong Linux networking and debugging skills (for example perf, tcpdump, ethtool, iproute2).

Proven production‑grade debugging experience: forming hypotheses, running experiments, and driving issues to root cause under pressure.

Expertise in host‑side NIC validation and tuning (offloads, queues, interrupts, firmware/driver interactions).

Strong knowledge of AI networking libraries (such as NCCL) and protocols (such as RoCE and RDMA), including performance and correctness debugging.

Ability to read and reason about source code (C/C++/Python or similar) and collaborate closely with developers on fixes.

Solid scripting and automation skills with Bash / Python / Ansible for setup, log collection, and experiment orchestration.

Fast learner, familiar with modern AI tools and workflows, able to adapt quickly.

Excellent analytical, problem‑solving and communication skills, with strong ownership and a collaborative mindset.

Ways to stand out from the crowd:

Hands‑on debugging of collective communication libraries (for example NCCL) or large‑scale LLM training / inference clusters.

Experience with large cluster environments (tens to thousands of GPUs or nodes), including incident response and post‑mortem analysis.

Deep expertise in tuning and debugging congestion control and lossless Ethernet for AI workloads (for example DCQCN, ECN, PFC).

Familiarity with NVIDIA networking technologies (for example BlueField / BF3, ConnectX NICs) and their software stack and diagnostics.

Experience debugging issues that span multiple layers (L2/L3, transport, AI frameworks) or contributing to open‑source networking / AI systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8541388
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
4 ימים
Location: Yokne`am
Job Type: Full Time
We are looking for a Senior networking test engineer with strong system‑level debugging skills to join our End‑to‑End Verification team. You will work on cutting‑edge Ethernet‑based AI clusters, owning complex issues across hardware, system software and AI workloads. NVIDIA is widely considered to be one of the technology worlds most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

What youll be doing:

Design and review test and product requirements across the Ethernet / NIC / DPU / Switch portfolio, focusing on large‑scale AI cluster behavior.

Build and maintain realistic customer‑like testbeds, including heterogeneous hardware, OS / driver combinations and complex network fabrics.

Own end‑to‑end cluster troubleshooting: reproduce customer scenarios, triage across the stack and drive issues to root cause and fix.

Read and understand relevant source code to identify defects, validate fixes and improve logging and instrumentation.

Collaborate closely with development teams to debug NCCL, RoCE/RDMA and related networking components using logs, code inspection and targeted experiments.

Define tests and guide the automation team to implement robust suites that produce actionable logs, metrics and traces.

Run Regression, Performance, Functional and Scale testing, analyze results and provide clear, data‑driven reports to stakeholders.

Profile and benchmark deep learning training and inference workloads, correlating model‑level metrics with system and network telemetry to uncover bottlenecks.
Requirements:
What we need to see:

B.A./B.Sc. in Computer Science, Electrical Engineering, or equivalent IT/Network/Systems experience.

5+ years of hands‑on networking or system‑level testing and debugging on Linux.

Strong Linux networking and debugging skills (for example perf, tcpdump, ethtool, iproute2).

Proven production‑grade debugging experience: forming hypotheses, running experiments, and driving issues to root cause under pressure.

Expertise in host‑side NIC validation and tuning (offloads, queues, interrupts, firmware/driver interactions).

Strong knowledge of AI networking libraries (such as NCCL) and protocols (such as RoCE and RDMA), including performance and correctness debugging.

Ability to read and reason about source code (C/C++/Python or similar) and collaborate closely with developers on fixes.

Solid scripting and automation skills with Bash / Python / Ansible for setup, log collection, and experiment orchestration.

Fast learner, familiar with modern AI tools and workflows, able to adapt quickly.

Excellent analytical, problem‑solving and communication skills, with strong ownership and a collaborative mindset.

Ways to stand out from the crowd:

Hands‑on debugging of collective communication libraries (for example NCCL) or large‑scale LLM training / inference clusters.

Experience with large cluster environments (tens to thousands of GPUs or nodes), including incident response and post‑mortem analysis.

Deep expertise in tuning and debugging congestion control and lossless Ethernet for AI workloads (for example DCQCN, ECN, PFC).

Familiarity with NVIDIA networking technologies (for example BlueField / BF3, ConnectX NICs) and their software stack and diagnostics.

Experience debugging issues that span multiple layers (L2/L3, transport, AI frameworks) or contributing to open‑source networking / AI systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8584095
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/02/2026
Location: Ra'anana and Yokne`am
Job Type: Full Time
We are seeking a QA Engineer to join our Cloud Testing team. As part of the broader QA organization, youll focus on testing the software layer that enables our products to achieve industry-leading benchmarks in performance and quality. This role involves hands-on testing of cutting-edge cloud and networking features, learning new technologies, and contributing to both manual and automated validation efforts.

What Youll Be Doing:
Design, develop, and execute test plans for new features across software and update releases.
Identify and report bugs, assist in reproducing issues, and collaborate with R&D to isolate root causes and validate fixes.
Perform a variety of tests including functional, regression, performance, and security testing.
Analyze network flows and debug issues through log and system analysis.
Develop automation scripts and contribute to building scalable test infrastructures.
Requirements:
What We Need to See:
Practical Engineer / BA / BSc in Computer Science or a related field.
2+ years of experience in software testing, with a focus on networking and virtualization technologies.
Proficiency in scripting and automation languages such as Python, PyTest, or Bash.
Strong hands-on experience with Linux systems.
Ability to work independently, take ownership, and execute tasks effectively.
Willingness and availability to work with physical lab setups.
Strong verbal and written communication skills in English.
A collaborative team player with a proactive and leadership-oriented mindset..

Ways to Stand Out from the Crowd:
Experience with containerized environments (Docker, Kubernetes)
Familiarity with DevOps practices and CI/CD pipelines using Jenkins or GitLab
Understanding and experience with Agile and Scrum methodologies
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8539974
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 7 שעות
Location: Yokne`am
Job Type: Full Time
We are looking for a Senior networking test engineer with strong system‑level debugging skills to join our End‑to‑End Verification team. You will work on cutting‑edge Ethernet‑based AI clusters, owning complex issues across hardware, system software and AI workloads. We are widely considered to be one of the technology worlds most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

What youll be doing:

Design and review test and product requirements across the Ethernet / NIC / DPU / Switch portfolio, focusing on large‑scale AI cluster behavior.

Build and maintain realistic customer‑like testbeds, including heterogeneous hardware, OS / driver combinations and complex network fabrics.

Own end‑to‑end cluster troubleshooting: reproduce customer scenarios, triage across the stack and drive issues to root cause and fix.

Read and understand relevant source code to identify defects, validate fixes and improve logging and instrumentation.

Collaborate closely with development teams to debug NCCL, RoCE/RDMA and related networking components using logs, code inspection and targeted experiments.

Define tests and guide the automation team to implement robust suites that produce actionable logs, metrics and traces.

Run Regression, Performance, Functional and Scale testing, analyze results and provide clear, data‑driven reports to stakeholders.

Profile and benchmark deep learning training and inference workloads, correlating model‑level metrics with system and network telemetry to uncover bottlenecks.
Requirements:
What we need to see:

B.A./B.Sc. in Computer Science, Electrical Engineering, or equivalent IT/Network/Systems experience.

5+ years of hands‑on networking or system‑level testing and debugging on Linux.

Strong Linux networking and debugging skills (for example perf, tcpdump, ethtool, iproute2).

Proven production‑grade debugging experience: forming hypotheses, running experiments, and driving issues to root cause under pressure.

Expertise in host‑side NIC validation and tuning (offloads, queues, interrupts, firmware/driver interactions).

Strong knowledge of AI networking libraries (such as NCCL) and protocols (such as RoCE and RDMA), including performance and correctness debugging.

Ability to read and reason about source code (C/C++/Python or similar) and collaborate closely with developers on fixes.

Solid scripting and automation skills with Bash / Python / Ansible for setup, log collection, and experiment orchestration.

Fast learner, familiar with modern AI tools and workflows, able to adapt quickly.

Excellent analytical, problem‑solving and communication skills, with strong ownership and a collaborative mindset.

Ways to stand out from the crowd:

Hands‑on debugging of collective communication libraries (for example NCCL) or large‑scale LLM training / inference clusters.

Experience with large cluster environments (tens to thousands of GPUs or nodes), including incident response and post‑mortem analysis.

Deep expertise in tuning and debugging congestion control and lossless Ethernet for AI workloads (for example DCQCN, ECN, PFC).

Familiarity with our networking technologies (for example BlueField / BF3, ConnectX NICs) and their software stack and diagnostics.

Experience debugging issues that span multiple layers (L2/L3, transport, AI frameworks) or contributing to open‑source networking / AI systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8586994
סגור
שירות זה פתוח ללקוחות VIP בלבד