דרושים » הנדסה » Interconnect Failure Analysis Hardware Engineer

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
08/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a Hardware failure analysis engineer to join the Interconnect FA team. This position will be part of the product engineering team which control our Interconnect production lines. The ideal candidate will roll out failure analysis of products, while reviewing both the design and hardware assembly while executing manual and automated tests.

What you'll be doing:

Review architectural design (electrical, mechanical, process, software) of the tested units to locate potential design issues.

Design, develop and execute test suits to find the root cause for the HW malfunction of customers returned or production material.

Connect with inner / external customers for background review.

Finalize the investigation to summary reports.
Requirements:
What we need to see:

B.Sc. in Electrical Engineering / Electrical Practical engineering.

Minimum of 2 years experience in design / debug / FA of electrical circuits

Hands-on experience with the schematics readout and lab tools operation (DVM, oscilloscopes, pattern generator...).

Knowledge of electrical equipment.

Independent, responsible worker, able to plan and execute.

Team player with good communication and interpersonal skills.

Ways to stand out from the crowd:

Experience in Python scripting.

Background in Linux operating system shell.

Knowledge in optical hardware design.

Experience in Networking equipment operation (Switch CLI, NIC driver commands).
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8536588
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
08/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a Hardware failure analysis engineer to join the Interconnect FA team. This position will be part of the product engineering team which control NVIDIA Interconnect production lines. The ideal candidate will roll out failure analysis of products, while reviewing both the design and hardware assembly while executing manual and automated tests.

What you'll be doing:

Review architectural design (electrical, mechanical, process, software) of the tested units to locate potential design issues.

Design, develop and execute test suits to find the root cause for the HW malfunction of customers returned or production material.

Connect with inner / external customers for background review.

Finalize the investigation to summary reports.
Requirements:
What we need to see:

B.Sc. in Electrical Engineering / Electrical Practical engineering.

Minimum of 2 years experience in design / debug / FA of electrical circuits

Hands-on experience with the schematics readout and lab tools operation (DVM, oscilloscopes, pattern generator...).

Knowledge of electrical equipment.

Independent, responsible worker, able to plan and execute.

Team player with good communication and interpersonal skills.

Ways to stand out from the crowd:

Experience in Python scripting

Background in Linux operating system shell.

Knowledge in optical hardware design.

Experience in Networking equipment operation (Switch CLI, NIC driver commands).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8536572
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
08/02/2026
חברה חסויה
Location: Yokne`am
Job Type: Full Time
looking for a Hardware Engineer for our System Product Engineering group. The candidate will take full responsibility for networking systems product support engineering matters. The position requires an understanding of hardware, firmware/software, mechanical integration, and manufacturing. The candidate will supervise Product engineering activities in a various CM ( contract manufacturing ) lines, working closely with groups across the networking organization: Hardware/Software/Firmware R&D, Architecture, Test Engineering, Quality & Reliability, and Operations groups. The candidate will take part in direct communications with our internal and external teams. The position provides an opportunity to gain deep knowledge of our networking systems and a system-level view of our solutions, along with working in a dynamic and positive environment. ​

What you'll be doing:

Leading Cross-Organizational Task Forces: You will be responsible for managing and leading cross-functional teams to address quality events and failure modes in our networking products. This involves coordinating efforts from various departments, such as hardware, software, and manufacturing, to collectively work toward identifying and resolving issues.

Root Cause Analysis and Test Planning: You will perform in-depth root cause analysis by meticulously analyzing failure modes Monitor and control the test yield for all stations.

Onsite engineering support: you will be the technical owner for all test issues, test infrastructures, and test setup and will have to deliver fast and professional responses for all issues.

The workplace is located at the FLEX factory in Migdal Haemek, Israel.
Requirements:
What we need to see:

B.S.C in Electrical Engineering, or equivalent experience.

5+ years of technical experience in hardware design/system.

Experience with Failure analysis/Debugging/Architecture or Product Engineer.

Familiar with the manufacturing processes of electrical products.

Proficient with lab equipment and lab setups and troubleshooting skills.

A highly motivated teammate who always stays up-to-date with new technologies and FA methodologies.

Proficient English is required.

Ways to stand out from the crowd:

Experience as a test engineer or system/product engineer.

Experience in board design/system.

Familiar with ICT / JTAG technology and functional tests.

Linux operating systems, and programming languages such as Perl/Python.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8536251
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/02/2026
חברה חסויה
Location: Yokne`am
Job Type: Full Time
We are looking for an excellent System Product Engineer! NVIDIA Networking division is a leading supplier of innovative end-to-end InfiniBand and Ethernet connectivity solutions and services for servers and storage. We offer market-leading solutions that include adapter cards, switches, cables and software to support networking technologies. Our products optimize data center performance and deliver industry-leading bandwidth and scalability. In addition, we serve a wide range of markets including high performance computing, enterprise, data centers, cloud computing, big data and Web 2.0. We are constantly reinventing ourselves to stay ahead of the market and bring groundbreaking products and services to the industry.

The System product engineer is responsible for stabilizing hardware products from the first build unit through ramp-up and transition to high-volume mass production. This role plays a critical part in ensuring product quality, yield, and test robustness for NVIDIAs cutting-edge platforms by leading debug infrastructure development, systematic failure analysis, and data-driven continuous improvement across the NPI lifecycle.

What you'll be doing:

Own end-to-end product stabilization from first article to production release.

Develop and maintain hardware debug infrastructures, tools, and methodologies to support efficient root cause analysis.

Monitor, analyze, and improve yield performance during NPI.

Perform deep-dive failure analysis and root cause investigations to improve product robustness, quality, and throughput.

Improve test stability, coverage, and repeatability while reducing false failures.

Collaborate closely with Mechanical NPI engineers, Test Development teams, R&D, and manufacturing partners.

Apply data analysis techniques to identify systemic issues, trends, and optimization opportunities.
Requirements:
What we need to see:

B.Sc. in Electrical Engineering.

4+ years of relevant experience in product engineering, failure analysis/debug, or HW/test design.

Proven hands-on experience in HW debugging.

Strong problem-solving abilities with a diligent approach.

Outstanding interpersonal and communication skills in English.

Ability to thrive in a collaborative and dynamic environment.


Ways to stand out from the crowd:

Background with test design and HW lab equipment.

Experience with data/yield analysis.

Experience with JMP and/or other data analysis tools.

Experience working with a manufacturing environment.

Basic programming knowledge (C, Perl, Python) in Linux environment.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8540040
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
08/02/2026
חברה חסויה
Location: Yokne`am
Job Type: Full Time
Our networking product engineering team is looking for an excellent Test Development Engineer. The position requires an understanding of both HW and SW to provide stable, efficient, and smoothly running production tests that enable high availability while ensuring the quality of the products being shipped to customers.

What youll be doing:

Design and develop automated tests for networking switches and DPU adapters while working closely with the HW, ASIC, and SW engineering teams to achieve reliable tests with high coverage.

Be exposed to various aspects of design, DFT, and test of our next-generation network products.

Take a significant part in the definition and development of tests, from early development level to the final test of Network-Systems in mass production.

Work with overseas manufacturing Mass Production teams to increase yields, test coverage and capacity, and to reduce production costs.

Lead test solution innovations to reduce test time and improve overall product quality.

Utilize test suites to find, debug, and resolve problems in the production process, and drive issues to closure.

Support and fix bugs in existing test code; support production lines for deployments, patches, and ongoing maintenance.
Requirements:
What we need to see:

BSc in Computer Science, Electrical Engineering, or a related field.

3+ years of related experience in software development.

3+ years of proven experience in Python development.

Programming experience in one or more additional languages such as Perl or C - an advantage.

Excellent knowledge of a version control system (preferably Git).

Experience working with UNIX/Linux Operating Systems.

Background with Software/Hardware products integration and HW lab measurement equipment.

Excellent communication skills and hands-on experience collaborating with global, cross-functional teams.

Self-motivation and a great teammate.

Ways to stand out from the crowd:

Excellent programming, debugging, performance analysis, and test design skills.

Knowledge of Hardware testing, Mass-Production flows, and yield improvement methodologies.

Creativity - find solutions for challenging requirements.

A highly motivated teammate who always stays up-to-date with new technologies and test methodologies.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8536401
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
10/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for an experienced engineer to join our SW/FW testing team. This position will be part of QA group, which enables our products meet the industry leading benchmarks of efficiency and quality. The ideal candidate will engage in testing of industry-leading features, be able to learn new features, execute manual and automated tests, and participate in automation development.


What you'll be doing:

Review arch design and requirements documents for new features introduced in different domains.

Design, develop and implement tests for the new features, as part of SW/FW update releases.

Automate newly added tests in the existing automation framework, add new capabilities and features to the framework.

Report bugs found during execution, assist with reproduction and debugs to understand root cause, verify bug fixes provided by R&D team, raise if not fixed.

Perform tests in different scopes: regression, performance, functional, security; report progress of testing and provide summary reports of testing activity.
Requirements:
What we need to see:

Practical / BSc in Computer Science or Electrical Engineer.

2+ years of proven experience in QA.

Scripting skills and proven experience in Python.

Clear verbal and written communication, proficient written and spoken English.

Independent worker, able to plan and shine in area of responsibility.

Standout colleague with good communication, desire to lead.


Ways to stand out from the crowd:

Knowledge / experience in Networking.

Familiarity with ITU-T standards and synchronization protocols, such as SyncE and PTP or equivalent experience.

Proven experience in QA - methodologies, test design.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8540052
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
05/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a skilled Software Developer with strong hardware knowledge to join our diverse team passionate about developing low-level software and test infrastructure for our networking and Interconnect products. As a technical focal point, you will work at the intersection of hardware and software, taking ownership of driver development, hardware bring-up, and test system architecture. You will be instrumental in driving innovation by developing low-level software that directly controls hardware, debugging sophisticated hardware-software interactions, and creating robust testing solutions. If you're passionate about code development, automation, innovation, reading schematics, debugging hardware with software tools, and becoming a key technical contributor, we'd love to have you on board!

What You'll Be Doing:

Collaborate with multi-functional teams including hardware, electrical, thermal, and mechanical engineers to understand hardware specifications and develop corresponding test requirements.

Take the lead in debugging hardware-software integration issues using instruments, and software tools to pinpoint root causes.

Develop low-level drivers in C/C++ for hardware components and build DLL wrappers for integration with higher-level C# applications.

Review and interpret hardware schematics and datasheets to understand signal flows, timing requirements, and implement appropriate software control mechanisms.

Lead all aspects of hardware bring-up, validation, and deployment of test solutions to production environments, ensuring accurate integration.
Requirements:
What We Need to See:

Bachelor's or master's degree in electrical engineering, Computer Engineering, or Software Engineering with significant hardware/electronics coursework or hands-on experience.

5+ years of hands-on experience in hardware-software integration, low-level driver development, or firmware development with proven ability to lead technical projects.

Strong proficiency in one or more programming languages such as Python, Java, C#, with additional experience in C/C++ for low-level programming.

Demonstrated ability to read and interpret electrical schematics, block diagrams, and hardware datasheets.

Experience with hardware debugging tools such as oscilloscopes, logic analyzers, JTAG debuggers, or similar instruments.

Excellent problem-solving skills in developing software solutions for sophisticated hardware-software interactions. Ability to collaborate with hardware teams, demonstrating deep technical ownership of hardware-software integration projects.

Ways To Stand Out from the Crowd:

Hands-on experience with PCIe, I2C, SPI, UART, or other hardware communication protocols. Experience developing device drivers for Windows or Linux environments.

Background in embedded systems, microcontrollers, DSPs, FPGAs, or custom ASIC integration. Experience wrapping native C/C++ libraries into managed DLLs for .NET/C# applications.

Knowledge of hardware validation methodologies and experience with automated hardware test equipment. Previous work in networking hardware, high-speed interconnects, or semiconductor validation environments.

Strong ability to bridge communication between hardware and software teams, translating hardware requirements into software solutions.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8534014
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
02/03/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a skilled Software Developer with strong hardware knowledge to join our diverse team passionate about developing low-level software and test infrastructure for our networking and Interconnect products. As a technical focal point, you will work at the intersection of hardware and software, taking ownership of driver development, hardware bring-up, and test system architecture. You will be instrumental in driving innovation by developing low-level software that directly controls hardware, debugging sophisticated hardware-software interactions, and creating robust testing solutions. If you're passionate about code development, automation, innovation, reading schematics, debugging hardware with software tools, and becoming a key technical contributor, we'd love to have you on board!


What You'll Be Doing:

Collaborate with multi-functional teams including hardware, electrical, thermal, and mechanical engineers to understand hardware specifications and develop corresponding test requirements.

Take the lead in debugging hardware-software integration issues using instruments, and software tools to pinpoint root causes.

Develop low-level drivers in C/C++ for hardware components and build DLL wrappers for integration with higher-level C# applications.

Review and interpret hardware schematics and datasheets to understand signal flows, timing requirements, and implement appropriate software control mechanisms.

Lead all aspects of hardware bring-up, validation, and deployment of test solutions to production environments, ensuring accurate integration.
Requirements:
What We Need to See:

Bachelor's or master's degree in electrical engineering, Computer Engineering, or Software Engineering with significant hardware/electronics coursework or hands-on experience.

5+ years of hands-on experience in hardware-software integration, low-level driver development, or firmware development with proven ability to lead technical projects.

Strong proficiency in one or more programming languages such as Python, Java, C#, with additional experience in C/C++ for low-level programming.

Demonstrated ability to read and interpret electrical schematics, block diagrams, and hardware datasheets.

Experience with hardware debugging tools such as oscilloscopes, logic analyzers, JTAG debuggers, or similar instruments.

Excellent problem-solving skills in developing software solutions for sophisticated hardware-software interactions. Ability to collaborate with hardware teams, demonstrating deep technical ownership of hardware-software integration projects.


Ways To Stand Out from the Crowd:

Hands-on experience with PCIe, I2C, SPI, UART, or other hardware communication protocols. Experience developing device drivers for Windows or Linux environments.

Background in embedded systems, microcontrollers, DSPs, FPGAs, or custom ASIC integration. Experience wrapping native C/C++ libraries into managed DLLs for .NET/C# applications.

Knowledge of hardware validation methodologies and experience with automated hardware test equipment. Previous work in networking hardware, high-speed interconnects, or semiconductor validation environments.

Strong ability to bridge communication between hardware and software teams, translating hardware requirements into software solutions.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8566021
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
11/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a skilled Software Developer with strong hardware knowledge to join our diverse team passionate about developing low-level software and test infrastructure for NVIDIA's networking and Interconnect products. As a technical focal point, you will work at the intersection of hardware and software, taking ownership of driver development, hardware bring-up, and test system architecture. You will be instrumental in driving innovation by developing low-level software that directly controls hardware, debugging sophisticated hardware-software interactions, and creating robust testing solutions. If you're passionate about code development, automation, innovation, reading schematics, debugging hardware with software tools, and becoming a key technical contributor, we'd love to have you on board!

What You'll Be Doing:

Collaborate with multi-functional teams including hardware, electrical, thermal, and mechanical engineers to understand hardware specifications and develop corresponding test requirements.

Take the lead in debugging hardware-software integration issues using instruments, and software tools to pinpoint root causes.

Develop low-level drivers in C/C++ for hardware components and build DLL wrappers for integration with higher-level C# applications.

Review and interpret hardware schematics and datasheets to understand signal flows, timing requirements, and implement appropriate software control mechanisms.

Lead all aspects of hardware bring-up, validation, and deployment of test solutions to production environments, ensuring accurate integration.
Requirements:
What We Need to See:

Bachelor's or master's degree in electrical engineering, Computer Engineering, or Software Engineering with significant hardware/electronics coursework or hands-on experience.

5+ years of hands-on experience in hardware-software integration, low-level driver development, or firmware development with proven ability to lead technical projects.

Strong proficiency in one or more programming languages such as Python, Java, C#, with additional experience in C/C++ for low-level programming.

Demonstrated ability to read and interpret electrical schematics, block diagrams, and hardware datasheets.

Experience with hardware debugging tools such as oscilloscopes, logic analyzers, JTAG debuggers, or similar instruments.

Excellent problem-solving skills in developing software solutions for sophisticated hardware-software interactions. Ability to collaborate with hardware teams, demonstrating deep technical ownership of hardware-software integration projects.

Ways To Stand Out from the Crowd:

Hands-on experience with PCIe, I2C, SPI, UART, or other hardware communication protocols. Experience developing device drivers for Windows or Linux environments.

Background in embedded systems, microcontrollers, DSPs, FPGAs, or custom ASIC integration. Experience wrapping native C/C++ libraries into managed DLLs for .NET/C# applications.

Knowledge of hardware validation methodologies and experience with automated hardware test equipment. Previous work in networking hardware, high-speed interconnects, or semiconductor validation environments.

Strong ability to bridge communication between hardware and software teams, translating hardware requirements into software solutions.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8542252
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
11/02/2026
חברה חסויה
Location: Yokne`am
Job Type: Full Time
Our Networking division is looking for an excellent System Validation Engineer! NVIDIA Networking division is a leading supplier of innovative end-to-end InfiniBand and Ethernet connectivity solutions and services for servers and storage. We offer market-leading solutions that include adapter cards, switches, cables, and software to support networking technologies. Our products optimize data center performance and deliver industry-leading bandwidth and scalability. In addition, we serve a wide range of markets including high-performance computing, enterprise, data centers, cloud computing, big data, and Web 2.0. We are constantly reinventing ourselves to stay ahead of the market and bring groundbreaking products and services to the industry.

What you'll be doing:

Test Verification and Validation (both HW and SW).
Maintain products quality by improving test stability, coverage, design, and manufacturing process.
Execute tests in different scopes such: regression, performance degradation, functional, and security; report the progress of testing and provide summary reports of the activity.
Troubleshooting and streamlining/optimizing our testing procedures.
Support production matters.
Requirements:
What we need to see:
BSc in electrical/computer Engineering or a related field with 3+ years of experience.
Experience in hardware/software validation.
Programming experience in one or more programming languages: Perl, Python, C, C++.
Strong automation/scripting skills.
Coding experience, Understand the large Python project code and derive unit test.
A good knowledge of simulation flow and test automation development.
Strong problem-solving ability and experience in product engineering/failure analysis and debug/ HW or test design.
Excellent interpersonal and communication skills in English.

Ways to stand out from the crowd:
Experience as a Verification Engineer.
Experience in high-speed electrical testing and Unit test
Background with production manufacturing flows.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8541432
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
11/02/2026
Location: Yokne`am
Job Type: Full Time
We are looking for a Senior networking test engineer with strong system‑level debugging skills to join our End‑to‑End Verification team. You will work on cutting‑edge Ethernet‑based AI clusters, owning complex issues across hardware, system software and AI workloads. We are widely considered to be one of the technology worlds most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

What youll be doing:

Design and review test and product requirements across the Ethernet / NIC / DPU / Switch portfolio, focusing on large‑scale AI cluster behavior.

Build and maintain realistic customer‑like testbeds, including heterogeneous hardware, OS / driver combinations and complex network fabrics.

Own end‑to‑end cluster troubleshooting: reproduce customer scenarios, triage across the stack and drive issues to root cause and fix.

Read and understand relevant source code to identify defects, validate fixes and improve logging and instrumentation.

Collaborate closely with development teams to debug NCCL, RoCE/RDMA and related networking components using logs, code inspection and targeted experiments.

Define tests and guide the automation team to implement robust suites that produce actionable logs, metrics and traces.

Run Regression, Performance, Functional and Scale testing, analyze results and provide clear, data‑driven reports to stakeholders.

Profile and benchmark deep learning training and inference workloads, correlating model‑level metrics with system and network telemetry to uncover bottlenecks.
Requirements:
What we need to see:

B.A./B.Sc. in Computer Science, Electrical Engineering, or equivalent IT/Network/Systems experience.

5+ years of hands‑on networking or system‑level testing and debugging on Linux.

Strong Linux networking and debugging skills (for example perf, tcpdump, ethtool, iproute2).

Proven production‑grade debugging experience: forming hypotheses, running experiments, and driving issues to root cause under pressure.

Expertise in host‑side NIC validation and tuning (offloads, queues, interrupts, firmware/driver interactions).

Strong knowledge of AI networking libraries (such as NCCL) and protocols (such as RoCE and RDMA), including performance and correctness debugging.

Ability to read and reason about source code (C/C++/Python or similar) and collaborate closely with developers on fixes.

Solid scripting and automation skills with Bash / Python / Ansible for setup, log collection, and experiment orchestration.

Fast learner, familiar with modern AI tools and workflows, able to adapt quickly.

Excellent analytical, problem‑solving and communication skills, with strong ownership and a collaborative mindset.

Ways to stand out from the crowd:

Hands‑on debugging of collective communication libraries (for example NCCL) or large‑scale LLM training / inference clusters.

Experience with large cluster environments (tens to thousands of GPUs or nodes), including incident response and post‑mortem analysis.

Deep expertise in tuning and debugging congestion control and lossless Ethernet for AI workloads (for example DCQCN, ECN, PFC).

Familiarity with NVIDIA networking technologies (for example BlueField / BF3, ConnectX NICs) and their software stack and diagnostics.

Experience debugging issues that span multiple layers (L2/L3, transport, AI frameworks) or contributing to open‑source networking / AI systems.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8541388
סגור
שירות זה פתוח ללקוחות VIP בלבד