Senior AI Test Architect

עדכון קורות החיים לפני שליחה

8541318

שירות זה פתוח ללקוחות VIP בלבד

משרות דומות שיכולות לעניין אותך

דיווח על תוכן לא הולם או מפלה

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

3 ימים

Senior AI Test Architect

חברה חסויה

Location: Yokne`am

Job Type: Full Time

We are looking for an AI Test Architect joining E2E Verification group to profile Innovative large scale Distributed training on NVIDIA AI End-to-End solutions in a large scale supercomputing clusters. Provide insights on at-scale system design and tuning mechanisms for large-scale compute runs. You will work with the latest Accelerated Computing and Deep Learning software and hardware platforms, with researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, Switch, HCA, CPU and GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.

What youll be doing:

Profiling, benchmarking, and analyzing deep learning models to identify areas for optimization and improvement in terms of performance, efficiency, and accuracy, with a strong emphasis on networking aspects.

Collaborating closely with data scientists, researchers, development, automation teams to design and implement scalable training pipelines and frameworks that demonstrate large scale high -performance networking capabilities.

Staying up-to-date with the latest advancements in deep learning algorithms, architectures, NVIDIA GPU technologies, and high-performance networking solutions.

Optimizing deep learning models for performance, memory usage, and power efficiency while maximizing high-performance networking features on NVIDIA supercomputers.

Providing insights and recommendations based on the analysis of large-scale training results, specifically focusing on networking bottlenecks and optimizations, to improve model outcomes and achieve business objectives.

Collaborating with hardware engineers to guide the development and integration of efficient networking solutions for deep learning, including exploring network architecture optimizations and bringing to bear technologies such as RDMA or InfiniBand.

Requirements:
What we need to see:

B.Sc. in Computer Science, Software Engineering, or equivalent experience

Strong understanding and practical experience with machine learning algorithms and techniques, with a specialization in deep learning and expertise in high-performance networking

8+ years of overall experience, with CUDA programming for deep learning frameworks like TensorFlow, PyTorch, combined with expertise in networking libraries and protocols

Ability to profile and optimize deep learning workflows, focusing on networking-related bottlenecks and optimizations, to improve overall performance and efficiency

Exceptional analytical and problem-solving skill, with a keen attention to detail, particularly in identifying and resolving networking performance issues

Excellent communication and collaboration skills, enabling effective teamwork and cooperation.

Familiarity with supercomputers, parallel computing, distributed systems, and high- performance networking technologies like RDMA or InfiniBand.

Ways to stand out from the crowd:

Demonstrated experience in successfully profiling and optimizing large-scale deep learning training on our supercomputers, with a significant focus on high-performance networking enhancements.

Experience with distributed deep learning, distributed training frameworks, or large-scale data pipelines enhanced by high-performance networking solutions.

Expertise in optimizing networking parameters, such as bandwidth, latency, or congestion control, for deep learning workloads.

Familiarity with NVIDIA's networking technologies, such as Mellanox InfiniBand, and their integration with deep learning workflows.

Strong understanding of high-performance networking protocols and standards and their application to deep learning.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8536135

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

11/01/2026

Senior AI Test Architect

חברה חסויה

Location: Yokne`am

Job Type: Full Time

Requirements:
What we need to see:

B.Sc. in Computer Science, Software Engineering, or equivalent experience.

Strong understanding and practical experience with machine learning algorithms and techniques, with a specialization in deep learning and expertise in high-performance networking.

8+ years of overall experience, with CUDA programming for deep learning frameworks like TensorFlow, PyTorch, combined with expertise in networking libraries and protocols.

Ability to profile and optimize deep learning workflows, focusing on networking-related bottlenecks and optimizations, to improve overall performance and efficiency.

Exceptional analytical and problem-solving skill, with a keen attention to detail, particularly in identifying and resolving networking performance issues.

Excellent communication and collaboration skills, enabling effective teamwork and cooperation.

Familiarity with supercomputers, parallel computing, distributed systems, and high- performance networking technologies like RDMA or InfiniBand.

Ways to stand out from the crowd:

Demonstrated experience in successfully profiling and optimizing large-scale deep learning training on our supercomputers, with a significant focus on high-performance networking enhancements.

Experience with distributed deep learning, distributed training frameworks, or large-scale data pipelines enhanced by high-performance networking solutions.

Expertise in optimizing networking parameters, such as bandwidth, latency, or congestion control, for deep learning workloads.

Familiarity with our networking technologies, such as Mellanox InfiniBand, and their integration with deep learning workflows.

Strong understanding of high-performance networking protocols and standards and their application to deep learning.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8496288

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

לפני 12 שעות

Senior Networking Solution Test Engineer, AI Cluster Debugging

חברה חסויה

Location: Yokne`am

Job Type: Full Time

We are looking for a Senior networking test engineer with strong system‑level debugging skills to join our End‑to‑End Verification team. You will work on cutting‑edge Ethernet‑based AI clusters, owning complex issues across hardware, system software and AI workloads. We are widely considered to be one of the technology worlds most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

What youll be doing:

Design and review test and product requirements across the Ethernet / NIC / DPU / Switch portfolio, focusing on large‑scale AI cluster behavior.

Build and maintain realistic customer‑like testbeds, including heterogeneous hardware, OS / driver combinations and complex network fabrics.

Own end‑to‑end cluster troubleshooting: reproduce customer scenarios, triage across the stack and drive issues to root cause and fix.

Read and understand relevant source code to identify defects, validate fixes and improve logging and instrumentation.

Collaborate closely with development teams to debug NCCL, RoCE/RDMA and related networking components using logs, code inspection and targeted experiments.

Define tests and guide the automation team to implement robust suites that produce actionable logs, metrics and traces.

Run Regression, Performance, Functional and Scale testing, analyze results and provide clear, data‑driven reports to stakeholders.

Profile and benchmark deep learning training and inference workloads, correlating model‑level metrics with system and network telemetry to uncover bottlenecks.

Requirements:
What we need to see:

B.A./B.Sc. in Computer Science, Electrical Engineering, or equivalent IT/Network/Systems experience.

5+ years of hands‑on networking or system‑level testing and debugging on Linux.

Strong Linux networking and debugging skills (for example perf, tcpdump, ethtool, iproute2).

Proven production‑grade debugging experience: forming hypotheses, running experiments, and driving issues to root cause under pressure.

Expertise in host‑side NIC validation and tuning (offloads, queues, interrupts, firmware/driver interactions).

Strong knowledge of AI networking libraries (such as NCCL) and protocols (such as RoCE and RDMA), including performance and correctness debugging.

Ability to read and reason about source code (C/C++/Python or similar) and collaborate closely with developers on fixes.

Solid scripting and automation skills with Bash / Python / Ansible for setup, log collection, and experiment orchestration.

Fast learner, familiar with modern AI tools and workflows, able to adapt quickly.

Excellent analytical, problem‑solving and communication skills, with strong ownership and a collaborative mindset.

Ways to stand out from the crowd:

Hands‑on debugging of collective communication libraries (for example NCCL) or large‑scale LLM training / inference clusters.

Experience with large cluster environments (tens to thousands of GPUs or nodes), including incident response and post‑mortem analysis.

Deep expertise in tuning and debugging congestion control and lossless Ethernet for AI workloads (for example DCQCN, ECN, PFC).

Familiarity with NVIDIA networking technologies (for example BlueField / BF3, ConnectX NICs) and their software stack and diagnostics.

Experience debugging issues that span multiple layers (L2/L3, transport, AI frameworks) or contributing to open‑source networking / AI systems.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8541388

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

לפני 5 שעות

Senior HPC and AI Cluster Administrator

חברה חסויה

Location: More than one

Job Type: Full Time

We are looking for a Senior HPC and AI Cluster Administrator to join the Networking clusters solutions HPC/AI Infrastructure team. We are building supercomputers and AI clusters based on groundbreaking technologies. We are looking for a system administrator to be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and GPU computing

You will work with the latest Accelerated computing and Deep Learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms. Does this sound like you? If so, we would love to hear from you!

What you will be doing:

Deploy, manage and maintain large scale HPC/AI clusters.

Managing Linux job/workload schedules and orchestration tools.

Support and maintain continuous integration and delivery pipelines.

Troubleshooting and fixing, bottom up from bare metal, operating system, software stack and application level.

Supporting Research & Development activities and engaging in POCs/POVs for future improvements.

Requirements:
What we need to see:
Bachelor's Degree in Computer Science, Engineering, or a related field; or equivalent experience.

5+ years of experience.

Knowledge of HPC and AI solution technologies from CPUs and GPUs to high speed interconnects and supporting software.

Experience with job scheduling workloads and orchestration tools such as Slurm, K8s.

Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu) networking (sockets, firewalls, iptables, wireshark, etc.) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc.

Experience with multiple storage solutions such as Lustre, GPFS, zfs and xfs. Familiarity with newer and emerging storage technologies.

Python programming and bash scripting experience, automation and configuration management tools such as Jenkins, Ansible, Gitops.

Knowledge of Networking Protocols like InfiniBand, Ethernet.

Experience with virtual systems (for example VMware, Hyper-V, KVM).

Familiarity with cloud computing platforms (e.g. AWS, Azure, Google Cloud).

Ways to stand out from the crowd:

Knowledge of CPU and/or GPU architecture.

Knowledge of Kubernetes, container related microservice technologies.

Experience with GPU-focused hardware/software (DGX, Cuda).

Background with RDMA (InfiniBand or RoCE) fabrics.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8542260

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

3 ימים

Senior Network Performance Exploration Engineer

חברה חסויה

Location: Tel Aviv-Yafo and Yokne`am

Job Type: Full Time

We seek a highly motivated Network Performance Exploration Engineer to join our team of experts and help shape the foundational infrastructure for the AI revolution. Our next-generation networking systems are at the forefront of connecting and powering the world's most advanced AI clusters. As a key member of our architecture team, you will be responsible for exploring and identifying critical network optimization opportunities across our entire hardware and software stack, analyzing how system-level changes impact application-level performance.

What Youll Be Doing:

Explore and validate end-to-end application performance, defining comprehensive test plans and critical metrics to identify optimization opportunities in both hardware and software.

Establish and maintain a comprehensive database of benchmark results, tracking performance across releases to drive data-informed decisions.

Conduct deep-dive analysis into communication libraries (like NCCL), system software, and hardware configurations to investigate performance characteristics, validate architectural theories, and identify bottlenecks.

Provide critical performance data to correlate and enhance simulation tools, ensuring our models accurately predict real-world hardware behavior.

Analyze application-level traffic patterns (e.g., LLMs) on our advanced networking fabrics to identify hardware and software optimization opportunities and tune system parameters.

Lead Proof-of-Concept (POC) projects to prototype and evaluate potential hardware and software optimizations and their impact on application performance.

Requirements:
What We Need To See:

B.Sc. or M.Sc. degree in Computer Science, Computer Engineering, or Electrical Engineering, or equivalent experience.

5+ years of relevant industry or research experience in high-performance computing, computer architecture, or computer networks.

Hands-on programming skills in Python and/or C/C++ for system analysis, automation, and customizing benchmarks.

Excellent understanding of large-scale system behavior and the effect of distributed computing workloads on network and system performance.

Proven experience in performance analysis, benchmarking, and identifying system bottlenecks.

Exceptional analytical, problem-solving, and systems-thinking skills, with the ability to dive deep into complex software and hardware interactions.

Ability to thrive in a a fast-paced, dynamic environment and work concurrently with multiple cross-functional teams.

Ways To Stand Out From The Crowd:

Deep understanding of and hands-on experience with communication libraries such as NCCL, UCX, or MPI.

Direct experience debugging or modifying the source code of a major communication library.

Expertise in the architecture and system-level requirements of large-scale, distributed Deep Learning workloads (e.g., LLMs).

Expertise in high-performance network protocols (Ethernet, InfiniBand, RoCE) and interconnect technologies like NVLink.

Familiarity with the PyTorch ecosystem, especially for distributed workloads.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8536022

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

11/01/2026

Principal System Networking Architect

חברה חסויה

Location: More than one

Job Type: Full Time

We seek a highly motivated and experienced System Architect specializing in Data-Center, AI Fabric, and Ethernet Networking to join our team of experts and help shape the future of high-performance ML/AI computing. You will have the opportunity to work on some of the most pioneering technologies and help drive the innovation of our next-generation networks. You will play a key role in defining end-to-end solutions, networking protocols and features, interworking with orchestration systems, and help address new business opportunities in exciting areas. Our Architects also represent us in open-source projects, technical conferences, and standard development organizations.

What you'll be doing:

Explore new technologies and end-to-end solutions for our Ethernet Networking Platforms.

Be familiar with data-center and AI fabric network topologies, AI/ML clusters operation and network usage, as well as with the Ethernet Switch platforms' design and characteristics.

Define robust architectures and technical requirements for network operating systems and end-to-end solution offering for AI/ML workloads' needs and highly performing network operations.

Lead the work with R&D and Validation teams, providing technical guidelines and close support and thorough reviews for detailed designs and test plans.

Collaboration with architects across various fields, including Chip Design, Firmware, Hardware Platforms, and System teams.

Close work with product marketing, program managers, and account managers to ensure the successful execution of projects.

Support engagements with key customers, issue patents, publish white papers and blogs, and be proactive in technical forums and industry working groups.

Promote innovation through the design and implementation of Proof-of-Concept (PoC).

Requirements:
What we need to see:

B.Sc., M.Sc. or Ph.D. in Computer Science, Computer Engineering, or Electrical Engineering.

15+ years of experience in embedded software development for networking products, including 7+ years functioning as a System and/or Networking Architect.

Expert-level knowledge in Ethernet/IP technologies, network topologies, and networking features in data center, telco and/or edge networks.

Highly experienced in system software design and networking fundamentals.

Excellent understanding of large-scale network behavior and the effect of distributed computing workloads on the network.

Demonstrated ability to maintain technical foresight, conducting deep research and development into new technologies to generate innovative ideas and functional applications.

Leadership skills and accountability, including of past projects.

Clear verbal and written communication with the ability to build consensus within a large organization.

Possess problem-solving and critical thinking skills.

Ability to operate in a highly dynamic environment.

Ways to stand out of the crowd:

Extensive knowledge in various Switch ASIC hardware and Software Development Kit (SDK).

Demonstrated ability to prototype ideas and demonstrate their value.

Applying ML/AI methods to solve networking problems.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8496586

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

6 ימים

Senior PCIe Firmware Engineer

חברה חסויה

Location: Tel Aviv-Yafo and Yokne`am

Job Type: Full Time

We are looking for a creative and experienced Senior Firmware Engineer to join our PCIe Firmware team-someone passionate about using artificial intelligence to engineer the foundational hardware of the AI revolution.

As an integral part of our team, you'll architect and implement the core of our next-generation devices. This senior role places you at the center of innovation, where you will have a direct impact on our business and technology by solving sophisticated technical challenges. Its a unique opportunity to shape our technology and empower customers to build the supercomputers and AI fabrics of tomorrow.

What You'll Be Doing:
Lead the architectural design, development, and optimization of cutting-edge PCIe firmware, using AI-driven modeling and insights to deliver exceptional performance.

Serve as a trusted technical expert by investigating, debugging, and resolving challenging PCIe firmware issues for our most important customers.

Collaborate closely with our Chip Design, Verification, Software, and Architecture engineers to find root causes and develop robust, long-term solutions.

Champion the integration of AI-assisted diagnostics and generative AI tools across the entire development lifecycle to boost team productivity and innovation.

Translate customer needs and field data into actionable feedback that directly shapes the future of our products.

Requirements:
What We Need to See:
A degree in Electrical Engineering, Computer Science, Computer Engineering, or equivalent practical experience.

8+ years of significant professional experience in embedded firmware development, with a deep understanding of PCIe.

A strong foundation in computer architecture, operating systems, and object-oriented programming.

Proficiency in scripting languages like Python to automate tasks and workflows.

An innovative approach with a genuine desire to apply AI and machine learning to accelerate firmware development.

Ways to Stand Out from the Crowd:
Track record of applying AI-powered tools like Cursor to accelerate the development lifecycle.

Previous experience in a customer-facing or application engineering role.

Direct, hands-on experience with PCIe switch architecture and its firmware in high-performance applications.

Deep knowledge of hardware verification concepts and tools (e.g., C++, Python, Jenkins).

Extensive knowledge of networking protocols and the Linux operating system.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8533752

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

18/01/2026

Software Architect, Advanced Development

חברה חסויה

Location: More than one

Job Type: Full Time

We are looking for a passionate, modern software engineer or junior architect early in their career. The role involves developing and prototyping new scalable training and inference advancements using our Spectrum-X AI fabric.

This role offers a rare opportunity to work on innovative AI and networking technologies, building prototypes that influence the development of large-scale AI systems. You will help improve AI application-network interaction by refining communication, crafting congestion control, contributing to NIC and switch capabilities, and enhancing AI factory performance at scale.

What youll be doing:

Prototype end-to-end solutions to improve distributed training and disaggregated inference performance.

Analyze and optimize communication flows across application, transport, and network layers.

Develop system software spanning communication libraries, drivers, and firmware integrations.

Collaborate with hardware, firmware, and SDK teams to co-design network features.

Validate and integrate prototypes into our AI infrastructure and products.

Requirements:
What we need to see:

Bachelor's or Master's Degree in Computer Science or Electrical Engineering

0-2 years of experience in relevant fields.

Programming knowledge in C/C++

Ability to work closely with architects and R&D teams.

Passion to learn and innovate independently.

Ways to stand out from the crowd:

Demonstrated innovation and leadership turning prototypes into impactful product features.

Understanding of Networking Protocols - Ethernet, InfiniBand is an advantage.

Ability to quickly adapt to new technology and go deep into new areas.

Contributions to open-source projects, academic papers, or performance benchmarking tools.

Background in AI factory architectures, distributed inference, or network telemetry.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8506688

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

18/01/2026

Senior AI Software Development Engineer, TensorRT-LLM

חברה חסויה

Location: Yokne`am

Job Type: Full Time

We are now looking for a TensorRT-LLM Software Development Engineer! We are hiring software engineers for its TensorRT-LLM team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning-powered AI, enabling breakthroughs in areas like LLM, ChatGPT and Generative AI that have put DL at the iPhone moment for AI. Join the team which is building the inferencing software which is foundational to product lines within us and across the industry! The ability to work on a fast-paced delivery-focused team is required and excellent interpersonal skills are a must.

What you'll be doing:

Craft and develop robust inference software that can be scaled to multiple platforms for functionality and performance.

Performance analysis, optimization, and tuning for Large Language Models (LLMs)

Conduct unit tests and performance tests for different stages of the inference pipeline.

Closely follow academic developments in the field of artificial intelligence and feature update TensorRT-LLM.

Write safe, scalable, modular, and high-quality (C++/Python) code for our core backend software for LLM inference.

Collaborate across the company to guide the direction of deep learning inference, working with software, research and product teams.

Requirements:
What we need to see:

Bachelors, Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree (or equivalent experience).

5+ years of relevant software development experience.

Excellent Python programming skills, software design, and software engineering skills.

Awareness of the latest developments in LLM architectures and LLM inference techniques.

Experience working with deep learning frameworks like PyTorch and HuggingFace.

Proactive and able to work without supervision.

Excellent written and oral communication skills in English.

Ways to stand out from the crowd:

Prior experience with a LLM inference framework (TensorRT-LLM, SGLang, vLLM, etc.) or a DL compiler in inference, deployment, algorithms, or implementation.

Prior experience with performance modeling, profiling, debug, and code optimization of a DL/HPC/high-performance application.

Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design.

Architectural knowledge of CPU and GPU.

GPU programming experience (CUDA or OpenCL).

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8506686

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

18/01/2026

HPC and AI Data Center Engineer

חברה חסויה

Location: More than one

Job Type: Full Time

We are looking for an HPC and AI Data Center Engineer to join the networking cloud solutions HPC/AI Infrastructure team. We are focused on building supercomputers and HPC clusters based on groundbreaking technologies. We are looking for a lab manager, be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and GPU computing. Take part of building large-scale compute and Deep Learning software and hardware platforms, work together and support many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions.

What you will be doing:

Plan and build complex cluster and supercomputers in various of data center and labs.

Rack stack and cable management to ensure efficient use of space and easy maintenance.

Ensure data centers and labs power and cooling efficiency while optimizing rack space utilization.

Data centers and labs daily operation and support.

Installations for variety of infrastructure and solutions - Cloud, VMs, Storage, Network, HPC and AI.

Perform troubleshooting - network, optic cabling, bare metal, operating system.

Support Research & Development activities.

Requirements:
What we need to see:

MCSE or MCITP/CCNA certification.

3+ years of experience as lab manager.

Experience in supporting large and complex data centers.

Proven hands-on experience in Linux troubleshooting with good problem identification, resolution and solving skills.

In depth knowledge in Linux & Windows Core Services: DHCP, DNS, NIS, AD, etc.

Team Work, Service oriented, organized.

Ways to stand out from the crowd:

Scripting experience in Bash and/or Python.

Experience with configuration managements tools known in the community (e.g. Ansible, puppet).

CI & Known Job schedulers tools (e.g. Jenkins, SLURM).

Virtualization: KVM / VMware / Hyper-V.

Experience with L2 & L3 network protocols.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8506713

שירות זה פתוח ללקוחות VIP בלבד