דרושים » הנדסה » Senior HPC DevOps Engineer

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 2 שעות
חברה חסויה
Location: Yokne`am
Job Type: Full Time
we are looking for an experienced HPC DevOps and Network Engineer to help us build the supercomputers and HPC clusters of the future. As a Senior HPC DevOps Engineer, you'll be a key player in groundbreaking advancements in artificial intelligence and GPU computing. Your expertise will drive the latest breakthroughs, providing insights on at-scale system design and tuning mechanisms for large-scale compute runs. You will work with the latest Accelerated computing and Deep Learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.
What youll be doing:
Innovate and Implement: Design, implement, and maintain large-scale HPC/AI clusters with state-of-the-art monitoring, logging, and alerting systems.
Infrastructure as Code (IaC): Utilize and develop tools to manage infrastructure as code, ensuring scalable and repeatable deployments.
Streamline CI/CD Pipelines: Develop and maintain continuous integration and continuous delivery (CI/CD) pipelines to automate and streamline deployment processes.
Automate Everything: Develop automation scripts and tools to automate deployment, configuration management, and operational monitoring.
Develop complex Networking automations.
Troubleshoot Complex Issues: Perform comprehensive troubleshooting from bare metal to application level, ensuring system reliability and efficiency.
Lead and Educate: Serve as a technical resource, developing and sharing best practices with internal teams.
Drive Innovation: Support R&D activities and engage in proof of concepts (POCs) and proof of values (POVs) for future improvements.
Requirements:
B.Sc. in Computer Science, Engineering, or a related field with 5+ years of experience.
Deep knowledge of HPC and AI solution technologies, including CPUs, GPUs, high-speed interconnects, and supporting software.
Advanced proficiency in programming and scripting languages, with a solid understanding of object-oriented programming principles.
Familiarity with Jenkins, Ansible, Puppet/Chef.
Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu), networking and OS-level security.
Deep understanding of networking protocols such as InfiniBand and Ethernet.
Experience with job scheduling workloads and orchestration tools such as Slurm and Kubernetes.
Experience with multiple storage solutions like Lustre, GPFS, ZFS, and XFS.
Expertise with virtual systems (VMware, Hyper-V, KVM, Citrix).
Familiarity with cloud platforms (AWS, Azure, Google Cloud).
Ways to stand out from the crowd:
Proven networking experience or strong knowledge through professional networking training.
Architectural Insight: Knowledge of CPU and/or GPU architecture.
Container Expertise: Understanding of Kubernetes and container-related microservice technologies.
GPU Focus: Experience with GPU-focused hardware/software (DGX, CUDA).
RDMA Fabrics: Background with RDMA (InfiniBand or RoCE) fabrics.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465332
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 2 שעות
חברה חסויה
Location: More than one
Job Type: Full Time
we are looking for an outstanding engineer to join the Networking Software Cross DevOps team. The position is part of a team that supports sophisticated systems and integrations, maintains proprietary, open source and in-house tools, with the strive to improve development workflows and overall efficiency. we are one of a kind company, crafting the future of computing and challenging the existing conventions. With your help we would forge the next generation of compute infrastructure multiplying the power of the CPU, GPU and DPU with the groundbreaking networking technology for the age of AI.
What youll be doing:
Develop DevOps tools using our company, proprietary and open source technologies to help build better CI/CD workflows across our company.
Craft efficiency and usability improvements across our company product development environments to streamline release pipelines and processes across the company.
Be in the critical path of supporting hundreds of developers as well as intimately understanding the values of predictability, automation and reliability.
Design and build sophisticated automations and AI powered applications.
Requirements:
Bachelor's/Masters degree in computer science or computer engineering or equivalent program
5+ years of proven experience
Expert development ability in at least one scripting language (e.g. Python, Groovy or similar)
Profound understanding of the Linux operating system.
Deep understanding of containerization and cloud technologies architecture.
Outstanding problem-solving and critical thinking abilities.
Ability to self-manage, lead technically, and communicate effectively.
Ways to stand out from the crowd:
Expertise in version control systems (e.g. Git, GitLab, GitHub, Gerrit)
Expertise with Jenkins or similar.
Strong interests towards groundbreaking technologies
Ability to take initiatives and drive them across multiple functional teams.
Familiarity with CSPs infrastructure and tooling.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465352
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 3 שעות
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
The company Networking Advanced Development Software team develops new groundbreaking technologies to enable new market shares for the company and tighten customer relationships. These are emerging technologies in networking and distributed computing for the booming AI factories and data centers. They span areas such as AI neural networks, Deep Learning, High Performance Computing (HPC), Storage, Cloud, SW Defined Network, Network Function Virtualization, 5G NR and more. We develop the solutions top-down, all the way from application behavioral analysis, to architecture definition and down to the implementation, using the world-leading company devices. The development traverses any needed component - application SW, middleware SW, OS kernel subsystems, device drivers, embedded SW (Firmware) and CUDA GPU. We collaborate with partners and key customers in the analysis processes and engage with open source communities introducing our leading features.
What youll be doing:
Lead a team of 5 engineers in the advanced technologies development
Design and implement solutions throughout all layers from high level application, OS and driver subsystem to firmware
Work on impactful projects involving state-of-the-art high-performance computing hardware and software
Provide insight and technical guidance and collaborate with peers from across the company - including software architecture, chip architecture, and engineering departments to improve our future technology
Collaborate with our company partners and customers.
Requirements:
B.Sc. in Computer Science, Electrical Engineering, Computer Engineering, or a related field, or equivalent practical experience
10+ overall years of industry experience in system programming or related fields and 3+ years of experience leading a team
Understanding of multi core hardware, operating systems design, concurrency, virtual memory, caching, interrupts, device drivers, real-time
Excellent programming skills
Ability to learn complex concepts in a fast pace environment
A teammate with a can-do attitude, high energy and excellent interpersonal skills
Ways to stand out from a crowd:
Familiarity with networking protocols
Experience with open-source projects (coursework, personal, or contributions)
Working in a fast-paced and dynamic environment.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465195
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
20/11/2025
Location: Yokne`am and Tel Hai
Job Type: Full Time
We are searching for an outstanding Software Linux Engineer to expand our installation and packaging capabilities for our networking software. As a Software Engineer, you will focus on designing, developing, and maintaining user-space tools, packaging systems, and installation flows across leading Linux distributions. This position offers the opportunity to have a real impact in a dynamic, technology-focused company, contributing to product lines that power the most advanced data centers, cloud environments, and HPC systems in the world.

Are you passionate about Linux, system-level integration, and delivering seamless installation experiences? Do you want to help drive the deployment of high-speed networking solutions across multiple Linux ecosystems? Are you excited to work on cutting-edge technologies while enabling customers and internal teams to accelerate adoption?

What youll be doing:
Design, implement, and maintain installation and packaging workflows for our networking software across major Linux distributions (Debian, Ubuntu, RHEL, SLES, etc.).
Develop and support meta-packages, profiles, and tools to streamline user-space installation and configuration.
Work with distribution maintainers to ensure compatibility and smooth delivery through native packaging systems (DEB, RPM).
Perform system-level testing and verification of package installations on various platforms and OS versions.
Collaborate with cross-functional teams (kernel, QA, release engineering, and support) to ensure installation quality and maintainability.
Contribute to Linux kernel driver development and backporting to support advanced networking features.
Requirements:
What we need to see:
BS in Computer Science, Computer/Software Engineering, or a related field.
4+ years of software development experience, with strong focus on Linux system-level development.
Proficiency in Linux package management systems (dpkg, RPM, yum, apt, zypper) and scripting languages such as Python or Bash.
Experience with packaging standards, automation tools, and release workflows.
Familiarity with kernel backporting, patch management, and driver installation (a plus).
Excellent communication and collaboration skills with a customer-focused mindset.
Strong debugging and troubleshooting skills, especially across varied Linux environments.


Ways to stand out from the crowd:
MS in Computer Science, Electrical Engineering, or a related field.
Deep knowledge of Linux operating systems and distribution lifecycle management.
Experience contributing to public Linux distributions or upstream projects.
Familiarity with cloud or containerized environments (e.g., Docker, Kubernetes).
Experience supporting large-scale deployment environments in data center or HPC settings.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8421990
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 3 שעות
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
The company Networking Advanced Development Software team develops new groundbreaking technologies to enable new market shares for the company and tighten customer relationships. These are emerging technologies in networking and distributed computing for the booming AI factories and data centers. They span areas such as AI neural networks, Deep Learning, High Performance Computing (HPC), Storage, Cloud, SW Defined Network, Network Function Virtualization, 5G NR and more. We develop the solutions top-down, all the way from application behavioral analysis, to architecture definition and down to the implementation, using the world-leading company devices. The development traverses any needed component - application SW, middleware SW, OS kernel subsystems, device drivers, embedded SW (Firmware) and CUDA GPU. We collaborate with partners and key customers in the analysis processes and engage with open source communities introducing our leading features.
What youll be doing:
Design and implement solutions throughout all layers from high level application, OS and driver subsystem to firmware
Work on impactful projects involving state-of-the-art high-performance computing hardware and software
Provide insight and technical guidance and collaborate with peers from across the company - including software architecture, chip architecture, and engineering departments to improve our future technology
Collaborate with our company partners and customers.
Requirements:
B.Sc. in Computer Science, Electrical Engineering, Computer Engineering, or a related field
Understanding of multi core hardware, operating systems design, concurrency, virtual memory, caching, interrupts, device drivers, real-time
Programming skills
Ability to learn complex concepts in a fast pace environment.
A teammate with a can-do attitude, high energy and excellent interpersonal skills
Ways to stand out from a crowd:
Familiarity with networking protocols
Experience with open-source projects (coursework, personal, or contributions)
Working in a fast-paced and dynamic environment.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465199
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
18/11/2025
Location: Yokne`am
Job Type: Full Time
We are building state-of-the-art accelerated computing platforms that know no boundaries. Our next-generation Infiniband, NVLink, and Ethernet systems will continue to be at the forefront of connecting and powering the world's most advanced AI clusters. We are looking for a highly motivated and experienced SW networking senior software engineer to join our SAI development team.

This is an outstanding opportunity to join our high performance multi-site team and to work on some of the most pioneering technologies, implement and lead cutting-edge networking features for cloud, HPC and AI networks. We drive the data growth of the worlds biggest companies. With talented engineers around the globe, the work environment is dynamic, meaningful, and fast-paced.

What youll be doing:

Develop first tier features, with groundbreaking multi-protocol networking technology.

Lead features from planning through design and development, until delivery to the customer.

Work closely with other development teams, arch and verification to ensure features delivery on time with high quality.

Gain deep understanding of our products and technologies.
Requirements:
What we need to see:

B.Sc. degree or equivalent experience in Engineering/Computer Science/related field.

At least 5 years experience in development positions in the industry.

C programming experience - must, Python programming experience- an advantage.

High technical understanding and learning skills specification, design, programming, integration and debugging abilities.

Self-motivated, ability to work with little definition and supervision while multi-tasking and prioritizing across a number of projects and initiatives.

Experience with testing methodologies, some tasks will include developing sophisticated fully automated testing environment.

Excellent English communication and leading skills.

Ways to stand out from the crowd:

Experience in a Ethernet switching product development, Routing / Bridging protocols knowledge.

Experience in a multi-functional team and collaborate with teams in oversea sites.

Linux networking knowledge, TCP/IP stack.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8418809
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 3 שעות
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
We seek a highly motivated Network Performance Exploration Engineer to join our team of experts and help shape the foundational infrastructure for the AI revolution. Our next-generation networking systems are at the forefront of connecting and powering the world's most advanced AI clusters. As a key member of our architecture team, you will be responsible for exploring and identifying critical network optimization opportunities across our entire hardware and software stack, analyzing how system-level changes impact application-level performance.
What Youll Be Doing:
Explore and validate end-to-end application performance, defining comprehensive test plans and critical metrics to identify optimization opportunities in both hardware and software.
Establish and maintain a comprehensive database of benchmark results, tracking performance across releases to drive data-informed decisions.
Conduct deep-dive analysis into communication libraries (like NCCL), system software, and hardware configurations to investigate performance characteristics, validate architectural theories, and identify bottlenecks.
Provide critical performance data to correlate and enhance simulation tools, ensuring our models accurately predict real-world hardware behavior.
Analyze application-level traffic patterns (e.g., LLMs) on our advanced networking fabrics to identify hardware and software optimization opportunities and tune system parameters.
Lead Proof-of-Concept (POC) projects to prototype and evaluate potential hardware and software optimizations and their impact on application performance.
Requirements:
B.Sc. or M.Sc. degree in Computer Science, Computer Engineering, or Electrical Engineering, or equivalent experience.
5+ years of relevant industry or research experience in high-performance computing, computer architecture, or computer networks.
Hands-on programming skills in Python and/or C/C++ for system analysis, automation, and customizing benchmarks.
Excellent understanding of large-scale system behavior and the effect of distributed computing workloads on network and system performance.
Proven experience in performance analysis, benchmarking, and identifying system bottlenecks.
Exceptional analytical, problem-solving, and systems-thinking skills, with the ability to dive deep into complex software and hardware interactions.
Ability to thrive in a a fast-paced, dynamic environment and work concurrently with multiple cross-functional teams.
Ways To Stand Out From The Crowd:
Deep understanding of and hands-on experience with communication libraries such as NCCL, UCX, or MPI.
Direct experience debugging or modifying the source code of a major communication library.
Expertise in the architecture and system-level requirements of large-scale, distributed Deep Learning workloads (e.g., LLMs).
Expertise in high-performance network protocols (Ethernet, InfiniBand, RoCE) and interconnect technologies like NVLink.
Familiarity with the PyTorch ecosystem, especially for distributed workloads.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465097
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 3 שעות
Location: Yokne`am
Job Type: Full Time
We are now looking for a TensorRT-LLM Software Development Engineer!
we are hiring software engineers for its TensorRT-LLM team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning-powered AI, enabling breakthroughs in areas like LLM, ChatGPT and Generative AI that have put DL at the iPhone moment for AI. Join the team which is building the inferencing software which is foundational to product lines within our company and across the industry! The ability to work on a fast-paced delivery-focused team is required and excellent interpersonal skills are a must.
What you'll be doing:
Craft and develop robust inference software that can be scaled to multiple platforms for functionality and performance
Performance analysis, optimization, and tuning for Large Language Models (LLMs)
Conduct unit tests and performance tests for different stages of the inference pipeline.
Closely follow academic developments in the field of artificial intelligence and feature update TensorRT-LLM
Write safe, scalable, modular, and high-quality (C++/Python) code for our core backend software for LLM inference.
Collaborate across the company to guide the direction of deep learning inference, working with software, research and product teams.
Requirements:
Bachelors, Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree (or equivalent experience).
5+ years of relevant software development experience.
Excellent Python programming skills, software design, and software engineering skills
Awareness of the latest developments in LLM architectures and LLM inference techniques
Experience working with deep learning frameworks like PyTorch and HuggingFace
Proactive and able to work without supervision
Excellent written and oral communication skills in English
Ways to stand out from the crowd:
Prior experience with a LLM inference framework (TensorRT-LLM, SGLang, vLLM, etc.) or a DL compiler in inference, deployment, algorithms, or implementation
Prior experience with performance modeling, profiling, debug, and code optimization of a DL/HPC/high-performance application
Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design.
Architectural knowledge of CPU and GPU
GPU programming experience (CUDA or OpenCL).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465145
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
19/11/2025
Location: Yokne`am
Job Type: Full Time
Our Networking Software Group is rapidly growing, and we are hiring a Software Engineer for the InfiniBand Switch Software Development team. Come and join a diverse group of engineers spread across the globe who come together in small close knit teams to innovate and develop groundbreaking solutions.

As a member of the team, you will be a part of a cutting-edge Python-based SW project using advanced techniques to solve complex issues. You will gain unique knowledge of how operating systems work, Linux kernel and how large scale networks are constructed. Teams utilize the latest software engineering methodologies and tools in an agile fashion to release on time. Are you ready for this challenge? The Networking Hardware Acceleration team develops a cutting-edge, high speed API for NVIDIAs Network Interface Cards (NICs). We power foundational projects like DPDK and DOCA-Flow, driving next-generation networking performance. Join us to gain deep insights into NVIDIAs hardware acceleration technology and make a meaningful impact on both software and hardware innovation.

What Youll Be Doing:
Learn new networking features, plan their verification strategy, and implement it on top of a Python-based in-house developed environment.
Design, develop, optimize, and maintain an OS/Kernel verification testing platform.
Collaborate with team members, architects, design, QA teams, and customers (both external and internal).
Innovate! We are always looking for new ways to make NVIDIA's Networking driver products shine in customers' eyes.
Requirements:
What We Need To See:
B.S. degree or equivalent experience in Engineering/Computer Science/related field.
+1 years of relevant experience
Strong technical abilities, problem-solving, design, coding, and debugging skills.
Ability to lead feature development, take full ownership of tasks from A-Z and deliver independently with minimum supervision.
Great teammate with strong interpersonal skills.

Ways To Stand Out Of The Crowd:
Proven experience in Python programming.
Knowledge in Networking protocols and Linux kernel.
Experience in software verification or validation.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8421170
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
13/11/2025
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
We are looking for a versatile Senior Software Engineer for the DPU Platform team. This position offers the opportunity to have real impact in a multifaceted, technology-focused company affecting product lines that empower the most advanced data centers in the world. Using your deep knowledge of embedded platforms, operating systems, and software distribution technologies, you will work with a world-wide development team to solve the unique challenges of delivering the world's most powerful platforms.

What you'll be doing:

Develop system software components including processor firmware and bootloaders, kernel drivers/modules, and user space applications and libraries

Collaborating with hardware and product design teams to develop software for sophisticated SOC platform designs.

Assisting world-wide teams with various customers' and internal DPU projects.

Tackle complex system-level optimization and resource utilization challenges.

Participate across all levels of product development lifecycle that values high-standards for clear requirements, software quality and performance.

Collaborate within a worldwide matrixed software development team, and have broad impact within our highly-dynamic and technology-focused company.
Requirements:
What we need to see:

Bachelor's degree in Computer Science/Engineering or equivalent experience.

5+ years developing software for embedded systems (C is required, Python).

Proven understanding of the system software stack, with a focus on software/hardware interaction, including platform firmware, device drivers, Linux kernel, and how user-space applications utilize system services to achieve high performance.

A deep knowledge of high-performance processor architecture including CPU and cache coherency concepts, as well as hardware accelerators.

Well-rounded engineering skills, including technical investigation, design, testing, and agile software engineering process.

Outstanding written and oral communication skills.

Must be proficient in the C programming language.

Experienced with build environment tools (gcc, git, github, make, bitbake, shell scripts, gerrit, jenkins, etc).

Ways to stand out from the crowd:

Background with ARMv8 microarchitecture, ATF and/or UEFI software is a strong plus.

Experience with multiple Linux distributions, with the ability to compare and contrast them.

Experience developing security key management solutions is very desirable.

Exposure to secure boot flows and/or trusted computing environments.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8412818
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 2 שעות
Location: More than one
Job Type: Full Time
our company has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. Its a unique legacy of innovation thats fueled by great technologyand amazing people. Today, were tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing whats never been done before takes vision, innovation, and the worlds best talent. As a worker, youll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.
we are seeking a highly skilled and modern software engineer to develop and prototype brand new advancements in distributed training and inference using our companys Spectrum-X AI fabric. This role offers a rare chance to pioneer AI and networking technology, contributing to ground-breaking projects that will define the landscape of large-scale AI systems. Improve AI app-networking connection by refining communication, crafting congestion control, coding NIC firmware, and expanding switch SDK features for enhanced AI factory efficiency. Your work impacts large AI system development, scaling, and speed.
What youll be doing:
Prototype end-to-end solutions to improve distributed training and disaggregated inference performance.
Analyze and optimize communication flows across application, transport, and network layers.
Develop system software spanning communication libraries, drivers, and firmware integrations.
Collaborate with hardware, firmware, and SDK teams to co-design network features.
Validate and integrate prototypes into our companys AI infrastructure and products.
Requirements:
BSc/MSc/PhD in Computer Science or Electrical Engineering
5+ years of relevant experience and/or knowledge
Deep understanding of networking and communication internals NCCL, RDMA/RoCE, congestion control.
Hands-on experience with HW/SW/FW integration and low-level programming (C/C++, kernel, drivers).
Some background in distributed training systems (such as PyTorch DDP, Megatron-LM, DeepSpeed).
Ways to stand out from the crowd:
Demonstrated innovation and leadership turning prototypes into impactful product features.
Experience with programmable data planes (P4, eBPF, DOCA SDK, or switch SDKs).
Familiarity with NIC firmware scheduling, in-network compute, or congestion management.
Contributions to open-source projects, academic papers, or performance benchmarking tools.
Strong background in AI factory architectures, distributed inference, or network telemetry.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465368
סגור
שירות זה פתוח ללקוחות VIP בלבד