דרושים » תוכנה » Senior Software Engineer, DOCA

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 9 שעות
חברה חסויה
Location: Yokne`am and Ra'anana
Job Type: Full Time
We are looking for a Senior Software Engineer. You will work with highly experienced engineers to provide the world's outstanding SmartNIC products for cloud-computing, research, medical, automotive, finance, weather, telco, and more. We are developing some of the core libraries of the company DOCA SDK, rapidly growing DOCA functionality and use cases. With DOCA, developers can program the data center infrastructure by creating software-defined, cloud-native, secured, HW-accelerated services.
We also take significant part in the Linux-foundation DPDK (dpdk.org) project, and expand the company-Mellanox PMD in particular, providing the framework and common API for fast packet processing in user space. Our goal is to enable breakthrough network performance, using our companySmartNIC hardware capabilities and address the performance, scale and security demands of modern software-defined enterprise data centers and public cloud infrastructure.
What you'll be doing:
You will architect, design, and develop the next-generation technology in network acceleration, as well as work with best-in-class technical leaders in this domain
Engage with customers and architects to understand the requirements and derive the software design accordingly
Collaborate with other engineering teams that develop the upper layers applications like virtual switches (OVS, VPP, and etc.) and lower layers like driver, kernel, FW, and HW.
Requirements:
B.Sc. (or equivalent experience) in computer science/software engineering
5+ years confirmed experience of Programming C/C++
5+ years confirmed experience in Linux environment and tools
Deep experience with Networking Protocols mainly Ethernet, and security protocols
Experience with virtualization technologies
Strong analytical, debugging, and problem-solving skills
Deep knowledge of computer architecture and operating systems.
Experience in performance optimizations
Ways to stand out from the crowd:
Knowledge and experience in DPDK
Knowledge and experience with designing SDKs
Open Source Software Contributor to relevant projects (OvS, DPDK, Linux Kernel..)
A positive demeanor, a growth mindset, and excellent interactions with colleagues.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8321760
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 9 שעות
חברה חסויה
Location: Ra'anana
Job Type: Full Time
we are looking for an experienced SW Engineer with desire and ability to contribute and lead cutting edge Network Management System of most powerful super-computers in the world. Our team is growing, and we are looking for hardworking and self-motivated engineers to lead building of advanced, high scale SDN management solutions. You will be part of a dynamic team, working with amazing people. This crucial role will give you a rare opportunity to craft and deliver a new class of Data Center NMS product line.
What you'll be doing:
The team develops infrastructure for monitoring and gathering telemetry from production environments, running on the worlds largest supercomputers and datacenters.
The work environment is dynamic and challenging; we are innovating and inventing software products at the forefront of technology in terms of performance, scalability, and features.
Our team works closely with other engineering teams to co-design new features and software APIs.
Requirements:
B.Sc. or equivalent experience in computer science / software engineering.
5 years experience of Programming in Python and C/C++.
3 years experience in Linux environment and tools.
Deep knowledge of Networking Protocols InfiniBand, Ethernet.
Expert knowledge in computer architecture and operating systems.
Experience in performance optimizations.
Ways to stand out from the crowd:
You have positive attitude and work well with others.
Demonstrated use of creative ideas, providing solutions to challenging problems.
Knowledge in RDMA technology.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8321918
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 9 שעות
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
We are looking for a versatile Senior Software Engineer for the company DPU Platform team. This position offers the opportunity to have real impact in a multifaceted, technology-focused company affecting product lines that empower the most advanced data centers in the world. Using your deep knowledge of embedded platforms, operating systems, and software distribution technologies, you will work with a world-wide development team to solve the unique challenges of delivering the world's most powerful platforms.
What you'll be doing:
Develop system software components including processor firmware and bootloaders, kernel drivers/modules, and user space applications and libraries
Collaborating with hardware and product design teams to develop software for sophisticated SOC platform designs
Assisting world-wide teams with various customers' and internal DPU projects
Tackle complex system-level optimization and resource utilization challenges
Participate across all levels of product development lifecycle that values high-standards for clear requirements, software quality and performance
Collaborate within a worldwide matrixed software development team, and have broad impact within our highly-dynamic and technology-focused company.
Requirements:
Bachelor's degree in Computer Science/Engineering or equivalent experience
5+ years developing software for embedded systems (C is required, Python)
Proven understanding of the system software stack, with a focus on software/hardware interaction, including platform firmware, device drivers, Linux kernel, and how user-space applications utilize system services to achieve high performance
A deep knowledge of high-performance processor architecture including CPU and cache coherency concepts, as well as hardware accelerators
Well-rounded engineering skills, including technical investigation, design, testing, and agile software engineering process
Outstanding written and oral communication skills
Must be proficient in the C programming language
Experienced with build environment tools (gcc, git, github, make, bitbake, shell scripts, gerrit, jenkins, etc)
Ways to stand out from the crowd:
Background with ARMv8 microarchitecture, ATF and/or UEFI software is a strong plus
Experience with multiple Linux distributions, with the ability to compare and contrast them
Experience developing security key management solutions is very desirable
Exposure to secure boot flows and/or trusted computing environments.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8321946
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 9 שעות
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
We are seeking a highly motivated Senior Software Engineer with expertise in embedded software development to join our Data Processing Unit (DPU) Software Group. We are looking for a candidate with the ability to thrive in an environment with sophisticated software and hardware designs, take ownership and lead the SW development of key components of the DPU. The role includes working closely with HW, FW, and SW teams all over the world, and take our product to next level.
What youll be doing:
Design and develop high performance networking solutions based on our company's outstanding Bluefield networking cards hardware
Engage closely with customers and partners.
Collaborate with multiple teams in our multi-functional environment on developing new features/improvements.
Stay up to date with industry best practices, new technologies, and emerging trends in software verification.
Write fast, effective, maintainable, reliable and well documented code
Innovate! Bring our company's DPU products to shine in customer's view.
Requirements:
Bachelor's degree in Computer Science, Software Engineering, or a related field (or equivalent work experience).
5+ years of experience in writing programs using C/C++.
Experience with embedded SW development
Good background in designing, implementing, and debugging Software.
Experience in development under a Linux environment..
Extensive knowledge in Software debugging and problem solving skills.
Strong design, coding, analytical, debugging and problem-solving skills
Ability to work concurrently with multiple groups in the organization
Creative, motivated, and value driven person
Ways to stand out from the crowd:
Experience with networking applications and protocols.
Expertise in driver development along with deep knowledge of modern C++ programming.
Proficiency in Python development.
Background in BMC, UEFI, Secure Boot, U-Boot, ATF, and Yocto.
Previous experience working closely with hardware and board design teams.
Experience in software development within the Linux kernel.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8321937
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
1 ימים
Location: Ra'anana
Job Type: Full Time
we are looking for an excellent Software Engineer for the Switch SDK Group. You will join the SDK group and take our product to next level, working closely with various other design and architecture teams and gain a deep understanding of our companys products and technologies. our company has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. Its a unique legacy of innovation thats fueled by great technologyand amazing people.
Today, were tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing whats never been done before takes vision, innovation, and the worlds best talent. As a worker, youll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.
What youll be doing:
Design, develop, optimize and maintain APIs, tools and libraries for Switching, Routing, Analytics, Telemetry and many other modules
Collaborate with team members, Architects, QA teams, and customers (both external and internal)
Innovate & rapidly develop POC prototypes that can then be developed into full-fledged products/solutions.
Requirements:
B.Sc. in Software Engineering / Computer Science / related field or equivalent work experience will be considered as well
10+ years of experience as a Software Engineer, including experience with C programming
Experience with Embedded/ RT Embedded systems
Excellent C programming skills, with a keen eye for performance and writing optimized code
Strong analytical skills, deep knowledge of algorithms and proficiency with data structures
Excellent communication and documentation skills
Ways to stand out from the crowd:
Previous experience with Ethernet Switching or Routing protocols
Hands on Linux development, user-space and/or kernel-space.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8319754
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 10 שעות
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
we are leading groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU -- our invention -- serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables groundbreaking creativity and discovery, and powers inventions that were once considered science fiction, including artificial intelligence to autonomous cars. Come work for the team that brought to you NCCL, NVSHMEM & GPUDirect. Our GPU communication libraries are crucial for scaling Deep Learning and HPC applications! We're seeking a Senior Software Architect to help co-design next-gen data center platforms and scalable communications software.
DL and HPC applications have a huge compute demands and already run at scales of up to tens of thousands of GPUs. GPUs are connected with high-speed interconnects (e.g. NVLink, PCIe) within a node and with high-speed networking (e.g. InfiniBand, Ethernet) across nodes. Efficient and fast communication between GPUs directly impacts end-to-end application performance. This impact continues to grow with the increasing scale of next generation systems. This is an outstanding opportunity to advance the state-of-the-art, break performance barriers, and deliver platforms the world has never seen before. Are you ready to build the new and innovative technologies that will help realize our company's vision?
What you will be doing:
Investigate opportunities to improve communication performance by identifying bottlenecks in today's systems.
Design and implement new communication technologies to accelerate AI and HPC workloads.
Explore innovative solutions in HW and SW for our next generation platforms as part of co-design efforts involving GPU, Networking, and SW architects.
Build proofs-of-concept, conduct experiments, and perform quantitive modeling to evaluate and drive new innovations.
Use simulation to explore performance of large GPU clusters (think scales of 100s of 1000s of GPUs).
Requirements:
M.S./Ph.D. degree in CS/CE or equivalent experience.
5+ years of relevant experience.
Excellent C/C++ programming and debugging skills.
Experience with parallel programming models (MPI, SHMEM) and at least one communication runtime (MPI, NCCL, NVSHMEM, OpenSHMEM, UCX, UCC).
Deep understanding of operating systems, computer and system architecture.
Solid in fundamentals of network architecture, topology, algorithms, and communication scaling relevant to AI and HPC workloads.
Strong experience with Linux.
Ability and flexibility to work and communicate effectively in a multi-national, multi-time-zone corporate environment.
Ways to stand out from the crowd:
Expertise in related technology and passion for what you do. Experience with CUDA programming and our company GPUs. Knowledge of high-performance networks like InfiniBand, RoCE, NVLink, etc.
Experience with Deep Learning Frameworks such PyTorch, TensorFlow, etc. Knowledge of deep learning parallelisms and mapping to the communication subsystem. Experience with HPC applications.
Strong collaborative and interpersonal skills and a proven track record of effectively guiding and influencing within a dynamic and multi-functional environment.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8321599
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
1 ימים
חברה חסויה
Location: Tel Aviv-Yafo and Ra'anana
Job Type: Full Time
our company has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. Its a unique legacy of innovation thats fueled by great technologyand amazing people. Today, were tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing whats never been done before takes vision, innovation, and the worlds best talent. As a worker, youll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.
Are you ready to help build next generation solutions for High-Performance Web Servers and Proxy Servers, Content Delivery Networks, Financial Service market's High-Frequency Trading (HFT) platforms, High-Performance Disaggregated Storage? we are looking for an Excellent Senior Software Engineer to develop groundbreaking networking acceleration solutions for variety of markets. Development is done in an exciting and Agile environment on a widely deployed products that need constant improvements.
What youll be doing:
Design and develop high performance networking solutions based on our company's outstanding ConnectX and Bluefield networking cards hardware.
Work in a startup mode/group developing groundbreaking networking solutions.
Development of the entire solutions stack, from application level to networking card hardware access.
Development of kernel bypassing user space TCP/IP stack on top of our outstanding networking card hardware.
Invent and implement creative ways to improve performance and scalability.
Large scale and high quality deliveries.
Work closely with customers and partners.
Collaborate with multiple teams in our multi-functional environment on developing new features/improvements.
Requirements:
B.Sc. or M.Sc. in Computer Science or Electrical Engineering or equivalent experience.
5+ years of experience in each of the following areas: Software development in C/C++, Networking protocols, Linux environment.
Strong design, coding, analytical, debugging and problem-solving skills.
Ability to quickly adapt to new technology and go deep into new areas.
Independence and agility.
Good social and interpersonal skills.
Ways to stand out from the crowd:
Experience with low latency acceleration and performance improvement.
Experience with Linux user space/driver/kernel development.
Deep knowledge and understanding of TCP/IP stack.
Good view of system architecture and performance.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8319787
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 9 שעות
Location: Yokne`am
Job Type: Full Time
our company's networking performance team is developing tools for benchmarking and profiling our company's cutting-edge networking technologies. Our products are used by engineers within our companyand across the industry to optimize networking performance quickly and accurately. We are evolving our suite of expert-driven tools to create an AI-powered assistant that can autonomously benchmark and analyze complex networking performance issues in the world's most advanced data centers. This is an opportunity to work on an innovative strategic project, combining systems engineering with cutting-edge generative AI, directly impacting how our company's top customers optimize infrastructure.
What You'll Be Doing:
Build the core components of our AI assistant, translating a visionary architecture into a robust, production-ready system
Enhance and extend our core telemetry capture engine (Python and C++) to provide the high-fidelity data needed for analysis
Develop intelligent agents and workflows using modern AI frameworks to automate the 'benchmark-analyze-remediate' lifecycle
Design and implement sophisticated techniques for grounding LLMs with private data (e.g., RAG), ensuring the assistant's outputs are factually accurate and reliable
Partner closely with our domain experts in performance benchmarking and data analysis to create a cohesive, end-to-end platform.
Requirements:
B.Sc. in Computer Science or a related engineering field
A minimum of 5+ years of professional software development experience, for which an advanced degree with a strong portfolio of peer-reviewed research, conference presentations, significant open-source contributions, or other demonstrated expertise may be considered an equivalent
Experience developing applications with Large Language Models (LLMs), demonstrated through professional work, significant open-source contributions, or advanced academic research projects
Familiarity with modern AI development frameworks (e.g., LangChain, LangGraph, LlamaIndex) and concepts (Agentic AI, MCP servers, etc.)
A collaborative mindset with excellent communication skills, and a passion for mentoring and learning from talented peers
Ways To Stand Out From The Crowd:
M.Sc. or Ph.D. in a relevant field, particularly with a focus on AI or distributed systems
Proven proficiency in both C++ and Python
Experience connecting AI agents to unstructured data sources (e.g., databases, Confluence APIs, knowledge graphs)
A background in data analysis and visualization (e.g., pandas, Jupyter).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8321836
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
1 ימים
Location: Ra'anana
Job Type: Full Time
our company has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. Its a unique legacy of innovation thats fueled by great technologyand amazing people. Today, were tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing whats never been done before takes vision, innovation, and the worlds best talent. As a worker, youll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.
we are looking for a Senior Software Development Engineer to contribute to cutting-edge Network Management System of the most powerful super-computers in the world. Our team is growing, and we are looking for hardworking and self-motivated engineers to develop and verify advanced, high-scale SDN management solutions. You will be part of a dynamic team, working with amazing people.
What Youll Be Doing:
You will have a significant impact in developing the next-generation Unified Fabric Manager (UFM) product.
Help drive the underlying technology stack and implementation methodology, ensuring it competes at a world-class level.
Collaborate closely with other SW R&D teams and SW Architects to successfully implement ambitious projects.
Engage in performance tuning and automation to build a flawless operational environment.
Design and implement micro-services architecture to support our advanced, high-scale SDN management solutions.
Work in an agile environment, ensuring continuous improvement and innovation.
Requirements:
We are looking for candidates with the following proven qualifications and experience:
B.Sc. or equivalent experience in Computer Science or a related field.
10+ Hands-on experience with system software design, development, and maintenance, particularly in C/C++ programming.
Debugging and performance analysis skills are strictly required.
Significant advantage if you have Python programming experience.
Proficiency with Dockers, Kubernetes, and other orchestration tools.
Background with RESTful web services and experience with Continuous Integration and Continuous Delivery.
Excellent interpersonal and written communication skills to foster collaboration and inclusion.
Ways to stand out from the crowd:
Extensive knowledge and deep understanding of Linux system programming.
A track record of solving sophisticated problems with elegant solutions.
Demonstrated ability to deliver complex projects in previous roles.
Experience building infrastructures and tools to speed up development, testing, and release.
Experience in agile software development methodology.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8319866
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 10 שעות
Location: Ra'anana and Yokne`am
Job Type: Full Time
we are looking for an excellent Senior Software Developer to work on open-source cloud platforms such as Kubernetes. We are seeking an experienced engineer who is deeply technical, hands-on, and has a wide system view. You will design, build and deploy high-performance and scalable clouds based on our company's superior ConnectX NICs and Bluefield DPUs. We are looking to grow our teams with the smartest people in the world. If you're creative and autonomous, we want to hear from you!
What youll be doing:
Design and implement new features to accelerate Network and Storage
Work closely with open source communities, participate in the relevant working groups
Work with different teams across our company
Mentor members of the team, enabling them to deliver high-quality software.
Requirements:
BSc in Computer Science or equivalent program experience
5+ years of hands-on experience in software development, preferably with C/Python/Golang
Highly motivated with strong communication skills, ability to work successfully with multi-functional teams, developers, and architects
Coordinate effectively across organizational boundaries and geographies
Strong self-initiative, independence, and flexibility to a new technology
Deep understanding of network protocols, virtualization, and containers
Strong background in designing, implementing, and debugging complex software
Wide hands-on experience with Kubernetes or OpenStack echo systems
Ways to stand out from the crowd:
Experience with working on open source projects
Background with SR-IOV, K8S, K8S controllers, CNI.
Wide hands-on experience with OVN and OVS.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8321730
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 10 שעות
חברה חסויה
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
we are leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars.
Come work for the team that brought to you NCCL, NVSHMEM & GPUDirect. Our GPU communication libraries are crucial for scaling Deep Learning and HPC applications! We are looking for a motivated Performance engineer to influence the roadmap of our communication libraries. The DL and HPC applications of today have a huge compute demand and run on scales which go up to tens of thousands of GPUs. The GPUs are connected with high-speed interconnects (eg. NVLink, PCIe) within a node and with high-speed networking (eg. Infiniband, Ethernet) across the nodes. Communication performance between the GPUs has a direct impact on the end-to-end application performance; and the stakes are even higher at huge scales! This is an outstanding opportunity for someone with HPC and performance background to advance the state of the art in this space. Are you ready for to contribute to the development of innovative technologies and help realize our company's vision?
What you will be doing:
Conduct in-depth performance characterization and analysis on large multi-GPU and multi-node clusters.
Study the interaction of our libraries with all HW (GPU, CPU, Networking) and SW components in the stack
Evaluate proof-of-concepts, conduct trade-off analysis when multiple solutions are available
Triage and root-cause performance issues reported by our customers
Collect a lot of performance data; build tools and infrastructure to visualize and analyze the information
Collaborate with a very dynamic team across multiple time zones.
Requirements:
M.S. (or equivalent experience) or PHD in Computer Science, or related field with relevant performance engineering and HPC experience
3+ yrs of experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM)
Experience conducting performance benchmarking and triage on large scale HPC clusters
Good understanding of computer system architecture, HW-SW interactions and operating systems principles (aka systems software fundamentals)
Implement micro-benchmarks in C/C++, read and modify the code base when required
Ability to debug performance issues across the entire HW/SW stack. Proficient in a scripting language, preferably Python
Familiar with containers, cloud provisioning and scheduling tools (Kubernetes, SLURM, Ansible, Docker)
Adaptability and passion to learn new areas and tools. Flexibility to work and communicate effectively across different teams and timezones
Ways to stand out from the crowd:
Practical experience with Infiniband/Ethernet networks in areas like RDMA, topologies, congestion control
Experience debugging network issues in large scale deployments
Familiarity with CUDA programming and/or GPUs
Experience with Deep Learning Frameworks such PyTorch, TensorFlow.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8321604
סגור
שירות זה פתוח ללקוחות VIP בלבד