Senior Data Engineer

עדכון קורות החיים לפני שליחה

8333451

שירות זה פתוח ללקוחות VIP בלבד

משרות דומות שיכולות לעניין אותך

דיווח על תוכן לא הולם או מפלה

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

28/08/2025

Senior Software Engineer, Benchmarking and Analytics

חברה חסויה

Location: Yokne`am

Job Type: Full Time

our companys Networking Performance team develops advanced software tools that drive the optimization of the worlds fastest networking technologies. Our mission spans benchmarking, telemetry capture, and performance analysis, enabling both our company's R&D teams and our company's customers across the industry to root-cause bottlenecks, maximize throughput, and achieve world-class performance for AI and HPC workloads.
We are looking for a Senior Software Engineer to lead the design and development of next-generation performance engineering frameworks. In this role, you will focus on Python-based benchmarking and analysis systems, while also contributing to high-performance components in C/C++. Youll work on challenges at the intersection of networking, distributed computing, and AI, building tools that run at scale on clusters, clouds, and data centers.
What you'll be doing:
Design and implement performance benchmarking frameworks for next-generation AI and HPC workloads
Take full technical ownership of our core telemetry engine
Work across Python (primary) and C/C++ (for performance-critical modules) to deliver reliable and scalable tools
Collaborate with experts in networking, AI, and systems to translate performance engineering needs into powerful software solutions
Enhance our DevOps, owning the CI/CD pipelines and release processes for your projects
Drive technical innovation in the performance engineering ecosystem, including taking part in building our next-gen agentic AI assistant.

Requirements:
B.Sc. in Computer Science, or a related engineering field
5+ years of professional software development experience
A proven track record of technical ownership, making key architectural decisions, driving a technical agenda, and problem solving
Expert-level Python development skills, building robust, well-structured, production-grade applications
C/C++ experience, especially for performance-critical or low-level components
Experience with modern CI/CD pipelines and DevOps practices
Ways to stand out from the crowd:
Linux systems knowledge, including software packaging (RPM, DEB), and an understanding of the complexities of software distribution and dependencies
Experience with the Python data analysis and visualization frameworks (e.g., h5py, pandas, NumPy, Matplotlib/Plotly)
Experience with Slurm, Kubernetes, MPI, or other distributed job orchestration and cluster management systems
Familiarity with agentic AI concepts or frameworks (e.g., RAG techniques, LangChain, LangGraph, LlamaIndex, etc.)
Experience contributing to open-source projects.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8324017

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

27/08/2025

Senior HPC DevOps Engineer

חברה חסויה

Location: Yokne`am

Job Type: Full Time

we are looking for an experienced HPC DevOps Engineer to help us build the supercomputers and HPC clusters of the future. As a Senior HPC DevOps Engineer, you'll be a key player in groundbreaking advancements in artificial intelligence and GPU computing. Your expertise will drive the latest breakthroughs, providing insights on at-scale system design and tuning mechanisms for large-scale compute runs. You will work with the latest Accelerated computing and Deep Learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.
What youll be doing:
Innovate and Implement: Design, implement, and maintain large-scale HPC/AI clusters with state-of-the-art monitoring, logging, and alerting systems.
Infrastructure as Code (IaC): Utilize and develop tools to manage infrastructure as code, ensuring scalable and repeatable deployments.
Streamline CI/CD Pipelines: Develop and maintain continuous integration and continuous delivery (CI/CD) pipelines to automate and streamline deployment processes.
Automate Everything: Develop automation scripts and tools to automate deployment, configuration management, and operational monitoring.
Enhance Monitoring: Deploy advanced monitoring solutions for servers, networks, and storage to ensure seamless operations.
Troubleshoot Complex Issues: Perform comprehensive troubleshooting from bare metal to application level, ensuring system reliability and efficiency.
Lead and Educate: Serve as a technical resource, developing and sharing best practices with internal teams.
Drive Innovation: Support R&D activities and engage in proof of concepts (POCs) and proof of values (POVs) for future improvements.

Requirements:
B.Sc. in Computer Science, Engineering, or a related field with 5+ years of experience.
Deep knowledge of HPC and AI solution technologies, including CPUs, GPUs, high-speed interconnects, and supporting software.
Advanced proficiency in programming and scripting languages, with a solid understanding of object-oriented programming principles.
Familiarity with Jenkins, Ansible, Puppet/Chef.
Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu), networking and OS-level security.
Deep understanding of networking protocols such as InfiniBand and Ethernet.
Experience with job scheduling workloads and orchestration tools such as Slurm and Kubernetes.
Experience with multiple storage solutions like Lustre, GPFS, ZFS, and XFS.
Expertise with virtual systems (VMware, Hyper-V, KVM, Citrix).
Familiarity with cloud platforms (AWS, Azure, Google Cloud).
Ways to stand out from the crowd:
Architectural Insight: Knowledge of CPU and/or GPU architecture.
Container Expertise: Understanding of Kubernetes and container-related microservice technologies.
GPU Focus: Experience with GPU-focused hardware/software (DGX, CUDA).
RDMA Fabrics: Background with RDMA (InfiniBand or RoCE) fabrics.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8321669

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

4 ימים

Senior Data Center Engineer

חברה חסויה

Location: Yokne`am

Job Type: Full Time

Are you a person who likes to work in a fast paced organization? For two decades, our company has pioneered visual computing, the art and science of computer graphics. With our invention of the GPU - the engine of modern visual computing - the field has expanded to encompass personal computer games, movie production, product design, medical diagnosis and scientific research. Today, visual computing is becoming increasingly central to how people harmonize with technology, and there has never been a more exciting time to join our excellent team. we are now passionate about innovation at the intersection of visual processing, high performance computing, and artificial intelligence.
Our data centers, and the thousands of servers installed in them, are the foundation upon which our creative products and services are delivered. Our customer base consists of hundreds of deeply intelligent engineers who continually provide us with complex problems to be solved. We are looking for a highly motivated Data Center Engineer to join our team. Having depth and breadth knowledge of working in data center facilities in a large-scale distributed environment is a strength you'll need. You should have deep knowledge and experience in at least one of the following core areas: Project Management, Hardware Operations, Network Operations or Data Center Operations.
What you'll be doing:
Act as technical liaison for our technical teams.
Project manage highly impactful large and small scale data center and lab projects.
Participate in the installation, monitoring, maintenance, support, and optimization of all production server hardware.
Contribute to the development of the global DC knowledge base
Lead teams to deploy new data center infrastructure, deliver server upgrades, integration, rebuild and other projects as required
Build plans to optimize power, cooling and network resources
Inventing the forecast to efficiently match supply to demand. Predict data center growth and scaling issues before they occur and implement solutions. You will acquire new facilities and optimize existing ones to support the business growth.
You will hunt for and resolve issues as well as analyze data looking for trends and systemic issues. Implement process improvements and develop a cutting edge operational environment.

Requirements:
4+ years of experience in large-scale data center hardware deployments, building data centers and directly leading data center relocation projects.
Technical certification and/or relevant work experience.
Ability to understand internal customer problem and provide technical solution in timely manner
Demonstrable ability to prioritize effectively, and experience with technical project management.
Understanding of Data Center infrastructure including power , cooling , structured cabling (copper , fiber ).
Experience in developing and reviewing MOP and SOPs, to minimize our operating risks.
Strong interpersonal skills and ability to communicate effectively with a diverse group of customers and staff.
Ways to stand out of the crowd:
Deep understanding of data center power and cooling ( liquid ,air etc..) infrastructure, network and cabling infrastructure and root cause analysis skills.
knowledge of programming languages ( Python , SQL, Bash etc..)
Experience with Nautobot/Netbox , monitoring tools , Service Now and Inventory Management tools.
You're a self-starter with an attitude for growth, continuous learning, and constantly looking to improve the team and build strong business relationships
Attention to detail with superb interpersonal skills and the ability to effectively manage multiple priorities.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8328347

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

27/08/2025

Senior HPC Performance Engineer

חברה חסויה

Location: Tel Aviv-Yafo and Yokne`am

Job Type: Full Time

we are leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars.
Come work for the team that brought to you NCCL, NVSHMEM & GPUDirect. Our GPU communication libraries are crucial for scaling Deep Learning and HPC applications! We are looking for a motivated Performance engineer to influence the roadmap of our communication libraries. The DL and HPC applications of today have a huge compute demand and run on scales which go up to tens of thousands of GPUs. The GPUs are connected with high-speed interconnects (eg. NVLink, PCIe) within a node and with high-speed networking (eg. Infiniband, Ethernet) across the nodes. Communication performance between the GPUs has a direct impact on the end-to-end application performance; and the stakes are even higher at huge scales! This is an outstanding opportunity for someone with HPC and performance background to advance the state of the art in this space. Are you ready for to contribute to the development of innovative technologies and help realize our company's vision?
What you will be doing:
Conduct in-depth performance characterization and analysis on large multi-GPU and multi-node clusters.
Study the interaction of our libraries with all HW (GPU, CPU, Networking) and SW components in the stack
Evaluate proof-of-concepts, conduct trade-off analysis when multiple solutions are available
Triage and root-cause performance issues reported by our customers
Collect a lot of performance data; build tools and infrastructure to visualize and analyze the information
Collaborate with a very dynamic team across multiple time zones.

Requirements:
M.S. (or equivalent experience) or PHD in Computer Science, or related field with relevant performance engineering and HPC experience
3+ yrs of experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM)
Experience conducting performance benchmarking and triage on large scale HPC clusters
Good understanding of computer system architecture, HW-SW interactions and operating systems principles (aka systems software fundamentals)
Implement micro-benchmarks in C/C++, read and modify the code base when required
Ability to debug performance issues across the entire HW/SW stack. Proficient in a scripting language, preferably Python
Familiar with containers, cloud provisioning and scheduling tools (Kubernetes, SLURM, Ansible, Docker)
Adaptability and passion to learn new areas and tools. Flexibility to work and communicate effectively across different teams and timezones
Ways to stand out from the crowd:
Practical experience with Infiniband/Ethernet networks in areas like RDMA, topologies, congestion control
Experience debugging network issues in large scale deployments
Familiarity with CUDA programming and/or GPUs
Experience with Deep Learning Frameworks such PyTorch, TensorFlow.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8321604

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

27/08/2025

Senior Software Engineer, AI-Driven Performance Engineering

חברה חסויה

Location: Yokne`am

Job Type: Full Time

our company's networking performance team is developing tools for benchmarking and profiling our company's cutting-edge networking technologies. Our products are used by engineers within our companyand across the industry to optimize networking performance quickly and accurately. We are evolving our suite of expert-driven tools to create an AI-powered assistant that can autonomously benchmark and analyze complex networking performance issues in the world's most advanced data centers. This is an opportunity to work on an innovative strategic project, combining systems engineering with cutting-edge generative AI, directly impacting how our company's top customers optimize infrastructure.
What You'll Be Doing:
Build the core components of our AI assistant, translating a visionary architecture into a robust, production-ready system
Enhance and extend our core telemetry capture engine (Python and C++) to provide the high-fidelity data needed for analysis
Develop intelligent agents and workflows using modern AI frameworks to automate the 'benchmark-analyze-remediate' lifecycle
Design and implement sophisticated techniques for grounding LLMs with private data (e.g., RAG), ensuring the assistant's outputs are factually accurate and reliable
Partner closely with our domain experts in performance benchmarking and data analysis to create a cohesive, end-to-end platform.

Requirements:
B.Sc. in Computer Science or a related engineering field
A minimum of 5+ years of professional software development experience, for which an advanced degree with a strong portfolio of peer-reviewed research, conference presentations, significant open-source contributions, or other demonstrated expertise may be considered an equivalent
Experience developing applications with Large Language Models (LLMs), demonstrated through professional work, significant open-source contributions, or advanced academic research projects
Familiarity with modern AI development frameworks (e.g., LangChain, LangGraph, LlamaIndex) and concepts (Agentic AI, MCP servers, etc.)
A collaborative mindset with excellent communication skills, and a passion for mentoring and learning from talented peers
Ways To Stand Out From The Crowd:
M.Sc. or Ph.D. in a relevant field, particularly with a focus on AI or distributed systems
Proven proficiency in both C++ and Python
Experience connecting AI agents to unstructured data sources (e.g., databases, Confluence APIs, knowledge graphs)
A background in data analysis and visualization (e.g., pandas, Jupyter).

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8321836

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

4 ימים

Senior Software and Automation Engineer ICPE

חברה חסויה

Location: Yokne`am

Job Type: Full Time

We are looking for a creative and independent Software Engineer for tools, infrastructure, and workflows development for the IC test and product engineering group in our company Networking Business Unit.
our company Networking Business Unit has continuously reinvented itself over two decades. Our high-speed buses & network products are leading in the markets with innovative ways to improve speed and bandwidth from one generation to another and today we are known as the go-to place for End-to-End High-Speed Ethernet and InfiniBand Solutions.
We're looking to grow our company and build our teams with smart people who can join us at the cutting-edge technology. We need a creative individual, who will help move Network Silicon ICs products (Switch, NIC, SmartNic) from design to mass production. You will work with test engineers, test house, Design, IT and many other professionals in the organization for the development of tools and test infrastructure for speeding time to market and enabling next generation test capabilities, characterization and data analysis.
If you are passionate about enabling of the highest quality Network products in the market, we want to hear from you!
What you'll be doing:
Design, develop, and maintain mission-critical engineering applications and automation tools.
Build systems that automate test program validation, execution, and release processes.
Architect infrastructure for scalable test and data workflows targeting next-generation network silicon.
Collaborate with cross-functional teams to enhance HW/SW automation flows and characterization pipelines.
Support integration and deployment in manufacturing environments and Contract Manufacturers (CM).
Enable new capabilities in the CM
Leverage DevOps best practices (CI/CD, version control, infrastructure automation) to accelerate internal development cycles.
Work with various teams at our company to improve and automate data analysis capabilities for all engineering and characterization test results.

Requirements:
BSc or higher in Computer Science or related field, with 7+ years of hands-on software development experience.
Proficiency in C# and Python; C/C++ experience is a strong plus.
Proven experience in GUI, application development, and tool integration; web/cloud background is advantageous.
GIT high proficiency.
Outstanding customer orientation
Hands-on experience with CI/CD (Jenkins, GitLab pipelines), Git-based workflows, Linux environments, shell scripting, and virtualized infrastructure.
Passion for it just works automation and no repetitive tasks.
Excellent communication skills with diverse teams and functional groups
Agile, self-learning and high execution quality standards
Innovative approach for problem solving.
Ways to stand out from the crowd:
VBA or VB6 experience is a huge plus.
Semiconductor test knowledge or hands-on experience with ATE/DFT workflows.
Experience with HW/SW interfaces.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8328340

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

27/08/2025

Senior Software Architect - Deep Learning and HPC Communications

חברה חסויה

Location: Tel Aviv-Yafo and Yokne`am

Job Type: Full Time

we are leading groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU -- our invention -- serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables groundbreaking creativity and discovery, and powers inventions that were once considered science fiction, including artificial intelligence to autonomous cars. Come work for the team that brought to you NCCL, NVSHMEM & GPUDirect. Our GPU communication libraries are crucial for scaling Deep Learning and HPC applications! We're seeking a Senior Software Architect to help co-design next-gen data center platforms and scalable communications software.
DL and HPC applications have a huge compute demands and already run at scales of up to tens of thousands of GPUs. GPUs are connected with high-speed interconnects (e.g. NVLink, PCIe) within a node and with high-speed networking (e.g. InfiniBand, Ethernet) across nodes. Efficient and fast communication between GPUs directly impacts end-to-end application performance. This impact continues to grow with the increasing scale of next generation systems. This is an outstanding opportunity to advance the state-of-the-art, break performance barriers, and deliver platforms the world has never seen before. Are you ready to build the new and innovative technologies that will help realize our company's vision?
What you will be doing:
Investigate opportunities to improve communication performance by identifying bottlenecks in today's systems.
Design and implement new communication technologies to accelerate AI and HPC workloads.
Explore innovative solutions in HW and SW for our next generation platforms as part of co-design efforts involving GPU, Networking, and SW architects.
Build proofs-of-concept, conduct experiments, and perform quantitive modeling to evaluate and drive new innovations.
Use simulation to explore performance of large GPU clusters (think scales of 100s of 1000s of GPUs).

Requirements:
M.S./Ph.D. degree in CS/CE or equivalent experience.
5+ years of relevant experience.
Excellent C/C++ programming and debugging skills.
Experience with parallel programming models (MPI, SHMEM) and at least one communication runtime (MPI, NCCL, NVSHMEM, OpenSHMEM, UCX, UCC).
Deep understanding of operating systems, computer and system architecture.
Solid in fundamentals of network architecture, topology, algorithms, and communication scaling relevant to AI and HPC workloads.
Strong experience with Linux.
Ability and flexibility to work and communicate effectively in a multi-national, multi-time-zone corporate environment.
Ways to stand out from the crowd:
Expertise in related technology and passion for what you do. Experience with CUDA programming and our company GPUs. Knowledge of high-performance networks like InfiniBand, RoCE, NVLink, etc.
Experience with Deep Learning Frameworks such PyTorch, TensorFlow, etc. Knowledge of deep learning parallelisms and mapping to the communication subsystem. Experience with HPC applications.
Strong collaborative and interpersonal skills and a proven track record of effectively guiding and influencing within a dynamic and multi-functional environment.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8321599

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

27/08/2025

Senior Software Developer

חברה חסויה

Location: Tel Aviv-Yafo and Yokne`am

Job Type: Full Time

we are spearheading the AI revolution and the creation of state-of-the-art accelerated compute platforms for global utilization. Our Network Modeling and Performance Insights group is seeking a skilled and driven Software Developer for the design and development of our infrastructure for a complex networking simulation as a service. In this role, you will be responsible for developing and optimizing our network simulation software, and to enhance its performance and quality. You will work on integrating this infrastructure with cloud computation services for various use cases and ensure the simulation is available as a service for internal and external customers. If you're passionate about tackling intricate challenges and contributing to comprehensive software solutions, we want to hear from you.
What you'll be doing:
Enhance simulation runtime and memory consumption through innovative optimization techniques.
Improve the quality of the simulation as a software product, ensuring robustness and reliability.
Expends the simulation versatility to accommodate new various and complex user use cases and bleeding-edge requirements.
Design and expose the simulation as a service to facilitate easier access for different stakeholders.
Integrate a new simulation management system, making simulated experiments data accessible to all users.
Design and develop a CI/CD infrastructure for our complex networking simulation tool, ensuring efficient deployment and smooth integration processes.

Requirements:
BSc or above in Computer Science, Computer Engineering, or a related field, or equivalent experience.
5+ years of relevant practical experience in software development, including working on a large-scale software product, preferably with strict performance considerations.
Proficiency in C++ and optimization techniques for improving code performance
In-depth knowledge of computer science fundamentals, and computer architecture.
Strong communication skills.
Experience with simulation environments (specifically, network related) - a significant advantage
Prior experience with multi-core computation and parallel code acceleration
Familiarity with cloud computing and parallelization of computational workloads - an advantage.
Experience in developing CI/CD pipelines and integrating services - an advantage.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8321816

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

27/08/2025

Senior System Software Engineer, NCCL - Partner Enablement

חברה חסויה

Location: Tel Aviv-Yafo and Yokne`am

Job Type: Full Time

we are leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars.
Come work for the team that brought to you NCCL, NVSHMEM & GPUDirect. Our GPU communication libraries are crucial for scaling Deep Learning and HPC applications! We are looking for a motivated Partner Enablement Engineer to guide our key partners and customers with NCCL. Most DL/HPC applications run on large clusters with high-speed networking (Infiniband, RoCE, Ethernet). This is an outstanding opportunity to get an end to end understanding of the AI networking stack. Are you ready for to contribute to the development of innovative technologies and help realize our company's vision?
What you will be doing:
Engage with our partners and customers to root cause functional and performance issues reported with NCCL
Conduct performance characterization and analysis of NCCL and DL applications on groundbreaking GPU clusters
Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP, etc.)
Guide our customers and support teams on HPC knowledge and standard methodologies for running applications on multi-node clusters
Document and conduct trainings/webinars for NCCL
Engage with internal teams in different time zones on networking, GPUs, storage, infrastructure and support.

Requirements:
B.S./M.S. degree in CS/CE or equivalent experience with 5+ years of relevant experience. Experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM)
Excellent C/C++ programming skills, including debugging, profiling, code optimization, performance analysis, and test design
Experience working with engineering or academic research community supporting HPC or AI
Practical experience with high performance networking: Infiniband/RoCE/Ethernet networks, RDMA, topologies, congestion control
Expert in Linux fundamentals and a scripting language, preferably Python
Familiar with containers, cloud provisioning and scheduling tools (Docker, Docker Swarm, Kubernetes, SLURM, Ansible)
Adaptability and passion to learn new areas and tools
Flexibility to work and communicate effectively across different teams and timezones
Ways to stand out from the crowd:
Experience conducting performance benchmarking and developing infrastructure on HPC clusters. Prior system administration experience, esp for large clusters. Experience debugging network configuration issues in large scale deployments
Familiarity with CUDA programming and/or GPUs. Good understanding of Machine Learning concepts and experience with Deep Learning Frameworks such PyTorch, TensorFlow
Deep understanding of technology and passionate about what you do.

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8321595

שירות זה פתוח ללקוחות VIP בלבד

שמך המלאמה השם שלך?

מייל

תיאור

שליחה

תודה על שיתוף הפעולה

מודים לך שלקחת חלק בשיפור התוכן שלנו :)

המשרה נמחקה

תוכל לצפות בה בדף המשרות שלי

המשרה הוחזרה לרשימת תוצאות החיפוש

האם תרצה להסיר את המשרה מרשימת

המשרות השמורות שלך?

כן לא

אירעה שגיאה בשליחת פרטיך למשרה

28/08/2025

Senior System DevOps Engineer

חברה חסויה

Location: Yokne`am

Job Type: Full Time

we are seeking a highly skilled DevOps Engineer to join our Networking IC Product Engineering Group (ICPE). This is a unique opportunity to become a cornerstone of our DevOps practice, owning the critical systems that power our engineering innovation. Youll be responsible for the entire DevOps lifecyclefrom robust CI/CD pipelines to production line package releasesdriving efficiency, scalability, and reliability across the organization. You will work with a high degree of autonomy, expected to independently lead initiatives, design and implement optimal solutions, and collaborate with both internal stakeholders and external partners. If you're a self-motivated engineer who thrives in dynamic environments, takes initiative without waiting for direction, and enjoys improving and scaling engineering ecosystemswe want you with us.
What Youll Be Doing:
Develop and maintain robust, scalable CI/CD pipelines to ensure seamless software integration and delivery.
Collaborate with cross-functional teams to enhance build system reliability and efficiency.
Monitor, troubleshoot, and optimize system performance to ensure continuous, reliable operation.
Diagnose and resolve complex issues affecting the stability and performance of development and production environments.

Requirements:
Bachelor's degree in computer science, computer engineering, or equivalent experience.
5+ years of hands-on experience in CI/CD pipeline development and automation (e.g., Jenkins, GitLab CI/CD).
5+ years of experience in Python development.
5+ years of working with Linux distributions (e.g., RedHat, Ubuntu).
Proficiency in scripting languages (e.g., Bash, Ruby, Groovy) in a Unix/Linux environment.
Strong background in configuration and deployment management.
Expertise in version control systems (e.g., GitLab, Gerrit).
Exceptional problem-solving skills, with a focus on identifying root causes and implementing long-term fixes.
Excellent communication and interpersonal skills; strong team spirit and cross-team collaboration mindset.
Proven ability to work independently, prioritize tasks, and drive initiatives without constant supervision.
Ways To Stand Out From The Crowd:
Experience with PyTest or other testing frameworks.
Previous leadership experience or a track record of mentoring/team-leading.
Familiarity with databases (e.g., MongoDB or similar).
Hands-on experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).

This position is open to all candidates.

עדכון קורות החיים לפני שליחה

8322916

שירות זה פתוח ללקוחות VIP בלבד