דרושים » תוכנה » Senior Software Developer

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 5 שעות
חברה חסויה
Location: Yokne`am and Tel Aviv-Yafo
Job Type: Full Time
We are spearheading the AI revolution and the creation of state-of-the-art accelerated compute platforms for global utilization. Our Network Modeling and Performance Insights group is seeking a skilled and driven Software Developer for the design and development of our infrastructure for a complex networking simulation as a service. In this role, you will be responsible for developing and optimizing our network simulation software, and to enhance its performance and quality. You will work on integrating this infrastructure with cloud computation services for various use cases and ensure the simulation is available as a service for internal and external customers. If you're passionate about tackling intricate challenges and contributing to comprehensive software solutions, we want to hear from you.

What you'll be doing:

Enhance simulation runtime and memory consumption through innovative optimization techniques.

Improve the quality of the simulation as a software product, ensuring robustness and reliability.

Expends the simulation versatility to accommodate new various and complex user use cases and bleeding-edge requirements.

Design and expose the simulation as a service to facilitate easier access for different stakeholders.

Integrate a new simulation management system, making simulated experiments data accessible to all users.

Design and develop a CI/CD infrastructure for our complex networking simulation tool, ensuring efficient deployment and smooth integration processes.
Requirements:
BSc or above in Computer Science, Computer Engineering, or a related field, or equivalent experience.

5+ years of relevant practical experience in software development, including working on a large-scale software product, preferably with strict performance considerations.

Proficiency in C++ and optimization techniques for improving code performance

In-depth knowledge of computer science fundamentals, and computer architecture.

Strong communication skills.

Experience with simulation environments (specifically, network related) - a significant advantage

Prior experience with multi-core computation and parallel code acceleration

Familiarity with cloud computing and parallelization of computational workloads - an advantage.

Experience in developing CI/CD pipelines and integrating services - an advantage.
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8499597
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
4 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
We are searching for a strong technical leader to own the backbone of our Networking Research capabilities. We are looking for an Engineering Manager to lead the development of our high-fidelity Network Simulation platform and the extensive on-premise infrastructure that powers it.

In this role, you will lead a team of performance simulation software engineers and DevOps/Infrastructure specialists. You will own the "Simulation-as-a-Service" product-a critical platform used by internal researchers to model next-generation data center architectures. Your mission is to ensure our simulations are accurate, performant, and accessible, while managing the large-scale compute clusters required to run them.

What you'll be doing:

Team Leadership: Manage and mentor a team of C++ software engineers and DevOps infrastructure engineers, fostering a culture of performance, reliability, and code quality.

Product Ownership (Sim-as-a-Service): Treat the internal simulation platform as a product. Work with research partners to define the roadmap, prioritize features, and ensure high availability for users.

High-Performance Simulation: Be responsible for the architecture and optimization of complex network simulation engines (C++ based), ensuring they can scale to model extensive data center topologies with high fidelity.

Infrastructure Management: Own the lifecycle of our on-premise compute clusters and servers. Drive decisions on hardware upgrades, prioritisation, and managing system resources.

DevOps & Automation: Lead the strategy for CI/CD pipelines, automated testing, and containerized deployments to ensure rapid iteration and stability of the simulation platform.

multi-functional Collaboration: Partner with the AI Agents team to expose simulation APIs, enabling agents to run experiments and gather data autonomously.
Requirements:
What we need to see:

MSc, Ph.D. or equivalent experience in Computer Science, Electrical Engineering, or a related field.

8+ years of hands-on software engineering experience, with a proven track record of leading technical teams in systems or infrastructure domains for 3+ years.

3+ years of managerial experience.

C++ Expertise: Strong background in C++ development for high-performance applications (System-level programming, concurrent programming).

Infrastructure & DevOps: Practical experience managing on-premise servers, Linux environments, and modern DevOps tools (Kubernetes, Slurm, Docker, Ansible).

Operational Rigor: Ability to manage "heavy" operations-ensuring uptime, monitoring system health, and optimizing hardware utilization.

Ways to stand out from the crowd:

Networking Knowledge: Deep understanding of computer networking fundamentals (TCP/IP, Ethernet, InfiniBand, Congestion Control) and data center architectures.

Simulation/Modeling: Experience with discrete event simulation (DES) or modeling complex systems.

HPC Background: Experience working with MPI, CUDA, or other High-Performance Computing frameworks.

Specific Simulators: Familiarity with standard network simulators like OMNeT++, NS-3, or similar proprietary tools.

Hardware Knowledge: Understanding of switch micro-architecture or NIC design is a significant plus.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8494134
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
21/12/2025
Location: Tel Aviv-Yafo
Job Type: Full Time
We are seeking a highly skilled Senior Networking AI Platform Engineer to join our Applied Networking AI group. In this role, you will help design and develop cutting-edge AI solutions, integrating them seamlessly into a variety of products. Youll collaborate closely with multi-functional teams of data scientists, software engineers, and DevOps professionals to ensure the efficient deployment, monitoring, and optimization of machine learning (ML) models.
As a key contributor, you will drive the entire software development lifecycle-from conceptualization and architecture to implementation and production-while working closely with engineering teams to solve complex problems and help build a successful company practice.
What you'll be doing:
Lead the design, development, and deployment of robust software systems across different platforms and environments
Architect, design, and implement scalable and high-performance software solutions, handling complex requirements and integrating various subsystems
Ensure systems are maintainable, flexible, and well-documented, with an emphasis on performance and security
Adapt to new tools, technologies, and frameworks, and be capable of taking ownership of the development process from conception to deployment
Supply innovative ideas and solutions, driving continuous improvement in both code quality and system efficiency
Develop and maintain scalable infrastructure for handling and deploying security and networking ML models in production, ensuring high availability, scalability, performance.
Design and implement data pipelines to efficiently process and transform large volumes of data for training and inference purposes.
Optimize and fine-tune ML models for performance, scalability, and resource utilization, considering factors such as latency, efficiency, and cost.
Collaborate with data scientists and software engineers to operationalize and deploy ML models, including model versioning, packaging, and integration with existing systems.
Requirements:
Bachelors or masters degree in computer science, Data Science, or a closely related discipline.
Over 5 years of experience in software development and/or MLOps.
Strong proficiency in programming languages such as Python, Java, C++.
Deep understanding of cloud services architecture and the ability to create real-world applications that include telemetry, authentication, authorization, and security standard methodologies.
Proven track record of leading complex software projects from concept to delivery.
A "can do" attitude with exceptional problem-solving skills and the ability to thrive in fast-paced environments..
Strong problem-solving skills and ability to solve and resolve sophisticated issues in a timely manner.
Excellent communication and collaboration skills, with the ability to work effectively in multi-functional teams.
Attention to detail and a focus on quality, ensuring robustness and reliability in production ML systems.
Experience with Kubernetes architecture and management is a plus.
Ways to stand out from the crowd:
Exude high energy and a positive attitude.
Stellar verbal and written communication skills.
Passionate about data science and implementation.
Have data science and GPU performance experience.
Want to make what was impossible possible!
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465950
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 3 שעות
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
We are looking for a passionate Software Engineer to join our Simulation development team. Our team is growing, and we are looking for hardworking and experienced engineers to take part in building advanced networking simulation solutions. You will be part of a team developing next generation Switch simulation, works closely with other SW R&D teams and SW Architects.

What you will be doing:

Design and develop advanced features simulating our world class Switches.

Develop solutions using advanced virtualization technology.

Write clean, efficient and maintainable code.

Collaborate with team members, SW R&D, Architects, Chip Design and FW.
Requirements:
What we need to see:

B.Sc. degree or equivalent experience in Computer Science / Software Engineering.

4+ years of experience.

Proficient knowledge and experience in C/C++.

Strong design, coding, analytical, debugging and problem-solving skills.

Full ownership & end-to-end responsibility.

Excellent social and written communication skills.

Ways To Stand Out From The Crowd:

Can do attitude, independency and agility.

Ability to quickly adapt to new technology and go deep into new areas.

Understanding of Networking Protocols - Ethernet, InfiniBand is an advantage.

Knowledge of Virtualization, especially with KVM/QEMU is an advantage.

Knowledge of Linux/Windows kernel and drivers development is an advantage.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8499894
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
2 ימים
חברה חסויה
Location: Yokne`am
Job Type: Full Time
We are looking for an AI Test Architect joining E2E Verification group to profile Innovative large scale Distributed training on our AI End-to-End solutions in a large scale supercomputing clusters. Provide insights on at-scale system design and tuning mechanisms for large-scale compute runs. You will work with the latest Accelerated Computing and Deep Learning software and hardware platforms, with researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, Switch, HCA, CPU and GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.

What youll be doing:

Profiling, benchmarking, and analyzing deep learning models to identify areas for optimization and improvement in terms of performance, efficiency, and accuracy, with a strong emphasis on networking aspects.

Collaborating closely with data scientists, researchers, development, automation teams to design and implement scalable training pipelines and frameworks that demonstrate large scale high -performance networking capabilities.

Staying up-to-date with the latest advancements in deep learning algorithms, architectures, our GPU technologies, and high-performance networking solutions.

Optimizing deep learning models for performance, memory usage, and power efficiency while maximizing high-performance networking features on our supercomputers.

Providing insights and recommendations based on the analysis of large-scale training results, specifically focusing on networking bottlenecks and optimizations, to improve model outcomes and achieve business objectives.

Collaborating with hardware engineers to guide the development and integration of efficient networking solutions for deep learning, including exploring network architecture optimizations and bringing to bear technologies such as RDMA or InfiniBand.
Requirements:
What we need to see:

B.Sc. in Computer Science, Software Engineering, or equivalent experience.

Strong understanding and practical experience with machine learning algorithms and techniques, with a specialization in deep learning and expertise in high-performance networking.

8+ years of overall experience, with CUDA programming for deep learning frameworks like TensorFlow, PyTorch, combined with expertise in networking libraries and protocols.

Ability to profile and optimize deep learning workflows, focusing on networking-related bottlenecks and optimizations, to improve overall performance and efficiency.

Exceptional analytical and problem-solving skill, with a keen attention to detail, particularly in identifying and resolving networking performance issues.

Excellent communication and collaboration skills, enabling effective teamwork and cooperation.

Familiarity with supercomputers, parallel computing, distributed systems, and high- performance networking technologies like RDMA or InfiniBand.

Ways to stand out from the crowd:

Demonstrated experience in successfully profiling and optimizing large-scale deep learning training on our supercomputers, with a significant focus on high-performance networking enhancements.

Experience with distributed deep learning, distributed training frameworks, or large-scale data pipelines enhanced by high-performance networking solutions.

Expertise in optimizing networking parameters, such as bandwidth, latency, or congestion control, for deep learning workloads.

Familiarity with our networking technologies, such as Mellanox InfiniBand, and their integration with deep learning workflows.

Strong understanding of high-performance networking protocols and standards and their application to deep learning.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8496288
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 3 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a 100% hands-on Storage Services Software engineer to join the block storage group. You will be a member of a team that builds the next generation block storage capabilities. You will work closely with a variety of teams and architects including the networking team, and external customers. You will take part in defining the software architecture and implementation of the most advanced storage services! Services that will need to meet extreme performance and scalability demands! We have crafted a team of extraordinary people stretching around the globe, whose mission is to push the frontiers of what is possible today and define the platform of tomorrow.

We work, think and learn as a team. We thrive in a deeply strong environment, and we're passionate about a culture that demands innovation and the highest standards. The rewards are sweet and include collaborating with some of the smartest people in the industry, an aggressive compensation plan that rewards top performers, and the opportunity to work on products that transform the way people work and play.

What youll be doing:

100% hands-on coding role in C language, kernel and userspace.

Research, design, implement and test, new and existing, networking features for distributed storage services and features of our block storage solution, in both Host and DPU environments.

Acquire understanding of the algorithms, the technicalities and the interaction with other components across our block storage ecosystem.

Analyze and solve challenging bugs and customer cases in large scale production systems, identifying issues in our or inbox kernel modules and often in other components. Drive new solutions based on any issues that arise.
Requirements:
What we need to see:

B.Sc., M.Sc.. in Computer Science, Electrical Engineering or related discipline (or equivalent experience).

15+ years of experience as a senior developer, preferably in the domains of storage, networking, and/or operating-systems.

Strong proficiency in C/C++ programming.

Knowledge of networking fundamentals and experience in Linux-based networking environments.

Familiarity with RDMA technologies, including Infiniband, RoCE, or iWARP, and experience with RDMA programming models, control and data paths. Comprehension of large and complexed systems.

Proven professional experience in designing and developing distributed systems; advantage for experience in block storage and/or networking systems.

Ability to work autonomously, with a proactive mindset and perseverance to solve day to day challenges.

Ability to quickly adapt to new technology and go deep into new areas

Excellent communication skills and a collaborative mindset.

Innovative approach, identifying opportunities to improve, accelerate, and reuse existing solutions.

Knowledge of cloud computing concepts, including virtualization, scalability, and data management.

Ways to Stand Out From the Crowd:

Linux Kernel coding experience.

Linux Kernel internals knowledge including memory management, scheduling, etc.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8499984
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a motivated and experienced Senior Software Engineer to join our Cloud and K8s Group. The successful candidate will possess a strong technical background in low-level systems programming and will excel in developing performant, efficient, and reliable software across multiple operating systems. Expertise in C++ and deep knowledge of Linux, macOS, and Windows internals are essential for this role, as you will be instrumental in building and optimizing our agent.

Key Responsibilities:

Design, implement, and optimize low-level system software components and libraries with a focus on performance and efficiency.
Analyze and debug complex issues related to operating system internals (kernel, drivers, memory management) across Linux, macOS, and Windows platforms.
Develop networking capabilities and optimize networking stack interactions within software modules.
Write clean, maintainable, and well-tested C++ code, while mentoring and reviewing peers contributions.
Collaborate closely with infrastructure, security, and product teams to design scalable and secure systems.
Contribute to CI/CD pipelines and automation workflows to streamline build, test, and deployment processes.
Develop and maintain scripting tools (e.g., Python, Bash, PowerShell) to support development and operational tasks.
Stay up to date with emerging technologies in systems programming, cybersecurity, and networking to continuously improve our solutions.
Requirements:
Bachelor's or Masters degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
Minimum of 5 years experience in software development with a strong focus on C++ and low-level programming.
Deep understanding of Linux, macOS, and Windows internals including kernel architecture, system calls, process and memory management.
Strong knowledge of networking protocols and experience writing performant and efficient code.
Experience with Golang is an advantage.
Background or interest in cybersecurity is a plus.
Familiarity with .NET development is beneficial.
Experience with CI/CD tools and pipelines (e.g., Jenkins, GitHub Actions) is preferable.
Proficient in scripting languages such as Python, Bash, or PowerShell.
Strong problem-solving skills and ability to work independently and in a team environment.
Excellent communication and collaboration skills.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8496587
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
our company's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to our s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.
In this role, you will work with system teams and the CPU Architecture team to develop an understanding of the Central Processing Unit (CPU), System on a Chip (SoC), performance metrics, benchmarks/measuring tools, and available optimization knobs. You will define methods and technologies to model CPU performance at different accuracy levels by supporting architectural explorations and decision making. You will correlate performance projections with measured post-silicon data.
The AI and Infrastructure team is redefining whats possible. We empower our company customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers, our company Cloud customers, and billions of our company users worldwide.
We're the driving force behind our company's groundbreaking innovations, empowering the development of our cutting-edge AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for our company Cloud, our company Global Networking, Data Center operations, systems research, and much more.
Responsibilities
Write product or system development code.
Design, develop, test, deploy, maintain, and improve Central Processing Unit (CPU) software modeling and other software tools.
Manage project priorities, deadlines, and deliverables.
Collaborate with hardware and software CPU architecture teams, SoC performance modeling team, and other company Software teams.
Requirements:
Minimum qualifications:
Bachelor's degree in Electrical Engineering, Computer Engineering, Computer Science, or a related field, or equivalent practical experience.
2 years of experience with software development in C++ programming language or 1 year of experience with an advanced degree.
Preferred qualifications:
Masters degree or PhD in Engineering, Computer Science, or a related technical field.
2 years of experience with data structures and algorithms.
Experience in modern CPU/Machine Learning (ML) architecture and micro-architecture.
Ability to learn coding languages.
Excellent object-oriented database design and SQL skills.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8473147
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
19/12/2025
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a motivated and experienced Senior software engineer to join our Cloud and K8s Group. The successful candidate will possess a strong technical background in low-level systems programming and will excel in developing performant, efficient, and reliable software across multiple operating systems. Expertise in C ++ and deep knowledge of Linux, macOS, and Windows internals are essential for this role, as you will be instrumental in building and optimizing our agent. Key Responsibilities:
* Design, implement, and optimize low-level system software components and libraries with a focus on performance and efficiency.
* Analyze and debug complex issues related to operating system internals ( Kernel, drivers, memory management) across Linux, macOS, and Windows platforms.
* Develop networking capabilities and optimize networking stack interactions within software modules.
* Write clean, maintainable, and well-tested C ++ code, while mentoring and reviewing peers contributions.
* Collaborate closely with infrastructure, security, and product teams to design scalable and secure systems.
* Contribute to CI/CD pipelines and automation workflows to streamline build, TEST, and deployment processes.
* Develop and maintain scripting tools (e.g., Python, Bash, PowerShell) to support development and operational tasks.
* Stay up to date with emerging technologies in systems programming, cybersecurity, and networking to continuously improve our solutions.
Requirements:
* Bachelor's or Masters degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
* Minimum of 5 years experience in software development with a strong focus on C ++ and low-level programming.
* Deep understanding of Linux, macOS, and Windows internals including Kernel architecture, system calls, process and memory management.
* Strong knowledge of networking protocols and experience writing performant and efficient code.
* Experience with Golang is an advantage.
* Background or interest in cybersecurity is a plus.
* Familiarity with.NET development is beneficial.
* Experience with CI/CD tools and pipelines (e.g., Jenkins, GitHub Actions) is preferable.
* Proficient in scripting languages such as Python, Bash, or PowerShell.
* Strong problem-solving skills and ability to work independently and in a team environment.
* Excellent communication and collaboration skills.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8398145
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
21/12/2025
חברה חסויה
Location: Yokne`am
Job Type: Full Time
we are looking for an experienced HPC DevOps and Network Engineer to help us build the supercomputers and HPC clusters of the future. As a Senior HPC DevOps Engineer, you'll be a key player in groundbreaking advancements in artificial intelligence and GPU computing. Your expertise will drive the latest breakthroughs, providing insights on at-scale system design and tuning mechanisms for large-scale compute runs. You will work with the latest Accelerated computing and Deep Learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.
What youll be doing:
Innovate and Implement: Design, implement, and maintain large-scale HPC/AI clusters with state-of-the-art monitoring, logging, and alerting systems.
Infrastructure as Code (IaC): Utilize and develop tools to manage infrastructure as code, ensuring scalable and repeatable deployments.
Streamline CI/CD Pipelines: Develop and maintain continuous integration and continuous delivery (CI/CD) pipelines to automate and streamline deployment processes.
Automate Everything: Develop automation scripts and tools to automate deployment, configuration management, and operational monitoring.
Develop complex Networking automations.
Troubleshoot Complex Issues: Perform comprehensive troubleshooting from bare metal to application level, ensuring system reliability and efficiency.
Lead and Educate: Serve as a technical resource, developing and sharing best practices with internal teams.
Drive Innovation: Support R&D activities and engage in proof of concepts (POCs) and proof of values (POVs) for future improvements.
Requirements:
B.Sc. in Computer Science, Engineering, or a related field with 5+ years of experience.
Deep knowledge of HPC and AI solution technologies, including CPUs, GPUs, high-speed interconnects, and supporting software.
Advanced proficiency in programming and scripting languages, with a solid understanding of object-oriented programming principles.
Familiarity with Jenkins, Ansible, Puppet/Chef.
Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu), networking and OS-level security.
Deep understanding of networking protocols such as InfiniBand and Ethernet.
Experience with job scheduling workloads and orchestration tools such as Slurm and Kubernetes.
Experience with multiple storage solutions like Lustre, GPFS, ZFS, and XFS.
Expertise with virtual systems (VMware, Hyper-V, KVM, Citrix).
Familiarity with cloud platforms (AWS, Azure, Google Cloud).
Ways to stand out from the crowd:
Proven networking experience or strong knowledge through professional networking training.
Architectural Insight: Knowledge of CPU and/or GPU architecture.
Container Expertise: Understanding of Kubernetes and container-related microservice technologies.
GPU Focus: Experience with GPU-focused hardware/software (DGX, CUDA).
RDMA Fabrics: Background with RDMA (InfiniBand or RoCE) fabrics.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465332
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
21/12/2025
Location: Tel Aviv-Yafo and Yokne`am
Job Type: Full Time
We seek a highly motivated Network Performance Exploration Engineer to join our team of experts and help shape the foundational infrastructure for the AI revolution. Our next-generation networking systems are at the forefront of connecting and powering the world's most advanced AI clusters. As a key member of our architecture team, you will be responsible for exploring and identifying critical network optimization opportunities across our entire hardware and software stack, analyzing how system-level changes impact application-level performance.
What Youll Be Doing:
Explore and validate end-to-end application performance, defining comprehensive test plans and critical metrics to identify optimization opportunities in both hardware and software.
Establish and maintain a comprehensive database of benchmark results, tracking performance across releases to drive data-informed decisions.
Conduct deep-dive analysis into communication libraries (like NCCL), system software, and hardware configurations to investigate performance characteristics, validate architectural theories, and identify bottlenecks.
Provide critical performance data to correlate and enhance simulation tools, ensuring our models accurately predict real-world hardware behavior.
Analyze application-level traffic patterns (e.g., LLMs) on our advanced networking fabrics to identify hardware and software optimization opportunities and tune system parameters.
Lead Proof-of-Concept (POC) projects to prototype and evaluate potential hardware and software optimizations and their impact on application performance.
Requirements:
B.Sc. or M.Sc. degree in Computer Science, Computer Engineering, or Electrical Engineering, or equivalent experience.
5+ years of relevant industry or research experience in high-performance computing, computer architecture, or computer networks.
Hands-on programming skills in Python and/or C/C++ for system analysis, automation, and customizing benchmarks.
Excellent understanding of large-scale system behavior and the effect of distributed computing workloads on network and system performance.
Proven experience in performance analysis, benchmarking, and identifying system bottlenecks.
Exceptional analytical, problem-solving, and systems-thinking skills, with the ability to dive deep into complex software and hardware interactions.
Ability to thrive in a a fast-paced, dynamic environment and work concurrently with multiple cross-functional teams.
Ways To Stand Out From The Crowd:
Deep understanding of and hands-on experience with communication libraries such as NCCL, UCX, or MPI.
Direct experience debugging or modifying the source code of a major communication library.
Expertise in the architecture and system-level requirements of large-scale, distributed Deep Learning workloads (e.g., LLMs).
Expertise in high-performance network protocols (Ethernet, InfiniBand, RoCE) and interconnect technologies like NVLink.
Familiarity with the PyTorch ecosystem, especially for distributed workloads.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8465097
סגור
שירות זה פתוח ללקוחות VIP בלבד