דרושים » הנדסה » Software Engineer - AI Datacenter Orchestration

משרות על המפה
 
בדיקת קורות חיים
VIP
הפוך ללקוח VIP
רגע, משהו חסר!
נשאר לך להשלים רק עוד פרט אחד:
 
שירות זה פתוח ללקוחות VIP בלבד
AllJObs VIP
כל החברות >
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a talented and motivated Software Engineer to join our newly formed team developing orchestration tools and platforms for AI datacenters.
The main goal of this team is to create customer-focused orchestration solutions that simplify the deployment, management, and optimization of large-scale AI workloads across a full datacenter stack - including switches, hosts, smart NICs, GPUs, ROCm, and RCCL.
You will work on the design and development of orchestration systems that bridge compute, networking, and AI acceleration domains, primarily using Python and modern full-stack technologies.
Key Responsibilities
* Design and develop software components for orchestration platforms managing AI datacenter infrastructure.
* Implement control and coordination mechanisms for compute, network, and AI acceleration resources.
* Develop backend services, APIs, and UI components using Python and modern full-stack frameworks.
* Collaborate with cross-functional teams - including networking, GPU, and system software - to integrate orchestration capabilities across multiple layers.
* Participate in architecture discussions, code reviews, and continuous integration processes.
* Contribute to testing, validation, and performance improvements of orchestration systems.
* Engage with product and customer teams to translate operational needs into effective software solutions.
Preferred Qualifications
* Exposure to **AI workloads** and GPU ecosystems (ROCm, RCCL, PyTorch, TensorFlow).
* Experience with **distributed systems, control-plane software, or cluster management frameworks**.
* Familiarity with **REST/gRPC APIs**, **microservices**, and **cloud-native architectures**.
* Background in **monitoring, telemetry, or resource scheduling systems**.
* Practical experience in **full-stack development** (React, Angular, Node.js, or equivalent).
* Experience with **test automation frameworks** (pytest, Robot Framework, etc.).
Requirements:
3+ years of experience in software development, preferably in infrastructure, orchestration, or systems software.
Strong proficiency in Python, including experience with backend or orchestration frameworks.
Familiarity with datacenter or cloud infrastructure, including networking, compute, or storage systems.
Experience with containers and orchestration platforms (Docker, Kubernetes).
Solid understanding of software engineering principles, including design patterns, testing, and CI/CD.
Strong collaboration and communication skills, with the ability to work in a multidisciplinary environment.
Preferred Qualifications
Exposure to AI workloads and GPU ecosystems (ROCm, RCCL, PyTorch, TensorFlow).
Experience with distributed systems, control-plane software, or cluster management frameworks.
Familiarity with REST/gRPC APIs, microservices, and cloud-native architectures.
Background in monitoring, telemetry, or resource scheduling systems.
Practical experience in full-stack development (React, Angular, Node.js, or equivalent).
Experience with test automation frameworks (pytest, Robot Framework, etc.).
This position is open to all candidates.
 
Hide
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8485588
סגור
שירות זה פתוח ללקוחות VIP בלבד
משרות דומות שיכולות לעניין אותך
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
We are seeking a Senior Network Engineer to join our AI datacenter development team. This role involves testing, validating, and scaling advanced network infrastructure across multi-vendor environments. You will be responsible for replicating real-world AI datacenter topologies, validating performance under stress, and ensuring network solutions deliver uncompromising reliability, scalability, and performance.
Key Responsibilities
Define and execute regression, functional, performance, and scale test suites for network infrastructure
Investigate complex performance and scaling bottlenecks in AI datacenters
Design and build customer-scale testbeds emulating diverse network architectures
Write and debug automation scripts (Python, Bash) to drive traffic generators and manipulate test environments
Analyze large-scale test data to identify root causes of hardware/software issues across multi-vendor platforms
Collaborate with R&D and architecture teams to validate ASIC and micro-architectural behaviors
Support POC efforts, innovation initiatives, and customer use-case reproductions
Generate comprehensive test reports highlighting results, insights, and recommendations
Validate network protocols and hardware-software interactions in AI workloads.
Requirements:
8+ years of experience in system validation, performance testing, or troubleshooting in networking environments
Strong understanding of network protocols (L2/L3) and hardware-software interactions
Hands-on experience with congestion control, collective communication frameworks, and AI-scale workloads
Proficiency in Linux-based systems with strong Python scripting and automation skills
Practical experience with network traffic generators for performance and stress testing
Ability to debug across multi-vendor platforms
Excellent troubleshooting skills, curiosity, and problem-solving mindset
Strong communication and collaboration skills with ownership over end-to-end testing
Preferred Qualifications
Industry certifications (CCNP) or equivalent expertise
Knowledge of AI/ML workloads and distributed training
Experience with high-speed networking (InfiniBand, Ethernet, RDMA)
Familiarity with containerization and orchestration.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8485654
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
28/01/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Your Career We are looking for a highly talented technical individual to join the Cortex XDR infrastructure team. The team is responsible for developing automation infrastructure and various cloud based tools and platforms that are used across the research, development and QA departments to ensure the functionality, stability and quality of the XDR product, alongside the efficiency of the infrastructure and process used to build, test and deploy on various clouds and distributions. We believe that the platforms & infrastructure that the team provides are a critical & crucial part of our department's progress to the modern future and one of our key growth factors. As a Platform engineer you will play a pivotal role in enhancing our development and automation experience by pushing forward modern automation approaches, eliminating manual efforts and introducing new development operations for continuous integration, scale & durability using advanced cloud services. Your expertise will be used in areas such as infrastructure development, cloud based automation, serverless infrastructure, automation as a service, providing technical guidance, and pushing infrastructure\configuration as a code and GitOps approach across the development departments. To succeed in this role, you should have a strong foundation in modern cloud based automation methodologies and a comprehensive understanding of industry best practices, especially in redundancy and scalability of large systems and the ability to control them via SCM based declarative configs. You should be familiar with modern public clouds approaches and serverless based architectures, including virtualization containers and container based orchestration including multiple Kubernetes based deployments. You should be comfortable engaging in complex technical discussions and advocating for optimal solutions in a fast-paced growing environment as part of our quest for continuous improvement. Your Impact Utilize modern technologies including serverless cloud services, Kubernetes, Terraform, among others, and use them all in an infrastructure/configuration as a code GItOps approach to manage everything via source code and continuous integration processes Design and implement (hands on) the next generation of platforms, automation frameworks, SDKs, and tools to be used across our entire R&D group, and be part of our infrastructure transition to the cloud Develop and maintain a cloud based test execution system, that supports parallel executions on multiple operating systems and multiple cloud providers and at a very large scale, and by so helping reduce the amount of effort required to perform automatic testing and manual testing, and reduce time to market Provide tools, systems and simulators for scaling up all lifecycle phases of our products and services including cross company and third party integrations and frameworks to be used in high scale Introduce progress and help revolutionize our operations and lay the foundation for innovation and growth.
Requirements:
Your Experience At least 4 years of hands-on experience as one of the following - Platform/InfraOps Engineer, DevOps , Cloud Infrastructure Engineer or equivalent Hands-on experience working with cloud services in big public Clouds (Azure, AWS, GCP) Experience with designing and implementing cloud based infrastructure (especially serverless components), alongside using infrastructure as Code tools such as Terraform and Pulumi to automatically build and maintain the provisioned cloud infrastructure Strong programming skills in Python (or another high level language), with vast experience in Object-Oriented Programming, including Design Patterns, Algorithms and Data Structures Strong experience with containerization technologies (docker, containerd) and orchestration , especially with various Kubernetes deployments, both self-managed and cloud managed deployments.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8521930
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
28/01/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
Your Career The SASE Platform team builds and operates highly available, secure, and globally distributed services that protect users, applications, and data for some of the worlds largest enterprises. Our mission is to deliver cloud-native security and networking capabilities that seamlessly converge networking and security at scale. As enterprises accelerate adoption of cloud, remote work, and AI-driven workloads, the need for resilient, observable, and secure SASE platforms has never been greater. As an SRE, you will play a critical role in ensuring our platform is reliable, scalable, performant, and secure from day one. Your Impact As a Site Reliability Engineer, you will be an integral part of the product and platform lifecycle, partnering closely with software engineers, security experts, and infrastructure teams. You will: Collaborate with development teams to embed reliability, scalability, and operability into services from the earliest design stages Design, review, and evolve cloud-native architectures to improve availability, performance, cost efficiency, and fault tolerance Build and operate automation for provisioning, deploying, and managing infrastructure at global scale using Infrastructure as Code Improve CI/CD pipelines and release processes to enable safe, fast, and repeatable deployments Drive observability best practices, including metrics, logs, traces, SLIs/SLOs, and data-driven incident analysis Participate in on-call rotations, continuously reducing MTTR through automation, runbooks, and proactive reliability improvements Mentor and guide engineers on large-scale cloud and SASE deployments, fostering a strong SRE culture Participate in architecture and design reviews, bringing a reliability and operational excellence mindset Champion reliability, security, and operational maturity across the organization.
Requirements:
Your Experience Bachelors degree in Engineering, Computer Science, or a related technical field (or equivalent practical experience) 5+ years of experience working with Unix/Linux systems (shell, tools, networking, storage, kernel concepts) 2+ years of hands-on experience with microservices architectures running on Kubernetes and container platforms Strong understanding of distributed systems design, fault tolerance, scalability patterns, and high-availability architectures Experience operating workloads in public cloud environments (AWS, GCP, Azure, or hybrid) at medium to large scale Proficiency in building automation and tools in Python, Java, or similar languages for production environments Strong Infrastructure as Code experience (Terraform, Ansible, Chef, Puppet, or similar) Experience designing and operating monitoring, alerting, and observability systems at scale A tools-first mindset with a passion for reducing toil and increasing engineering efficiency Excellent communication skills and the ability to lead discussions across engineering and security teams Experience applying reliability and security frameworks to design, review, and operate production systems Nice to have: Networking expertise, including TCP/IP, DNS, BGP, routing, load balancing, proxies, VPNs, and cloud networking concepts-especially relevant to SASE architectures Experience operating or supporting SASE, SD-WAN, Zero Trust, or network security platforms Familiarity with AI/LLM technologies, including: Using LLMs to improve operational workflows (incident analysis, alert enrichment, runbooks, automation) Experience integrating AI/ML services into production systems Understanding of reliability, security, and governance considerations for AI-driven services.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8522215
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
7 ימים
Location: Tel Aviv-Yafo
Job Type: Full Time
we are transforming how organizations build, run, and scale AI and accelerated compute workflows with NeuralMesh, our intelligent, adaptive mesh storage system. Unlike traditional data infrastructures, which become more fragile as compute environments grow and performance demands increase, NeuralMesh becomes faster, stronger, and more efficient as it scales - providing a flexible, adaptable foundation for enterprise and agentic AI innovation that maximizes GPU utilization, accelerates time to first token, and lowers the cost of innovation.
we are a growth-stage company backed by world-class venture capital investors and AI infrastructure industry leaders. Our technology, purpose-built for AI, has garnered over 140 patents and is trusted by more than 30% of Fortune 50 enterprises, as well as the worlds leading hyperscalers, neoclouds, and AI innovators. Our team is customer-obsessed and works accountably, boldly, and collaboratively to ensure customer success. If we sound like your kind of people, join us!
About the role
At our company, were building a next-generation platform for validating large-scale distributed systems. Our goal is to continuously ensure the correctness, performance, and resilience of the company Data Platform across every layer of the stack.
As a Senior Software Engineer, youll work hands-on on the systems and frameworks that test, stress, and validate complex distributed infrastructure under real-world conditions. Youll help design and build automated environments that simulate scale, concurrency, and failure scenarios, and youll contribute to evolving how we ensure reliability and correctness in modern infrastructure systems.
This role is ideal for engineers with a strong distributed systems background who enjoy deep technical problem-solving, working close to the system, and building tools that improve quality, stability, and confidence at scale.
What Youll Do
Design and implement core components of a distributed testing infrastructure and quality platform.
Build automated frameworks to validate functionality, performance, and resilience at scale.
Collaborate closely with infrastructure, storage, and platform teams to ensure quality is built into the development lifecycle.
Contribute to improving tooling, test coverage, and engineering best practices across the organization.
Requirements:
Strong experience (5+ years) building or working on large-scale distributed systems in areas such as storage, networking, cloud infrastructure, or backend platforms.
Solid understanding of concurrency, system correctness, and reliability in production systems.
Hands-on programming experience in one or more of the following languages: Go, C++, Rust, or Python.
Experience building test frameworks, infrastructure tooling, or internal platforms is a strong advantage.
Curiosity and interest in modern approaches to testing, automation, and system validation (including AI-assisted techniques).
Ability to work independently on complex technical problems while collaborating effectively with cross-functional teams.
Nice to Have
Experience with observability, performance testing, fault injection, or chaos engineering.
Familiarity with CI/CD pipelines for large-scale systems.
Exposure to AI/ML-driven testing or automation tools.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8531975
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
Location: Tel Aviv-Yafo
Job Type: Full Time
We are looking for a motivated and experienced Senior Software Engineer to join our Cloud and K8s Group. The successful candidate will possess a strong technical background in low-level systems programming and will excel in developing performant, efficient, and reliable software across multiple operating systems. Expertise in C++ and deep knowledge of Linux, macOS, and Windows internals are essential for this role, as you will be instrumental in building and optimizing our agent.

Key Responsibilities:

Design, implement, and optimize low-level system software components and libraries with a focus on performance and efficiency.
Analyze and debug complex issues related to operating system internals (kernel, drivers, memory management) across Linux, macOS, and Windows platforms.
Develop networking capabilities and optimize networking stack interactions within software modules.
Write clean, maintainable, and well-tested C++ code, while mentoring and reviewing peers contributions.
Collaborate closely with infrastructure, security, and product teams to design scalable and secure systems.
Contribute to CI/CD pipelines and automation workflows to streamline build, test, and deployment processes.
Develop and maintain scripting tools (e.g., Python, Bash, PowerShell) to support development and operational tasks.
Stay up to date with emerging technologies in systems programming, cybersecurity, and networking to continuously improve our solutions.
Requirements:
Bachelor's or Masters degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
Minimum of 5 years experience in software development with a strong focus on C++ and low-level programming.
Deep understanding of Linux, macOS, and Windows internals including kernel architecture, system calls, process and memory management.
Strong knowledge of networking protocols and experience writing performant and efficient code.
Experience with Golang is an advantage.
Background or interest in cybersecurity is a plus.
Familiarity with .NET development is beneficial.
Experience with CI/CD tools and pipelines (e.g., Jenkins, GitHub Actions) is preferable.
Proficient in scripting languages such as Python, Bash, or PowerShell.
Strong problem-solving skills and ability to work independently and in a team environment.
Excellent communication and collaboration skills.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8496587
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
18/01/2026
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Were looking for a Senior Software Engineer to join our growing R&D team. In this role, you will play a critical part in designing, building, and optimizing complex systems that power our AI-driven platform. Youll work across the stack- primarily on backend services - with opportunities to influence architectural decisions and build highly scalable and performant systems. Youll collaborate closely with AI, product, and frontend teams to bring advanced features to life and ensure a seamless, intelligent experience for our users.

This is a high-impact role for someone who is passionate about engineering excellence, eager to shape systems end-to-end, and ready to grow with a fast-moving, AI-first company.

Key Responsibilities:
Design, develop, and maintain robust backend systems and services.
Ensure the scalability, performance, and security of backend components.
Collaborate with front-end developers and data teams to integrate user-facing elements with server-side logic.
Optimize the platform's infrastructure to handle large-scale data processing and analysis.
Troubleshoot and debug complex issues, identifying and implementing the most effective solutions.
Contribute to the architecture and system design decisions for the backend infrastructure.
Stay up to date with industry trends and new technologies to continuously improve backend performance.
Requirements:
7+ years of software development experience in a fast-paced SaaS environment.
Strong experience with server-side technologies, particularly Node.js, Python and SQL.
In-depth knowledge of databases; experience in schema design and optimization.
Expertise in API development and microservices architecture.
Familiarity with cloud platforms such as Google Cloud/AWS.
Understanding of containerization and orchestration tools (Docker, Kubernetes).
Experience with message queues (e.g., RabbitMQ, Kafka or their cloud alternatives such as SQS/pubsub) and data processing.
Experience with client-side technologies (e.g. React) is a plus
Applied AI or video editing knowledge is a big plus.
Excellent problem-solving skills with a focus on scalability and performance.
Ability to work independently while also thriving in a collaborative team environment.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8506638
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 8 שעות
Location: More than one
Job Type: Full Time
We are looking for a Senior HPC and AI Cluster Administrator to join the Networking clusters solutions HPC/AI Infrastructure team. We are building supercomputers and AI clusters based on groundbreaking technologies. We are looking for a system administrator to be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and GPU computing

You will work with the latest Accelerated computing and Deep Learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms. Does this sound like you? If so, we would love to hear from you!

What you will be doing:

Deploy, manage and maintain large scale HPC/AI clusters.

Managing Linux job/workload schedules and orchestration tools.

Support and maintain continuous integration and delivery pipelines.

Troubleshooting and fixing, bottom up from bare metal, operating system, software stack and application level.

Supporting Research & Development activities and engaging in POCs/POVs for future improvements.
Requirements:
What we need to see:
Bachelor's Degree in Computer Science, Engineering, or a related field; or equivalent experience.

5+ years of experience.

Knowledge of HPC and AI solution technologies from CPUs and GPUs to high speed interconnects and supporting software.

Experience with job scheduling workloads and orchestration tools such as Slurm, K8s.

Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu) networking (sockets, firewalls, iptables, wireshark, etc.) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc.

Experience with multiple storage solutions such as Lustre, GPFS, zfs and xfs. Familiarity with newer and emerging storage technologies.

Python programming and bash scripting experience, automation and configuration management tools such as Jenkins, Ansible, Gitops.

Knowledge of Networking Protocols like InfiniBand, Ethernet.

Experience with virtual systems (for example VMware, Hyper-V, KVM).

Familiarity with cloud computing platforms (e.g. AWS, Azure, Google Cloud).

Ways to stand out from the crowd:

Knowledge of CPU and/or GPU architecture.

Knowledge of Kubernetes, container related microservice technologies.

Experience with GPU-focused hardware/software (DGX, Cuda).

Background with RDMA (InfiniBand or RoCE) fabrics.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8542260
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
חברה חסויה
Location: Tel Aviv-Yafo
Job Type: Full Time
Required Software Engineer - Al Platform
About Us
We help modern, mid-size businesses transform the way they manage people, giving HR and managers all they need to connect, engage, develop, and retain top talent. Since 2015, weve achieved consecutive triple-digit year-over-year growth, all backed by our amazing team of Bobbers from across the globe, making us the choice HRIS of over ~4500 midsize and multinational companies and over 1 Milion users.
Our HR platform is intuitive, data-driven, and built for the way people work today: globally, remotely, and collaboratively.
What this role is really about
Were building our internal AI Platform as a foundation that enables safe, scalable, and cost‑effective AI across the company. Our mission is to support internal business units in becoming more productive and driving business growth. As a Software Engineer on Business Innovation (Internal), youll develop high‑leverage platform capabilities used by Product, Customer Success, Sales, Operations, Data, and IT. Youll work across the entire stack, from APIs and services to data and UI. transforming business requi
Build core AI platform services - Design and implement agent orchestration, prompt management, RAG, Connectors, and evaluation pipelines that power AI experiences across the company.
Develop complex agentic workflows - Develop a multi-step workflow that coordinates tools and services with proper observability, guardrails, and cost controls (using OpenAI Agent SDK, LangGraph, or a similar framework).
Build LLM evaluation and optimisation process -Develop evaluation harnesses, offline/online experiments, prompt-testing frameworks, and dashboards to balance quality, latency, and spend across all AI services.
Requirements:
5+ years of hands‑on software engineering experience building production systems at scale.
Strong proficiency in Python, with Practical knowledge of databases.
Strong grounding of LLM/AI application patterns (RAG, tool use, function calling, guardrails) and vendor APIs (OpenAI or similar).
Experience with vector store (pgvector, Pinecone, OpenSearch), feature/semantic layers, or retrieval pipelines
Familiarity with: eval frameworks, prompt/version management, offline/online A/B testing, and cost/latency optimization.
Clear written and verbal communication; able to drive alignment with concise design docs and reviews.
Nice to have:
Experience building developer platforms or internal tooling
Familiarity with workflow orchestration (Airflow, Prefect, Dagster) or multi-model routing strategies
Hands-on experience with model optimisation, fine-tuning, or distillation techniques.
Deep experience with cloud infrastructure (AWS), containers (Docker, Kubernetes), and distributed systems.
Frontend development frameworks such as react
Background in SaaS/enterprise environments with compliance requirements (SOC2, GDPR).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8498450
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
15/01/2026
Location: Tel Aviv-Yafo
Job Type: Full Time
Were on the lookout for a driven and experienced hands-on Engineering Team Leader to lead a group of engineers building the AI application foundation behind our next-generation cyber-AI product.
This team owns the core backend and frontend infrastructure that powers all user-facing applications - as well as the agentic and conversational frameworks that bring intelligence and automation into the experience.
As a hands-on leader, youll guide a talented team of engineers in designing and evolving the systems that enable every other Engineering group to create fast, scalable, and intelligent apps on top of our platform.
:Responsibilities
Lead a multidisciplinary team responsible for backend services, frontend frameworks, and AI agent infrastructure - the technical bedrock of our product experience.
Mentor engineers, grow the team, and foster a culture of technical excellence and innovation.
Design and build robust frameworks and systems that power all customer-facing applications.
Develop the agentic and conversational architecture enabling LLM-powered user interactions and intelligent workflows.
Collaborate with AI research, data, and product teams to seamlessly integrate AI-driven capabilities into production systems.
Define and drive the technical roadmap, ensuring scalability, developer productivity, and rapid iteration.
Requirements:
7+ years of software development experience, with 2+ years leading and mentoring engineers.
Strong expertise in backend development (Go, Python, Node.js, or similar) and familiarity with modern frontend technologies (React, Vue, TypeScript, etc.).
Proven experience designing distributed and web application architectures.
Solid understanding of developer experience - how to build frameworks and tooling that enable other teams to move faster.
Experience working with or integrating AI systems, LLMs, or agent-based architectures.
Excellent collaboration skills and a startup mindset - hands-on, pragmatic, and impact-oriented.
A product-oriented mindset and the ability to work in a fast-paced, team-driven environment.
Advantages:
Experience with agent orchestration, context management, or retrieval-augmented architectures.
Experience in cybersecurity.
Hands-on expertise in Go development.
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8504160
סגור
שירות זה פתוח ללקוחות VIP בלבד
סגור
דיווח על תוכן לא הולם או מפלה
מה השם שלך?
תיאור
שליחה
סגור
v נשלח
תודה על שיתוף הפעולה
מודים לך שלקחת חלק בשיפור התוכן שלנו :)
לפני 15 שעות
Location: Tel Aviv-Yafo
Job Type: Full Time
Join our companys AI research group, a cross-functional team of ML engineers, researchers and security experts building the next generation of AI-powered security capabilities. Our mission is to leverage large language models to understand code, configuration, and human language at scale, and to turn this understanding into security AI capabilities which will drive our company AI future security solutions.
We foster a hands-on, research-driven culture where youll work with large-scale data, modern ML infrastructure, and a global product footprint that impacts over 100,000 organizations worldwide.
Key Responsibilities
Your Impact & Responsibilities
As a Senior ML Research Engineer, you will be responsible for the end-to-end lifecycle of large language models: from data definition and curation, through training and evaluation, to providing robust models that can be consumed by product and platform teams.
Own training and fine-tuning of LLMs / seq2seq models: Design and execute training pipelines for transformer-based models (encoder-decoder, decoder-only, retrievalaugmented, etc.), and fine-tune open-source LLMs on our company-specific data (security content, logs, incidents, customer interactions).
Apply advanced LLM training techniques such as instruction tuning, preference / contrastive learning, LoRA / PEFT, continual pre-training, and domain adaptation where appropriate.
Work deeply with data: define data strategies with product, research and domain experts; build and maintain data pipelines for collecting, cleaning, de-duplicating and labeling large-scale text, code and semi-structured data; and design synthetic data generation and augmentation pipelines.
Build robust evaluation and experimentation frameworks: define offline metrics for LLM quality (task-specific accuracy, calibration, hallucination rate, safety, latency and cost); implement automated evaluation suites (benchmarks, regression tests, redteaming scenarios); and track model performance over time.
Scale training and inference: use distributed training frameworks (e.g. DeepSpeed, FSDP, tensor/pipeline parallelism) to efficiently train models on multi-GPU / multi-node clusters, and optimize inference performance and cost with techniques such as quantization, distillation and caching.
Collaborate closely with security researchers and data engineers to turn domain knowledge and threat intelligence into high-value training and evaluation data, and to expose your models through well-defined interfaces to downstream product and platform teams.
Requirements:
What You Bring
5+ years of hands-on work in machine learning / deep learning, including 3+ years focused on NLP / language models.
Proven track record of training and fine-tuning transformer-based models (BERT-style, encoder-decoder, or LLMs), not just consuming hosted APIs.
Strong programming skills in Python and at least one major deep learning framework (PyTorch preferred; TensorFlow).
Solid understanding of transformer architectures, attention mechanisms, tokenization, positional encodings, and modern training techniques.
Experience building data pipelines and tools for large-scale text / log / code processing (e.g. Spark, Beam, Dask, or equivalent frameworks).
Practical experience with ML infrastructure, such as experiment tracking (Weights & Biases, MLflow or similar), job orchestration (Airflow, Argo, Kubeflow, SageMaker, etc.), and distributed training on multi-GPU systems.
Strong software engineering practices: version control, code review, testing, CI/CD, and documentation.
Ability to own research and engineering projects end-to-end: from idea, through prototype and controlled experiments, to models ready for integration by product and platform teams.
Good communication skills and the ability to work closely with non-ML stakeholders (security experts, product managers, engineers).
This position is open to all candidates.
 
Show more...
הגשת מועמדותהגש מועמדות
עדכון קורות החיים לפני שליחה
עדכון קורות החיים לפני שליחה
8541239
סגור
שירות זה פתוח ללקוחות VIP בלבד