We are looking for an experienced senior developer to design and build high-performance storage systems optimized for AI inference workloads, particularly large language models (LLMs). This role involves developing scalable, GPU-accelerated storage infrastructure that integrates tightly with modern AI inference frameworks and distributed architectures.
Position located in Kfar Saba, Israel.
Responsibilities:
Design and implement scalable storage solutions tailored for AI/ML inference pipelines.
Optimize data pipelines, caching, and I/O patterns to maximize GPU utilization and minimize inference latency.
Research and prototype innovative storagecompute co-design approaches for transformer-based models.
Stay current with advancements in distributed storage, high-performance networking, and AI inference technologies.
Contribute to open-source AI infrastructure projects where applicable.
Requirements: At least 5 years of experience in industry.
Expert-level Python and proficient Rust programming skills.
Strong knowledge of distributed storage architectures, object storage, and high-performance filesystems.
Hands-on experience with GPU acceleration technologies (CUDA, NCCL) and GPU memory management.
Familiarity with AI/ML frameworks and transformer model architectures.
Excellent problem-solving, debugging, and performance optimization skills.
Self-motivated, able to work independently in fast-paced, innovative environments.
AI expertise in different models and storage/networking, at least one of them.
Experience with high-performance networking protocols (InfiniBand, RoCE).- Preferred.
Knowledge of HPC technologies (MPI, NVLink).- Preferred.
Contributions to open-source AI or storage projects.- Preferred.
University Degree- Preferred.
Startup experience- Preferred.
This position is open to all candidates.