We are looking for a highly motivated PhD or MSc student to join our team for a summer internship. The focus of the internship is on cost-efficient serving of AI inference workloads, with a particular emphasis on optimizing routing strategies and managing KV (Key-Value) cache usage across distributed systems.
The intern will work on:
Designing and evaluating routing algorithms to improve inference performance and cost.
Investigating strategies for efficient KV cache management at scale.
Prototyping and benchmarking ideas to optimize inference serving systems.
This internship offers a unique opportunity to work at the intersection of AI and systems, with real-world impact on scalable inference serving.
Our summer internship program offer you an opportunity to join our research team for 3 months internship (working 5 days a week) in either Haifa or Tel Aviv sites (according to each internship). During the internship, you will be working with our talented researcher on top projects, helping create the next generation of AI, security, quantum, cloud and much more.
Requirements: Bachelor's Degree
Preferred education
Master's Degree
Required technical and professional expertise
MSC or PhD candidate from CS in advanced stages of studies
Background in Computer Science, Machine Learning Systems, or related fields.
Knowledge of distributed systems, networking, or inference infrastructure is a plus.
Strong programming skills (Python, Go, or similar).
Interest in AI infrastructure and large-scale system optimization.
Ability to work independently while also being an excellent team player.
Familiarity with Kubernetes (K8s) is an advantage.
Preferred technical and professional experience
Publication/s at top-tier peer-reviewed conferences or journals.
This position is open to all candidates.