Come join the team as a Staff Machine Learning Engineer. We are seeking a highly skilled ML engineer passionate about building a world-class platform at a high scale, specifically focused on delivering AI capabilities. You will be part of a vibrant team of AI Scientists and ML Engineers, helping to build the next generation of awesome products and experiences using cutting-edge Generative AI technology. If you love having stretch goals, challenges, and making customers incredibly happy while fostering your obsessive need for perfect code and user experience, this is the job for you.
Responsibilities
Design, implement, and enhance services at large scale, specifically focusing on improving Generative AI inference, quantization, optimization, finetuning, and evaluation.
Use your coding expertise to design and implement scalable, modular, and secure services.
Develop backend systems that support serving of LLMs and AI Agents at scale, utilizing the latest industry tools and techniques.
Work cross-functionally with product managers, AI scientists, business units, and other engineers to understand, implement, refine, and design Generative AI models.
Help automate, deliver, monitor, and improve GenAI solutions while taking end-to-end responsibilities including technical documentation and automation tests.
Interact with a variety of data sources, working closely with peers to refine features from the underlying data and build end-to-end pipelines.
Resolve defects and bugs during testing, production, and post-release patches, and participate in peer code reviews.
Explore the state-of-the-art technologies and apply them to deliver customer benefits.
Requirements: 7+ years of active software engineering experience with a focus on building AI driven applications, machine learning systems, and microservices at large scale.
Proven experience building AI products serving at high scales, coupled with experience designing and developing Generative AI architectures.
Extensive knowledge of large language models (LLMs) and building agents at scale is a great plus.
Experience with LLM tools and frameworks such as LangChain, vLLM, and the HuggingFace toolkit.
Proficiency in Java and Python, as well as data oriented languages, tools, and frameworks like Spark.
Strong understanding of Software Design, Architecture, and working with cloud technologies, in particular AWS, and container technologies like Docker, Kubernetes, and KubeFlow/MLflow.
Solid software engineering fundamentals, including version control systems (Git, Github), the ability to write production-ready code, and an understanding of data structures, algorithms, and performance implications.
Experience with machine learning techniques (classification, regression, clustering), mathematics fundamentals (linear algebra, calculus, probability), and data processing tools (relational, NoSQL, stream processing).
Bachelor, Masters, or PhD degree in Computer Science or a related field, or equivalent practical/work experience.
This position is open to all candidates.