Required SOC Quality and Reliability Engineer, Cloud
Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Tel Aviv, Israel; Haifa, Israel.
About the job
In this role, youll work to shape the future of AI/ML hardware acceleration. You will have an opportunity to drive cutting-edge TPU (Tensor Processing Unit) technology that powers our most demanding AI/ML applications. Youll be part of a team that pushes boundaries, developing custom silicon solutions that power the future of our TPU. You'll contribute to the innovation behind products loved by millions worldwide, and leverage your design and verification expertise to verify complex digital designs, with a specific focus on TPU architecture and its integration within AI/ML-driven systems.
Our data centers are the most advanced in the world. In this role, you will help build the SoCs that power these data centers by driving quality and reliability processes from the integrated circuit perspective. You will create silicon and follow it into the field (and back) to drive improvements for the next generations of chips.
You will have an understanding of IC flows, wafer processing, testing, qualification, yield, reliability, and failure analysis. You will work with various cross functional teams to develop quality and reliability specifications, develop and deploy design guidelines, and develop and execute and test plans. Within the larger organization you will collaborate with global hardware quality and reliability teams, silicon design, validation and engineering teams.The AI and Infrastructure team is redefining whats possible. We empower our customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity.
Responsibilities
Drive the strategic definition and development of design-for-reliability (DfR) guidelines, collaborating with cross-functional subject matter experts to integrate reliability into early design stages.
Define and lead the development of qualification hardware and test methodologies, managing internal teams and external vendors to ensure silicon and package verification.
Execute comprehensive silicon and package qualification programs (including high-temperature operating life (HTOL), early life failure rate (ELFR), electrostatic discharge/latch-up (ESD/LU), biased highly accelerated stress test (b/HAST), etc.) and conduct failure analysis to resolve quality issues.
Extract and analyze data from qualification programs, high-volume manufacturing, and field returns to identify failure mechanisms and trends for yield and reliability optimization.
Develop and implement physics-based statistical Quality and Reliability models (e.g., ELF, time-dependent dielectric breakdown (TDDB), negative bias temperature instability (NBTI) to predict device failure mechanisms and lifetime behaviors.
Requirements: Minimum qualifications:
Bachelor's degree in Electrical Engineering, Materials Science, Physics, or a related field or equivalent practical experience.
8 years of experience in IC silicon quality or reliability.
Experience leading the product reliability lifecycle from post-tapeout through high-volume manufacturing.
Experience with semiconductor complementary metal-oxide-semiconductor (CMOS) technology, device physics, and failure mechanisms.
Preferred qualifications:
Master's degree in Electrical Engineering, Materials Science, or related field.
Expertise in statistical data analysis using tools such as JMP, Python, or JSL.
Knowledge of design-for-reliability (DfR) rules and implementation techniques.
Familiarity with electrical failure analysis (EFA) and physical failure analysis (PFA) techniques.
Track record with silicon reliability on process nodes and advanced packaging technologies.
This position is open to all candidates.