we are transforming how organizations build, run, and scale AI and accelerated compute workflows with NeuralMesh, our intelligent, adaptive mesh storage system. Unlike traditional data infrastructures, which become more fragile as compute environments grow and performance demands increase, NeuralMesh becomes faster, stronger, and more efficient as it scales, providing a flexible, adaptable foundation for enterprise and agentic AI innovation that maximizes GPU utilization, accelerates time to first token, and lowers the cost of innovation.
we are a growth-stage company backed by world-class venture capital investors and AI infrastructure industry leaders. Our technology, purpose-built for AI, has garnered over 140 patents and is trusted by more than 30% of Fortune 50 enterprises, as well as the worlds leading hyperscalers, neoclouds, and AI innovators. Our team is customer-obsessed and works accountably, boldly, and collaboratively to ensure their success. If we sound like your kind of people, join us!
About The Position
At our company, we dont just "test" software; we push the boundaries of high-performance distributed systems. We are looking for a Group Lead for System Quality Engineering-someone who views quality as a complex engineering challenge, not a checkbox.
You will lead a group of 20 world-class SW engineers and Team Leads. Your mission is to build the "adversarial" engineering force that ensures our platform remains resilient, performant, and unbreakable under the worlds most demanding data workloads. We want a leader who treats testing as a performance art and a systems science, likely coming from a background in Backend Engineering, SRE, or highly technical System Verification.
Key Responsibilities:
Architect the "Breaking" Strategy: Design the end-to-end strategy for validating a massive-scale distributed file system. This isn't just about coverage; it's about finding the architectural breaking points.
Lead a High-Caliber Engineering Group: Manage and mentor a team of "Quality Hackers." You will set the technical bar, promote a culture of engineering excellence, and move away from traditional "manual-first" mindsets.
Deep-Tech Collaboration: Work as a peer to the Product and Core Development leads. Youll influence the product roadmap by identifying systemic risks early in the design phase.
Evolve the Automation Ecosystem: Partner with Infrastructure teams to build sophisticated, automated test environments that simulate chaotic, real-world customer environments at scale.
Field-to-Core Feedback Loop: Bridge the gap between how our product is used in the field and how we stress-test it in the lab, ensuring our company excels in the most extreme AI and HPC use cases.
Data-Driven Reliability: Define and track high-signal metrics (System Recovery Time, Latency P99s under stress, Mean Time to Detection) to provide a transparent view of product health.
Requirements: 10+ years in Software Engineering: You have spent significant time "in the trenches" building or breaking complex systems.
8+ years of Leadership: Experience managing managers and large teams (15+) in a fast-paced, high-growth environment.
Systems Thinking: Deep experience with Distributed Systems, High-Performance Computing (HPC), or Storage (NAS, Object, SAN). You understand IO paths, metadata consistency, and network protocols.
Polyglot/Hacker Mindset: Proficiency in Python for automation and a deep understanding of C++, Go, or Rust to navigate and debug the core codebase.
Production-First Mentality: Experience in SRE, Production Engineering, or high-scale System Verification. You know how systems fail in the real world (network partitions, disk failures, race conditions).
Analytical Rigor: The ability to look at a complex architecture and instinctively know where the "hidden" bugs live.
Communication: Excellent verbal and written communication skills for cross-functional collaboration.
This position is open to all candidates.