we are transforming how organizations build, run, and scale AI and accelerated compute workflows with NeuralMesh, our intelligent, adaptive mesh storage system. Unlike traditional data infrastructures, which become more fragile as compute environments grow and performance demands increase, NeuralMesh becomes faster, stronger, and more efficient as it scales - providing a flexible, adaptable foundation for enterprise and agentic AI innovation that maximizes GPU utilization, accelerates time to first token, and lowers the cost of innovation.
we are a growth-stage company backed by world-class venture capital investors and AI infrastructure industry leaders. Our technology, purpose-built for AI, has garnered over 140 patents and is trusted by more than 30% of Fortune 50 enterprises, as well as the worlds leading hyperscalers, neoclouds, and AI innovators. Our team is customer-obsessed and works accountably, boldly, and collaboratively to ensure customer success. If we sound like your kind of people, join us!
About the role
At our company, were building a next-generation platform for validating large-scale distributed systems. Our goal is to continuously ensure the correctness, performance, and resilience of the company Data Platform across every layer of the stack.
As a Senior Software Engineer, youll work hands-on on the systems and frameworks that test, stress, and validate complex distributed infrastructure under real-world conditions. Youll help design and build automated environments that simulate scale, concurrency, and failure scenarios, and youll contribute to evolving how we ensure reliability and correctness in modern infrastructure systems.
This role is ideal for engineers with a strong distributed systems background who enjoy deep technical problem-solving, working close to the system, and building tools that improve quality, stability, and confidence at scale.
What Youll Do
Design and implement core components of a distributed testing infrastructure and quality platform.
Build automated frameworks to validate functionality, performance, and resilience at scale.
Collaborate closely with infrastructure, storage, and platform teams to ensure quality is built into the development lifecycle.
Contribute to improving tooling, test coverage, and engineering best practices across the organization.
Requirements: Strong experience (5+ years) building or working on large-scale distributed systems in areas such as storage, networking, cloud infrastructure, or backend platforms.
Solid understanding of concurrency, system correctness, and reliability in production systems.
Hands-on programming experience in one or more of the following languages: Go, C++, Rust, or Python.
Experience building test frameworks, infrastructure tooling, or internal platforms is a strong advantage.
Curiosity and interest in modern approaches to testing, automation, and system validation (including AI-assisted techniques).
Ability to work independently on complex technical problems while collaborating effectively with cross-functional teams.
Nice to Have
Experience with observability, performance testing, fault injection, or chaos engineering.
Familiarity with CI/CD pipelines for large-scale systems.
Exposure to AI/ML-driven testing or automation tools.
This position is open to all candidates.