You will join the R&D team building the next generation of our AI-powered metadata and context management. This system is the backbone that enables instant, actionable answers via our MCP and agentic use cases by translating complex business questions into secure, cost-aware data queries.
This is a high-impact, data-intensive role requiring production-grade backend development expertise, focused on quality, governance, and low-latency retrieval across complex data sources.
What youll do:
This role focuses on the end-to-end backend development and engineering quality for our core AI retrieval plane:
Design and Scale Context Retrieval: Develop and deploy services for our hybrid RAG/GraphRAG systems, combining lexical and vector search on data from various sources.
Implement Governance Gates: Build mission-critical backend logic to enforce data security policies, ensuring customer data and internal business data are filtered, masked, or isolated based on policy tags before context is ever generated.
Own the Quality Lifecycle: Design and implement automated evaluation to continuously measure the precision, recall, and faithfulness of the retrieved context against golden metrics.
Optimize Performance and Cost: Architect backend solutions for performance, including designing semantic caching strategies, memory, and context compression.
Data Modeling & APIs: Contribute to the graph model and APIs for retrieving metadata (glossary terms, lineage, schemas), enabling sophisticated context assembly for the agent.
Full Observability: Implement comprehensive tracing and monitoring for every agent component and tool call, ensuring high service reliability and quick root-cause analysis.
Requirements: What you have
Primary Language: 4+ years of hands-on experience developing and deploying production-grade services at scale.
Data & Backend Excellence: Proven experience with large-scale distributed systems, and familiarity with large-scale production databases and data flow principles.
Search & Information Retrieval Expertise: Strong hands-on experience with search engines, vector indexing, and designing a hybrid retrieval mechanism.
Cloud Data Fluency: Experience with BigQuery and familiarity with general cloud platforms (AWS/GCP) for deploying data-intensive applications.
Quality Mindset: A strong commitment to quality engineering, test automation, and an ability to drive product excellence through data-driven decisions.
Bonus points.
Familiarity with metadata frameworks (DataHub, OpenMetadata, DataPlex, etc).
Passionate about building or integrating LLM-based systems (LangChain, RAG, vector DBs, etc.).
Python experience .
Recommended by our employee.
This position is open to all candidates.