It starts with you - a technical leader whos passionate about data pipelines, data modeling, and growing high-performing teams. You care about data quality, business logic correctness, and delivering trusted data products to analysts, data scientists, and AI systems. Youll lead the Data Engineering team in building ETL/ELT pipelines, dimensional models, and quality frameworks that turn raw data into actionable intelligence.
If you want to lead a team that delivers the data products powering mission-critical AI systems, join mission - this role is for you.
:Responsibilities
Lead and grow the Data Engineering team - hiring, mentoring, and developing engineers while fostering a culture of ownership and data quality.
Define the data modeling strategy - dimensional models, data marts, and semantic layers that serve analytics, reporting, and ML use cases.
Own ETL/ELT pipeline development using platform tooling - orchestrated workflows that extract from sources, apply business logic, and load into analytical stores.
Drive data quality as a first-class concern - validation frameworks, testing, anomaly detection, and SLAs for data freshness and accuracy.
Establish lineage and documentation practices - ensuring consumers understand data origins, transformations, and trustworthiness.
Partner with stakeholders to understand data requirements and translate them into well-designed data products.
Build and maintain data contracts with consumers - clear interfaces, versioning, and change management.
Collaborate with Data Platform to define requirements for new platform capabilities; work with Datastores on database needs; partner with ML, Data Science, Analytics, Engineering, and Product teams to deliver trusted data.
Design retrieval-friendly data products - RAG-ready paths, feature tables, and embedding pipelines - while maintaining freshness and governance SLAs.
Requirements: 8+ years in data engineering, analytics engineering, or BI development, with 2+ years leading teams or technical functions. Hands-on experience building data pipelines and models at scale.
Data modeling - Dimensional modeling (Kimball), data vault, or similar; fact/dimension design, slowly changing dimensions, semantic layers
Transformation frameworks - dbt, Spark SQL, or similar; modular SQL, testing, documentation-as-code
Orchestration - Airflow, Dagster, or similar; DAG design, dependency management, scheduling, failure handling, backfills
Data quality - Great Expectations, dbt tests, Soda, or similar; validation rules, anomaly detection, freshness monitoring
Batch processing - Spark, SQL engines; large-scale transformations, optimization, partitioning strategies
Lineage & cataloging - DataHub, OpenMetadata, Atlan, or similar; metadata management, impact analysis, documentation
Messaging & CDC - Kafka, Debezium; event-driven ingestion, change data capture patterns
Languages - SQL (advanced), Python; testing practices, code quality, version control
This position is open to all candidates.