We are looking for a Senior Data & Machine Learning Engineer to operate at the intersection of data platform engineering and machine learning enablement. This role is responsible for building scalable, efficient, and reliable data systems while enabling Data Science and Analytics teams to develop and deploy ML-driven features.
You will take ownership of the data and ML infrastructure layer, ensuring that pipelines, storage models, and compute usage are optimized, while also shaping how data workflows and ML solutions are designed across the organization.
Responsibilities
Data Platform & Infrastructure
Design, build, and maintain scalable data pipelines and storage systems supporting analytics and ML use cases
Ensure compute and cost efficiency across pipelines, storage models, and processing workflows
Own and improve data orchestration, transformation, and serving layers (e.g., Spark, DBT, streaming/batch systems)
Build and maintain shared infrastructure components, including:
IO managers and data access abstractions
Integrations with DBT, Spark, and other data frameworks
Internal tooling to improve developer productivity and reliability
ML Enablement & Collaboration
Partner closely with Data Science to design and productions ML solutions for new features and research initiatives
Translate experimental models into robust, scalable production systems
Support feature engineering, training pipelines, and inference workflows
Help define best practices for ML lifecycle management (training, validation, deployment, monitoring)
Data Quality, Governance & Best Practices
Enforce best practices for building and maintaining data processes across Data Analyst and Data Science teams
Define standards for:
Data modeling and transformations
Pipeline reliability and observability
Testing, versioning, and documentation
Improve data quality, consistency, and discoverability across the organization
Performance & Reliability
Optimize systems for performance, scalability, and cost efficiency
Monitor and troubleshoot data pipelines and ML systems in production
Implement observability (logging, metrics, alerting) across data workflows
Requirements: Strong programming skills in Python (or similar language)
Proven experience building and maintaining production-grade data pipelines
Hands-on experience with data processing frameworks (e.g., Spark or similar)
Familiarity with DBT or modern data transformation workflows
Experience working with cloud environments (AWS, GCP, or Azure)
Solid understanding of data modeling, distributed systems, and ETL/ELT patterns
This position is open to all candidates.