Work closely with data scientists/analytics and other stakeholders to identify and prioritize data engineering projects and to ensure that the data infrastructure is aligned with business goals and objectives
Design, build and maintain optimal data pipeline architecture for extraction, transformation, and loading of data from a wide variety of data sources, including external APIs, data streams, and data stores.
Continuously monitor and optimize the performance and reliability of the data infrastructure, and identify and implement solutions to improve scalability, efficiency, and security
Stay up-to-date with the latest trends and developments in the field of data engineering, and leverage this knowledge to identify opportunities for improvement and innovation within the organization
Solve challenging problems in a fast-paced and evolving environment while maintaining uncompromising quality.
Implement data privacy and security requirements to ensure solutions comply with security standards and frameworks.
Enhance the team's dev-ops capabilities.
Requirements: Bachelor's or Master's degree in Computer Science, Engineering, or a related field
2+ years of proven experience developing large-scale software using an object-oriented or a functional language.
5+ years of professional experience in data engineering, focusing on building and maintaining data pipelines and data warehouses
Strong experience with Spark, Scala, and Python, including the ability to write high-performance, maintainable code
Experience with AWS services, including EC2, S3, Athena, Lambda and EMR
Familiarity with data warehousing concepts and technologies, such as columnar storage, data lakes, and SQL
Experience with data pipeline orchestration and scheduling using tools such as Airflow
Strong problem-solving skills and the ability to work independently as well as part of a team
High-level English - a must
A team player with excellent collaboration skills.
Nice to Have:
Expertise with Vertica or Redshift, including experience with query optimization and performance tuning
Experience with machine learning and/or data science projects
Knowledge of data governance and security best practices, including data privacy regulations such as GDPR and CCPA.
Knowledge of Spark internals (tuning, query optimization)
This position is open to all candidates.