We seek an experienced Software Engineer with a strong background to become an integral member of our Data-Core team, tasked with the mission of processing, digitizing, and analyzing hundreds of millions of data sources. Your role will be pivotal in creating a unified, up-to-date, and accurate utilities map, services, and tools for accelerating our mapping operations. Your contributions will directly impact our core product's success.
Your Mission
Collaborate with cross-functional teams to design, build, and maintain data processing pipelines, ensuring efficient data integration, transformation, and loading.
Implement geospatial data processing techniques and contribute to the visualization of data on our unified map, enhancing the product's geospatial features.
Drive the scalability and performance optimization of data systems, addressing infrastructure challenges as data volume and complexity grow.
Develop and optimize complex Python-based code for data cleansing, validation, and standardization, enhancing the quality and accuracy of datasets.
Create and manage data infrastructure components, including ETL workflows, data warehouses, and databases, supporting seamless data flow and accessibility.
Collaborate in designing and implementing data architecture, ensuring effective data storage, retrieval, and security.
Design and implement CI/CD processes for data processing, model training, releasing, and monitoring, ensuring robustness and consistency.
Requirements: 7+ years of proven experience as a backend/software engineer.
2+ years of proven experience as a Data Infrastructure Engineer or in a similar role in managing and processing large-scale datasets.
A proven experience in building scalable online services.
Experience with frameworks like Kafka, DataLake, Airflow, Docker, and K8S to build data processing and exploration pipelines along with ML infrastructure to power our intelligence.
Experience in AWS/Google cloud environments, writing in Python/Java/Scala/Go.
Experience in deploying a diverse range of cloud-based technologies to support mission-critical projects, including expertise in writing, testing, and deploying code within a Kubernetes environment.
Experience working with both SQL and NoSQL databases such as Postgres, MySQL, Redis, or DynamoDB.
This position is open to all candidates.