We are looking for a skilled and motivated Software Engineer to join our backend data infrastructure team. You will be at the core of data ecosystem, building and maintaining high-performance data services and pipelines that support both real-time and batch workloads. Your work will directly impact how data is accessed and leveraged across the company from live production environments to ML training pipelines. You will design and maintain systems that span hybrid infrastructure (on-prem and cloud), and ensure our data platform is fast, reliable, and scalable. We value engineers who are curious, open-minded, and excited to learn new technologies and practices as the landscape evolves.
As a Big Data Engineer, you will:
Design, implement, and maintain backend services for managing and processing large-scale data.
Build and operate production-grade data pipelines and infrastructure.
Develop utilities ,libraries and services to support high-throughput data retrieval and access patterns.
Ensure observability, stability, and performance of data services in hybrid (cloud/on-prem) environments.
Monitor and troubleshoot issues in live systems and continuously improve their robustness.
Work cross-functionally to ensure data is accessible, well-modeled, and easy to consume by other teams.
Requirements: Strong programming experience in at least one of the following: C++, Java, Rust, .NET, or Python.
Experience working with python data analytics libraries (such as numpy, pandas, polars).
Experience working on backend services or data-intensive applications.
Understanding of distributed systems, data pipelines, and production monitoring.
Experience in hybrid infrastructure environments (on-prem + cloud).
An open-minded technologist with a willingness to learn and adopt new technologies and best practices.
Nice to Have:
Familiarity with Apache Iceberg or other table/data format technologies (e.g., Delta Lake, Hudi, Parquet, ORC).
Familiarity with Streaming technologies Kafka, Flink.
Experience with orchestration tools like Airflow or Argo.
Exposure to analytics engines (e.g., Spark, DuckDB, Trino).
Knowledge of Kubernetes and containerized deployments.
Experience in MLOps or supporting machine learning pipelines.
This position is open to all candidates.