data drives our decisions. Technology is at our core. And innovation is everywhere. But our company is more than datasets, lines of code or A/B tests. Were the thrill of the first night in a new place. The excitement of the next morning. The friends you encounter. The journeys you take. The sights you see. And the memories you make. Through our products, partners and people, we make it easier for everyone to experience the world.
Leadership/Team Quote:
This opening is for the Content Intelligence team within the Central Tech department at Booking.com.
The Content Intelligence team is at the forefront of Generative AI innovation, driving solutions for travel-related chatbots, text generation and summarization applications, Q&A systems, and free-text search. Beyond this, the team is building a cutting-edge platform that processes millions of images and textual inputs daily, enriching them with ML capabilities. These enriched datasets power downstream applications, helping personalize the customer experiencefor example, selecting and displaying the most relevant images and reviews as customers plan and book their next vacation.
Role Description:
As a Data Engineer, youll collaborate with top-notch engineers and data scientists to elevate our platform to the next level and deliver exceptional user experiences. Your primary focus will be on the data engineering aspectsensuring the seamless flow of high-quality, relevant data to train and optimize content models, including GenAI foundation models, supervised fine-tuning, and more.
Youll work closely with teams across the company to ensure the availability of high-quality data from ML platforms, powering decisions across all departments. With access to petabytes of data through MySQL, Snowflake, Cassandra, S3, and other platforms, your challenge will be to ensure that this data is applied even more effectively to support business decisions, train and monitor ML models and improve our products.
Key Job Responsibilities and Duties:
Rapidly developing next-generation scalable, flexible, and high-performance data pipelines.
Dealing with massive textual sources to train GenAI foundation models.
Solving issues with data and data pipelines, prioritizing based on customer impact.
End-to-end ownership of data quality in our core datasets and data pipelines.
Experimenting with new tools and technologies to meet business requirements regarding performance, scaling, and data quality.
Providing tools that improve Data Quality company-wide, specifically for ML scientists.
Providing self-organizing tools that help the analytics community discover data, assess quality, explore usage, and find peers with relevant expertise.
Acting as an intermediary for problems, with both technical and non-technical audiences.
Promote and drive impactful and innovative engineering solutions
Technical, behavioral and interpersonal competence advancement via on-the-job opportunities, experimental projects, hackathons, conferences, and active community participation
Collaborate with multidisciplinary teams: Collaborate with product managers, data scientists, and analysts to understand business requirements and translate them into machine learning solutions. Provide technical guidance and mentorship to junior team members.
דרישות:
Bachelors or masters degree in computer science, Engineering, Statistics, or a related field.
Minimum of 3 years of experience as a Data Engineer or a similar role, with a consistent record of successfully delivering ML/Data solutions.
You have built production data pipelines in the cloud, setting up data-lake and server-less solutions; you have hands-on experience with schema design and data modeling and working with ML scientists and ML engineers to provide production level ML solutions.
You have experience designing systems E2E and knowledge of basic concepts (lb, db, caching, NoSQL, etc)
Strong programming skills in languages such as Python and המשרה מיועדת לנשים ולגברים כאחד.