poolside logo

Member of Engineering (Pre-training / Data Engineering) — Poolside - poolside

Data Scientist

Posted: January 30, 2026
Posted 148 days ago
Last seen in crawl: June 25, 2026 (2d ago)
Estimated Expiry: March 6, 2026
Field
Role & Management
Role Level:Mid-Level
Management Tier:No People Management
Job Type
Experience
3 years
Required Languages

Job Description

Join Poolside as a core member of the Pretraining Data team, responsible for building and scaling high-performance data pipelines for large-scale model training. Focus on data ingestion, deduplication, and streaming systems at petabyte scale to support AI research and development.

Company Information

poolside logo
Technology
Headcount: 50 - 200

Data shown is based on historical job postings from our database.

Job Details

Responsibilities

  • Build high-performance data pipelines
  • Deliver diverse datasets for training
  • Collaborate with research and engineering teams

Requirements

  • Experience with distributed data systems
  • Knowledge of orchestration tools (Slurm, Airflow, Dagster)
  • Proficiency in Python
  • Experience with GPU clusters and distributed pipelines
  • Knowledge of libraries like Polars, Dask, PySpark

Skills & Technologies

Distributed data systemsPythonGPU clustersPolarsDaskPySpark

Education Level

Bachelor

Recruitment Process

  1. 1
    Intro call
  2. 2
    Technical interview
  3. 3
    Team fit interview
  4. 4
    Final interview
2 days agoContent Complete

Help us improve JobCrawls — sign in to sync saved jobs across devices, or send feedback anytime.