Огляд ролі від JobGrid
Senior Data Engineer (F/M/D) at AlphaIgnis: Munich, Німеччина; На місці; Повна зайнятість; Lead; IT. JobGrid adds normalized role facts, source context, and a path to the employer application page so candidates can compare the listing before applying.
- Location and workplace: Munich, Німеччина, На місці
- Role classification: IT, Інженер даних, Повна зайнятість, Lead
- Source freshness: checked by JobGrid on 2026-05-28.
- Application path: candidates continue to the employer application page with non-personal referral tags.
The Opportunity
We’re looking for a Senior Data Engineer to architect and scale the data backbone powering next-generation AI models in robotics and real-world environments.
This role sits at the intersection of distributed systems, multimodal data processing, and applied machine learning, with a strong focus on building high-quality datasets for robotic foundation models. You will ensure that data pipelines, infrastructure, and data strategy directly translate into measurable improvements in model performance.
Your Responsibilities
- Drive the model–data loop by connecting application requirements with data collection, and translating model failures into data-driven improvements through collection, curation, and augmentation
- Build and scale distributed data pipelines (Ray/Anyscale or similar) for TB-scale video, sensor, and robotics datasets
- Design multimodal data schemas aligning video, actions, and high-frequency sensor streams
- Develop Python tooling for data quality, including cleaning, anomaly detection, and dataset versioning
- Own dataset quality and coverage, including annotation workflows, data diversity, and storage trade-offs
- Lead a small team and coordinate with data providers and annotation vendors
- Oversee real-world data collection, including technical setup, compliance, and secure data handling
Technologies
- Python (advanced, production-grade)
- Ray / Anyscale or Apache Spark
- AWS / GCP for large-scale data and GPU training pipelines
- Video and sensor data formats (H.264/H.265, ROS bags, MCAP)
- PyTorch, NumPy
- DVC, LakeFS or similar data versioning tools
- Distributed data processing and storage systems