modal

Member of Technical Staff - ML Performance

🇺🇸 New York, Estados Unidos, New York City, Estados Unidos Presencial Tecnología Senior Publicado Abr 21, 2026

Solicitar

Modalidad Presencial

Seniority Senior

Categoría Tecnología

Categoría IT Data Science y ML

Idioma English

Publicado 21 de abril de 2026

Última verificación 27 de mayo de 2026

Dónde está disponible este puesto

Plegado de forma predeterminada para que la descripción sea fácil de leer.

2 ubicaciones

Estados Unidos

New York, Estados Unidos
New York City, Estados Unidos

Contexto de JobGrid

Resumen del puesto por JobGrid

Member of Technical Staff - ML Performance at modal: New York, Estados Unidos, New York City, Estados Unidos; Presencial; Senior; Tecnología; Data Science y ML. JobGrid adds normalized role facts, source context, and a path to the employer application page so candidates can compare the listing before applying.

Location and workplace: New York, Estados Unidos, New York City, Estados Unidos, Presencial
Role classification: Tecnología, Data Science y ML, Senior
Source freshness: checked by JobGrid on 2026-05-27.
Application path: candidates continue to the employer application page with non-personal referral tags.

About Us:

Modal provides the infrastructure foundation for AI teams. With instant GPU access, sub-second container startups, and native storage, Modal makes it simple to train models, run batch jobs, and serve low-latency inference. We have thousands of customers who rely on us for production AI workloads, including Lovable, Scale AI, Substack, and Suno.

We're a fast-growing team based out of NYC, SF, and Stockholm. We've hit 9-figure ARR and recently raised a Series B at a $1.1B valuation. Our investors include Lux Capital, Redpoint Ventures, Amplify Partners, and Elad Gil.

Working at Modal means joining one of the fastest-growing AI infrastructure organizations at an early stage, with many opportunities to grow within the company. Our team includes creators of popular open-source projects (e.g. Seaborn, Luigi), academic researchers, international olympiad medalists, and experienced engineering and product leaders with decades of experience.

The Role

We are looking for strong engineers with experience in making ML systems performant at scale. If you are interested in contributing to open-source projects and Modal’s container runtime to push language and diffusion models towards higher throughput and lower latency, we’d love to hear from you!

Requirements

5+ years of experience writing high-quality, high-performance code.
Experience working with torch, high-level ML frameworks, and inference engines (vLLM or TensorRT).
Familiarity with Nvidia GPU architecture and CUDA.
Experience with ML performance engineering (tell us a story about boosting GPU performance — debugging SM occupancy issues, rewriting an algorithm to be compute-bound, eliminating host overhead, etc).
Nice-to-have: familiarity with low-level operating system foundations (Linux kernel, file systems, containers, etc).