LieuMaslak, TR
Mode de travailSur site
ContratTemps plein
LangueEnglish
Publié6 mai 2026
Dernière vérification8 mai 2026
Lucida is teaching the world to speak.
Two billion people are trying to learn a language. Almost all of them are stuck ; not because they lack motivation, but because the only thing that actually works (talking to a human tutor) is too expensive, too inconvenient, or too embarrassing.
We're building the alternative: a voice-first AI tutor you can actually have a conversation with, anytime, in your pocket. Real-time. Sub-second. Feels-like-a-person. Already serving a million learners.
We're well-funded, seed-stage, and we're hiring the engineer who'll build the backbone behind that product.
The role
You'll own a meaningful surface of our backend ; the systems that turn audio, models, prompts, and user state into a working tutor at scale. Day-to-day, you'll:
- Design and operate the real-time conversational pipeline ; streaming services and WebSocket interfaces that keep latency budgets honest at the scale of a million users
- Build and harden the LLM orchestration layer ; prompt design as code, structured outputs, streaming, retries, fallbacks, cost control across multiple providers
- Treat prompts as engineering artifacts: versioned, evaluated, regression-tested. Vibes are not a methodology.
- Take open-source models (LLM, ASR, TTS, avatar) from a paper or HF repo and put them on our GPUs ; benchmark, optimize, serve, monitor
- Fine-tune and train our own models on top of open-source bases ; curate datasets, run training jobs, evaluate against production criteria, and ship the result
- Design event-driven media flows ; webhooks, post-session processing, recording and export pipelines
- Own third-party integrations end-to-end ; contracts, retries, observability, the boring-important stuff
- Make architecture decisions with the founders, not after them
What we're looking for
- 5+ years writing production Python you're not embarrassed by ; typed, tested, readable
- Deep fluency in asyncio and concurrent/streaming code
- Strong command of HTTP, WebSockets, and event-driven systems
- Hands-on experience integrating with LLM APIs in production ; streaming, tool use, structured outputs, and the operational realities (rate limits, retries, cost control)
- A real sense of prompt engineering as engineering ; you've shipped prompts that survived contact with users, iterated on them with data, and didn't just "feel good in the playground"
- A real fine-tuning / training track record ; you've taken an open-source model, prepared the data, run the training, evaluated it honestly, and shipped the result to users. Not a notebook tutorial. A model that moved a metric.
- Experience deploying and serving your own models on GPUs ; quantization, batching, KV-cache, latency/throughput tradeoffs
- A debugging instinct for distributed systems at scale: traces, profiling, backpressure, capacity planning
- Comfort with Postgres, Redis, and a queue/broker layer
- Pragmatism ; you ship, you measure, you iterate. You don't over-engineer, and you don't under-test.
Nice to have
- Real-time media systems (WebRTC, SFU, streaming pipelines)
- Audio or speech model deployment and fine-tuning in production
- Distillation, synthetic data generation, or RLHF/DPO-style alignment work
- Multi-region or multi-cloud infrastructure
- Cost optimization at scale, token economics, GPU utilization, caching strategies
- Open-source contributions