clera

Staff Engineer — Agentic AI

🇺🇸 San Francisco, United States On-site IT Lead Posted Jun 7, 2026

Apply

Location San Francisco, United States

Workplace On-site

Seniority Lead

Category IT

IT Category Data Science & ML

Language English

Posted June 7, 2026

Last verified June 9, 2026

JobGrid context

Role summary by JobGrid

Staff Engineer — Agentic AI at clera: San Francisco, United States; On-site; Lead; IT; Data Science & ML. JobGrid adds normalized role facts, source context, and a path to the employer application page so candidates can compare the listing before applying.

Location and workplace: San Francisco, United States, On-site
Role classification: IT, Data Science & ML, Lead
Source freshness: checked by JobGrid on 2026-06-09.
Application path: candidates continue to the employer application page with non-personal referral tags.

About the Role

We're hiring a senior technical leader to own the core agent intelligence that turns engineers' intent into reliable, cost-efficient multi-step workflows across desktop engineering tools. This role sits at the intersection of applied agentic AI, user research, and product delivery and will determine the product's real-world value to enterprise customers.

You'll report to the CTO and serve as technical lead for a small team of AI engineers, a user researcher, and domain expert contractors in an early-stage, high-impact environment (Series A, Fortune 100 customers, direct line to leadership).

What You'll Do

Lead development of the core agent intelligence layer that executes multi-step workflows across complex desktop engineering software.
Own the full product loop: define agent capabilities from user stories, build implementations, and benchmark against real workflows.
Drive agent task success rate by defining evaluation frameworks, establishing baselines, and iterating to improve completion metrics.
Set and enforce per-task token budgets and track cost per completed workflow to ensure commercial viability.
Build rigorous, reproducible evaluation infrastructure grounded in validated user stories.
Lead user story mapping and validation through interviews and close collaboration with domain experts.
Translate validated user stories into testable evals and close the loop between research and benchmarking.
Own agent architecture decisions including tool-calling, state management, error recovery, model routing, and context management.
Act as a player-coach: write production code, review designs, unblock the team, and raise engineering standards.
Collaborate cross-functionally with integrations, product, and customers during POCs to align agent behavior with real-world usage.

What We're Looking For

7+ years in software engineering, including at least 2 years building agentic LLM-based agents that act in the real world.
Deep experience designing LLM application architectures, including model selection, context/window management, retrieval, and orchestration patterns.
Proven ability to build evaluation and benchmarking frameworks measuring task completion, cost efficiency, and failure modes.
Technical leadership experience setting direction for small teams (3–6 engineers) and performing meaningful code review.
Strong Python skills and familiarity with LLM tooling (function calling, tool APIs, observability/tracing, evaluation frameworks).
Experience with desktop automation or programmatic control of applications (COM or similar).
Nice to have: Domain experience in mechanical engineering, CAD/CAE, PLM, or adjacent industries.
Nice to have: Understanding of enterprise deployment constraints on locked-down corporate workstations.
Nice to have: Track record contributing to public benchmarks, publications, or open-source agentic AI projects.