nectar-social

Senior Site Reliability Engineer

Palo Alto Presencial Publicado May 30, 2026
Ubicación Palo Alto
Modalidad Presencial
Idioma English
Publicado 30 de mayo de 2026
Última verificación 30 de mayo de 2026

About Us

We're living through a fundamental shift in how people discover, evaluate, and purchase products. The next generation doesn't respond to traditional marketing -- they build relationships with brands through authentic social interactions, seek recommendations from communities they trust, and expect personalized experiences that feel human, not corporate.

At Nectar Social, we're building the AI-native social operating system that enables this new era of commerce. We believe every social interaction should deepen the relationship between brands and their communities while creating genuine value for both sides.

Founded by ex-Meta product and engineering leaders, we've raised over $30M in total capital from investors including GV and True Ventures. We work with brands like Oura Health, Caraway, e.l.f. Cosmetics, Kosas, OLIPOP, and many more. We're building the future of social commerce -- where community, conversation, and commerce converge.

The Role

We're looking for a Senior Site Reliability Engineer to own the reliability, scalability, and operational excellence of the production systems that power Nectar's platform. We run high-volume data ingestion pipelines and real-time AI agents on top of a fast-growing customer base, and we need a seasoned SRE to help us scale these systems safely and keep them running flawlessly.

As one of our first dedicated SREs, you'll have outsized impact and ownership. You'll define how we measure, operate, and harden our infrastructure -- establishing the reliability foundations that the rest of the engineering team builds on as we scale.

What You'll Be Doing

  • Own the reliability and scalability of our production systems as they handle rapidly growing volumes of social data and real-time AI workloads

  • Define and drive SLOs, SLIs, and error budgets, and build the observability, alerting, and on-call practices to support them

  • Lead incident response and blameless postmortems, then turn what we learn into systemic improvements that prevent recurrence

  • Improve performance, cost efficiency, and capacity planning across our cloud infrastructure as the platform scales

  • Harden our infrastructure-as-code, deployment, and CI/CD pipelines for resilience and repeatability

  • Partner with engineering teams to embed reliability into system design and raise the operational bar across the org

What We're Looking For

  • 5+ years of experience operating production systems as an SRE, infrastructure, or platform engineer

  • Experience scaling databases, data infrastructure, or complex production platforms under significant load

  • Hands-on expertise with cloud infrastructure (AWS or similar) and infrastructure-as-code tooling

  • Solid programming skills for building automation, tooling, and operational services

  • Comfortable operating in fast-moving startup environments with high ownership and autonomy

  • A reliability-first mindset balanced with pragmatism about velocity and cost

Bonus Points

  • Experience standing up or maturing an SRE practice at an early-stage or rapidly scaling company

  • Familiarity with our tech stack: AWS, Pulumi, Postgres, ClickHouse, Turbopuffer, or Temporal

  • Background in capacity planning, performance engineering, or cost optimization at scale

What We Offer

  • Competitive compensation and early equity

  • Health, vision, and dental benefits + 401(k) match

  • Clear career growth opportunities as the company scales

  • Free lunch in the heart of University Ave. in Palo Alto

  • Deep exposure to cutting-edge AI tooling and the opportunity to shape how brands use it

  • A collaborative, ambitious team defining a new category of AI-native marketing infrastructure