Callosum

Cloud Systems & Resource Orchestration - Member of Technical Staff

🇬🇧 London, Vereinigtes Königreich Vor Ort IT Veröffentlicht Mai 20, 2026
Arbeitsort Vor Ort
Kategorie IT
IT-Kategorie DevOps / SRE
Sprache English
Veröffentlicht 20. Mai 2026
Zuletzt geprüft 3. Juni 2026
JobGrid-Kontext

Rollenübersicht von JobGrid

Cloud Systems & Resource Orchestration - Member of Technical Staff at Callosum: London, Vereinigtes Königreich; Vor Ort; IT; DevOps / SRE. JobGrid adds normalized role facts, source context, and a path to the employer application page so candidates can compare the listing before applying.

  • Location and workplace: London, Vereinigtes Königreich, Vor Ort
  • Role classification: IT, DevOps / SRE
  • Source freshness: checked by JobGrid on 2026-06-03.
  • Application path: candidates continue to the employer application page with non-personal referral tags.

About Us

Artificial intelligence scaled on a bet - that bigger models, more identical chips, and more data would keep delivering. As problems grow more complex and the requirements of intelligence more diverse, that bet is breaking down. The next era belongs to heterogeneous intelligence: diverse models on diverse chips, each with distinct strengths, co-evolving into systems of capability unreachable by any single model or accelerator.

Callosum is the Intelligent Systems company. We built the infrastructure to make that possible. Our co-evolution engine optimises simultaneously across workflows, agents, and silicon. We launched in early 2026 showing orders of magnitude improvements in performance and a shift in the cost-performance frontier that no single chip or model provider can provide.

We believe intelligence comes from the system, not the model.

We are scientists and engineers solving what others consider impossible. If you thrive on hard problems, and are passionate and energised by the scale of the challenge, we'd love to hear from you.

About the Role

Callosum believes that orders of magnitude improvements in AI systems will come through application-aware orchestration across heterogeneous hardware. We are building that vision: infrastructure that treats the full landscape of compute as a unified, co-evolving system, evolved beyond GPUs. Current orchestration stacks were built for the homogeneous world - naive to the strengths of new chips and blind to the demands of modern multi-agent workflows.

This role defines how Callosum addresses this problem at the cloud and cluster level, transforming a fragmented compute ecosystem into a unified, exploitable resource pool. We are building the novel paradigm of orchestration that understands accelerator-specific constraints and capabilities. Your work is what makes heterogeneous compute intelligent at scale: every chip placed precisely and allocated efficiently in a stack that is resource-aware and diversity-native.

What You’ll Build

  • Design and build multi-cloud orchestration systems that abstract provider-specific differences behind a unified deployment and scheduling layer

  • Extend Kubernetes - particularly Dynamic Resource Allocation (DRA) — to be aware of heterogeneous accelerator topologies and capabilities, and multi-agent AI workflows

  • Implement intelligent load balancing and placement strategies across cloud providers, regions, and hardware types

  • Build control plane systems that enable efficient allocation and management of heterogeneous accelerator capacity while preserving the ability to exploit hardware-specific strengths

  • Collaborate with an Accelerator Systems Software engineer to surface low-level scheduling primitives into the orchestration layer

What You Bring

  • Strong experience with Kubernetes internals - custom controllers, schedulers, device plugins, CRDs, and the DRA framework

  • You've built or operated multi-cloud infrastructure and have a detailed understanding of the networking, storage, and compute differences between major providers

  • Familiarity with GPU/accelerator resource management in cluster environments (e.g. MIG, time-slicing, device plugins, topology-aware scheduling)

  • Experience with infrastructure-as-code, fleet management, and the reliability engineering required to keep large-scale heterogeneous systems running

What We Offer

  • Competitive Salary, determined by skills and experience

  • Equity & Ownership

  • Private healthcare

  • We offer Visa sponsorship and relocation benefits to hire the best in the world

  • We work in person at our London office. You'll have the tools, space and setup to do your best work, and if you have specific needs, just tell us

We're committed to building an inclusive workplace where everyone feels welcome, and believe in equal opportunities for all.