robotsandpencils

Senior Testing Engineer

🇨🇦 Canada, CA, CA Hybrid IT Lead Posted May 14, 2026

Apply

WorkplaceHybrid

SeniorityLead

CategoryIT

IT CategoryQA / Test Automation

LanguageEnglish

PostedMay 14, 2026

Last verifiedMay 14, 2026

Where this role is available

Collapsed by default to keep the job description easy to scan.

2 locations

Canada

Canada, CA
CA

JobGrid listing details

JobGrid.eu keeps the employer description in its original language and adds clear listing facts, freshness, and source context so candidates can evaluate the role before applying.

Key details: 2 locations, IT, QA / Test Automation, Hybrid, Lead
Current openings: 11 active jobs
Original language: English
Source and freshness: Collected from public career pages and reviewed through JobGrid.eu source availability checks. Last verified: May 14, 2026.
Apply path: JobGrid.eu sends candidates to the original application page and adds non-personal referral parameters.

Company Overview

Robots & Pencils is an applied AI engineering firm building the next frontier of business architecture. We design and ship AI co-workers that integrate into enterprise operations and deliver measurable results for our clients. We’re all in on AWS, combining deep UX capability with senior engineering talent to get AI into production fast and keep it there. We’ve earned the trust of leaders across Consumer Products and Retail, Education, Energy, Financial Services, Healthcare, and Manufacturing and more, and earned a reputation as the nimble alternative to traditional global systems integrators. Founded in 2009, with delivery centers in Canada, the United States, Eastern Europe, and Latin America, we are smaller, faster, and more senior by design. Our teams average 15+ years of experience. We move fast, sweat the details, and build things that actually ship.

Position Overview

We’re looking for a Senior Testing Engineer to join our team and own quality across a cloud-native AI/ML platform built on AWS. This is not a QA position. It is a hands-on engineering role for someone who writes test code across a production Python and AWS stack. In this role, you will evaluate the platform, create a comprehensive test coverage plan, and drive best practices across the team. You’ll propose and implement improvements to our testing infrastructure, modify production code to improve testability when necessary, and work with the broader engineering team to establish patterns that other developers can adopt and carry forward. You will be the authoritative voice on testing automation and test engineering on this engagement, working with minimal supervision.

Why This Role Matters

At Robots & Pencils, we design AI systems for a human world. Our name says it all. Robots and pencils means engineering paired with creativity, because every agent we ship has to work for real people in real workflows. That balance is baked into how we operate.

This platform delivers AI-powered learning and automation to real users—and its reliability depends on the quality infrastructure you build. You won’t be auditing a test suite. You’ll be building one from the ground up, shaping what “done” means across every layer of the stack: Lambdas, DynamoDB, SQS, event-driven flows, and agentic AI pipelines. When this platform works, people learn better and move faster. That’s what’s on the line.

What You’ll Do

Craft & Delivery

Evaluate the platform, produce a thorough test coverage plan, and design a scalable testing architecture for the Python/AWS stack (Lambda, DynamoDB single-table design, SQS, S3, EventBridge, CDK) across unit, integration, E2E, agentic eval, and synthetic learner layers

Write production-grade test code using PyTest and Python-native frameworks, build and maintain agentic evals and synthetic learner pipelines that validate AI-driven workflows end-to-end, and own quality gates in CI/CD pipelines (e.g. GitHub Actions)—modifying production code to improve testability when warranted

Bring an AI-forward mindset to your daily work, using tools like Claude Code and Cursor to ship higher-quality work at pace

Collaboration & Communication

Partner with engineering and product leadership to align test strategy with delivery goals and platform architecture decisions

Translate test coverage status, quality risks, and recommended investments into terms technical and non-technical stakeholders can act on
Lead test planning sessions and release readiness assessments, driving clear go/no-go signals across the team

Leadership & Influence

Establish the testing standards, frameworks, and patterns the broader engineering team adopts and extends, mentoring junior and mid-level engineers on testing practices to raise quality ownership across the team rather than centralizing it on yourself

Take ownership of quality end-to-end, including the unglamorous work of stabilizing flaky suites and paying down test debt
Evaluate and introduce emerging tools and methodologies, continuously improving testing quality and velocity without chasing novelty for its own sake

What You’ll Bring

5+ years of professional software engineering experience with a strong focus on testing—unit, integration, E2E, and/or AI/ML system testing

Strong Python programming skills; this role writes test code, not just test plans

Hands-on experience with AWS services including Lambda, DynamoDB, SQS, S3, and EventBridge; CDK experience a strong plus

Deep expertise with PyTest and Python-native testing frameworks, with a track record of designing and scaling test automation infrastructure

Experience writing and maintaining E2E and integration tests for event-driven, serverless, or microservices architectures

Familiarity with DynamoDB single-table design and the specific challenges of testing against it

Experience building or validating agentic or LLM-based systems; comfort with evals, output consistency testing, and hallucination/accuracy validation

Strong CI/CD expertise, with experience owning quality gates in delivery pipelines (e.g. GitHub Actions)

Working knowledge of AI safety and responsible AI principles as they apply to validating LLM behavior, prompt injection defenses, and PII handling in test data

Demonstrated ability to work independently, drive architectural recommendations, and deliver with minimal supervision

Demonstrable usage of AI-forward tools such as Claude Code and Cursor

Strong problem-solving skills and sound judgment in ambiguous technical territory

You’ll Do Well Here if You Are

A doer. You see something broken and fix it. You’d rather move on clarity than wait for certainty.

A fast learner who knows you don’t know everything. The AI landscape changes weekly. You’re senior enough to know better and curious enough to keep learning anyway.

Direct in a way that makes the work better. You give honest feedback. You’d rather have the hard conversation than blow smoke.

Obsessed with craft. You know genius is in the details. You ship exceptional, not perfect, and you don’t put your name on work you wouldn’t stand behind.

Built for ownership. You honor commitments, admit mistakes fast, and back your teammates when a decision costs something. No handoffs, no finger-pointing.

All in. You treat clients’ businesses like your own. You take the work seriously without taking yourself seriously.

Resourceful when the budget, timeline, or team is tight. Constraints don’t slow you down. They sharpen you.

Glad to be in the room with people who care as much as you do. Our teams average fifteen-plus years of experience. We hire people who push each other to do better work.