Posts tagged "mcp-tooling"

2 posts · all topics

Asking the test authoring agent what tools it wanted

There's a growing recognition in the industry that designing for agents-as-users is a different problem than designing for human users. We've been working through this on my team for a while, and I think it's worth naming what we've actually been doing.

When I was scoping the recent refactor of our test authoring agent's tool surface, I had the agent itself analyze its own tools and write me a report — what surprised it, what felt redundant, what it wanted that wasn't there. That report shaped the tool list in a recent upgrade that made the authoring agent a lot more capable. My teammate, Anja, did something similar with the new results analysis tool, asking a coding agent to explain why it kept reaching for one tool over another, so the design would hold up through MCP.

The pattern across both: when you're building something for an agent to use, the agent itself is the closest user-research subject you have. I want to reach for this approach earlier next time.

mabl remote MCP catching a deployment-breaking selector regression while my PR was still open.

I rewrote the workspace menu on a branch this weekend, swapping a Bootstrap dropdown for a Material-UI button, and the mabl remote MCP told me — in the PR, before I'd merged anything — that I'd just broken every mabl-on-mabl test that goes through login. The agent walked the chain: get_mabl_deployment surfaced the failing deploy preview runs, analyze_failure produced the root-cause synopsis, and get_mabl_test_details located the offending step. Our "App - Login" reusable flow asserts on a specific selector to confirm the workspace dropdown is present. My rewrite replaced that control with one mabl's intelligent find recognizes natively — so the right fix is to delete the brittle assertion and let intelligent find do its job. The agent proposed both: a one-line backwards-compat id to unblock the PR today, and a follow-up to clean up the e2e flow at leisure. Ten seconds in context. I'd have caught this at deploy time without the MCP; I caught it while I was still in the diff.