the differentiator

The stack.

Most AI agencies resell Vapi or Bland and pay a per-minute platform tax on every client call. Ascero AI owns the stack from telephony to memory — open-source, multi-model, multi-cloud. The result: flat-rate client pricing, no single-vendor lock-in, and a margin curve that improves with scale instead of getting worse.

Below is the actual architecture, with every component named and linked. No black boxes.

Voice layer
01

Owned voice receptionist — Pipecat + LiveKit on Twilio

Most AI agencies resell Vapi or Bland and pay a per-minute platform tax on every call. Ascero AI runs Pipecat (open-source real-time voice orchestration) on LiveKit (WebRTC transport) with Twilio handling the PSTN interconnect. Your receptionist runs on our infra, not someone else's billing meter — so margins compound as you grow.

Why this matters: Per-minute platform fees compound to the wrong side of the trade as call volume grows. Owning the voice stack lets Ascero AI offer flat-rate pricing instead of usage-meter pricing, which the research shows is the #1 SMB conversion lift in the AI receptionist category.

Reasoning layer
02

Multi-model routing across Anthropic, OpenAI, Google

Every model has a different shape. Claude is the default for nuanced reasoning, multi-step planning, and the AskSummit chat agent. GPT-realtime carries the voice receptionist's conversational layer. Gemini handles long-context document reasoning and Google Workspace integration. Switching the model is a config change, not a rewrite.

Why this matters: Single-model agencies are downstream of one vendor pricing, latency, and policy decision. Multi-model routing keeps Ascero AI insulated from any single provider — and lets each task land on the model that does it best.

Browser + research layer
03

Browser-native agents — Stagehand + browser-use

Half of small-business work happens inside web portals: insurance carriers, MLS listings, scheduling tools, e-commerce admins. Stagehand v3 (Browserbase) handles deterministic, reliability-critical flows like booking and checkout. browser-use handles exploratory research and data-entry tasks via accessibility-tree navigation — no XPath, no brittle selectors. Together they replace anywhere a human currently logs into a portal.

Why this matters: Outsourced VAs cost $1,200-$3,000/mo per worker and break the moment a portal redesigns. Browser-native agents are deterministic where it matters (checkout, booking) and resilient where it doesn't (research). Ascero AI productizes this as a deployable tier.

Memory + ops layer
04

Persistent agency memory — claude-mem + multica

Every Ascero AI client engagement gets its own persistent memory store. The chat agents, the receptionist, the workflows all share context across sessions instead of cold-starting every conversation. multica runs the internal agency ops — content drafts, outbound campaigns, build tasks — assigned to agents like a Linear board assigns to humans.

Why this matters: Most AI agencies lose context between sessions and re-explain every project from scratch. Persistent memory is the difference between an agency that gets better over time and one that re-introduces itself every Monday.

Tool + skill layer
05

Skill marketplaces — anthropics/skills + curated agency stack

Ascero AI ships every client engagement with a curated skill set: official Anthropic skills (document generators, code skills) plus a Summit-specific layer (vertical-tuned receptionist scripts, lead-qualification logic, ROI math templates). Skills compose like Lego bricks — every project starts further down the cost curve than the last.

Why this matters: Building from scratch every engagement is the agency anti-pattern. A curated skill marketplace + persistent memory means month-12 engagements cost 30% of the labor that month-1 engagements did.

the thesis

Every other AI agency is downstream of someone else's billing meter.

Ascero AI owns the voice stack, the browser stack, and the memory layer. Open source where it counts. Multi-model where it matters. Self-hostable when a client demands it.

That's the moat — and it's why Summit can sell flat-rate while the rest of the field sells per-minute.

Book a stack walkthrough →

last reviewed ·

Written and maintained by Kadin Nestler + Jaiden Lawlor. The two co-founders of Ascero AI.

Find a mistake? Email us — we update inside 48 hours.