AI for Business

    AI receptionist: how it works, what it costs, and when a small business needs one

    An AI receptionist answers your calls, takes messages, books appointments, and routes enquiries automatically, around the clock. This guide covers the full technical stack (Twilio, VAPI, ElevenLabs, n8n, RAG), the cloud-versus-on-prem question, what it costs, and the honest case for when it is not worth building.

    13 June 2026
    10 min read

    AI in 5 Minutes

    Weekly AI insights delivered to your inbox

    AI voice receptionist concept with phone, audio waveform and digital brain network — ai receptionist

    An AI receptionist is a voice or chat agent that handles your inbound calls, texts, or web chats automatically. It answers questions about your business, takes a message, books an appointment, or transfers the caller to the right person. It does this 24 hours a day without anyone picking up the phone.

    I run KlarifAi, an AI consultancy based in Sunderland. We build these systems for UK businesses, from manufacturers to professional services. This post covers the full technical picture, including the tools, the architecture, and what to expect when you build or commission one.

    What an AI receptionist actually is

    The term covers a range of setups. At the simple end, it is a chatbot on your website that answers common questions. At the more capable end, it is a voice agent that picks up your phone, holds a natural conversation in real time, retrieves accurate information about your specific business, and logs the outcome somewhere useful.

    The use cases that justify the build cost for a UK SME are usually:

    • Inbound calls going unanswered outside office hours
    • Staff spending a large part of their day fielding the same five questions
    • Appointment or booking requests coming in by phone that need to be logged manually
    • A small team (2 to 10 people) where nobody has a dedicated front-desk role

    Manufacturers and trade businesses are a strong fit. A 15-person fabrication shop cannot have someone on reception all day. An AI agent can answer calls about lead times, delivery status, and contact details while the team is on the floor.

    How an AI receptionist works

    Five layers, each with a clear job. Here is the full flow:

    How an AI receptionist handles a call

    Incoming call / SMS / web chat

    A customer dials your number, sends a text, or starts a chat widget

    Telephony layer — Twilio

    Manages the real phone number, routes the inbound call via a webhook to your AI

    VAPI

    Full voice AI platform — STT + LLM + TTS in one loop

    ElevenLabs

    Premium TTS voice — pair with Deepgram or Whisper for STT

    Hears the caller. Speaks the reply. Keeps the conversation flowing.

    AI brain / orchestration — n8n or custom code

    Decides what to do with the request. Calls tools, queries knowledge, routes to a human if needed. Runs on n8n (cloud or self-hosted VPS) or custom software on your own server.

    RAG + Vector DB

    Retrieves your FAQs, pricing, products. Pinecone, Weaviate, Supabase pgvector, Chroma

    Storage layer

    Logs the call, saves the enquiry. Cloud (Supabase, Airtable) or on-prem (PostgreSQL on VPS)

    Response to caller + optional follow-ups

    Speaks the answer back. Optionally: sends an SMS confirmation, creates a CRM record, books a calendar slot, or alerts your team.

    Each layer is a separate piece of software or service. The diagram shows how they connect. The sections below explain what each one does and the specific tools used.

    The technology stack explained

    You do not need to be a developer to understand this. Each piece has a specific job, and knowing which tool does what makes it easier to have an informed conversation with whoever builds it for you.

    Twilio

    Phone number and call routing

    Twilio provides the actual UK phone number. When someone calls it, Twilio fires a webhook to your AI system. It handles call routing, recording consent prompts, and SMS fallback. Cost is roughly £1 per month per number plus per-minute usage. You can also buy a number directly inside VAPI if you want to keep everything in one place.

    twilio.com

    VAPI

    Voice AI platform (STT + LLM + TTS in one loop)

    VAPI manages the real-time conversation loop: it transcribes the caller's speech (speech-to-text), sends the text to a language model, and converts the reply into natural-sounding speech (text-to-speech). It handles the latency engineering that makes phone conversations feel natural rather than robotic. This is the most complete off-the-shelf solution for an AI phone agent.

    vapi.ai

    ElevenLabs

    Premium text-to-speech voice engine

    ElevenLabs produces some of the most realistic AI voices available. It is primarily a TTS engine, so for a full phone agent you pair it with a speech-to-text provider (Deepgram and OpenAI Whisper are the common choices). Use ElevenLabs when voice quality is a priority and you are comfortable assembling the stack yourself or working with a developer to do it.

    elevenlabs.io

    n8n (or custom code)

    Orchestration — the brain that decides what to do

    n8n is an open-source workflow tool that connects all the pieces. When VAPI transcribes a caller's question, n8n receives it, queries the knowledge base, calls any relevant APIs (calendar, CRM, stock system), and sends a response back. n8n can run on n8n.cloud or self-hosted on a VPS for full data control. For more complex or performance-sensitive setups, the orchestration layer is built as custom Python or Node.js code and hosted on a VPS or dedicated server.

    n8n.io

    RAG and vector database

    Knowledge retrieval — how the AI knows your business

    RAG (Retrieval Augmented Generation) lets the AI search your business knowledge before answering. You feed the vector database with your FAQs, product catalogue, pricing, opening hours, and policies. When a caller asks 'do you have X in stock?' or 'what are your opening hours?', the AI retrieves the relevant chunk of your data and uses it to answer accurately. Common vector DB choices: Pinecone (managed cloud), Weaviate (cloud or self-hosted), Supabase pgvector (simple and cheap), or Chroma (self-hosted, easy to run locally).

    Storage layer

    Where call data, enquiries, and bookings are saved

    Every call generates data: who called, what they asked, what the AI said, what action was taken. That data needs to go somewhere. Cloud options include Supabase, Airtable, Google Sheets, or your CRM (HubSpot, Salesforce). On-prem options mean a PostgreSQL database running on your own VPS or server. On-prem keeps your data in-house, which matters for some industries (healthcare, legal, finance). Cloud is faster to set up and easier to maintain.

    These tools connect via webhooks and APIs. You do not need to choose all of them. A simpler setup might use VAPI (which includes its own phone number, STT, and TTS) wired to n8n, with a Supabase database for storage and a simple FAQ document for the knowledge base. That is already a functional AI receptionist that handles the majority of routine calls. See our custom AI agents service for how we approach these builds.

    Cloud vs on-prem: which to choose

    Almost everything in this stack can run either as a managed cloud service or self-hosted on a VPS or your own server. The right choice depends on your data sensitivity, technical capacity, and budget.

    Cloud

    • Faster to set up, no server management
    • Automatic updates and backups
    • Scales without infrastructure changes
    • Monthly cost grows with volume
    • Your data lives on third-party servers

    Tools: n8n.cloud, Pinecone, Supabase (hosted), Airtable

    On-prem / VPS

    • Full data control, stays in-house
    • Predictable fixed cost at scale
    • Better for regulated industries (healthcare, legal)
    • Needs someone to manage updates and backups
    • Higher setup effort

    Tools: self-hosted n8n, Chroma or Weaviate, PostgreSQL, Hetzner or DigitalOcean VPS

    For most UK SMEs starting out, cloud is the right call. The data involved is typically call logs and enquiry notes, not sensitive personal records. If you handle patient data, legal documents, or financial records, on-prem or a UK-hosted private cloud is worth the extra setup. The ICO's guidance on AI and data protection is worth reading before you decide where caller data is stored.

    What an AI receptionist costs

    Two buckets: infrastructure costs (the tools themselves) and build costs (designing and wiring it together).

    Typical monthly infrastructure costs

    Twilio phone number + usage~£1/month per number + £0.01–0.02/min call time
    VAPI voice AI usage~£0.05–0.15/min of active conversation
    Vector database (e.g. Pinecone or Supabase)Free to £20/month for SME volumes
    n8n.cloudFree tier available; paid from ~£17/month
    VPS (if self-hosting n8n or storage)£4–15/month on Hetzner or DigitalOcean

    For a business taking 200–500 calls per month, total infrastructure typically runs £50–150/month.

    Build and setup costs at KlarifAi: from £300/month on our Quick Wins tier for an ongoing engagement, or from £5,000 for a complete bespoke build that includes the telephony setup, knowledge base, orchestration, storage, and training for your team. The first 30-minute call is free, and you leave with a cost-versus-saving estimate specific to your call volume.

    The honest ROI test: if you currently miss 30 calls a month outside office hours and each call is worth £100 in potential business, that is £3,000 of at-risk revenue. An AI receptionist that captures even half of those pays back quickly.

    When you do not need an AI receptionist

    The honest part. Several situations where it is the wrong call:

    • Your callers mostly need human judgment. An AI handles rules-based, predictable calls well. If most of your enquiries are complex, emotionally sensitive, or highly variable, the agent will frustrate callers rather than help them.
    • You get fewer than 30–40 calls a week. At that volume, the infrastructure cost and build effort rarely earn back. A voicemail-to-email setup (free with most phone systems) is the right tool.
    • Your knowledge base does not exist yet. The AI is only as good as the information you give it. If your FAQs, pricing, and product details are not written down anywhere, the first job is writing them, not building the agent.
    • Your industry has specific regulatory requirements around phone calls. Financial services and healthcare calls have FCA and CQC rules around recording, consent, and what an automated agent can and cannot say. Check compliance before you build, not after.

    How to get started

    The practical starting point is to document what your current phone calls actually look like. For one week, note down every type of call your team handles and roughly how long each takes. Group them into categories: FAQs, bookings, complaints, sales, wrong numbers. The categories that are high-volume and predictable are where the AI starts.

    From there, the build follows the stack in this post: phone number, voice layer, orchestration, knowledge base, storage. We build these on n8n for most clients because it is transparent, your team can see exactly what happens on each call, and we can hand over a system your team actually understands. This ties directly to our workflow automation approach.

    We have done this for UK manufacturers, including Roundel Manufacturing. If you are in a similar situation, a free 30-minute call is the fastest way to find out whether the numbers stack up for your volume.

    Frequently asked questions

    What is an AI receptionist?

    A voice or chat agent that handles your inbound calls, texts, or web chats automatically. It answers questions about your business, takes a message, books an appointment, or routes the caller, 24 hours a day, without a human picking up. The stack: a phone number provider (Twilio), a voice AI layer (VAPI or ElevenLabs), an orchestration tool (n8n or custom code), and a knowledge base the AI queries for accurate answers.

    What is VAPI used for in an AI receptionist?

    VAPI manages the full phone call loop: it converts the caller's speech to text, sends it to an AI model, and converts the reply into natural-sounding speech. It connects to a Twilio phone number on one side and to your workflow on the other. It is the most common choice for building AI phone agents because it handles the real-time latency that makes phone conversations feel natural.

    What is the difference between VAPI and ElevenLabs?

    VAPI is a complete voice agent platform handling the whole call (speech recognition, AI reasoning, and speech synthesis). ElevenLabs is primarily a text-to-speech engine known for highly realistic voice quality. For a full AI receptionist, VAPI handles more of the infrastructure out of the box. ElevenLabs is typically paired with a separate speech-to-text provider (Deepgram or OpenAI Whisper) when you want more control over voice quality in a custom setup.

    What is RAG and why does an AI receptionist need it?

    RAG stands for Retrieval Augmented Generation. The AI searches your specific business knowledge before answering, rather than relying on general training data. You feed the vector database with your product catalogue, pricing, opening hours, and FAQs. When a caller asks something specific, the AI retrieves the relevant information and gives an accurate, up-to-date answer. Without RAG, the AI can only answer generic questions.

    Will AI replace receptionists?

    The honest answer: for the routine 70% of calls (FAQs, appointment booking, message taking, call routing) an AI handles well. For calls that need real judgment, empathy, or complex problem-solving, a human is still the right answer. An AI receptionist takes the predictable, repetitive work so your team spends time on the calls that actually need them.

    How much does an AI receptionist cost?

    Infrastructure runs roughly £50 to £150 per month for a business taking 200 to 500 calls per month (Twilio number, VAPI usage, vector database, n8n). Build costs at KlarifAi start at £300 per month for an ongoing engagement or from £5,000 for a complete bespoke build. A free 30-minute call gives you a specific number for your call volume.

    Find out if an AI receptionist pays back for your call volume

    Book a free 30-minute call. We will map your current call types, estimate the infrastructure cost, and give you an honest view on whether the build makes sense. You keep the analysis either way.

    Further reading: ICO guidance on AI and data protection, VAPI voice AI platform, and ElevenLabs text-to-speech.

    Keep reading

    Guide

    Automation Consulting: A Plain UK Guide

    What automation consulting is, what to automate first, and what it costs for UK manufacturers and SMEs.

    Guide

    AI Automation Agency UK

    The wider view on AI automation: what you can automate, what it costs, and how to choose an agency.