Create a Telephone IVR Using ChatGPT, NextJS and Twilio

ChatGPT is truly a dream come true for phone system programmers and designers. The AI is so smart, it feels like you're talking to a real person! Gone are the days of programming in hundreds of canned responses, or worse, paying a voice actor to read a list of prompts.

By using Twilio to build your IVR, ChatGPT as the AI backend, and NextJS for the middleware, you can get an AI-powered IVR set up in less than 20 minutes.

And I'm going to show you how.

Step 1 - Get an OpenAI API Key

You'll need to create an OpenAI account to get an API key. It's currently free, but that may change in the future. Once you're logged in, click on your profile image and select View API Keys.

That's all you need to access ChatGPT.

Step 2 - Create a New NextJS Project

In this tutorial, I'm going to use NextJS solely for it's built in express API to be our middleman between Twilio and ChatGPT. This does slow down the time between asking a question and getting an answer and would not be suitable for a production environment, in my opinion, but it works!

As always, we begin with create-next-app:

npx create-next-app chatApi

Turn on typescript and ESlint if you prefer and then run:

cd chatApi
npm i next-js-cors openai
code .

Create a new file in the project root called .env.local and add your OpenAI key:

NEXT_PUBLIC_OPENAI_KEY=blahblahblahblahblah

Also, we'll need to generate a token for accessing our API from twilio as well to prevent abuse. Click here to get a secure randomly generated key. Then add it to .env.local:

NEXT_PUBLIC_API_KEY=yoursecurerandomapikey

Finally, open up your project in VS Code by typing:

code .

Ok, let's create our API route under pages/api/chat.js:

import { Configuration, OpenAIApi } from "openai";
import NextCors from 'nextjs-cors';

const configuration = new Configuration({
  apiKey: process.env.NEXT_PUBLIC_OPENAI_KEY,
});

const openai = new OpenAIApi(configuration);

export default async function handler(req, res) {
    // Create our CORS policy to allow remote hosts
    await NextCors(req, res, {
    methods: ['POST', 'HEAD'],
    origin: '*',
    optionsSuccessStatus: 200,
    });

    // Destruct request body
    const { prompt, token } = req.body;

    // Validate our API token from Twilio
    if (token !== process.env.NEXT_PUBLIC_API_KEY) {
    res.status(403).send("HTTP 403 / Unauthorized");

    return;
    }

    // Make our API request to ChatGPT and send the result to Twilio
    try {
    const completion = await openai.createCompletion({
        model: "text-davinci-003", // ChatGPT latest version
        prompt: prompt,
        max_tokens: 2048 // Max length of response (2048 is highest)
    });

    const result = completion?.data?.choices[0]?.text;

    if (result) {
        res.status(200).json({ result: result });
    } else {
        const error = completion?.data?.error;
        console.log(error);
        res.status(200).json({ result: "I'm sorry, an error occurred." });
    }
    } catch (e) {
    console.log(e)
    res.status(200).json({ result: "I'm sorry, an error occured." });
    }

    return;
}

Now deploy your project to Vercel either by the Vercel command line tool or via GitHub. If you're not sure how to do this, check out this guide.

Remember to add your environment variables under the settings tab of your Vercel project:

Add your environment variables to your Vercel project

Also, you may want to change the default Vercel domain to something more memorable:

Step 3 - Create a Twilio Account

Now head over to Twilio.com and create a free account. You'll need to enter credit card info for your minute usage, but it's super cheap. Also, they give you $10 in credit to play around with, so you can mess around with this without paying any money. If you decide to take this all the way and use this as a solution, then I highly recommend Twilio as they are the most comprehensive and extensible telephony solution on the market. And they are very reasonably priced. I'd say I put $25 on every couple of months for my personal usage.

Once you register your account, you'll need to requisition a phone number. To do this click on Phone Numbers >> Manage and Buy a number. In the search criteria, you can enter your area code to get a local number, or 888 to get a toll-free number (which costs a dollar more per month):

Ok, once you have your number, we need to create our IVR. There are many ways to accomplish this, but we are going to do it the easy way by using Twilio Studio. To get there, type Studio in the search bar at the top of your dashboard. Then click the + icon to create a new Studio flow and name it Chat IVR or something. Then click Create from Scratch.

You should be presented with a blank grid. To create our flow, you simply drag a function from the right-hand column over to the grid area and then change its parameters. It's stupid easy! The first thing we'll want to do is greet the user when an incoming call comes in. To do this, we'll use the Say / Play function. Change the parameters as follows:

WIDGET NAME: Greeting
SAY OR PLAY MESSAGE OR DIGITS: Say a Message
TEXT TO SAY: Hello, ask me anything!
LANGUAGE: English
MESSAGE VOICE: [Polly] Salli
NUMBER OF LOOPS: 1

Click save.

You can select whatever voice you like, but I'm using the Polly / Sally voice, which are provided by Amazon's Polly TTS engine. Now click and drag from incoming call from your triggers and connect it to your new widget. Now when a call comes in to this flow, the caller will be greeted by this voice. Your flow should look like this:

Next, we'll route the call to another widget called Gather Input on Call. Configure it as follows:

WIDGET NAME: ReadInput
SPEECH RECOGNITION LANGUAGE: English
PROFANTY FILTER: True | False (your choice)

Leave verything else in their default settings and click save. Then connect the Audio Complete node from Greeting to the input node on ReadInput. Your flow should look like this:

Since there is such a long delay from the time your speech is recognized to when the response is received from our API, it's necessary to indicate to the caller that their voice has been recognized and that it's being processed. To do this, we'll simply play a little tone. We'll use the Say / Play widget for this:

WIDGET NAME: Acknowledge
SAY OR PLAY MESSAGE OR DIGITS: Play a Message
URL OF AUDIO FILE: https://chatendpoint.vercel.app/processing.mp3
NUMBER OF LOOPS: 1

You can download and add this file to your own Vercel project if you like, or simply use the one from my project. Click save and then connect the User Said Something node of ReadInput to the input node of Acknowledge.

Now we'll create our HTTP request to our API. To do that, we need the, you guessed it, Make HTTP Request widget:

WIDGET NAME: RequestAPI
REQUEST METHOD: POST
REQUEST URL: https://yourdomain.vercel.app/api/chat
CONTENT TYPE: Form URL Encoded
REQUEST BODY: [leave blank]
HTTP Parameters:
    prompt: {{widgets.ReadInput.SpeechResult}}
    token: yourgeneratedapitoken

Then click save and connect the Audio Complete node from Acknowledge to the input node of RequestAPI:

Next, we need to take the response from our API and say the text back to the caller. Again we'll use the Say / Play widget:

WIDGET NAME: Response
SAY OR PLAY MESSAGE OR DIGITS: Say a Message
TEXT TO SAY: {{widgets.RequestAPI.parsed.result}}
LANGUAGE: English
MESSAGE VOICE: [Polly] Salli
NUMBER OF LOOPS: 1

Click save and link the Success node from RequestAPI to its input.

That largely completes our flow. But you'll probably want to do this in a loop. So after the response is read back to the caller, we'll want to play another tone to indicate that they should continue speaking. So grab another Say / Play:

WIDGET NAME: Prompt
SAY OR PLAY MESSAGE OR DIGITS: Play a Message
URL OF AUDIO FILE: https://chatendpoint.vercel.app/prompt.mp3
NUMBER OF LOOPS: 1

Click save and link Audio Complete from Response to the input of Prompt. Then link Audio Complete from Prompt to the input of ReadInput. That should complete the loop.

Here is what your final flow should look like:

Click the red Publish button to make your flow active.

I know you're probably as eager as I was to try it out, but there's one more thing we must do. We must set the route of your new phone number to connect to this Studio flow. So go back to Active Numbers and click Manage. Then click your number. Scroll down until you find Configure With and select Webhook, TwiML Bin, Function, Studio Flow, Proxy Service and then select Studio Flow and ChatIVR under A Call Comes In. Then click save.

Now call your number and have a pleasant conversation!

I hope you enjoyed this article. Also, I hope the possibilities of ChatGPT excite you as much as me! For more great information about web dev, systems administration and telephony programming, please visit the Designly Blog.