I’ve been on a text-to-speech and speech-to-text kick lately. My last post talked about using AWS S3 and Amazon Transcribe to convert your audio files to text and in previous articles I’ve covered how to create temporary prompts using Poly so you can build out your contact center call flows. Well, now we’re going to expand our use case to allow a traditional on premise call center to leverage the cloud and provide dynamic prompts. My use case is simple. I want my UCCX call center to dynamically play some string back to my caller without having to use a traditional TTS service.

First, this is not new in any way and other people have solved this in different ways. This Cisco DevNet Github repo provides a method to use voicerss.org to generate TTS for UCCX. However, this process requires loading a jar file in order to do Base64 decoding. Then there’s this Cisco Live presentation from 2019, by the awesome Paul Tindall, who used a Connector server to do something similar. To be fair the Connector server allowed for a ton more functionality than what I’m looking for.

Cisco Live Presentation

Second, I wanted this functionality to be as easy to use as possible. While functionality keeps getting better for on premise call center software there are still limitations around knowledge to leverage new features and legacy version that can’t be upgraded that makes it harder to consume cloud based services. I wanted the solution to require the least amount of moving parts possible. That means no custom Java nor additional servers to stand up.

The solution I came up with leverages Google’s cloud (GCP) specifically Cloud Functions. However, the same functionality can be achieves used AWS Lambda or Azure’s equivalent. At a high level we have an HTTP end point where you pass your text string to and in return you will get a wav file in the right format which you can then play back.

Flow Diagram

The URL would look something like this:

https://us-central1-myFunction.cloudfunctions.net/synthesize_text_to_wav?text=American%20cookies%20are%20too%20big

The Good Things About This

Pay as you go pricing for TTS. Looking at the pricing calculator a few hours of TTS a month would run under $2.00/month.
Infinitely scalable. If you’re handling 1 call or 100 calls your function will always return data.
Easy to use.

The Bad Things About This

There is a delay between making the request and getting the wav file. I’ve seen as long as 7 seconds at times. I would only use this in a very targeted manner and ensure it didn’t affect the caller experience too drastically.
Requires your on premise IVR to have internet access. Often time this is a big no no for most businesses.

Some initial testing with UCCX is showing some positive results. I’m going to investigate if there’s a way to accelerate the processing in order to keep the request and response in under 3 seconds as well as adding the ability to set language, voice, and even SSML via arguments. If you want to build this yourself here’s the code for the function.

def synthesize_text_to_wav(request):
"""Synthesizes speech from the input string of text."""
text = request.args.get('text')

client = texttospeech.TextToSpeechClient()
input_text = texttospeech.SynthesisInput(text=text)
voice = texttospeech.VoiceSelectionParams(
language_code="en-US",
name="en-US-Standard-C",
ssml_gender=texttospeech.SsmlVoiceGender.FEMALE,
)
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.MP3
)
response = client.synthesize_speech(
request={"input": input_text, "voice": voice, "audio_config": audio_config}
)

src_file_path = '/tmp/output.mp3'
dst_file_path = '/tmp/output.wav'

# make sure dir exist
os.makedirs(os.path.dirname(src_file_path), exist_ok=True)

# The response's audio_content is binary.
with open(src_file_path, "wb") as out:
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
AudioSegment.from_mp3(src_file_path).export(dst_file_path, format="wav", codec="pcm_mulaw", parameters=["-ar","8000"])
return send_file(dst_file_path

Be awesome!

~david

I’m getting more and more into serverless development. Trying to avoid handling any sort of hardware sounds like a dream come true. No more handling security patching, load balancing, etc. However, one of the biggest things I struggle with is how to do local development efficiently without having to deploy your code to the cloud every time? Google’s serverless offering, Firebase, released a very cool tool this year which allows you to emulate most their services locally. Here are some of my learnings so far. These should be specifically relevant if you’re doing any React development with Firebase.

Setting Up Your Local Development

Prerequisites:

We won’t be using the node.js server, but use node to install components and to develop our React app.
Firebase account.
Firebase CLI a key way to do that is using the command npm install -g firebase-cli
Optional but recommended create-react-app installed npm install -g create-react-app
Optional Your favorite IDE. I’m a sucker for VS Code.

Project Setup

Setting up your project. From the Firebase console, Add project:

Choose a name.
Choose your Google Analytics setting, not relevant for this.
Create project.
Add an app to get started and choose Web.
Choose a name, I generally use the same name as the project and set up Firebase Hosting.
Click on Database and create a Cloud Firestore database. Choose to start in test mode. Choose your favorite/closest region.

Local Setup

Create a folder where you’ll be doing your development, this will be your root folder. I give this folder a project relevant name.
First make sure you login with the below command and use the Google account associated with the Firebase console above.
- firebase login
Initialize your project with the command below. Make sure to choose the following features and be sure to select the Firestone project we created earlier during the project setup. Choose all the other defaults presented and choose the following emulation settings.
- firebase init
  
  Choose your Firebase services you’ll be using.
  
  Firebase cli emulator settings.

Of most importance here is that you take a look at your firebase.json file which has been generated by the initialization. It should look very much like this. Pay close attention to the emulators and hosting sections as these will play an important role later. One thing to watch out for at this point is to ensure the ports you’ve asked the emulator to use are actually open. On the next step we will be able to confirm if they are opened or not, but this is the file you use to change them if you get an error. Here’s what it should look like if you’re following along.

{
  "firestore": {
    "rules": "firestore.rules",
    "indexes": "firestore.indexes.json"
  },
  "functions": {
    "predeploy": [
      "npm --prefix \"$RESOURCE_DIR\" run lint"
    ]
  },
  "hosting": {
    "public": "public",
    "ignore": [
      "firebase.json",
      "**/.*",
      "**/node_modules/**"
    ],
    "rewrites": [
      {
        "source": "**",
        "destination": "/index.html"
      }
    ]
  },
  "emulators": {
    "functions": {
      "port": 5001
    },
    "firestore": {
      "port": 8080
    },
    "hosting": {
      "port": 5000
    },
    "ui": {
      "enabled": true
    }
  }
}

Finally it’s time to take a look at what we have so far. We’re going to start the emulator and see what we get with the out of the box setup for a Firebase project. If you get any errors it’s more than likely that you have a port conflict. I have a port conflict and moved Function from 5001 to 5080. If you need to to the same go back to your firebase.json file and find a free port and try again.
- firebase emulators:start

If everything worked you should see the following.

Firebase emulator running

At this point let’s stop for a second and break down what we have available to us. First, going to http://localhost:5000 will show you Firebase Hosting’s emulation. Next, going to http://localhost:4000 gives you a nice dashboard of all your emulated services and their status. As well as links to the relevant logs and details for those services. Finally, a log window with a very handy search feature to be able to do faster troubleshooting.

Firebase emulator UI

Firebase emulator log UI

If you’ve gotten this far you’ve setup a project through the Firebase console. You’ve setup your local dev environment. You’ve emulated Firebase for local test. Now we’re going to go through a very simple exercise where we’re going to use the most popular services for Firebase and show what you can and can’t emulate.

React Development Part 1

We are going to create a simple React application that allows a user to register, login, and then see the registration details they entered. This exercise will walks us through a few things:

Hosting: For the React application
Functions: API for registration and login
Firestore: Database for user information
Authentication: Firebase user management

First there are a few things we need to setup.

In the Firebase console for your project select Authentication and “Setup sign-in method”. You’re going to want to setup Email/Password provider. This will allow users to use those details to authenticate.
In your terminal go to the root of your project and create a new create-react-app (CRA) app. I like to use view as my root React folder, but you can choose whatever you want. You’ll want to end up with the following folder structure.

create-react-app view

Default Firebase and CRA file structure

At this point you have a CRA app inside your Firebase, but when you go to your Hosting URL you are still going to see the default Firebase website.

Firebase default Hosting webpage

Go back to your firebase.json and change your hosting path to view/build and then restart your Firebase emulator and you should now see your CRA app.

...
"hosting": {
"public": "view/build",
...

CRA default webpage

I will pick up the rest of the exercise on a follow up blog post as this is getting very lengthy.

~david

dmacias . org

Contact Center Musings: UCCE UCCX Amazon Connect Twilio Flex

Category / Development

Adding Text to Speech to Your IVR Using SaaS (Google Cloud Functions)

Serverless Development with Firebase Emulator

Setting Up Your Local Development

Prerequisites:

Project Setup

Local Setup

React Development Part 1