Voice AI Orders A Pizza: Then and Now
In 1974, computer scientist John Eulenberg demonstrated a Voice AI that ordered a pizza. 50 years later, we still see a lot of the same problems in modern Voice AI.
Incomplete Calls. People keep hanging up on Voice AI. Even when it's their job to take the call. They hung up on the 1974 Voice AI and they're hanging up on modern Voice AI a lot.
Machine Intolerance. People are much more patient with other humans than they are with machines. Most people didn't engage and hung up with the 1974 Voice AI. Eventually, the researchers started the call by giving the human on the other line some context on the call. The Voice AI says, "I am using a special device to help me to communicate." I've seen similar strategies work in modern production Voice AI calls.
Interruptions. At first, the human and AI interrupt each other. Then the human gets a sense of the latency and adapts. Turn-taking is an ongoing implicit negotiation in every human-to-human conversation. We're making progress in Voice AI on turn-taking, moving away from hard-coded VAD parameters and towards neat new multimodal and semantic end-pointing models.
Spoken Letters. In my opinion, the 1974 CDC 6500 does a better job of spelling spoken letters than most modern TTS platforms we use today. Modern TTS often fails gloriously on spoken letters.
Latency (time between end of human speech and start of AI speech). The 1974 Voice AI latency is about 7 seconds. These days, we're seeing on average about 2 seconds in production. Production Voice AI calls still require patience from the human. And we can't expect all humans to be as remarkably patient and kind as the person in this video.
Uncanny Valley. The most human aspect of the 1974 Voice AI is that it answers an 'or' question ("is this for pickup or delivery") with 'yes'. For a moment there, I thought it was a real person!
Success Rate. The audience applauds at the end of the successful call in the video. Building a great Voice AI agent requires a lot of craftsmanship. The models are getting better, but they're still not there. To build a great voice AI, you have to augment the models with a lot of workarounds and bandaids. I still applaud when I see on our platform a finely-executed Voice AI call.
You can find our complete analysis of modern Voice AI performance here.
Next Step
I love the optimism in the reporters sign off. "It may not be very long until we'll all be able to use computers to communicate." We're getting there!
But Adrian and I still unfortunately see far too many Voice AI's that are shipped because they run. Do you want to craft a great Voice AI agent that inspires your customers to deploy it for their entire call volume? Then you have to become a student of what the Voice AI agent is doing in production. That's where our Voice AI analytics platform can help. We'd love it if you tried it out!
Tom and Adrian
March 2025