AI Engine Text-To-Speech (TTS)

New! Support for OpenAI’s gpt-4o-mini-tts model:

Voice: Nova

Deepgram text-to-speech for customer service:

Voice: Athena (British – Feminine)

New! Orpheus text-to-speech for natural conversation:

Premium text-to-speech with Cartesia, offering 1-shot voice cloning:

It supports forms too! Here’s an OpenAI form demo instructed to whisper:

This plugin is an all-in-one consolidation of my previous plugins, which includes OpenAI, Google, Deepgram, ElevenLabs, Cartesia, and Azure Text-To-Speech (TTS). Click the microphone and start talking!

Enhancement:

  • AI Engine’s default speech recognition has been modified for a conversational experience. Text is automatically sent after you finish speaking, with an option for the microphone to stay on for continuous interaction.
  • OpenAI Whisper and Deepgram Nova 2+ support for enhanced transcription and wider browser support.

  • See here for another demo of the OpenAI Text-To-Speech.
  • See here for a demo of the Deepgram Text-To-Speech.
OpenAI TTS costs about $1.00 per hour of audio which is around the same price as Azure. See pricing.
Azure TTS offers 0.5 million characters free per month which is around 8 hours of audio. See pricing.

Orpheus costs around $0.35 per hour from DeepInfra, while Cartesia is around $2.35 per hour.

Deepgram costs around $1.60 per hour for TTS and $0.50 per hour for STT while offering a 50% discount for both if you opt-in to their Model Improvement Program. In short, they will train their TTS/STT models on your data. A setting is available to opt-in.

Text-to-speech also works in popup chatbots:


This is a premium plugin, so please contact me if you’d like to purchase it for a one-time payment of $75 USD.


Let’s enhance your business! I can help you develop custom AI engine extensions or broader AI strategies.
Contact me.