Verified Video & Streaming Est. 2016

Amazon Polly

An AI service that uses advanced deep learning technologies to synthesize natural-sounding human speech.

0.00 (0)

US Global

Visit Website Compare ← All Providers

Free Tier

5 million characters per month

Voice Types

Neural and Standard

Languages

30+

About Amazon Polly

Amazon Polly is a cloud service that converts text into lifelike speech, allowing you to create applications that talk. Polly's text-to-speech (TTS) service uses advanced deep learning technologies to synthesize natural-sounding human speech across dozens of voices and languages. It includes a variety of Neural Text-to-Speech (NTTS) voices, which deliver significant improvements in speech quality. The service supports Speech Synthesis Markup Language (SSML) for fine-grained control over speech aspects like pronunciation, volume, and rate. Developers can generate speech in real-time streams or save it as standard audio files, making it suitable for interactive voice systems, audio content creation, and accessibility applications.

Core Capabilities

Speech Synthesis

Converts text input into high-quality speech audio.

Neural Voices

Provides natural and expressive speech using Neural Text-to-Speech (NTTS) technology.

Customization

Control speech output with SSML tags for pronunciation, speed, pitch, and volume.

Custom Lexicons

Customize the pronunciation of specific words and phrases.

Technical Features

Real-Time Streaming

Enables immediate playback of synthesized speech.

Audio Formats

Supports MP3, Ogg Vorbis, and raw PCM audio streams.

Speech Marks

Provides metadata to synchronize facial animation or highlight text as it's being spoken.

Integration

Accessible via the AWS Management Console, CLI, and SDKs for various programming languages.

Amazon Polly

About Amazon Polly

Core Capabilities

Technical Features

Tags