Amazon Polly
Verified Audio & Speech Generation Est. 2016

Amazon Polly

An AI service that uses advanced deep learning technologies to synthesize natural-sounding human speech.

0.00 (0)
US Global

Free Tier

5 million characters per month

Voice Types

Neural and Standard

Languages

30+

About Amazon Polly

Amazon Polly is a cloud service that converts text into lifelike speech, allowing you to create applications that talk. Polly's text-to-speech (TTS) service uses advanced deep learning technologies to synthesize natural-sounding human speech across dozens of voices and languages. It includes a variety of Neural Text-to-Speech (NTTS) voices, which deliver significant improvements in speech quality. The service supports Speech Synthesis Markup Language (SSML) for fine-grained control over speech aspects like pronunciation, volume, and rate. Developers can generate speech in real-time streams or save it as standard audio files, making it suitable for interactive voice systems, audio content creation, and accessibility applications.

Core Capabilities

Speech Synthesis

Converts text input into high-quality speech audio.

Neural Voices

Provides natural and expressive speech using Neural Text-to-Speech (NTTS) technology.

Customization

Control speech output with SSML tags for pronunciation, speed, pitch, and volume.

Custom Lexicons

Customize the pronunciation of specific words and phrases.

Technical Features

Real-Time Streaming

Enables immediate playback of synthesized speech.

Audio Formats

Supports MP3, Ogg Vorbis, and raw PCM audio streams.

Speech Marks

Provides metadata to synchronize facial animation or highlight text as it's being spoken.

Integration

Accessible via the AWS Management Console, CLI, and SDKs for various programming languages.

Tags

API Global Enterprise
Visit Website
Founded 2016
Country US
Coverage Global
Access Type Self-serve
Pricing Model Usage-based
Pricing Visibility Public
Auth Method API Key
Sandbox Not available
Compare Providers