Descript Audio

Descript Audio — User Guide

Edit podcasts by editing text—remove filler instantly.

Visit website VPN may be required Freemium Sign-up required
Strengths
  • Text editing is video editing. Deleting text automatically deletes the corresponding video clips.
  • AI automatically removes slips of the tongue and pause words (um, ah, etc.)
  • Sound cloning function to correct recording errors without re-recording
  • Automatically generate subtitles with high accuracy
Best for
  • Podcast recording and post-production
  • YouTube video editing and subtitle generation
  • Online course video production
  • Organizing and editing conference videos

Text-Driven Video Editing

Descript transcribes video into text and edits video by editing text.

Scenario

Quickly remove slips of the tongue from videos

Prompt example
After importing the video, find the slip of the tongue part in the transcribed text, directly select and delete it
Output / what to expect
The corresponding video clips are automatically deleted, and the editing points transition naturally without the need to manually drag the timeline.
Tips

Turn on the "Remove filler words" function, Descript will automatically mark all "um", "ah" and other stop words, and delete them in batches with one click.

Scenario

Generate and edit subtitles

Prompt example
Import the video, click "Transcribe", select the language as Chinese, and wait for automatic transcription to complete
Output / what to expect
Automatically generate subtitles with an accuracy of more than 90%, errors can be modified directly in the text editor, and the subtitles and video are synchronized in real time.
Tips

Select SRT format when exporting subtitles, which can be directly uploaded to platforms such as YouTube and Bilibili.

Sound cloning and restoration

Use the Overdub feature to clone sounds and fix recording errors.

Scenario

Correct the incorrect content in the recording

Prompt example
Select the text that needs to be modified, enter the correct content directly, and click "Regenerate with Overdub"
Output / what to expect
AI regenerates the audio segment using the cloned voice, seamlessly blending with the original recording without the need for re-recording.
Tips

Voice cloning requires recording 10 minutes of training material first, and the effect will be very natural after the training is completed.

Sources & references: