AI Creates Realistic Sound Effects for Video Clips

A team out of the University of North Carolina at Chapel Hill are working on an AI that can add sound effects to video clips, and humans can’t tell the difference.

When AI Supplies the Sound in Video Clips, Humans Can’t Tell the Difference

Today we get an answer thanks to the work of Yipin Zhou and pals at the University of North Carolina at Chapel Hill and a few buddies at Adobe Research. These guys have trained a machine-learning algorithm to generate realistic soundtracks for short video clips.

Indeed, the sounds are so realistic that they fool most humans into thinking they are real. You can take a test yourself here to see if you can tell the difference.

The team take the standard approach to machine learning. Algorithms are only ever as good as the data used to train them, so the first step is to create a large, high-quality annotated data set of video examples.

The team create this data set by selecting a subset of clips from a Google collection called Audioset, which consists of over two million 10-second clips from YouTube that all include audio events. These videos are divided into human-labeled categories focusing on things like dogs, chainsaws, helicopters, and so on

To train a machine, the team must have clips in which the sound source is clearly visible. So any video that contains audio from off-screen events is unsuitable. The team filters these out using crowdsourced workers from Amazon’s Mechanical Turk service to find clips in which the audio source is clearly visible and dominates the soundtrack.

Read More at Technology Review

About Paul Gordon 2945 Articles
Paul Gordon is the publisher and editor of iState.TV. He has published and edited newspapers, poetry magazines and online weekly magazines. He is the director of Social Cognito, an SEO/Web Marketing Company. You can reach Paul at pg@istate.tv