On Friday, I linked to several videos by Vocal Synthesis, a new YouTube channel dedicated to audio deepfakes — AI-generated speech that mimics human voices, synthesized from text by training a state-of-the-art neural network on a large corpus of audio.
The videos are remarkable, pairing famous voices with unlikely dialogue: Bob Dylan singing Britney Spears, Ayn Rand and Slavoj Žižek dueting Sonny and Cher, Tucker Carlson reading the Unabomber Manifesto, Bill Clinton reciting “Baby Got Back,” or JFK touting the intellectual merits of Rick and Morty.
Many of the videos have been remixed by fans, adding music to create hilarious and surreal musical mashups. Six U.S. presidents from FDR to Obama rap N.W.A.’s, Fuck Tha Police, George W. Bush covers 50 Cent’s In Da Club, Obama covers Notorious B.I.G.’s Juicy, and my personal favorite, Sinatra slurring his way through the Navy Seal copypasta, a decade-old 4chan meme.
Videos Taken Offline
Over the weekend, for the first time, the anonymous creator of Vocal Synthesis received a copyright claim on YouTube, taking two of his videos offline with deepfaked audio of Jay-Z reciting the “To Be or Not To Be” soliloquy from Hamlet and Billy Joel’s “We Didn’t Start the Fire.”
According to the creator, the copyright claims were filed by Roc Nation LLC with an unusual reason for removal: “This content unlawfully uses an AI to impersonate our client’s voice.”
To continue reading this article, click here.