Multilingual Voice Talent For Your Videos On The Sly (Part II)

In the first installment of this article I have outlined the issue and presented a few sample videos as proof of concept. Today I'll show you the process by which I got the voiceovers in those video clips.

If you have watched the sample clips carefully and, most of all, listened to them, you will of course have noticed that the voices of the speakers are synthetic voices. In recent years, speech synthesis has become surprisingly good, making it particularly suitable for video amateurs.

Speech synthesis can be conveniently generated at your own desk, straight from the text to be spoken. The detour via human speakers in front of a microphone is no longer necessary. Google in particular offers a large number of different voices in several languages, which can also be customized to the needs of the film.

It should also be noted that synthetic voices create plausible deniability for „lone wulff“ producers, pun intended ;-) In an increasingly fascistoid environment, this is particularly interesting for political commentators. Many of us are on the STEEM and not on Facebook, Youtube et al precisely because of the freedom an privacy it affords us.

Long story short, speech synthesis starts with text. If you need multilingual versions, you first run your text through a translator software. Even if you speak the foreign language in question, the translator saves a lot of typing work. The DeepL translator by Linguee often provides more rounded translation results, but the Google translator has the advantage that you can have both texts read out loud right away.

Im ersten Teil dieses Artikels habe ich das Problem umrissen und einige Beispielvideos als Proof of Concept präsentiert. Heute zeige ich euch, wie ich die Voiceovers in diesen Videoclips hingekriegt habe.

Wer sich die Beispielclips aufmerksam angeschaut und vor allem zugehört hat, dem ist natürlich aufgefallen, dass es sich bei den Stimmen der Sprecher um synthetische Stimmen handelt. In den letzten Jahren ist die Sprachsynthese erstaunlich gut geworden, so dass sie sich besonders für Videoamateure geradezu anbietet.

Sprachsynthese lässt sich bequem sich am eigenen Schreibtisch direkt aus dem zu sprechenden Text heraus generieren. Der Umweg über menschliche Sprecher vor dem Mikro entfällt. Besonders Google bietet eine große Anzahl verschiedener Stimmen in etlichen Sprachen an, die noch zusätzlich an die Bedürfnisse des Films angepasst werden können.

Ebenfalls nicht von der Hand zu weisen ist, dass synthetische Stimmen für einen „Einzelkämpfer“-Produzenten eine glaubhafte Abstreitbarkeit herstellen. In einem mehr und mehr faschistoiden Umfeld ist das besonders für politische Kommentatoren interessant. Viele von uns tummeln sich gerade wegen der Freiheit und Privatsphäre hier auf dem STEEM und eben nicht bei Facebook, Youtube et al.

Lange Rede, kurzer Sinn, Sprachsynthese fängt mit Text an. Wer mehrsprachige Versionen benötigt, lässt diesen Text zunächst durch einen Übersetzer laufen. Auch wenn man die betreffende Fremdsprache beherrscht, spart der Übersetzer eine Menge Tipparbeit. Der DeepL Übersetzer von Linguee liefert oft schlüssigere Übersetzungsergebnisse, aber der Google Übersetzer hat den Vorteil, dass man beide Texte gleich vorlesen lassen kann.

If you want, you can record this speech with your preferred audio program and then get on with it. However, you have more creative options with a dedicated program that can customize voice pitch and speed, and output an audio file directly.

Microsoft and Apple customers may forgive me, but I can't give you a recommendation – Google is your friend! As far as I know, there is no really satisfactory solution for my Linux, which is why I'm using my Android smartphone instead.

I'm a glutton for books and have dozens of e-books on my smartphone at any given time. In the car I usually have them read to me by my
„@Voice Aloud Reader“ app (Android) and I'm always amazed at the speech quality! This is how I came up with the video voiceover idea after all.

Wer möchte, kann den vorgelesenen Text mit seinem bevorzugten Audioprogramm gleich aufnehmen und dann weiter verwenden. Mehr Möglichkeiten hat man allerdings mit einem Programm, dass Stimme und Sprechgeschwindigkeit beeinflussen kann und anschließend direkt eine Audiodatei ausgibt.

Microsoft- and Apple-Kunden mögen es mir verzeihen, aber ich kann euch leider keine Empfehlung abgeben, deshalb: Selber googeln! Für mein Linux gibt es meines Wissens nach allerdings keine wirklich zufriedenstellende Lösung, weshalb ich stattdessen mein Android-Smartphone verwende.

Ich bin ein großer Büchernarr und habe dutzendweise E-Books auf dem Smartphone. Im Auto lasse ich mir diese oft von meiner
„@Voice Aloud Reader“-App (Android) vorlesen und bin immer wieder über die Sprachqualität erstaunt. So bin ich dann letztendlich auf die Idee mit den Voiceovers gekommen.

This free app gives you easy access to a huge selection of TTS voices; and you can easily change their pitch, speed, and volume. Pronunciation issues can be corrected via a custom dictionary, although for the sake of simplicity, I do this right in the source text. For example, I have spelled the German name „Nied“ as "Need" to make it sound right.

The text reading can then be exported as a WAV file or in OGG format in three different quality levels. The creation of the sound file is much faster than the actual reading time. Shuffling files back and forth is a piece of cake over the home network, and for testing and tweaking talking speed vis-à-vis the video, the smartphone in one hand and the computer mouse in the other is actually an advantage.

Diese kostenlose App ermöglicht leichten Zugriff auf eine riesige Auswahl von TTS-Stimmen, bei denen zusätzlich noch die Stimmlage, Sprechgeschwindigkeit und Lautstärke verändert werden können. Fehlbetonungen können in einem Wörterbuch korrigiert werden, obwohl ich das der Einfachheit halber gleich im Quelltext vornehme. So habe ich z.B. das Wort „Lothringen“ als „Lootringen“ buchstabiert, damit es richtig klingt.

Dann kann man den vorgelesenen Text als WAV-Datei oder im OGG-Format in drei verschiedenen Qualitätsstufen exportieren. Die Erzeugung der Sounddatei ist dabei deutlich schneller als die eigentliche Lesezeit. Über das Heimnetzwerk ist das Hin- und Herschieben der Dateien eine lockere Angelegenheit und für das Ausprobieren der Sprechdauer im Video ist das Smartphone in der einen und die Computermaus in der anderen Hand eigentlich sogar ein Vorteil.

Enlarge / Vergrößeren

Finally, all you have to do is import the finished sound files into your video editor and refine them a bit. For my example videos I had the text read by a female and a male voice respectively and distributed them alternating on two timelines. Then I normalized the volume to 100%, positioned the „woman“ slightly left, and the „man“ slightly right within the stereo image. Done!

In closing I would like to go to bat for kdenlive. If you haven't heard about this free NLE video editor yet, you're missing out! kdenlive is available for several Linux flavors, Microsoft Windows, and Mac OSX. For many years I was a loyal user of Sony VEGAS, which is still running in a virtualized Windows on my Linux box. However, I noticed that I don't use it anymore and I don't miss it either. Take a look at kdenlive today!

Letztendlich müsst Ihr nur noch die fertigen Sound-Dateien in euren Videoeditor importieren und dort noch etwas verfeinern. Für meine Beispielvideos habe ich den Text jeweils von einer weiblichen und einer männlichen Stimme lesen lassen und abwechselnd auf zwei Zeitachsen verteilt. Danach habe ich die Lautstärken auf 100% normalisiert, die „Frau“ halblinks und den „Mann“ halbrechts im Stereobild positioniert. Fertig!

Als Schlusswort möchte ich eine Lanze für kdenlive brechen. Wer diesen freien und kostenlosen NLE Videoeditor noch nicht kennt, hat was verpasst! kdenlive ist erhältlich für etliche Linux-Geschmachsrichtungen, Microsoft Windows und Mac OSX. Viele Jahre lang war ich ein treuer Anhänger von Sony VEGAS, was auch nach wie vor noch in einem virtualisierten Windows auf meiner Linux-Box läuft. Allerdings habe ich festgestellt, dass ich es nicht mehr benutze und auch nicht vermisse. Schaut euch kdenlive unbedingt mal an!

Thanks for looking, thanks for reading, and don't forget to upvote and resteem if you liked my musings. Catch you next time!

Danke fürs Anschauen, danke fürs Lesen – und vergesst bitte nicht das Upvoten und Resteemen, wenn euch mein Beitrag gefallen hat. Bis zum nächsten Mal!