23-07-2024 , 24-07-2024
Speechgen offers a unique economical caching feature that significantly reduces time and costs for text-to-speech conversion. In this article, we will explore how this feature works, its benefits, and how it helps you save during voiceovers.
When you synthesize speech, Speechgen remembers the result of each sentence. For example:
Imagine you are working on voicing an educational course with 20 lessons. After completing the work, you decide to add a brief introduction to each lesson. With a regular service, you would have to voice the entire material again, leading to significant costs. With Speechgen, you will only pay for voicing the new introductions, saving resources and time.
Here’s a comparison of Speechgen with other services:
Example |
Other TTS |
Speechgen |
Example #1: 30 sentences |
100% cost |
100% cost |
Example #2: 30 sentences + 10 new |
100% cost |
25% cost |
With other speech synthesis services, each voiceover incurs a 100% cost of everything you voiced. With Speechgen, only new or changed sentences are voiced. As seen in the table, with a repeated voiceover, Speechgen used only 25% of the total character count instead of 100%, since 75% of the text was taken from previously voiced content.
This means you don't need to worry about repeated costs when revising your text. You can return to your text later and work with it.
Above this, a book mode for faster voicing of large texts is used, processing by large text blocks instead of sentences. Speechgen can voice up to 2,000,000 characters at once, but economical caching works up to 100,000 characters.
Voiced sentences are stored in memory for only 1 week. You have 7 days to supplement or revise the voiceover.
Additionally, in your profile, the complete voiceover history is stored for 30 days. This means that within 30 days you can download the text and file in their entirety. However, the cache itself will be stored for only 7 days.
If you decide, for example, to add to the voiceover after 25 days, the limits will be deducted again for the entire project. By saving the voiceover to favorites, you can keep the audio with the text forever, but the cache will still only be stored for 7 days.
Your text and audio file are saved in your profile, but not the cache, so please keep this in mind when working.
Cache works only for unchanged sentences. If you change even one letter or remove a comma in a sentence, it is considered new by the system.
Original Text:
Adding a new sentence:
Result: Speechgen takes the first three sentences from the cache and voices only the fourth one. Costs are incurred only for the fourth sentence.
Original Text:
Changing one word in the second sentence:
Result: Speechgen takes the first and third sentences from the cache but voices the second one again.
Original Text:
Removing the commas in the third sentence:
Result: Speechgen will re-voice the third sentence, and take the first and second sentences from the cache. The third sentence is considered changed due to the removal of commas.
If you add a new pause tag, such as break, it is also considered a change to the sentence. The system will reanalyze and revoice it.
<break time="200ms"/>
In fact, sentences are retrieved from the economical cache based on a complete match, character by character. If there is any new character or if a character is missing in the sentence, the program will not be able to match it exactly.
If you change the speed or tone settings, it will be a completely new voiceover, and the economical cache will not work. When you change the speed or tone, the neural network revoices the text with these new parameters. This is not a software speed-up or tone change; it is a full revoice.
Changing the speaker also results in a complete revoice. Here, the neural network does all the work again. Therefore, if you are adjusting the voice, do this for 1-2 sentences, and once you are satisfied with the speed and tone, voice the entire desired text.
On this special page https://speechgen.io/en/subs/, you can voice subtitles. To fit the timing, it is often necessary to speed up speech to meet the required timing. In this case, the economical cache works, as Speechgen first voices and then programmatically speeds up the subtitle.
You can change the pauses in the settings under the voicing field, and the cache will work perfectly. We save entire sentences to memory, and the system then combines them into audio. This way, you can adjust pauses between sentences or paragraphs without additional costs.
If you select a different format—ogg, wav, opus—and press revoice, the system will not charge you any limits. This is free. If you voiced and then realized you needed a different format, change it without fearing double costs.
If you change the Sample Rate in the settings and press revoice again, the system will not charge you any limits. This is free.
Speechgen's economical caching system offers significant advantages:
Speechgen saves your resources and provides tools for more efficient work with audio content, making it an ideal choice for those who value efficiency and quality in speech synthesis.