Not to mention, humans are much easier to deal with. That's about the degree of emoting and understanding we can expect from natural langauge processing and dynamic TTS. Because emotion is really really difficult to add, especially for anything more complex than "You said bad thing and now I'm sad". Things that have pretty simple and standard delivery without much emotion behind it.īut when you start to get to proper quests where you'd want NPC's to react properly and the dialogue to feel realistic and exciting, you run into issues. TTS can be fine for things like simple NPC's, shopkeeps, etc. The issue comes with the acting part of voice acting. Not necessarily dynamically, but with some machine learning and proper tools with pre-defined dialogue, devs could absolutely create realistic voices. We could absolutely create completely valid and realistic sounding voices to read dialogue. Clearly we wouldn't be able to record voice lines for procedurally generated dialogue like that, unless it's very simplified, so a clever, good text-to-speech would likely be needed. An example I thought of, perhaps in a future Elder Scrolls game, if you do something to anger an NPC, instead of them just going aggressive, they could ask you to do a procedurally generated quest, based on their character, or something like that. I'm hoping, in the future, that NPC's could have more sophisticated AI's, with more free will, and reacting more to what's happening around them. It'd also open the door for procedurally generated dialogue. If we could give a game like Morrowind a decent text to speech, it'd seem a lot more modern, I think. This honestly isn't something I'm too knowledgeable about, hence why I'm asking. Text-to-speech has gotten a lot better in modern times, in large part due to neural networks/AI. How far away are we from being able to implement text-to-speech into games? The advantages are clear: it'd be a lot cheaper than hiring voice actors. The former could have a lot more dialogue than the latter, because it's dialogue was mostly text based. It's a lot less acceptable these days to have predominantly text-based dialogue.Ī prime example is Morrowind vs Oblivion. Having a lot of dialogue means you need a lot of voice acting. User data is all anonymous.So one of the big reasons games need to cut down on dialogue is because of voice acting. We use Google Analytics to understand how the site is being used in order to improve your user experience. This information is collected by major web servers by default. This information includes information such as your computer’s Internet Protocol (“IP”) address, browser user-agent and the time and date of your visit. We want to inform you that whenever you use this service, we collect information that your browser sends to us. This section is used to inform website visitors regarding policies with the collection, use, and disclosure of Personal Information if anyone decided to use this service. To save generated audio, right click on audio player and press "Save audio as.".It should be done nearly instantly, as the interface tries to generate audio at x16777215 real-time. Wait for generated audio appear in audio player. All voices have lower and upper pitch and speed limits. Note that BonziBUDDY voice is actually an "Adult Male #2" with a specific pitch and speed. Microsoft Sam TTS Generator is an online interface for part of Microsoft Speech API 4.0 which was released in 1998.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |