Regular Spelling
Thoughts on language and more

People are Deaf

While I didn't know that it was a Yamaha product until today, I've known about Vocaloid for a long time. It wasn't anything made with it that first brought it to my attention, though, it was suddenly seeing the name "Miku" everywhere coming out of Japan. I had thought it was a fairly safe name when I started Skewed back in 2004. It wasn't a real Japanese name, and the sound of it had a general male alignment. But now, thanks to Miku Hatsune, I can't ever use that name again for a male character.

But that's not why we are here today, no. We're going to talk more about the software itself. I've spoken a fair amount in the past about voice recognition, and hand in hand with that is always voice synthesis. Now, voice synthesis itself is fairly diffucult, due to the complexity of vocal speech, but doubly hard from that is synthesis of the voice with tone put to it, synthesis of singing. That is, of course, what the Vocaloid software is about.

Now for a long time I've had my own ideas about methods of creating speech synthesis software, and its one of the primary reasons I started studying linguistics so I could understand some more of what goes into it, but I had been holding off on starting work on the software because of computer processing power. So when I first heard about the Vocaloid I was rather interested to hear what had been come up with so far. However, the original samples I'd heard all sounded so horribly mechanized I had no desire to hear any more and no hope for it for another good while.

Images of a PSP game based on Vocaloid caught my eye on an imageboard a few days ago, or more specifically, a Miku Hatsune PSP game by the name of "Project Diva". I'd heard talk about it for some time but I thought it was all in jest, because I didn't think that a product based on chiefly fanmade work would really come into existance on a console like that. Much to my surprise I was wrong, and we can thank SEGA for taking the risk. As I was reading about this game, though, I still had to wonder what they would do as far as the songs for it, and, expecting that they would want to use some good stuff for a commercial product, I started to listen to some of the track list.

Much to my delight, it actually can be used for some decent sounding singing. Just to share a couple of my favorites of what I heard, I'll give you the links for "Last Night, Good Night" and the unexpecedly swing "Miracle Painting". Granted they're a lot smoother than some of the things I've heard staged in the Vocaloid2 software, bu it still has a distinct synthesized sound to it. Plus, in my own personal taste, the general high squeaky voice she has I don't really like much anyway. But it did convince me that the software was fairly capable.

So after reading some, I decided to look into the software some more, which is where I finally discovered it was by Yamaha. Yamaha does good synthesis, and the XG softsynth of the 90s was one of my favorite MIDI synthesizers, second only to the Brookstree/Conexant WaveStream software which I put back together my Pentium to be able to use again. And as I read, apparently the regular Vocaloids ended up being created due to a general artist reluctance to contribute their voices to create Vocaloid data. There was a new libray made recently that the Japanese singer Gackt contributed to, which also used a new build of the Vocaloid software with a new feature, that actually sounds pretty decent, and also sounds really close to how Gackt sounds normally.

So based on that, and my clear recognition of Gackt, I began wondering if maybe the person that the Miku is based off of just has that normally high squeaky voice. Her name is Saki Fujita, a fairly recent voice actress which I've only actually seen one of the shows she was in. But I couldn't actually think of the voice from that character, so I looked up a random song by her, and found "Crystal Quartz". I can hear where Miku comes out of that, but there's so much difference between the actual voice and the synthesized Miku Hatsune. There's a great deal of voiced changes in the singing that it simply doesn't get replicated, changes in the throat as well as the mouth that change the sound of vowels and consonants subtly while still keeping the same vowels and consonants. There's still things that can be improved in Vocaloid.

But what I really found interesting was the conversation I had after that. I know someone who had said he was going to try playing with the Vocaloid software before, and while he may or may not have actually done it I knew he had known about it, and I have also had some other audiophile-categoried conversations with him in the past. So while talking to him about it, an increasinlgly confusion brought to my attention a surprising fact: he couldn't tell the difference between Saki and Miku's singing. I can tell the difference very clearly, but to him they sounded the same. And reading the comments on the Crystal Quartz video, a lot of people seemed to prefer the way the voice sounds in the synthesized singing of Miku over the genuine article.

So I've come to a conclusion, that suddenly makes everything regarding Vocalod clear. People are deaf.


Date posted: 29 June, 2009
Tags: japanese linguistic music pronunciation skewed software video_games
« The Bard sRc | Holiday Relaxing »