The future looks bright - Sand, software and soundSand, software and sound

After reading the owner’s manual and watching the first demonstrations, it’s clear that the Yamaha Genos™ is a beautiful face-lift over the Tyros series, but where is the sonic breakthrough?

As usual, the answer was right in front of my face all along. First, a few facts and figures:

    Feature                        Tyros 5    Genos
    ---------------------------    -------    -----
    Mega Voices                       54        82
    Super Articulation voices        288       390
    Super Articulation 2 voices       44        75
    Live voices                      138       160
    Articulation buttons               2         3

Back before the specifications were officially announced, I saw a leaked version of these specs. Given the big leap in Mega Voice (MV), Super Articulation (SA) and Super Articulation 2 (SA2) voices, I didn’t think the leaked specifications were credible. Now, I believe.

In short, the new tone generation hardware in Genos enables a very large SSD-sized waveform memory capable of holding all of the waveforms needs for the boost in MV, SA and SA2 voices. The end result is greater musical expression, detail and realism for both the Genos player and audiences.

This blog takes a focused look at Mega Voice, Super Articulation (1 and 2), and why the “great leap forward” is possible in Genos. For PSR/Tyros purists, I hope that you don’t mind my shortened abbreviations for Mega Voice, etc. The short abbreviations are much easier to type without extra punctuation marks.

Background information

MV, SA and SA2 are the trinity of highly detailed, expressive Yamaha voices. All three kinds of voices are based on Yamaha’s sample playback technology AWM2 (Advanced Wave Memory). Super Articulation 2 is based on Articulation Element Modeling (AEM). Both AWM2 and AEM are covered by many Yamaha patents.

Yamaha did not introduce these voices in one fell swoop. Mega Voices were the first to appear. A Mega Voice divides a voice into two or more velocity ranges and assigns a different waveform to each range. A trumpet voice, for example, is divided into:

    Velocity range    Waveform
    --------------    ----------------------
         1 - 20       mf trumpet
        21 - 40       f trumpet
        41 - 60       ff trumpet
        61 - 90       Legato
        81 - 100      Straight
       101 - 110      Shake
       111 - 120      Falls
       121 - 127      Glissando up

MIDI notes above C6 and above C8 are mapped to valve noise and breath noise, respectively. For other examples of Mega Voices, see the Mega Voice mapping table in the Tyros 5 Data List file for details. (Also, learn how to create a Mega Voice using Yamaha Expansion Manager.)

The first three ranges and waveforms correspond to velocity switching as we know it. The second five ranges correspond to articulations as we know and love them in software instruments. The articulations and noises are the sonic sweeteners that make sequenced music sound more human and natural.

Mega Voices are intended for sequencing. They are used in arranger keyboard styles to make them sound less MIDI-ish. Unless you have the finger control of a god, you cannot reasonably play a Mega Voice through the keyboard.

But, wait a minute! What if you put some smart software between the keyboard and the tone generator? The smart software watches and analyzes your gestures (i.e., key presses, releases, button pushes, etc.), and plays either a regular note or an articulated note. This is the basic idea behind Super Articulation.

In the case of the trumpet, for example, the SA software watches the notes that you play and if you push the right articulation button while playing a note, the software selects and plays a shake instead of a regular trumpet sound. The SA software also analyzes note timing and plays a legato waveform when you strike a second key while holding the first key. SA software even responds to note intervals such as playing a glissando when the interval between two notes is big enough.

In the end, Super Articulation makes Mega Voice articulations intuitively playable. I thoroughly enjoy playing the SA voices on my PSR-S950. I don’t have too think to hard at all — just let it rip as I hear it in my head.

Montage and late model Motif- and MOX-series synthesizers implement Expanded Articulation (XA). Take a look at my deconstruction of the Tenor To The Max voice.

Super Articulation 2 takes SA up another notch. Real musical tones are not discrete sonic events. Tones tend to blend together due to the characteristics of the musical instrument itself and/or playing technique (e.g., legato). SA2 performs a digital blending between notes by analyzing gestures and selecting the appropriate waveform from a very large database of waveform segments. Broadly speaking, these segments belong to three categories:

Head: Attack portion of the sound
Body: Main body of the sound
Tail: Release portion of the sound

Consider two notes where the first note is detached and the second note is legato. SA2 plays the head segment for the first note, sounding the attack. This is followed by the body of the first note. SA2 does not play a head for the second note. It blends the body of the first note into the body of the second note. When the second note is released, SA2 selects and plays a tail for the second note.

All of this blending is computation heavy and is very sensitive to timing and latency. The technology behind SA2 is Articulation Element Modeling (AEM). AEM is actually a deep subject and is patented. (See my related post about Real Acoustic Sound.)

Technical breakthrough, sonic breakthrough

Folks who are familiar with software instruments and sound libraries know that all of this comes with a cost. Sample libraries for orchestral instruments are enormous because there are so many different ways to bow, pluck, strike and generally mess with acoustic instruments. Tens and even hundreds of gigabytes are needed to store the highest quality sample libraries. Then, one needs to have a fast streaming device like an SSD and a computationally husky CPU to play the samples without a glitch or hiccup.

Before Montage and Genos, Yamaha’s mainstay tone generator (TG) integrated circuit (IC) was the SWP51L. This venerable chip carried the load in Motif, MOX, CP, Clavinova, and other mid- to high-end Yamaha products.

Like all things electronic, the SWP51L’s time eventually came and went. The SWP51L communicates to waveform memory over a CPU-like bus with a fixed width address. The SWP51L is limited in three ways. First, the fixed width address is not big enough to address the very large sample library needed to support today’s articulation-heavy voices. Second, the address bus cannot be (easily) made wider. Third, the bus protocol is not directly compatible with relatively inexpensive commodity NAND flash memory. Conclusion, the SWP51L does not scale to a big waveform memory.

The Montage and the Genos deploy the new generation SWP70 tone generator. Unlike the SWP51L, the SWP70 is compatible with commodity NAND flash memory — the same kind of memory used in solid state drives (SSD). The Open NAND Flash Interface (ONFI) bus protocol — and the Genos — is scalable.

Thus, Yamaha is finally free to expand waveform memory to sample library scale.

People make much of “SSD, SSD, SSD!” SSDs use a SATA bus for communication, a bus that can become a bottleneck in itself. Yamaha have found a way to integrate SSD functionality into the SWP70 without the need for a SATA bus. The integration promises greater speed (i.e., memory bandwidth) without the cost and latency of a SATA bus. This design approach is patented. Please read one of my earlier posts about the SWP70 for the gory technical details. Hope you know a bit about computer architecture before diving in!

I’ve also speculated about the role of the SWP70 in the implementation of the Genos file system. This post is highly speculative and has not been verified by reading the Genos service manual.

What does this mean for the player?

The bottom line for the player and audiences is rich sound filled with detail and realism, thanks to big waveform memory, AWM2/AEM synthesis and Yamaha’s sound development expertise. Big waveform capacity and the new mono/stereo tone generation channels in the SWP70 also mean greater use of stereo samples (“Live voices” in PSR/Tyros-speak.)

Please look at the chart at the beginning of this article. No previous generation-to-generation Tyros upgrade has had such a big jump in the number of Mega Voice, Super Articulation and Super Articulation 2 voices. It can only get better from here as the SWP70 is the Yamaha platform for the next 8 to 10 years.

The Genos promises to be an expressive instrument which will be fun to play. The knobs, sliders and articulation buttons afford a great deal of real time control. I can’t wait to play one of these!

Longer term, what do the technical breakthroughs hold for the Montage series? You ain’t seen or heard nothin’ yet.