The future looks bright

After reading the owner’s manual and watching the first demonstrations, it’s clear that the Yamaha Genos™ is a beautiful face-lift over the Tyros series, but where is the sonic breakthrough?

As usual, the answer was right in front of my face all along. First, a few facts and figures:

    Feature                        Tyros 5    Genos
    ---------------------------    -------    -----
    Mega Voices                       54        82
    Super Articulation voices        288       390
    Super Articulation 2 voices       44        75
    Live voices                      138       160
    Articulation buttons               2         3

Back before the specifications were officially announced, I saw a leaked version of these specs. Given the big leap in Mega Voice (MV), Super Articulation (SA) and Super Articulation 2 (SA2) voices, I didn’t think the leaked specifications were credible. Now, I believe.

In short, the new tone generation hardware in Genos enables a very large SSD-sized waveform memory capable of holding all of the waveforms needs for the boost in MV, SA and SA2 voices. The end result is greater musical expression, detail and realism for both the Genos player and audiences.

This blog takes a focused look at Mega Voice, Super Articulation (1 and 2), and why the “great leap forward” is possible in Genos. For PSR/Tyros purists, I hope that you don’t mind my shortened abbreviations for Mega Voice, etc. The short abbreviations are much easier to type without extra punctuation marks.

Background information

MV, SA and SA2 are the trinity of highly detailed, expressive Yamaha voices. All three kinds of voices are based on Yamaha’s sample playback technology AWM2 (Advanced Wave Memory). Super Articulation 2 is based on Articulation Element Modeling (AEM). Both AWM2 and AEM are covered by many Yamaha patents.

Yamaha did not introduce these voices in one fell swoop. Mega Voices were the first to appear. A Mega Voice divides a voice into two or more velocity ranges and assigns a different waveform to each range. A trumpet voice, for example, is divided into:

    Velocity range    Waveform
    --------------    ----------------------
         1 - 20       mf trumpet
        21 - 40       f trumpet
        41 - 60       ff trumpet
        61 - 90       Legato
        81 - 100      Straight
       101 - 110      Shake
       111 - 120      Falls
       121 - 127      Glissando up

MIDI notes above C6 and above C8 are mapped to valve noise and breath noise, respectively. For other examples of Mega Voices, see the Mega Voice mapping table in the Tyros 5 Data List file for details. (Also, learn how to create a Mega Voice using Yamaha Expansion Manager.)

The first three ranges and waveforms correspond to velocity switching as we know it. The second five ranges correspond to articulations as we know and love them in software instruments. The articulations and noises are the sonic sweeteners that make sequenced music sound more human and natural.

Mega Voices are intended for sequencing. They are used in arranger keyboard styles to make them sound less MIDI-ish. Unless you have the finger control of a god, you cannot reasonably play a Mega Voice through the keyboard.

But, wait a minute! What if you put some smart software between the keyboard and the tone generator? The smart software watches and analyzes your gestures (i.e., key presses, releases, button pushes, etc.), and plays either a regular note or an articulated note. This is the basic idea behind Super Articulation.

In the case of the trumpet, for example, the SA software watches the notes that you play and if you push the right articulation button while playing a note, the software selects and plays a shake instead of a regular trumpet sound. The SA software also analyzes note timing and plays a legato waveform when you strike a second key while holding the first key. SA software even responds to note intervals such as playing a glissando when the interval between two notes is big enough.

In the end, Super Articulation makes Mega Voice articulations intuitively playable. I thoroughly enjoy playing the SA voices on my PSR-S950. I don’t have too think to hard at all — just let it rip as I hear it in my head.

Montage and late model Motif- and MOX-series synthesizers implement Expanded Articulation (XA). Take a look at my deconstruction of the Tenor To The Max voice.

Super Articulation 2 takes SA up another notch. Real musical tones are not discrete sonic events. Tones tend to blend together due to the characteristics of the musical instrument itself and/or playing technique (e.g., legato). SA2 performs a digital blending between notes by analyzing gestures and selecting the appropriate waveform from a very large database of waveform segments. Broadly speaking, these segments belong to three categories:

  1. Head: Attack portion of the sound
  2. Body: Main body of the sound
  3. Tail: Release portion of the sound

Consider two notes where the first note is detached and the second note is legato. SA2 plays the head segment for the first note, sounding the attack. This is followed by the body of the first note. SA2 does not play a head for the second note. It blends the body of the first note into the body of the second note. When the second note is released, SA2 selects and plays a tail for the second note.

All of this blending is computation heavy and is very sensitive to timing and latency. The technology behind SA2 is Articulation Element Modeling (AEM). AEM is actually a deep subject and is patented. (See my related post about Real Acoustic Sound.)

Technical breakthrough, sonic breakthrough

Folks who are familiar with software instruments and sound libraries know that all of this comes with a cost. Sample libraries for orchestral instruments are enormous because there are so many different ways to bow, pluck, strike and generally mess with acoustic instruments. Tens and even hundreds of gigabytes are needed to store the highest quality sample libraries. Then, one needs to have a fast streaming device like an SSD and a computationally husky CPU to play the samples without a glitch or hiccup.

Before Montage and Genos, Yamaha’s mainstay tone generator (TG) integrated circuit (IC) was the SWP51L. This venerable chip carried the load in Motif, MOX, CP, Clavinova, and other mid- to high-end Yamaha products.

Like all things electronic, the SWP51L’s time eventually came and went. The SWP51L communicates to waveform memory over a CPU-like bus with a fixed width address. The SWP51L is limited in three ways. First, the fixed width address is not big enough to address the very large sample library needed to support today’s articulation-heavy voices. Second, the address bus cannot be (easily) made wider. Third, the bus protocol is not directly compatible with relatively inexpensive commodity NAND flash memory. Conclusion, the SWP51L does not scale to a big waveform memory.

The Montage and the Genos deploy the new generation SWP70 tone generator. Unlike the SWP51L, the SWP70 is compatible with commodity NAND flash memory — the same kind of memory used in solid state drives (SSD). The Open NAND Flash Interface (ONFI) bus protocol — and the Genos — is scalable.

Thus, Yamaha is finally free to expand waveform memory to sample library scale.

People make much of “SSD, SSD, SSD!” SSDs use a SATA bus for communication, a bus that can become a bottleneck in itself. Yamaha have found a way to integrate SSD functionality into the SWP70 without the need for a SATA bus. The integration promises greater speed (i.e., memory bandwidth) without the cost and latency of a SATA bus. This design approach is patented. Please read one of my earlier posts about the SWP70 for the gory technical details. Hope you know a bit about computer architecture before diving in!

I’ve also speculated about the role of the SWP70 in the implementation of the Genos file system. This post is highly speculative and has not been verified by reading the Genos service manual.

What does this mean for the player?

The bottom line for the player and audiences is rich sound filled with detail and realism, thanks to big waveform memory, AWM2/AEM synthesis and Yamaha’s sound development expertise. Big waveform capacity and the new mono/stereo tone generation channels in the SWP70 also mean greater use of stereo samples (“Live voices” in PSR/Tyros-speak.)

Please look at the chart at the beginning of this article. No previous generation-to-generation Tyros upgrade has had such a big jump in the number of Mega Voice, Super Articulation and Super Articulation 2 voices. It can only get better from here as the SWP70 is the Yamaha platform for the next 8 to 10 years.

The Genos promises to be an expressive instrument which will be fun to play. The knobs, sliders and articulation buttons afford a great deal of real time control. I can’t wait to play one of these!

Longer term, what do the technical breakthroughs hold for the Montage series? You ain’t seen or heard nothin’ yet.

Copyright © 2017 Paul J. Drongowski

Genos internal memory: A speculation

First, you have to get the mule’s attention.

Yamaha Genos™ hasn’t hit the streets yet and here is a speculative article about its hardware design…

I’d like to thank Kari V., Mihai and Joe H. on the PSR Tutorial Forum for getting this mule’s attention. They deserve the credit.

Spex

Here are a few Genos specifications that drew curious looks:

  • Polyphony: 256 (max.) (128 for Preset Voice + 128 for Expansion Voice)
  • Voice expansion memory: Approximately 1.8GBytes
  • Internal memory: Approximately 58GBytes

Normally, a Tyros has a large hard disk inside for bulk storage. The hard drive contains a file system to hold style files, song files, text files and a whole lot more. The Tyros 5 shipped with a 500GB hard disk drive. Tyros 5 internal memory — some form of non-volatile flash — is spec’ed at approximately 6.7MBytes. Yes, megabytes.

Word from the demonstrations is that the Genos has neither a hard disk drive nor a solid state drive (SSD). Thus, “Internal memory” is not directly user expandable or upgradeable. Eliminating the hard disk drive, the bracket and access door makes good sense because it reduces weight and chassis complexity. SSDs are still a little pricey for a cost-sensitive manufacturer like Yamaha. If it’s not a hard drive and if it’s not an SSD, then what is it?

Next, what’s up with that polyphony spec? 128 voice polyphony when you play preset voices only and 128 voice polyphony when you play a voice from user voice expansion memory? That’s rather unorthodox.

The high-level view

This is where the Yamaha SWP70 tone generator (TG) integrated circuit (IC) comes into the story.

The SWP70 uses ONFI-compatible NAND flash as its waveform memory. “ONFI” is the industry standard Open NAND Flash Interface. ONFI-compatible chips are the same NAND flash used in SSDs. The SWP70 caches the waveform data in a fast SDRAM just like an SSD in order to have fast, random access to samples.

Yamaha have created a tone generator IC that integrates an SSD-like flash and cache controller. This design eliminates the cost and latency of the SATA bus which normally connects an SSD within a PC or Mac.

For the hardware inclined, here’s a short speculative answer. There are two tone generator ICs each having their own ONFI flash memory. One TG and flash memory (call this one “TG A”) handles factory presets. The other TG and flash memory (call this “TG B”) handles user expansion voices.

The “TG B” flash memory is 64GBytes of ONFI NAND flash. Through software, it is partitioned into a file system partition (62GB?) and a user expansion voice partition (2GB).

The file system partition contains the initial factory content (4GB). The remaining space (58GB) is the “Internal memory” quoted in the Genos specifications.

So, Yamaha engineering decided to use space in one of the ONFI flash memories for bulk storage in order to cut the weight and expense of a magnetic hard drive (heavy) or an SSD (lighter than a hard drive, but not cheap).

If this is true — if — then there are some positive implications for the future of Genos. More at another time.

Ingenious, yes. User expandable, no.

Do I know this for sure? Oh, hell no. We need a service manual. Even a visual inspection of the digital logic board (DM) might not be conclusive.

The low-level view

The notional diagram below shows some of the major interfaces to the SWP70. [Click on images to enlarge.]

  • The CPU bus connects the SWP70 to the main control CPU and other major subsystems that require CPU-based data and control.
  • The ABUS allows SWP70s to communicate with each other when more than one SWP70 is in a system.
  • The waveform memory (NAND flash) communicates with the SWP70 over a Open NAND Flash Interface (ONFI) bus. This open industry standard lets Yamaha use commodity flash memory for waveform ROM. Waveform memory is split into upper and lower bytes with shared control signals. This arrangement instantly doubles bus bandwidth versus a single ONFI data channel.
  • The Serial audio bus brings audio data into the SWP70 (e.g., from the ADC) and sends audio data to the DACs and other subsystems.

Then, the fun begins. The SWP70 has three parallel SDRAM memory channels for wave and DSP working memory.

  • The DSP working memory is a large, scratch-pad memory for effect computation. I believe this memory is also the working memory for Montage FM-X.
  • The Wave working memory is a fast, read/write data cache which holds samples after they are read from the waveform memory. Remember, NAND flash favors sequential block mode read access, transferring data on the nibble-serial ONFI bus. The wave working memory plays the same role as the data cache in an SSD storage unit.

Memory capacities vary across products depending upon target polyphony, effect workload and, of course, the sample set.

Here are capacities for the PSR-S770, PSR-S970 and Montage. All capacities are physical (i.e., raw physical storage space).

             AWM     Waveform    Wave     DSP
          Polyphony   Memory   Working  Working
          ---------  --------  -------  -------
PSR-S770     128      512MB      32MB     8MB
PSR-S970     128       2GB       32MB     8MB
Montage      128*      4GB       32MB    16MB
          * Stereo/mono

The Montage DSP working memory is twice as large as the PSR-S970 reflecting the larger number of supported effect units.

The ONFI standard is the same standard used in solid state drives (SSD). Thus, Yamaha can reap the benefit of lower cost commodity flash. The wave working memory caches data just like an SSD. The SWP70 design yields maximum bandwidth to and from NAND flash without the expense or latency of a SATA bus. Thanks to ONFI, Yamaha can increase waveform memory size by dropping in higher capacity ONFI-compatible devices. User waveform (voice) expansion memory resides in these same memory components, so one should expect bigger user expansion memory in the future as well as bigger factory sample sets.

The SWP70 reads and writes two flash memories in tandem effectively sending a 16-bit word on each ONFI bus cycle. (See diagram below.) One memory provides the HIGH byte and the other memory provides the LOW byte. The same ONFI control signals are sent to both. For people who like to trash Yamaha for not using SSD, please note that tandem access doubles the transfer bandwidth over a single ONFI data path solution. (Of course, an SSD could do the same thing.)

I’ll bet that using the ONFI waveform memory for file system access made the tone generation guys nervous. Would file system traffic rob memory bandwidth from the tone generators?

Yamaha know latency. They spend a lot of time, money and intellectual effort understanding latency and conquering it. That’s where the second waveform working memory comes into play. Samples heading to the tone generators could be held in one waveform working memory while file system data could be held in the second, separate working memory. This organization separates the memory traffic and prevents file access from disturbing the critical, must-be-predictible sample stream. When the two channels arbitrate for the ONFI bus, the sample stream feeding tone generation could be given priority.

Copyright © 2017 Paul J. Drongowski