Random answer day (1)

Maybe it’s the first day of the regular NFL season or the phase of the moon. Here’s a recap of a few questions that came into the forums.

How are arranger/synth preset voices stored? First, one may ask, “How is a preset represented?” Typically, a preset voice consists of waveforms (AKA “samples”) and voice (meta-)data. The voice data control how the sample-playback engine applies filtering, amplitude envelope, modulation and so forth. The waveforms, of course, provide the basic digital audio data.

There is such a broad range of arranger/synth products at different price points, that the amount of storage and the kind of storage varies quite a lot.

The lowest of the low in the Yamaha range: PSS-A50, -E30, -F30, PSR-F51. Presets are stored on a 2MByte serial flash ROM and are loaded into the processor (SWLL) at start-up. The 2MBytes include code, too! Tone generation is integrated into the SWLL. Insanely small, and very low cost.

The highest of the high in the Yamaha range: Genos. Factory presets are stored in four 1GByte ONFI NAND flash devices. Expansion memory consists of two 1GByte ONFI NAND flash devices. Wave memory connects directly to external tone generators (SWP70).

I’ve looked at the diagrams for Genos and I’m not sure about the size and function of those memory units, especially Genos USER memory and expansion memory.

Yamaha confuses people when they speak of “user memory,” “internal memory,” etc. They are usually referring to logical, user visible storage.

When getting down to the hardware level, there are many different physical memory units. since we’re not discussing fairy dust or magic, the logical storage must be assigned to one or more physical memory units. And, of course, the physical memory units themselves may be composed of multiple integrated circuits. The other dimension is “what communicates to what.” Memory is passive and needs a processor to initiate reads and writes and to do something with all that data. At the physical level, a memory unit essentially belongs to a single processing unit (host computer, tone generator) and directly communicate with it.

Sometimes I think of the SWP70 as a parallel processor just like a GPU. The CPU/SWP70 is not exactly analogous to host CPU plus GPU, however. Graphics memory is shared between CPU and GPU. The SWP70 does not share its waveform memory with anybody — it’s dedicated to the tone generator. That’s why installing an expansion pack (voice library) is kind of slow and technically complicated, and why a Genos reboot is required.

Yamaha Genos SWP70 tone generators

Staying with Genos, Genos has two SWP70 tone generators: one handles factory presets and the other handles user expansion voices. The factory SWP70 has 4GBytes of flash memory while the expansion flash memory has 1GB of flash memory. That’s physical memory. Yamaha boosted the effective capacity to 3GB expansion through compression.

The SWP70s also have DSP RAM. As a user, you never know about this memory. It’s scratchpad memory for DSP effects. Physically, the DSP RAM is completely separate and independent from the waveform memory, and communicates with only its parent SWP70.

Yamaha Genos Host CPU

The host CPU has two kinds of memory (as determined by its bus interfaces): 1GB of working RAM on the CPU memory bus (EMIF) and two embedded eMMC memory devices that act like solid state storage drives (MMC0 and MMC1). As far as a user is concerned, the user never sees the 4GB eMMC drive (MMC0) just like you don’t see the DSP RAM; it’s hidden. The MMC0 drive contains the Linux operating system kernel and the root file system.

The user sees only part of the second 64GB eMMC drive (MMC1). The user sees the logical storage which Yamaha calls “Internal memory” or “USER drive.” What’s in the remaining 6GB? I don’t know — Yamaha haven’t left any clues.

What about Montage and its 5.67GByte waveform memory? 5.67GB is the capacity when the waveforms (samples) are compressed. Again, this is logical storage capacity.

Yamaha Montage SWP70 tone generators

Montage has two SWP70s. One SWP70 is dedicated to FM-X and it does not have waveform memory. The second SWP70 handles AWM2 synthesis (sample playback) and has waveform memory connected to it. The waveform memory consists of four 1GByte devices totaling 4GBytes. Thanks to Yamaha’s proprietary compression, Montage stores 5.67GBytes-worth of data in the physical waveform memory. The remaining space, 1.75GB physical, is available for user samples.

How does sample capacity relate to price? It doesn’t. Component cost is outweighed by manufacturing costs, software development cost and sound design cost.

If the memory components are so cheap, why isn’t there more waveform memory? If there was more, then you wouldn’t buy the Mark II model, would you? 🙂

I understand that E30/F30 do NOT offer velocity sensitivity. My question is about the internals. Is it confirmed that it’s a keybed with two switches per key, that just aren’t supported in software?

Yes, you need to be careful here. There are hardware model differences: E30 and F30 are not velocity sensitive. A50 is velocity sensitive.

There are two different keybed printed circuit boards (PCB). Yamaha part number VAY27800 for F30/E30 and VAY28500 for A50. The A50 PCB has the necessary diodes installed for velocity sense. The F30/E30 PCB does not have the diodes. Further, the A50 board has a 12-pin connector while the F30/E30 board has an 11-pin connector — perhaps to avoid assembly mistakes.

Yamaha Reface key switch matrix schematic

Is velocity sense worth the extra bucks? There may be other differences, too, but these differences are plainly visible.

And the usual caution/disclaimer — kiss the warranty good-bye! For the money, the PSS should be good mod-fodder. Korg probably sold a mess o’monotron that way. 

Copyright © 2021 Paul J. Drongowski

The future looks bright

After reading the owner’s manual and watching the first demonstrations, it’s clear that the Yamaha Genos™ is a beautiful face-lift over the Tyros series, but where is the sonic breakthrough?

As usual, the answer was right in front of my face all along. First, a few facts and figures:

    Feature                        Tyros 5    Genos
    ---------------------------    -------    -----
    Mega Voices                       54        82
    Super Articulation voices        288       390
    Super Articulation 2 voices       44        75
    Live voices                      138       160
    Articulation buttons               2         3

Back before the specifications were officially announced, I saw a leaked version of these specs. Given the big leap in Mega Voice (MV), Super Articulation (SA) and Super Articulation 2 (SA2) voices, I didn’t think the leaked specifications were credible. Now, I believe.

In short, the new tone generation hardware in Genos enables a very large SSD-sized waveform memory capable of holding all of the waveforms needs for the boost in MV, SA and SA2 voices. The end result is greater musical expression, detail and realism for both the Genos player and audiences.

This blog takes a focused look at Mega Voice, Super Articulation (1 and 2), and why the “great leap forward” is possible in Genos. For PSR/Tyros purists, I hope that you don’t mind my shortened abbreviations for Mega Voice, etc. The short abbreviations are much easier to type without extra punctuation marks.

Background information

MV, SA and SA2 are the trinity of highly detailed, expressive Yamaha voices. All three kinds of voices are based on Yamaha’s sample playback technology AWM2 (Advanced Wave Memory). Super Articulation 2 is based on Articulation Element Modeling (AEM). Both AWM2 and AEM are covered by many Yamaha patents.

Yamaha did not introduce these voices in one fell swoop. Mega Voices were the first to appear. A Mega Voice divides a voice into two or more velocity ranges and assigns a different waveform to each range. A trumpet voice, for example, is divided into:

    Velocity range    Waveform
    --------------    ----------------------
         1 - 20       mf trumpet
        21 - 40       f trumpet
        41 - 60       ff trumpet
        61 - 90       Legato
        81 - 100      Straight
       101 - 110      Shake
       111 - 120      Falls
       121 - 127      Glissando up

MIDI notes above C6 and above C8 are mapped to valve noise and breath noise, respectively. For other examples of Mega Voices, see the Mega Voice mapping table in the Tyros 5 Data List file for details. (Also, learn how to create a Mega Voice using Yamaha Expansion Manager.)

The first three ranges and waveforms correspond to velocity switching as we know it. The second five ranges correspond to articulations as we know and love them in software instruments. The articulations and noises are the sonic sweeteners that make sequenced music sound more human and natural.

Mega Voices are intended for sequencing. They are used in arranger keyboard styles to make them sound less MIDI-ish. Unless you have the finger control of a god, you cannot reasonably play a Mega Voice through the keyboard.

But, wait a minute! What if you put some smart software between the keyboard and the tone generator? The smart software watches and analyzes your gestures (i.e., key presses, releases, button pushes, etc.), and plays either a regular note or an articulated note. This is the basic idea behind Super Articulation.

In the case of the trumpet, for example, the SA software watches the notes that you play and if you push the right articulation button while playing a note, the software selects and plays a shake instead of a regular trumpet sound. The SA software also analyzes note timing and plays a legato waveform when you strike a second key while holding the first key. SA software even responds to note intervals such as playing a glissando when the interval between two notes is big enough.

In the end, Super Articulation makes Mega Voice articulations intuitively playable. I thoroughly enjoy playing the SA voices on my PSR-S950. I don’t have too think to hard at all — just let it rip as I hear it in my head.

Montage and late model Motif- and MOX-series synthesizers implement Expanded Articulation (XA). Take a look at my deconstruction of the Tenor To The Max voice.

Super Articulation 2 takes SA up another notch. Real musical tones are not discrete sonic events. Tones tend to blend together due to the characteristics of the musical instrument itself and/or playing technique (e.g., legato). SA2 performs a digital blending between notes by analyzing gestures and selecting the appropriate waveform from a very large database of waveform segments. Broadly speaking, these segments belong to three categories:

  1. Head: Attack portion of the sound
  2. Body: Main body of the sound
  3. Tail: Release portion of the sound

Consider two notes where the first note is detached and the second note is legato. SA2 plays the head segment for the first note, sounding the attack. This is followed by the body of the first note. SA2 does not play a head for the second note. It blends the body of the first note into the body of the second note. When the second note is released, SA2 selects and plays a tail for the second note.

All of this blending is computation heavy and is very sensitive to timing and latency. The technology behind SA2 is Articulation Element Modeling (AEM). AEM is actually a deep subject and is patented. (See my related post about Real Acoustic Sound.)

Technical breakthrough, sonic breakthrough

Folks who are familiar with software instruments and sound libraries know that all of this comes with a cost. Sample libraries for orchestral instruments are enormous because there are so many different ways to bow, pluck, strike and generally mess with acoustic instruments. Tens and even hundreds of gigabytes are needed to store the highest quality sample libraries. Then, one needs to have a fast streaming device like an SSD and a computationally husky CPU to play the samples without a glitch or hiccup.

Before Montage and Genos, Yamaha’s mainstay tone generator (TG) integrated circuit (IC) was the SWP51L. This venerable chip carried the load in Motif, MOX, CP, Clavinova, and other mid- to high-end Yamaha products.

Like all things electronic, the SWP51L’s time eventually came and went. The SWP51L communicates to waveform memory over a CPU-like bus with a fixed width address. The SWP51L is limited in three ways. First, the fixed width address is not big enough to address the very large sample library needed to support today’s articulation-heavy voices. Second, the address bus cannot be (easily) made wider. Third, the bus protocol is not directly compatible with relatively inexpensive commodity NAND flash memory. Conclusion, the SWP51L does not scale to a big waveform memory.

The Montage and the Genos deploy the new generation SWP70 tone generator. Unlike the SWP51L, the SWP70 is compatible with commodity NAND flash memory — the same kind of memory used in solid state drives (SSD). The Open NAND Flash Interface (ONFI) bus protocol — and the Genos — is scalable.

Thus, Yamaha is finally free to expand waveform memory to sample library scale.

People make much of “SSD, SSD, SSD!” SSDs use a SATA bus for communication, a bus that can become a bottleneck in itself. Yamaha have found a way to integrate SSD functionality into the SWP70 without the need for a SATA bus. The integration promises greater speed (i.e., memory bandwidth) without the cost and latency of a SATA bus. This design approach is patented. Please read one of my earlier posts about the SWP70 for the gory technical details. Hope you know a bit about computer architecture before diving in!

I’ve also speculated about the role of the SWP70 in the implementation of the Genos file system. This post is highly speculative and has not been verified by reading the Genos service manual.

What does this mean for the player?

The bottom line for the player and audiences is rich sound filled with detail and realism, thanks to big waveform memory, AWM2/AEM synthesis and Yamaha’s sound development expertise. Big waveform capacity and the new mono/stereo tone generation channels in the SWP70 also mean greater use of stereo samples (“Live voices” in PSR/Tyros-speak.)

Please look at the chart at the beginning of this article. No previous generation-to-generation Tyros upgrade has had such a big jump in the number of Mega Voice, Super Articulation and Super Articulation 2 voices. It can only get better from here as the SWP70 is the Yamaha platform for the next 8 to 10 years.

The Genos promises to be an expressive instrument which will be fun to play. The knobs, sliders and articulation buttons afford a great deal of real time control. I can’t wait to play one of these!

Longer term, what do the technical breakthroughs hold for the Montage series? You ain’t seen or heard nothin’ yet.

Copyright © 2017 Paul J. Drongowski

Genos internal memory: A speculation

Update: 23 December 2017

The article below illustrates the danger of speculation based on a few specifications. It is so easy to convince oneself that “This must be the way they did it!”

Thank goodness for service manuals.

I will eventually write a longer description of the Genos™ compute complex. Here’s a few facts to tide you over:

  • The main CPU is a 1GHz TI AM4376 Sitara Cortex-A9 ARM processor (AM4376BZDN100).
  • There are two SWP70 tone generator (TG) integrated circuits.
  • The master TG has 2GBytes (physical) of wave memory (Winbond W29N08GVSIAA).
  • The slave TG has 4GBytes (physical) of wave memory.
  • The internal memory is a Toshiba 64GByte eMMC device (THGBMGG9T4LBAIR).

The eMMC device is where you’ll find the “58GByte internal memory.” The eMMC is connected to one of the ARM’s two MMC interfaces.

The Montage’s main CPU is an 800MHz TI AM3352 Sitara Cortex-A8 ARM processor (AM3352BZCZ80). Whose the big brother and the little tag-along? 🙂

Everything about the following speculation (below) is utterly wrong. That’s why I try to explicitly label speculation and fact when writing.

A (Discredited) speculation

First, you have to get the mule’s attention.

Yamaha Genos™ hasn’t hit the streets yet and here is a speculative article about its hardware design…

I’d like to thank Kari V., Mihai and Joe H. on the PSR Tutorial Forum for getting this mule’s attention. They deserve the credit.

Spex

Here are a few Genos specifications that drew curious looks:

  • Polyphony: 256 (max.) (128 for Preset Voice + 128 for Expansion Voice)
  • Voice expansion memory: Approximately 1.8GBytes
  • Internal memory: Approximately 58GBytes

Normally, a Tyros has a large hard disk inside for bulk storage. The hard drive contains a file system to hold style files, song files, text files and a whole lot more. The Tyros 5 shipped with a 500GB hard disk drive. Tyros 5 internal memory — some form of non-volatile flash — is spec’ed at approximately 6.7MBytes. Yes, megabytes.

Word from the demonstrations is that the Genos has neither a hard disk drive nor a solid state drive (SSD). Thus, “Internal memory” is not directly user expandable or upgradeable. Eliminating the hard disk drive, the bracket and access door makes good sense because it reduces weight and chassis complexity. SSDs are still a little pricey for a cost-sensitive manufacturer like Yamaha. If it’s not a hard drive and if it’s not an SSD, then what is it?

Next, what’s up with that polyphony spec? 128 voice polyphony when you play preset voices only and 128 voice polyphony when you play a voice from user voice expansion memory? That’s rather unorthodox.

The high-level view

This is where the Yamaha SWP70 tone generator (TG) integrated circuit (IC) comes into the story.

The SWP70 uses ONFI-compatible NAND flash as its waveform memory. “ONFI” is the industry standard Open NAND Flash Interface. ONFI-compatible chips are the same NAND flash used in SSDs. The SWP70 caches the waveform data in a fast SDRAM just like an SSD in order to have fast, random access to samples.

Yamaha have created a tone generator IC that integrates an SSD-like flash and cache controller. This design eliminates the cost and latency of the SATA bus which normally connects an SSD within a PC or Mac.

For the hardware inclined, here’s a short speculative answer. There are two tone generator ICs each having their own ONFI flash memory. One TG and flash memory (call this one “TG A”) handles factory presets. The other TG and flash memory (call this “TG B”) handles user expansion voices.

The “TG B” flash memory is 64GBytes of ONFI NAND flash. Through software, it is partitioned into a file system partition (62GB?) and a user expansion voice partition (2GB).

The file system partition contains the initial factory content (4GB). The remaining space (58GB) is the “Internal memory” quoted in the Genos specifications.

So, Yamaha engineering decided to use space in one of the ONFI flash memories for bulk storage in order to cut the weight and expense of a magnetic hard drive (heavy) or an SSD (lighter than a hard drive, but not cheap).

If this is true — if — then there are some positive implications for the future of Genos. More at another time.

Ingenious, yes. User expandable, no.

Do I know this for sure? Oh, hell no. We need a service manual. Even a visual inspection of the digital logic board (DM) might not be conclusive.

The low-level view

The notional diagram below shows some of the major interfaces to the SWP70. [Click on images to enlarge.]

  • The CPU bus connects the SWP70 to the main control CPU and other major subsystems that require CPU-based data and control.
  • The ABUS allows SWP70s to communicate with each other when more than one SWP70 is in a system.
  • The waveform memory (NAND flash) communicates with the SWP70 over a Open NAND Flash Interface (ONFI) bus. This open industry standard lets Yamaha use commodity flash memory for waveform ROM. Waveform memory is split into upper and lower bytes with shared control signals. This arrangement instantly doubles bus bandwidth versus a single ONFI data channel.
  • The Serial audio bus brings audio data into the SWP70 (e.g., from the ADC) and sends audio data to the DACs and other subsystems.

Then, the fun begins. The SWP70 has three parallel SDRAM memory channels for wave and DSP working memory.

  • The DSP working memory is a large, scratch-pad memory for effect computation. I believe this memory is also the working memory for Montage FM-X.
  • The Wave working memory is a fast, read/write data cache which holds samples after they are read from the waveform memory. Remember, NAND flash favors sequential block mode read access, transferring data on the nibble-serial ONFI bus. The wave working memory plays the same role as the data cache in an SSD storage unit.

Memory capacities vary across products depending upon target polyphony, effect workload and, of course, the sample set.

Here are capacities for the PSR-S770, PSR-S970 and Montage. All capacities are physical (i.e., raw physical storage space).

             AWM     Waveform    Wave     DSP
          Polyphony   Memory   Working  Working
          ---------  --------  -------  -------
PSR-S770     128      512MB      32MB     8MB
PSR-S970     128       2GB       32MB     8MB
Montage      128*      4GB       32MB    16MB
          * Stereo/mono

The Montage DSP working memory is twice as large as the PSR-S970 reflecting the larger number of supported effect units.

The ONFI standard is the same standard used in solid state drives (SSD). Thus, Yamaha can reap the benefit of lower cost commodity flash. The wave working memory caches data just like an SSD. The SWP70 design yields maximum bandwidth to and from NAND flash without the expense or latency of a SATA bus. Thanks to ONFI, Yamaha can increase waveform memory size by dropping in higher capacity ONFI-compatible devices. User waveform (voice) expansion memory resides in these same memory components, so one should expect bigger user expansion memory in the future as well as bigger factory sample sets.

The SWP70 reads and writes two flash memories in tandem effectively sending a 16-bit word on each ONFI bus cycle. (See diagram below.) One memory provides the HIGH byte and the other memory provides the LOW byte. The same ONFI control signals are sent to both. For people who like to trash Yamaha for not using SSD, please note that tandem access doubles the transfer bandwidth over a single ONFI data path solution. (Of course, an SSD could do the same thing.)

I’ll bet that using the ONFI waveform memory for file system access made the tone generation guys nervous. Would file system traffic rob memory bandwidth from the tone generators?

Yamaha know latency. They spend a lot of time, money and intellectual effort understanding latency and conquering it. That’s where the second waveform working memory comes into play. Samples heading to the tone generators could be held in one waveform working memory while file system data could be held in the second, separate working memory. This organization separates the memory traffic and prevents file access from disturbing the critical, must-be-predictible sample stream. When the two channels arbitrate for the ONFI bus, the sample stream feeding tone generation could be given priority.

Copyright © 2017 Paul J. Drongowski