What’s inside of a Yamaha arranger?

Curiosity finally got the better of me and I decided to find out what’s inside of the Yamaha PSR-S750/S950 arranger keyboards. Fortunately, Yamaha provides service manuals for its products. The manuals have block diagrams and schematics as well as disassembly information, etc.

My first impression is that the S750 and S950 are quite different beasts inside even though a fair amount of user-level functionality is similar between the two products. However, some internal differences are pretty obvious and expected due to different features:

  • The S950 has a bigger set of voices and styles.
  • The S950 supports a wider range of effects on all four DSPs.
  • The S950 adds vocal harmony.
  • The S950 has a color display and can display lyrics and so forth through a video output.

Both products are relatively complex, multiprocessor systems, so the analysis below is greatly simplified.

As you might expect, both products have a main processor (CPU) to handle the user interface, the USB interface, and so forth. The S750 has an SWX08 CPU, which is most likely a Yamaha sourced SH3 or SH4 system-on-a-chip (SOC). The SWX08 has a Yamaha part code and is probably manufactured by Yamaha itself. The S950 has a Renesas SH7331 processor, which has an SH4AL-DSP CPU core. Yamaha has employed Hitachi/Renesas SH processors for many years. The SH4 is a reduced instruction set computer (RISC) that handles both general purpose computing and digital signal processing (DSP). The SH4AL-DSP can perform a multiply/add step in one clock cycle. Both machines are capable of handling some DSP duties on the main CPU. The SH7331 is clocked at 256MHz while the SWX08 is clocked at 135MHz.

The S750 program memory is 256Mbits. The S950 program memory is split between a 64MBit flash boot memory and a 4Gbit main program memory (Hynex HY27UF084G2M). The Hynex memory is 8-bit serial (512M x 8-bit) NAND flash memory. The address and data are clocked sequentially through an 8-bit port. Since this is a relatively low bandwidth interface, the program is loaded into SDRAM working memory first and then executed from there. The S950 working memory consists of four 128Mbit devices plus one 256 Mbit device for a total of 96MBytes ((4 * 16MByte) + 32MByte). I wouldn’t be surprised to find audio track data stored in the big NAND flash along with the program image. The S750 working memory is 64MBytes (2 * 256Mbit) of SDRAM.

Tone generation on both machines is performed by an SWP51L integrated circuit (IC). This is a custom Yamaha IC. The SWP51L has a 64Mbit by 16 bit SDRAM for DSP through a dedicated channel. The SWP51L is fed by wave ROM divided into HIGH and LOW banks. Each bank sends a 16-bit data stream to the SWP51L. Surprisingly, the wave ROM capacities are the same. The S750 and S950 have two banks of 1 Gbit NOR flash memory each (256MByte total).

Neither processor has a separate dedicated memory for downloadable expansion packs. The main CPU very likely reserves 64MBytes in the wave ROM for expansion pack samples. (“ROM” is a bit of a misnomer in this context.) Thus, one could expect to see larger expansion memory in future products when more wave memory is added.

The vocal harmony and display processing are handled by separate dedicated processors. The vocal harmony processor (SSP2) is connected to the output of the microphone analog-to-digital converter (ADC). SSP2 has its own dedicated DSP RAM and program memory. Each product has a display controller: the S1D13700 Embedded Memory Graphics LCD Controller on the S750 (black and white LCD) and the Yamaha Advanced Video Display Processor 7 (AVDP7) on the S950 (color LCD).

It’s interesting to look back at earlier Yamaha keyboards. The PSR-1500 and PSR-3000 were released in 2004. Here’s a table comparing past (2004) with present (2012).

PSR-3000 PSR-1500 PSR-S950 PSR-S750
SA 0 0 62 38
MegaVoice 10 0 23 18
Regular 261 273 571 523
Sweet 14 8 27 24
Cool 18 5 64 46
Live 19 1 39 29
Wave ROM 64MB 16MB 256MB 256MB

The Yamaha MOX and MOXF, for comparison, have 355MByte and 741MByte wave memory, respectively, when converted to 16-bit linear format.

The Super Articulation (SA), MegaVoice and Live voices are the most memory hungry. Both SA and MegaVoice voices need multiple articulations (multiple waveforms). The Live voices are sampled in stereo and require twice as much space as the equivalent mono (regular) voice. Of course, there are many other factors such as the number of multi-samples and loop length that affect memory usage and sound quality, so a grain of salt is needed when interpreting these numbers.