All is swell (SWL)

Posted on August 22, 2018 by pj

Yamaha develop a wide range of keyboard products from low-cost entry-level ‘boards to high-end synthesizers and digital workstations (AKA “arrangers”).

Within a market segment, the engineering challenge is to develop, manufacture and test a product with the desired feature set at the target selling price. I won’t discuss profit margin here since no one really knows, but Yamaha. We do know, however, that amortized non-recurring and recurring costs must be low enough to produce a significant return. Cost sensitivity is simply a day-to-day reality.

The entry-level segment is the most cost-sensitive segment because most customers in this segment are looking for an inexpensive keyboard with basic functionality. Think “Parents buying a first keyboard for a kid who may walk away from the whole thing in a week or two.” The entry-level segment outsells the mid- and high-end portable keyboard segment by nearly 2 to 1:

    Category                       Units            Retail value
    -----------------------------  ---------------  -------------
    Acoustic guitars               1,499,000 units  $678,000,000
    Electric guitars               1,132,000 units  $506,000,000
    Digital pianos                   135,000 units  $165,000,000
    Keyboard synthesizers             81,000 units  $104,000,000
    Controller keyboards             160,000 units  $ 32,000,000
    Portable keyboards under $199    656,000 units  $ 64,000,000
    Portable keyboards over $199     350,000 units  $123,000,000
    Total portable keyboards       1,006,000 units  $187,000,000

    Sales Statistics for 2014, USA market

Synth fanatics should note that although the average selling price (ASP) is higher for synths, the portable keyboard segment moves a much higher number of units. Fortunately, for manufacturers playing in the entry-level portable keyboard space, volume is relatively high and non-recurring cost can be laid off across a larger number of units than synths.

The entry-level segment has one other important driver — the desire for portable, battery operation. This design consideration limits the amount of electrical power available for computation and thus, limits the amount of computational capacity itself. Some dynamic power can be bought back through lower CPU clock speeds. Folks accustomed to giga-Hertz CPUs may be shocked to see such low clock speeds! Lower clock speeds simplify cooling and reduce overall weight by eliminating heat sinks and cooling fans.

LSI vs. commodity

Yamaha perceive their proprietary expertise in large scale integration (LSI) as a competitive advantage. Although Yamaha exploit commodity components where possible, tone generation and digital signal processing (DSP) are performed in proprietary hardware.

User interface and control (e.g., USB communications, MIDI, LCD, etc.) are a good fit with commodity CPU technology. Yamaha — and Roland — have a long history with H8 and SH architecture CPUs from Hitachi, now Renesas. Early products employed H8 microcontrollers for host CPU functions. Yamaha eventually migrated to the “Super H” reduced instruction set computer (RISC) family. (In 2011, Renesas announced the end of the H8 line.)

Yamaha have a considerable investment in software built and tuned for the SH family. Thus, migration to a new commodity architecture (ARM) is a pretty big deal with a high internal cost. Yamaha have adopted ARM for panel scanning/control in Reface and are using ARM processors for host computation in Montage and Genos. Time and experience will show if ARM is adopted in the entry- and mid-range segments, too.

Old faithful

Yamaha’s entry-level models rely on “old faithful,” the SWL family of proprietary Yamaha processors. The SWL is used in all entry-level models — a good way to drive volume manufacturing of a custom part. The SWL family has undergone several revisions over the years. I don’t intend to recount that history here.

The SWL01U was used in many products including the PSR-E443. The external clock crystal oscillates at 16.9344MHz yielding an internal clock speed of 33.8688MHz by scaling. The relatively low clock speed reduces heat and power consumption. The following diagram shows the typical “compute complex” in an entry-level keyboard. [Click to enlarge.]

The structure in the diagram is generic across Yamaha entry-level products. If you dive into the service manual for a specific entry-level keyboard, you’re likely to find a “compute complex” like this generic one although memory capacities and such are model specific.

The SWL01U provides a CPU bus to which a USB controller (optional), program/wave ROM, flash ROM and SDRAM are attached. The SWL01 has many on-board interfaces: keyboard scanning, LED/LCD interface, bit-serial audio (ADC, DAC), control knob sensing, etc. The SWL01U has an integrated USB controller which can be deployed in ultra low-cost, minimum component count designs.

The SDRAM is, of course, read/write working memory. The flash ROM retains user data when power is turned off.

The program and waveform data are stored in the same physical memory component. In the case of the PSR-E443, the prog/wave memory is a 16MByte parallel NOR flash memory. The factory sound set, therefore, is smaller than 16MBytes. Panel voices, the XGlite sound set, and drum kits are crammed into this small memory along with the E443’s software.

The SWL01U integrates 32 tone generation channels and relatively “lite” DSP effects (reverb, chorus and flanger). I have not had the chance to browse the service manual for the PSR-E453 (or E463). E453 polyphony increased to 48 voices and the DSP effect types got a modest bump. I expect to find a new, updated member of the SWL family in these newer keyboards.

Anyone modestly familiar with microcomputer systems will look at the diagram above and say, “It’s just a computer system,” and they would be right. The simplicity of the system — and its low cost — severely limit tone generation and effect processing, however. The bottleneck is the shared system bus. All traffic must cross this bus whether it is instructions for scanning the keyboard matrix, waveform samples for tone generation, or working data for DSP effects. There is only so much bus (memory) bandwidth and it must be split several ways.

We often think of tone generation as compute-limited. Tone generation may be memory (or bus) bandwidth limited, too. Each mono channel of tone generation must read 88,200 bytes per second:

    44,100Hz * 2bytes = 88,200 bytes per second

For 32 tone generation channels, total required bandwidth is :

    88,200 bytes per second * 32 channels = 2,822,400 bytes per second

This rate must be guaranteed in order to avoid audible artifacts. (Tone generation reads are probably given highest priority by the hardware.)

The system bus does not operate at the same speed as the CPU clock. Assuming 2 clocks per bus operation (conservative estimate), 2.8MBytes/second is a significant fraction of available system bus bandwidth (17 percent). The number of channels cannot be increased without affecting the latency of host operations such as key scanning and real-time player control (e.g., front panel knobs).

Who’s counting?

Entry-level products have a low component count thanks to all of the functionality integrated into the SWL. Low component count has many benefits including smaller printed circuit boards (PCB), lower power, fewer solder connections to go wrong during manufacturing, smaller chassis, etc.

The SWP01U has 176 pins around a modest-sized, quad flat surface mount package. By putting all memory traffic on the CPU bus, i.e., not using a dedicated memory channel for waveform samples, Yamaha have achieved a relatively low pin count. [I never thought I would ever refer to 176 pins as “relatively low.”] Other Yamaha solutions have a much greater pin count due to separate dedicated memory channels. Those solutions, however, deliver a much higher level of performance and polyphony. More about this in future posts.

What’s up, clock?

What’s up with those clock speeds? Why not something “even,” like 16MHz?

Turns out, 16.9344MHz is a multiple of the sample playback frequency:

    16,934,400Hz = 44,100Hz * 24bits * 16

The SWL generates the sample clock for the ADC and the DAC.

The PSR-E443’s ADC is a Texas Instruments PCM1803ADBR 24-bit analog to digital converter. A note in the schematic states “MCLK=768fs, fs=44.1kHz, 24-bit left justified, HPF on, Slave Mode.” 768*fs is 33.8688MHz which is exactly the CPU clock frequency.

The PSR-E443’s DAC is a Cirrus Logic (Wolfson) WM8524CGEDT/R 24-bit digital to analog converter. A note in the schematic states “SYSCLK=33.8688MHz (768fs), BCLK=2.8224Mhz (64fs), WCLK=44.1kHz (1fs), 24-bit left justified.”

You can find the datasheets for the ADC and DAC by searching the Web.

The PCM803A and WM8524 support three audio formats: left justified, right justified and I2S. The formats and clock scheme are rather common and standard, and are supported by most commodity audio ADC and DAC components. The SWL processor, ADC and DAC remain in synch because the CPU clock and the sample clock are one and the same.

So long!

I hope this blog post has given some insight into the design of entry-level musical instrument keyboards.

MIDI TRS pin-out

Posted on August 20, 2018 by pj

The MIDI Association has released Recommended Practice #54 covering the use of TRS connectors for standard current loop MIDI. If you’re not a member of the MIDI Association, I strongly recommend becoming a member. Among other benefits, members have access to the Association’s on-line specifications and recommended practices.

Without putting too fine a point on it, the diagram from RP#54 says nearly all. [Click image to enlarge.]

MIDI device designers are facing the same sort of space issues as phone developers. The physical form factors are getting too darned small! In many cases — no pun intended — connector size is a limiting factor. Old school 5-pin DIN connectors are enormous relative to slender product envelopes. The MIDI Association recommends 2.5mm TRS connectors and a maximum adapter cable length of 2 meters.

So, of course, you’re wondering about the MIDI controllers, etc. that you already own. A quick check with a digital meter or continuity checker will determine if your current device is consistent with the recommended practice or not. I checked the adapters with my Akai MPX8 and, yep, the adapters are consistent.

The Korg HNS-4331 MIDI conversion cable is also consistent with the RP #54 convention. Albeit, both the Akai and Korg adapters have 3.5mm TRS plug, not 2.5mm. 2.5mm is a good idea since it would be all too easy to plug 3.5mm headphones or a live LINE output into a MIDI port. Yikes!

The IK Multimedia cable that came with my iRig MIDI is a bit of an oddball. First, the 5-pin DIN connector is male, not female as shown in the diagram above. Further, the pin 4 and 5 connections are swapped relative to the pin wiring shown above. So, if you’re wiring for IK Multimedia, please take care and check things out for yourself using a continuity checker.

Sure wish this recommended practice was available a few years ago…

In case you missed it, here’s the MIDI reference circuit.

Please join and be sure to read the 2014 update to the MIDI electrical specification.

Yamaha VKB-100 redux

Posted on August 10, 2018 by pj

OK, sooner or later, you knew I would circle back to the Yamaha VKB-100 VOCALOID keyboard. This little gem is a vocal keytar that lets you play pre-loaded lyrics using an installed Vocaloid library sound. Up to five libraries may be installed including the VY-1 library that ships with the VKB-100.

The VKB-100 can be had for the relatively low price of roughly $400USD, depending on shipping cost from Japan. The VKB-100 is only available in Japan at the current time.

Here’s a quick summary of the specs (translated from Japanese):

Number of keys: 37
Keyboard type: HQ (High Quality) MINI keyboard
Maximum polyphony: Vocaloid (mono), Instruments (48)
Number of voices:
    Vocaloid: Up to 5 libraries, Preset: 1 library (VY-1)
    Instrument: Preset: 13
Effects: Reverb, distortion, chorus, tremolo
Equalizer: Flat, Boost, Bright, Mild
Lyric operation: Loop, phrase return/forward, head search, recover
Skill
    Vocaloid: 2 assignable skills
    Instrument: Skill 1: Sustain, Skill 2: Portamento
Memory slots: Vocaloid: 20, PCM sound source: 20
Main controls Pitch Bend Wheel, Expression Wheel, Effect Knob, 
    Select Knob, Selector Slider, Transpose button, Phrase Button, 
    Memory Button,  Skill Button, Octave Button, Loop Button, 
    Master Volume Knob
USB: USB to HOST, USB to DEVICE
Audio connections
    Headphone out
    AUX in
    Line out
Amplifier output: 0.7W
Internal speaker size: 3.6cm
Power adapter: PA-150B
Battery power: 6 x AA alkaline or rechargeable NiMH batteries
Battery life: About 7 hours when using alkaline batteries
Width x depth x height: 821 mm x 121 mm x 65 mm
Width x depth x height: 32.3 in x 4.8 in x 2.6 in
Weight: 1.5kg
Accessory soft case: SC-KB350 (5,500 Yen)

The keybed is probably the Reface keybed. (Sturdy, slightly clack-y.) What’s that? Instruments? Hmmm…

I don’t want to run through the operational particulars, again. Please see the following pages for background information:

It’s cheap, it straps on, it has a keyboard and wheels. Can it be used as a MIDI controller? As an instrument in it’s own right?

The build quality looks pretty decent and the ergonomics are nice. There are two wheels on the neck: pitch bend and expression. And here we hit the first bump in the road — no modulation. A quick look at the MIDI chart and we find that the VKB-100 sends:

Pitch bend
Portamento time (CC# 5)
Expression (CC# 11)
Portamento ON/OFF (CC# 65)
Release time (CC# 72)

That’s it. Now, I suppose one could remap CC# 11 to modulation, but it would be better to have modulation generated natively.

The other gotcha. The VKB-100 has two modes: normal and keyboard mode. The VKB-100 only sends MIDI controller/key messages in keyboard mode, not normal mode. The user must select keyboard mode through a menu and this setting is not retained across power off. That means changing to keyboard mode after every shutdown.

The VKB-100 doesn’t receive much of MIDI anything except some undocumented SysEx. Thus, forget about sequencing.

The final gotcha is MIDI over USB only. If you want to drive an old school 5-pin DIN MIDI module or keyboard, you will need to bridge USB to 5-pin.

So, what about those preset instruments? Here’s a list:

Synth 1
Synth 2
Synth 3
Synth 4
E. Guitar
Harmonica
Tenor Sax
Piano
E. Piano
Synth Bass
Slap Bass
Air Choir
Applause

Quality is comparable to an entry-level PSR keyboard. The Tenor Sax voice, for example, sounds like the Sweet! Tenor Sax in a PSR-E443. At least you have four effect types (reverb, distortion, chorus, tremolo) with four effect depth levels each. Kinda basic.

After a few insipid J-Pop YouTube demos, you want to blow your brains out. I watched them, so you don’t have to. Here are a few demos to check out.

This video demo is not J-Pop. It’s a couple of players who do an interesting quasi-jazz classical improv.
See an English language quick overview by Yamaha’s Blake Angelos
Can the VKB-100 sing English or German? Watch this video from Ploytec.
Here are Instrument demo number one and instrument demo number two.

Hey, Blake! You got to go to Superbooth? I’m jealous!

Bottom line, I was intrigued until I dove into the (Japanese) manuals and found the VKB-100’s limitations as a controller and stand-alone instrument. Ordering from Japan is no big deal, but the VKB-100 would have to offer some real cool features and sounds to compensate for the labor of translating Japanese to English and dealing with display messages in Japanese.

If you want to get your feet wet with Vocaloid, I recommend the Gakken NSX-36 Pocket Miku module. No keytar action, but you get a multi-timbral MIDI module that does Miku Vocaloid at a very modest price (less than $50 USD).

Insertion effects for MIDI songs

Posted on July 9, 2018 by pj

The new Yamaha Genos™ platform greatly expands the number of DSP insertion effects for styles and MIDI songs. No doubt, you would like to put these insertion effects to work in your own styles and MIDI songs. This blog post should help you get started.

There are 28 insertion effect units at your disposal:

Insertion Effect 1 to 19: Keyboard parts (RIGHT1, etc.) and Song channels 1 to 16.
Insertion Effect 20: Microphone and Song channels 1 to 16.
Insertion Effect 21 to 28: Style Parts (except Audio Styles).

Within the constraints of these three groups, any Insertion Effect unit within a group may be assigned to any audio source associated with the group.

I will use the terms “Insertion Effect” and “DSP effect” interchangeably. This is true when you delve into the Yamaha XG parameters, too.

With all this flexibility, effect resource management can easily get out of control. I’ve developed a few personal guidelines to help keep things organized:

Genos assigns RIGHT1, RIGHT2, RIGHT3, and LEFT to Insertion Effects 16, 17, 18 and 19. Avoid using these Insertion Effect units in a MIDI Song.
Assign the remaining Insertion Effect units on a 1-to-1 corresponding basis: DSP unit 1 to Song part 1, DSP unit 2 to Song part 2, etc.

These simple guidelines make it easier to manage track DSP usage when doing the busy-work of Song editing.

Genos also provides a Variation Effect which can be configured as either a System effect or an Insertion Effect. Let’s not even go there for now. The Variation Effect offers additional opportunities for signal routing and control. Unfortunately, opportunity comes at the cost of complicated configuration.

If you want more information about using the Variation Effect, here’s a pair of blog posts for you: PSR/Tyros XG effects and XG effects: SYSTEM mode.

It’s simple then — each DSP unit (Insertion Effect) corresponds to a single Song part. Each unit and its part have the same identifying number.

If you’re sequencing on the Genos itself, you can assign Insertion Effects to Style and Song parts using the Mixer. Go to the Mixer, touch the “Effect” tab at the Left of the screen, and then touch the “Assign Part Setting” button. Genos displays the insertion effect assignment dialog box where you can make assignments. This dialog box is a good way to check that your MIDI sequence is making the correct assignments, too.

I do my MIDI sequencing and editing in BandLab Technologies SONAR (formerly Cakewalk SONAR). This means configuring DSP effects via System Exclusive (SysEx) MIDI messages. Many people fear SysEx because the messages are encoded in hexadecimal numbers. Fear not! I’m going to give you a head start.

At a minimum, we need to create two SysEx messages for each Insertion Effect:

One message to assign the DSP unit to the Song part, and
One message to select the DSP effect type (e.g., British Legend Blues).

This is enough to assign a DSP effect preset (and its algorithm) to a Song part. Once assigned and the MIDI sequence is loaded, you can edit the effect parameters in the Genos GUI by spinning the faux knobs and such. When you hear a setting that you like, you can translate the settings into additional SysEx messages and incorporate the messages into the sequence using a DAW like SONAR.

First things first. The SysEx message to assign the DSP unit to a Song part has the form:

F0 43 10 4C 03 XX 0C YY F7

where XX is the DSP (Insertion Effect) unit number and YY is the Song part number. The only potential gotcha is MIDI unit and part numbering — it starts from zero instead of one. For example, let’s assign DSP unit 6 to MIDI part 6. (I’m assuming that the MIDI part and channel numbers are the same; the usual default situation.) In this example, XX=5 and YY=5, so the final SysEx message is:

F0 43 10 4C 03 05 0C 05 F7

Straightforward.

You may already be aware that hexadecimal (hex) is a way of counting (i.e., representing numeric quantities) in base sixteen. The hex digits 0 to 9 have their usual meaning. Hex digits A, B, C, D, E, and F represent the numeric quantities 10, 11, 12, 13, 14, and 15, respectively, when those quantities are written in base 10, decimal notation. You’ll need those hex digits when connecting DSP units 10 to 16 and Song Parts 10 to 16.

In case you’re still unsure of yourself, here’s a simple table to help you out:

DSP#  Part#   SysEx message
----  -----   -----------------------------------
   1      1   F0 43 10 4C 03 00 0C 00 F7
   2      2   F0 43 10 4C 03 01 0C 01 F7
   3      3   F0 43 10 4C 03 02 0C 02 F7
   4      4   F0 43 10 4C 03 03 0C 03 F7
   5      5   F0 43 10 4C 03 04 0C 04 F7
   6      6   F0 43 10 4C 03 05 0C 05 F7
   7      7   F0 43 10 4C 03 06 0C 06 F7
   8      8   F0 43 10 4C 03 07 0C 07 F7
   9      9   F0 43 10 4C 03 08 0C 08 F7
  10     10   F0 43 10 4C 03 09 0C 09 F7
  11     11   F0 43 10 4C 03 0A 0C 0A F7
  12     12   F0 43 10 4C 03 0B 0C 0B F7
  13     13   F0 43 10 4C 03 0C 0C 0C F7
  14     14   F0 43 10 4C 03 0D 0C 0D F7
  15     15   F0 43 10 4C 03 0E 0C 0E F7
  16     16   F0 43 10 4C 03 0F 0C 0F F7

Find the row in the table for the Insertion Effect (DSP unit) number and Song Part that you want to configure. The third column is the SysEx message to use.

Once the DSP unit is assigned to the Song Part, you need a SysEx message to choose the DSP effect type (e.g., British Lead Dirty). The SysEx message to accomplish this job has the form:

F0 43 10 4C 03 XX 00 MM LL F7

where XX is the DSP unit number, MM is the MSB of the effect type and LL is the LSB of the effect type. The effect types are listed in the Genos Data List PDF file. Look under the “Variation/Assertion Block” section of the Effect Type List. British Lead Dirty is a distortion effect with MSB=102 and LSB=32.

The next step is to convert the MSB and LSB to hexadecimal. I think this is the part that scares some folks the most. Actually, Yamaha have made it easy. While you’re in the Geno Data List PDF file, go to the first “MIDI Data Format” page. You’ll find a table that converts between decimal, hexadecimal and binary. Look up 102 and 32 in the table. The equivalent hex values are 0x66 and 0x20. (The “0x” is my way of marking hexadecimal values.)

After converting, it’s time to select the DSP effect type for unit 6 (and by way of assignment, Part 6). Plug XX=5, MM=66 and LL=20 into the template message above, producing:

F0 43 10 4C 03 05 00 66 20 F7

This message sets the effect type of DSP (Insertion Effect) 6 to British Lead Dirty.

That’s it. At this point, you’re ready to assign DSP preset effects to any of the Song parts. Style parts work the same way. No calculator involved, just a few easy tables.

Changing the DSP effect parameters via SysEx is a little bit more complicated. I’ll save that topic for another day.

Summer snoozer?

Posted on June 26, 2018 by pj

Summer NAMM 2018 starts in a few days and so far it looks like a snoozer. I haven’t found many preliminary announcements of interest. I’m tempted to predict Yamaha’s MOXF replacement (or upgrade), again. However, at some point, one starts to look like a broken clock being right twice a day. 🙂

Yamaha have announced a new product in the P-series digital pianos. The new model P-515 replaces the P-255 as the P-series flagship.

The P-515 is a slab that incorporates much of the new digital piano technology that was introduced with the CSP series (Smart Pianist and Piano Room). The P-515 includes NWX natural wood white keys with synthetic ebony and ivory key tops. Piano sounds include the Yamaha CFX and Bösendorfer Imperial. It has forty panel voices plus a 480-voice XG sound set. The CFX Grand Voice has binaural sampling which recreates the perspective of the player position. Other goodies include key-off samples, smooth release, Virtual Resonance Modeling (VRM), and half pedaling. VRM includes damper, string, Aliquot, and body resonance. Bluetooth audio/MIDI is built-in. Speakers are built-in: Oval 15W (12cm by 6cm) and dome 5W (2.5cm). [Two of everything, of course.] Weight is 48.5 pounds and MSRP is $1,999 USD and MAP (street) is $1,500 USD. A little too heavy for regular gigs out, but its compact size, sound and action make the P-515 a winner for home, especially at $1,500 street. [Manuals now available.]

60s retro is always in with me, so I’m charmed by the Vox Mini SuperBeetle. (Yes, they are using that abominable spelling.) Vox hit the real Super Beatle with a shrink ray, producing a small version that should have some punch. The Mini puts 50 Watts into a 10″ Celestion speaker (or other 4 ohm cab). It has a Korg NuTube tremolo circuit for warmth and 60s reverb.

The Mini has the distinctive Super Beatle chrome stand. I wish Korg/Vox would reissue the Continental in retro form instead of the form factor and stand that everybody pretty much hates. No pricing information yet.

Although it was announced in May, the IK Multimedia UNO analog synth looks like a good time. For a $200 USD synth, the specs aren’t too bad: two independent VCOs, 2-pole multi-mode VCF, LFO, VCA, step sequencer, battery or USB power, and it’s tiny. The UNO has four control knobs which are mapped to synth parameters through a 4 by 4 matrix. Plus, it has five performance buttons. Wisely, the UNO supports old school MIDI IN and OUT via 2.5mm jacks.

If only IK Multimedia had the presence of mind to add 5-pin MIDI to the iRig Keys I/O (25 or 49). Systems thinking, people. Systems thinking.

Roland (and Boss) have announced a new wireless audio system for guitar and other instruments. Two products are aimed at guitar players: the WL-20 and the WL-50. The WL-20 is a very compact transmitter and receiver pair — transmitter for the guitar and receiver for the amp. The WL-50 is a wireless receiver for the pedal board and has additional functionality like providing pedal board power, transmitter recharging, etc. The WL-20L is like the WL-20, but it’s for electronic instruments with line-level audio outputs. Other features include low latency, automatic rendezvous between transmitter and receiver (10 second rendezvous time) and up to 20 meter range. The WL system operates in the 2.4GHz band. I’m interested in the low-latency aspect because Bluetooth doesn’t cut it in this application.

The WLs will be sold under the Boss brand.

News you can use. Avid has announced and released a free version of Sibelius® | First. Sibelius | First is my go-to notation tool for lead sheets, deconstructing MIDI solos, etc. For sure, I will be downloading it and will appreciate the update. (My current copy is rather old, having been part of an M-Audio bundle.) You need to create an Avid account in order to download the free version. Might as well download Pro Tools® | First for free, too.

The modular synth trend has inspired a lot of great products such as the Moog Grandmother. Expect to see new announcements in the Roland System-500 line and literal “plug and play” products in other realms like the Finegear mixerblocks series.

Well, there is always the Moog One rumor. (Save your pennies.)

Time stretching applied to rotary speaker sound

Posted on June 20, 2018 by pj

“Sí, sí, I am very intrigued.”

With Summer NAMM 2018 one week away, I cast the net to see what I can catch. I did a quick sweep of recent patents and came up with a good ‘un.

When folks mention Yamaha, “tonewheel clone” does not immediately come to mind. Other players like Nord, Hammond Suzuki, etc. seem to be ahead in the clone market. So, I was a little surprised to find US Patent 9,899,016 B2, “Musical sound signal generation apparatus that generates sound emulating sound emitting from a rotary speaker.” This patent was issued and assigned to Yamaha on February 20, 2018. It is based on the Japanese patent 2015-171065 issued August 31, 2015.

Yamaha currently use two sample-based methods to generate the basic organ sound:

Playback and mix of waveforms for each individual tone wheel. On Montage and Genos, for example, the musician can adjust the level of each footage using the sliders to mimic drawbars. The generated sound is passed through a rotary speak DSP effect.
Playback of waveforms for “full up” organ registrations with and without the rotary speaker effect “sampled in.” The resulting sound may also be passed through a rotary speaker DSP effect.

In the first case, especially, the overall impression of a genuine B-3 depends upon the quality of the DSP rotary speaker effect. The up-side of the DSP effect is the ability to ramp up and ramp down the rotary speaker speed. So far, reaction to Yamaha’s rotary speaker effects has been mixed.

In the second case, one is not likely to put the sound through a rotary DSP effect — the swirling mass would just not be realistic. The “sampled in” approach can sound more realistic than the rotary DSP effect, but it has two major drawbacks:

The rotary speaker speed cannot ramp up and down between slow and fast rotation.
Sample playback does not align (synchronize) the rotary speaker position, so some noted are “rotating” faster than others and the true spatial characteristics of the horn and rotor are lost.

The second drawback is perhaps the worst of the two since it introduces audible artifacts which are not part of the true rotary speaker sound.

The method in the patent is a different take on sample-based synthesis of tone wheel sound which seeks to eliminate these problems. The notes are sampled for each tone wheel footage after a real world rotary speaker rotating at a particular rate. In each case, wavfeorms are sampled and saved for various rotational angles of the rotary speaker. Thus, the rotary speaker effect is “sampled in.”

Let’s quote from the patent:

Also, the electronic musical instrument has a time stretching function. The time stretching function is a function of changing the length of a sound while maintaining the pitch and formant of the sound. In other words, with the time stretching function, it is possible to extend and shorten a sound in a time axis direction, or in other words, it is possible to change only the reproduction speed (speed with which time advances) of the musical sound signal. The electronic musical instrument uses the time stretching function to extend and shorten each piece of waveform data in the time axis direction by the same extension and shortening rates.

Time stretching is applied to each of the tone wheel samples during playback. Thanks to time stretching, the instrument can reproduce the SLOW and FAST sound, and everything in between when the rotation speed ramps up or down. “A known pitch synchronous overlap and add method is used to achieve the time stretching function.”

The rest of the method — and it is both exhaustive and exhausting! — deals with the synchronization of the waveforms during playback, that is, the alignment of each waveform in accordance with the current virtual position (rotational angle) of the rotary speaker. Throw in separate treatment of the horn and rotor, stereo channels, etc.

The end result is a unique sample-based method that eliminates the problems of “sampled in” rotary speaker effects. I wish that patents came with audio demo files as it would be a treat to hear the method in action and to judge with one’s own ears. Maybe someday in a product?

Boston Music Expo 2018

Posted on June 12, 2018 by pj

After having so much fun last year, I couldn’t pass up the 2018 Boston Music Expo (Saturday, June 9). Music Expo brings people together — artists, producers, engineers, composers, tech companies — the whole panoply of folks at the intersection of musical art and technology.

Sound On Sound Magazine is the chief sponsor. This year’s gold sponsors are Yamaha and Steinberg. Of course, both Steinberg and Yamaha were showing their wares along with many other companies big and small.

Loïc Maestracci — the founder of Music Expo — was at the door with the chance for a quick “Hello!” Let’s get started and go in.

Boston Music Expo 2018 was hosted by The Record Co., located in Boston’s South Bay. The Record Co. has the ambitious mission “to build a sustainable, equitable music scene in Boston.” Although Boston already has a busy scene, it isn’t easy for all artists to grow, collaborate and record. The Record Co. provides subsidized studio space, gear and production resources, thereby lowering the financial barrier for artists looking to record.

The Record Co. has two studios, both kitted out with top-notch gear. Rates are very reasonable. The Studio A live room is quite large and was the venue for one of the two parallel seminar tracks running at Music Expo. Studio A held 40 to 50 seats with space to spare. Studio B is smaller and more intimate.

The thing that I like best about Music Expo is the surprises. While getting my bearings, I was blown away to find people soldering! I had stumbled into the Audio Builders Workshop sponsored by the Boston Chapter of the Audio Engineering Society (AES).

The Audio Builders Workshop offers seminars and group builds to encourage and inspire people to make their own audio electronics. I had a great chat with Brewster LaMacchia (Clockworks Signal Processing) who was leading the group build. The workshop participants were building a small metronome kit ($10 donation). The kit consists of a circuit board, 555 timer, speaker, battery connector, and a handful of discrete components. It’s all through-hole construction and looks like a great way to get started with soldering. If you’re in the Boston area and have an interest in audio electronics, then I definitely recommend getting in touch with this organization.

I bought one of the kits and will eventually build and review it. Sometimes I just like to soldering something up on a rainy day.

Another organization at Music Expo that deserves recognition and support is Beats By Girlz. BBG is a “music technology curriculum, collective, and community template designed to empower females to engage with music technology.” BBG sponsors workshops and other events (hardware and software provided!) to get women and girls into music production, composition and engineering.

That last “E” for “engineering” gets me fired up! Music technology, for me, is the gateway drug to Science, Technology, Engineering and Mathematics (STEM) education and careers. Women are so woefully underrepresented in STEM that I wholeheartedly support groups like Beats By Girlz. In addition to Boston, BBG has chapters in Minnesota, Los Angeles, New York and Chicago. I recommend Women In Music, too, BTW.

I arrived at Music Expo a little later than expected due to a traffic tie-up on the expressway. (Saturday morning? Really?) However, I did manage to catch the two sessions in which I was most interested.

Since it was first announced, I wanted to see and hear Audionamix Xtrax STEMS in action. I’ve tried to spice up my backing tracks with vocal snippets and found center extract (and center cancel) techniques lacking. My first “must-hear” session at Music Expo was an Xtrax STEMS plus Ableton Live presentation by Venomisto. Venomisto used Xtrac STEMS to pull a vocal stem from an existing song and then inserted the vocals into his own remix. Xtrax STEMS is not perfect, but it’s darned good for the money ($99 USD).

I really dug Venomisto’s latin remix, Havana. Toe tappin’, head noddin’. I love this stuff on a Saturday in the city! [I’m listening to it right now and can’t get back to work.] Cruise over to his site and you’ll hear Xtrax STEMS in action, too.

My second “don’t miss” session was “From Score To Stage” by Paul Lipscomb joined by Pieter Schlosser via Skype. Paul ran through the process of sketching and delivering the “Destiny 2” game soundtrack (Bungie Software). Wow, this session could have been a full day.

Although Paul wanted to show people that there are many ways to work and create as an artist, we’re talking “Production” here with a capital “P”. The Destiny 2 soundtrack is a AAA (big) budget production with multiple composers, orchestrators and an orchestra. All I can say, if you want to do this kind of work, be good at the hang and collaboration. Be prepared to work in a geographically dispersed team: client (Bellevue/Seattle), co-writers (Los Angeles, Seattle), orchestrator (The Berkshires in Massachusetts).

Paul classifies music (and the process of getting there) as either linear or interactive. Music for film or video is linear, having a start point, several intermediate points one after another and an end. Game music is interactive and must adapt and re-structure itself to fit the actions of the player.

He demonstrated how one can start with a simple motif (or two) and build your way to a 250 track behemoth. Thanks to the wonderful orchestral libraries available today, composers can put together a rather complete mock-up to present to a client for approval. Even on a big budget job, some of the parts in the mock-up may make it to the final mix simply because there isn’t enough money available to fund everything live (e.g., you can have the orchestra, but not the choir).

Paul uses Steinberg Nuendo and swears by it. Pieter uses Cubase. Nuendo is the bigger brother to Cubase and is geared for post-production and scoring. Paul exports MIDI tracks and provides them to the orchestrator for notation. Yep, good old MIDI.

Paul and Pieter’s presentation was thought provoking, especially about the current state/direction of orchestral music for film, video and games. A discussion about clients and aesthetics would be more appropriate for the “Notes From The Deadline” column in Sound On Sound. [My favorite SOS column, BTW.] However, I’m pondering the age-old question of how to raise our clients to a higher level of musicality. Like Paul, many of us listen to a wide range of music including traditional and modern classical music. (Paul’s advice: “Listen to everything!”) How can we move our clients beyond the limited scope of their own musical experience?

Well, shucks, that’s just two of the fifteen Boston Music Expo sessions on offer. Several sessions dealt with the business side — promotion, social media and collaboration — in addition to the artistic side.

I spent time cruising the exhibitor booths. Here’s a few short-takes and shout-outs:

Scott Esterson at Audionamix demonstrated Instant Dialog Cleaner (IDC) as well as XTrax STEMS. He humored a lot of my crazy questions and comments. Thanks.
The Yamaha folks had Montage6, MX88, MOXF8 and a clutch of Reface keyboards available for trial. Friendly as ever, it was good to touch base. I had an extended conversation with Nithin Cherian (Product Marketing Manager, Steinberg) and I quite appreciate the time that he spent talking with me.
The IK Multimedia iLoud Micro Monitors are excellent for the price. Not quite up to the Genelec studio monitors on show in the room next door, but much more affordable. A definite covet.
Speaking of IK, the iRig Keys I/O have a decent, solid feel and touch. The 25 key model is seriously small and still has full size keys. Suggestion to IK Multimedia: Please bring out a 5-pin MIDI dongle for us dinosaurs with old keyboards. I’d love to hook up an iRig Keys I/O 49 to Yamaha Reface YC.

A special shout-out to Derrick Floyd at the IK Multimedia booth. He epitomizes “good at the hang.”

I said it last year and I’ll say it again, Music Expo bridges the widening gap between customers and technically advanced products. On-line ads and videos just aren’t the same as playing with a product and experiencing it for one’s self. Brick and mortar stores cannot devote much space, inventory or expertise to the broad range of fun tools and toys that are up for sale. With on-line sales as perhaps the dominant sales channel, whoof, tactile customer experience is utterly lost. Music Expo closes the gap.

If Music Expo is coming to your corner of timespace, please don’t hesitate to attend and participate. I’m sure that you will enjoy the experience and will make valuable connections.

Which guitar is which?

Posted on June 4, 2018 by pj

I hope my recent post about single coil and double coil guitar tone and amp simulators was helpful. Today, I want to further reduce theory to practice.

A quick recap

Guitar pickups are important to overall guitar tone. There are two main types of pickup: single coil and double coil. Players generally describe the sound of a single coil pickup as bright or thin and describe the sound of a double coil pickup as warm or heavy. Double coil pickups are also called “humbuckers” because the design mitigates pickup noise and hum. Pickup tone tends to favor certain styles of music:

Single coil: Blues, funk, soul, pop, surf, light rock and country styles
Double coil (Humbucker): Hard rock, metal, punk, blues and jazz styles

Of course, there are no hard and fast rules and exceptions abound!

Fender guitars frequently use single coil pickups while Gibson favors double coil. Three guitar models are favorites and are in wide use:

Fender Telecaster (Usually 2 single coil pick-ups): Bright, banjo-like tone, twangy.
Fender Stratocaster (3 single coil pick-ups): Bright, cutting tone.
Gibson Les Paul (2 humbucker, dual coil pick-ups) Warm tone with sustain.

The Telecaster was originally developed in 1951 for country swing music. It was quickly adopted by early rock and rollers. The Stratocaster appeared in 1954, but is usually associated with 60s rock. It is often used in rock, blues, soul, surf and country music. The darker tone and sustain of the Les Paul make it suitable for hard rock, metal, blues and jazz styles.

These aren’t the only (in)famous guitars around. The Rickenbacker solid and semi-acoustic models are also classic. Think about the chime-y Beatles and Byrds radio hits from the 1960s. Single coil Ricks are not uncommon.

If you would like to hear the difference in raw tone between Fender Telecaster (single coil), Fender Stratocaster (single coil) and Gibson Les Paul (double coil humbucker), cruise over to this comparison video. The demonstrator compares raw tone starting at roughly 7 minutes into the video, ending at about 11 minutes. The first part of the video is the usual yacking and the last part of the video puts the guitars through an overdrive effect with the demonstrator playing over a backing track. The last part is less informative because our ears need to sort out the guitar from the backing track. Plus, once you put a guitar into a distortion effect, all bets are off. Are you hearing the true guitar tone or just an effected, synthesized tone?

Method to the madness

My ultimate goal is to identify and classify synth and arranger guitar voices, single coil vs. double coil, in order to quickly chose an appropriate guitar voice (patch) for MIDI sequencing. I work with Yamaha gear (Genos workstation, PSR-S950 arranger, and MOX6 synthesizer), so the following discussion will focus on Yamaha. However, you should be able to apply the same method (and guesswork about names!) to Korg, Nord, whoever.

Yamaha provides some major clues as to the origin of its guitar samples, but they are quite reticent to use brand names. Arranger (Genos and S950) voice names are especially opaque. Therefore, the best we can do is to use the clues when possible and to always, always use our ears.

Fortunately, the deep voice editing of the MOX6 lets me dive into the guts of a guitar patch to find the base waveform information including waveform name. In order to get the analysis started, I went into the Mega Voice patches to find the underlying waveforms. When Yamaha sample a guitar, they sample multiple articulations (open string, slap, slide, hammer on, etc.). The waveforms for a particular instrument are a family and share the same root name like “60s Clean.” Given the base waveforms, I then can identify regular synth voices which use the same waveforms. The regular voices are more easily played on the keyboard than Mega Voices, making it easier to perform A/B testing.

Mega Voices are a good entry point for analysis because the MOX, Motif and Montage family have roughly equivalent Mega Voices as the S950, Tyros and Genos product family. This allows A/B testing across and within product lines.

Development history is important, too. I took note of new Mega Voices added to each product generation. Each new Mega Voice is a new waveform family. Given a Mega Voice, I look for new Super Articulation (SArt) voices which were also added at the same time and try to find the SArt voices which are based on the Mega Voice. The chosen SArt voices become reference sounds for further A/B testing and starting points for voice selection when sequencing a song.

When A/B testing, all EQ, filter and DSP effects (including reverb and chorus) must be turned OFF. We need to reveal the sound of the underlying raw waveforms (samples). Even so, there may still be sonic differences due to VCF and VCA programming. I found that this kind of critical listening is quite tiring and it’s better to work for 30 minutes, walk away and come back later with fresh ears. Otherwise, everything starts to sound the same!

Breakdown

Enough faffing around, get to the bottom line.

First up is a correspondence table between Montage (Motif, MOX) Mega Voice guiters and Genos (Tyros, PSR S-series) Mega Voice guitars.

       Genos name            Motif/MOX name        Motif/MOX waveform
---------------------------  --------------------  ------------------
8 10 4 60sVintage                                  n/a [Strat]
8 11 4 60sVintageSlap                              n/a [Strat]
8  4 4 50sVintageFinger                            TC Cln Fing *
8  5 4 50sVintageFingerSlap                        TC Cln Fing Slap
8  6 4 50sVintagePick                              TC Cln Pick *
8  7 4 50sVintageSlap                              TC Cln Pick Slap
8  8 4 SlapAmpGuitar       
8  3 4 SingleCoilGuitar      Mega 1coil Old R&R    1Coil *
8  1 4 SolidGuitar1          Mega 60s *            60s Clean *
8  2 4 SolidGuitar2          Mega 60s *            60s Clean *
8  0 4 CleanGuitar           Mega 1coil *          Clean *
8  0 7 JazzGuitar            Mega Jazz Guitar      Jazz *
8  0 5 OverdriveGuitar       Mega Ovdr Fuzz        Overdrive *
8  0 6 DistortionGuitar      Mega Ovdr Distortion  Distortion *

A star (“*”) in the table is a placeholder for all of the voices and variants within a family. Motif/MOX have many variants of “Mega 60s” and “Mega 1coil” voices. They all use the “60s Clean” and “Clean” waveforms in different ways, including different stomp box and amplifier effects. A star in the waveform column denotes a waveform family, i.e., collectively a group of waveforms for all of the articulations sampled from the same instrument.

A few observations. Montage did not add any new guitar Mega Voices. Montage does not have a Stratocaster waveform. [A future upgrade for Montage?] Finally, I couldn’t quite work out where “SlapAmpGuitar” fit into the voice universe.

“Slap,” by the way, is a playing technique borrowed from bass players. The thumb hits a string instead of a pick or finger. Usually the lowest string is slapped because it is the most easily hit by the thumb. The slap may be combined with palm or finger muting to prevent other notes/strings from sounding with the slap.

Beyond Mega Voice

Folks know by now that Mega Voices are for styles and arpeggios. Yamaha never intended them to be played using the keyboard. It’s darn near impossible to play with the kind of precision required to trigger the appropriate articulation (waveform) when needed. They’re good for sequencing (styles, arpeggios) because a sequence can be edited in a DAW with precise control over note velocities.

None the less, musicians wanted to be able to play these great sounding voices and Yamaha responded with Expanded Articulation (Motif XS and later) and Super Articulation (Tyros 2 and later). I won’t dive into Expanded Articulation here. Super Articulation, however, effectively puts a software script in front of a Mega Voice. The script translates each player gesture to one of the several articulation waveforms which comprise a Mega Voice.

This description is notional. I doubt if the software uses an actual Mega Voice as the target. Some gestures like legato technique are handled in the AWM2 engine à la Expanded Articulation.

If you followed my suggestion to audition the Mega Voices without EQ, effects, etc., then you surely know how difficult it is to play a Mega Voice from the keyboard. Should you try this, I recommend setting the touch curve to HARD in order to hit those ultra low key velocities. Or, set RIGHT1, RIGHT2 and RIGHT3 to a fixed velocity. By changing the velocity level, you’ll be able to play a specific waveform within a Mega Voice precisely and reliably. Please refer to the Mega Voice maps in the Data List file to see the correspondence between velocity levels and waveforms.

To audition without Mega Voice and to select Genos (Tyros, S950) voices for sequencing, it’s far easier and fun to play a Super Articulation (SArt) voice. Problem is, with Yamaha’s opaque voice naming, it’s difficult to know the exact waveform family you’re triggering. So, I built a table of SArt reference voices by matching SA voices with their Mega Voice equivalent.

Genos Mega Voice      SArt reference   Waveform
--------------------  ---------------  ------------------------
60sVintage            60sVintageClean  [Strat]
60sVintageSlap        TBD              [Strat]
50sVintageFinger      CleanFingers     TC Cln Fing *
50sVintageFingerSlap  FingerSlapSlide  TC Cln Fing Slap
50sVintagePick        VintageWarm      TC Cln Pick *
50sVintageSlap        TBD              TC Cln Pick Slap
SlapAmpGuitar         TBD              TC Cln Fing Slap Amp/Lin
SingleCoilGuitar      SingleCoilClean  1Coil *
SolidGuitar1          WarmSolid        60s Clean *
SolidGuitar2          WarmSoild        60s Clean *
CleanGuitar           CleanSolid       Clean *
JazzGuitar            JazzClean        Jazz *
OverdriveGuitar       TBD              Overdrive *
DistortionGuitar      HeavyRockGuitar  Distortion *

Single coil vs. double coil? That’s easy. The only double coil guitars are SolidGuitar1, SolidGuitar2, and any SArt voice built on the 60s Clean waveform. All other guitars are single coil.

Hmmm. I’ll bet that a double coil Gibson Les Paul and/or Gibson SG are in the works. Yamaha will eventually fill the gap!

A few entries in the table are TBD, “to be determined.” Definitively identifying slap guitar has eluded me so far. I can hear a difference between non-slap and slap, but finger slap vs. picked slap, my ears aren’t there yet.

All in all, it was a useful exercise to strip away the effects and EQ. It reminds me of the scene in the documentary “It Might Get Loud” in which The Edge demonstrates his effects pedal board. First, the plain tone of the guitar, then the huge sound with all of the effects piled on. Thanks to the tech built into our keyboards, we can be a little bit like The Edge.

Single coil, double coil

Posted on May 30, 2018 by pj

Today’s exploration is practical even if it is excessively wonk-ish.

Last week, I decided to update MIDI sequences for a few classic tunes by The Alan Parsons Project. Parsons and Eric Woolfson laid down 70s progressive rock tracks with serious groove: “I Wouldn’t Want To Be Like You,” “What Goes Up”, and “Breakdown”. Classic in their own right are the guitar solos by Ian Bairnson. Bairnson contributed electric guitar (and the occasional saxophone!) to the Parsons/Woolfson wonder duo.

I’m striving for authenticity, so one of the first questions to ask is “What guitars and amplifiers did Bairnson use for the I Robot and Pyramid albums?” Fortunately, Ian has a page dedicated to his gear. Very likely, he played a Les Paul Custom through a Marshall 50 head driving a 4×12 Marshall angle-front cabinet. Thanks for posting this information, Ian!

The next hurdle is searching through the many tens (or hundreds) of synth guitar patches, amp simulators and speaker cabinet sims to find the most authentic audio waveforms and signal processing effects. Bang, we run into a practical and wonk-ish problem: Which of these many digital choices are likely candidates and which choices can we ignore? Unfortunately, manufacturers (at the very least, their attorneys) make the search difficult by avoiding any use of brand names (e.g., Gibson, Fender, Les Paul, etc.) in patch and effect names. Sometimes the patch/effect names are suggestive euphemisms, most times not.

For these kinds of sequencing jobs, I’m arranging on Yamaha gear, either PSR-S950 or Genos. Although I love their sound, it’s seems that Yamaha have deliberately gone out of their way to divorce patch/effect names from their real-world, branded counterparts. The number of candidates is small in organ-land, i.e., “Organ flutes,” as Yamaha calls them, mean Hammond B-3. The number of candidates in guitar-land is much, much larger and harder to discern.

Here’s some info that might help you out. Kind of decoder for guitar instrument and amp/cabinet sim names. Even though I looked to authoritative sources, there’s still guesswork involved. So, apologies up front if I’ve led anyone astray.

Single vs. double coil

This is a biggy. Guitarists are ever in pursuit of “tone.” Of course, a big part of tone is the electric guitar at the front-end of the signal chain. In this analysis, I’m concentrating mainly on solid body guitars and I’m ignoring acoustic, hollow-body and semi-hollow instruments.

Some might argue that player style, articulations and dynamics are the true front-end. If you want to argue that point, please go to a guitar forum. 🙂

For solid body, the choice of pick-up is important. If you’re not familiar with electric guitars, the pick-up is the set of wire coils beneath the guitar strings that sense vibrating strings and convert mechanical vibration to electrical vibration. The electrical signal is sent to a volume/tone circuit and then on to a guitar amplifier. A guitar may have more than one pick-up, say, one pick-up by the neck, one under the bridge and one in the middle between the two. The pick-ups may be switched into alternative combinations. Along with the volume/tone controls, the tonal possibilities are nearly endless.

Seems kind of pathetic to rely on only one or a few guitar waveforms (samples), doesn’t it?

There are two main kinds of pick-up: single coil and double coil (humbucker). The humbucker was invented and patented by Gibson as a means of mitigating the noise (hum) present produced by a single coil pickup. The sound of a single coil pick-up is often described with terms like “bright,” “crisp,” “bite,” “attack.” Double coil pick-ups are described as “thick,” “round,” “warm,” “dark,” “heavy.”

Due to parentage, Gibson guitars usually have double coil pick-ups. Fender guitars usually have single coil pick-ups. Naturally, the quest for tone has led to hybrids using both kinds of pick-up, regardless of manufacturer.

Reducing these observations to practice, when Ian Bairnston says he used a Gibson Les Paul Custom for his work with The Alan Parsons Project, we should be looking for samples (waveforms) of a double coil electric guitar, of which the Les Paul is an excellent example. Even if you couldn’t give two wits about synth patch names, use your ears an listen for a thick, round, warm, dark, heavy tone.

Detective work

OK, I’m a wonk and did a little detective work.

Yamaha arranger patch names are obtuse about single vs. double, etc. Worse, the voices are pre-programmed with DSP effects which mask the characteristics of the fundamental waveform. So, step zero is to be aware of the masking and turn off all EQ, DSP, chorus and reverb effects when listening and making comparisons.

Doubly worse is the lack of deep voice editing where we can deep dive a voice and discover the basic waveforms underlying a voice patch, including the waveform names. This is where my trusty Yamaha MOX6 synthesizer comes into play. I use the MOX6 to deep dive its patches and then compare patch elements against candidate voices on the PSR-S950 arranger. This always leads to interesting discoveries.

Although I refer to the MOX specifically, please remember that the MOX is a member of the Motif/MOX family. Comments can be extrapolated to the Motif XS on which the MOX is based, and the Motif XF/MOXF which are a superset of the Motif XS/MOX.

A large number of MOX programs have “Dual Coil” in their name. These programs are based on the “60s Clean” waveforms. Think of “60s Clean” as a family of waveforms with multiple articulations: open strings, slide, slap, FX, etc.

Other MOX programs are “Single Coil”. These programs are based on the “Clean” family of waveforms. If you listen and compare “60s Clean” versus “Clean,” you can hear the difference between single coil and double coil. The voice programming switches between the waveforms depending on key velocity, articulation buttons, and so forth.

The “60s Clean” and “Clean” waveform families make up the “Mega 60s Clean” and “Mega 1coil Clean” MOX megavoices, respectively. Please recall that a MegaVoice uses velocity switching, articulation switches (AF1 and AF2) and note ranges to configure a versatile voice suitable for arpeggio and style sequencing. Given the underlying waveforms, we can conclude that Mega 60s Clean is dual coil and Mega 1coil Clean is single coil.

Mid- and upper-range Yamaha arranger workstations also have MegaVoices, albeit they may have small differences in patch programming. The fundamental waveforms, however, are the same. Yamaha, like all manufacturers, recycle waveforms (samples). It’s not that older waveforms are bad; they provide backward compatibility and legacy support. Ever increasing waveform memory capacity makes it easy and inexpensive to include legacy waveforms and voices.

Given that conceptual basis, I did a little A/B testing between the MOX synth and the S950 arranger. Here is a summary of the correspondence between guitar voices:

    PSR-S950 Voice     MOX6 Voice
    -----------------  ---------------------
    MV CleanGuitar     Mega 1coil Clean

    MV SolidGuitar1    Mega 60s Clean
    MV SolidGuitar2    Mega 60s Clean

    MV SingleCoil      n/a
    MV JazzGuitar      n/a

    MV OverdriveGtr    Mega Ovdr Fuzz
    MV DistortionGtr   Mega Ovdr Distortion

    MV SteelGuitar     Mega Steel
    MV NylonGuitar     Mega Nylon

This is what my ears tell me when all of the EQ, DSP, chorus and reverb effects OFF.

MV SolidGuitar1 and MV SolidGuitar2 are based on the same waveform. The patch programming is different: different EQ, VCF and VCA parameter values. The default DSP effects are different, too.

Naturally, you’re curious about the missing S950 MV SingleCoil and MV JazzGuitar voices in the MOX6 column of the table. The MOX does not have equivalent voices. However, the Motif XF eventually added “Mega 1coil Old R&R” and “Mega Jazz Guitar”, both patches based on new single coil and jazz guitar waveform families. Indeed, the MV SingleCoil is great for that old rock’n’roll twang.

Hey, S950 owners! I’ll bet that you didn’t know that you have a piece of the Motif XF under your fingertips.

[I’m still categorizing SArt voices as single or double coil. Watch this space.]

Amplify this!

That’s it for the front-end of the signal chain. What about amp simulation?

The riddle of amp sim names is difficult to solve. Fortunately, guitarists are positively obsessive about vintage amps and the Web has many informative sites. (Too many, perhaps?) Armed with a few clues from the Yamaha Synth site, I forged out onto the Web and arrived at these educated guesses about amp simulators:

    DSP effect/sim      Real-world
    ------------------  ---------------------------------
    US Combo            Fender (Bassman?)
    Jazz Combo          Roland Jazz Chorus
    US High Gain        Boutique (Mesa Boogie Rectifier?)
    British Lead        Marshall Plexi
    British Combo       Vox (AC30)
    British Legend      Marshall (Bluesbreaker? JCM800?)
    Tweed Guy           Fender 55 Tweed Deluxe
    Boutique DC         Matchless DC30 (Boutique AC30)
    Y-Amp               Yamaha V-Amp
    DISTOMP             Yamaha stomp pedal FX
    80s Small Box       No specific make/model
    Small Stereo Dist   No specific make/model
    MultiFX             No specific make/model

The list compares quite favorably with Guitar World’s 10 most iconic guitar amplifiers:

    Vox AC30 Top Boost (1x12, 2x12)                 1958
    Fender Deluxe (1950s tweed)                     1955-1960
    Mesa/Boogie Dual Rectifier                      1989
    Marshall JCM800                                 1981
    Marshall 1959 Super Lead 100 Watt Plexi (4x12)  1965
    Roland JC-120 Jazz Chorus (2x12)                1975
    Peavey 5150 (2004: 6505)                        1992
    Fender Twin Reverb                              1965-1967
    Fender Bassman (4x10)                           1957-1960
    Hiwatt DR103 (4x12)                             1972

Several of the amp sims include cabinet simulation, too. Here are my guesses:

    DSP Sim  Real-world
    -------  --------------------------------
    BS 4x12  British stack (Marshall)
    AC 2x12  American combo (Fender?)
    AC 1x12  American combo (Fender?)
    AC 4x10  American combo (Fender?)
    BC 2x12  British combo (Vox?)
    AM 4x12  American modern (Mesa Boogie?)
    YC 4x12  Yamaha
    JC 2x12  Roland Jazz Chorus
    OC 2x12  Orange combo
    OC 1x8   Orange combo

The abbreviations “BS” and “AC” are potentially confusing. “AC” suggests the (in)famous AC series of Vox amps. “BS” suggests “Bassman”. However, I don’t recall a Vox AC 4×10, while the Fender 4×10 is iconic. A Yamaha site spelled out “BS” as “British Stack,” so I’m sticking with “A” for American and “B” for “British”.

Back to Bairnson, I’m trying the British Legend amp sim with a BS 4×12 cabinet first, then tweak.

I hope you enjoyed this somewhat wonk-ish walk through synthesizer and simulated guitar-ville. In the end, it’s tone that matters and let the ears decide.

Audio Style file format

Posted on April 16, 2018 by pj

Yamaha introduced audio styles in the PSR-S950 arranger workstation. Audio styles are both loved and hated. Loved when they sound good, but hated when people try to change or repurpose them in new styles.

The term “audio style” is a bit of an overstatement. Only the percussion track is audio. At least, that’s how audio styles have been developed and used to this day. Yamaha just released the Audio Phraser application for creating and editing the basic skeleton of an audio style, so this situation may change now that people can more freely create, edit and share their own audio styles.

Audio style file internal format

Ever since Yamaha distributed the audio styles for Genos, I’ve been meaning to take a look inside of an audio style file. Here’s a little preliminary information.

An audio style file is an IFF-like container just like a Standard MIDI File (SMF). In fact, an audio style file has the same internal organization as a regular style file which we know to be a Type 0 SMF with extra chunks.

An audio style file has the following chunks (in order):

    Type    Purpose
    ----    ------------------------------------
    MThd    SMF header chunk
    MTrk    SMF track chunk
    CASM    Yamaha CASM chunk
    AASM    Audio assembly (descriptor) chunk
    AFil    Audio file (waveform) chunk
    OTSc    Yamaha OTS chunk

The AASM and AFil chunks are new, additional chunks beyond the known MIDI, CASM and OTS chunks. All chunks have a four byte chunk identifier and a four byte chunk size. The chunk size does not include the identifier or chunk size bytes, as usual.

The AASM chunk is relatively small, about 2,500 bytes. It consists of 15 variable length ASEG subchunks. The ASEG subchunk has a four byte subchunk size. Each ASEG corresponds to a style section; that’s why there are fifteen of them.

An ASEG subchunk has three parts:

    Type    Purpose
    ----    ------------------------------------
    Adec    Identifies the style section
    Atab    Identifies the audio file; other functions unknown
    AMix    Function unknown

The Adec part is variable length, having an explicit four byte size. The Atab and AMix parts appears to be fixed length (101 and 28 bytes, respectively) and do not have an explicit size field.

The Adec part is ASCII text and is a style section name like “Main A” or “Fill In DD”. That is the only information in Adec.

I don’t know exactly what the Atab does. The Atab part contains an ASCII string which identifies the audio file associated with the style section. This string is clearly visible in a dump. (Example below.) All of the Atab and AMix parts in the test audio file have the same values except for the audio file names.

File Offset:       36965
Subchunk type:     'ASEG'
Subchunk size:     151
Section name:      Main D
Atab type:         'Atab'
   0    0    0   97    0   32   32   32 | 00 00 00 61 00 20 20 20 | ...a.
  32   32   32   32   32   41   56   48 | 20 20 20 20 20 29 38 30 |      )80
 115   67   97  110   97  100  105   97 | 73 43 61 6E 61 64 69 61 | sCanadia
 110   82  111   99  107   95   77   97 | 6E 52 6F 63 6B 5F 4D 61 | nRock_Ma
 105  110   32   68    0    0    0    0 | 69 6E 20 44 00 00 00 00 | in D....
   0    0    0    0    0    0    0    0 | 00 00 00 00 00 00 00 00 | ........
   0    0    0    0    0    0    0    0 | 00 00 00 00 00 00 00 00 | ........
   0    0    0    0    0    0    0    0 | 00 00 00 00 00 00 00 00 | ........
   1   15   -1    7   -1   -1   -1   -1 | 01 0F FF 07 FF FF FF FF | ........
   0    0    0  127    0    0    0    0 | 00 00 00 7F 00 00 00 00 | ........
 127    0    0    0    0    0  127    0 | 7F 00 00 00 00 00 7F 00 | ........
   0    0    0    0  127    0    0    0 | 00 00 00 00 7F 00 00 00 | ........
   0    0    0    0    0    0    0    0 | 00 00 00 00 00 00 00 00 | ........
AMix type:         'AMix'
   0    0    0   24    7 -128    0   -1 | 00 00 00 18 07 80 00 FF | ........
  88    4    4    2   24    8    0  -80 | 58 04 04 02 18 08 00 B0 | X.......
   7   71    0   10   64    0   91    0 | 07 47 00 0A 40 00 5B 00 | .G..@.[.
   0   -1   47    0    0    0    0    0 | 00 FF 2F 00 00 00 00 00 | ../.....

Etienne from the PSR Tutorial Forum points out that the AMix subchunk contains MIDI event codes:

AMix : header
00 00 00 18 : length of data
07 80 : 0780 hex = 1920 decimal (PPQN ?)
00 : delta time
FF 58 04 04 02 18 08 : meta event Time signature 4/4
00 : delta time
0B 07 70 : controller volume
00 : delta time
0A 40 : controller Panpot
00 : delta time
5B 00 : Controller Reverb send level
00 : delta time
FF 2F 00 : end of MTrk trunk

Nice catch, Etienne! The AMix content makes sense because something needs to set up the channel volume, pan and reverb level for the audio phrase. Yamaha love to use MIDI events for other purposes (like voice files, OTS, etc.) Why not?

The AFil chunk has substructure, too. The AFil chunk consists of ADSg chunks. As you might guess, the AFil chunk is pretty big because it contains waveform data.

The following table shows the offset and length information for the first ADSg in the example’s AFil:

    AFil     37287  15261858
    ADSg     37295   1219275      Container for an audio file
    ANdc     37303        50      File name
    AWav     37361   1219209      Container for audio waveform
    WAVE     37369       n/a      Marker (no subchunk size)
    Afmt     37373        16      Audio format information
    Sfmt     37397       217      Container for section information
    Sdec     37608         6      Section name, e.g., Main A
    Adat     37622   1218300      Waveform data
    AInf   1255930       640      Container for audio information
    BPnt   1255938       136
    OPnt   1256082       240
    APnt   1256330       232
    ATmp   1256570         0      Empty, subchunk size is 0
    ADSg   1256578                Container for the next audio file
    ....

The container relationships are important because the containers and subchunks are nested:

    AFil contains ADSg
    ADSg contains ANdc, AWav
    AWav contains WAVE, Afmt, Sfmt, Sdec, Adat, AInf
    AInf contains BPnt, OPnt, APnt, ATmp

The nesting is a bit of a pain in the patootie when writing code to parse a style file.

ADSg is the container chunk holding audio waveform (meta-)information. Like ASEG, there are fifteen ADSg chunks — one for each audio file. The ANdc subchunk inside contains the audio file name which matches up with the name in the ASEG. AWav is the container holding the audio waveform data itself.

The audio “file” format is WAV-like, but it is not exactly WAV (Microsoft RIFF). I was able to playback the audio by importing the audio style file as a raw (untyped) audio file. The audio format seems to be 44,100Hz, 16-bit stereo, big endian. No compression or encryption. It isn’t be too hard to dump the audio.

Yamaha Audio Phraser

Now that you know a little bit about what’s inside of an audio style file, here is brief overview of what the Audio Phraser program generates.

Audio Phraser generates an MThd MIDI file header chunk, a single MTrk chunk (Type 0), an ASEG chunk for each audio waveform, an AFil chunk (containing an ADSg subchunk for each audio file) and a CASM chunk.

The MIDI tempo and time signature are the same as the tempo set in Audio Phraser. The MIDI song title is set to “Audio Phraser”.

The MIDI track contains the usual markers at the beginning: SFF2 and SInt. A single SysEx message is generated after SInt: General MIDI System ON (F0 7E 7F 09 01 F7). The key signature is set to C/Am, followed by:

SMPTE Offset
Sequencer specific metadata: ff 7f 04 43 00 01 00 00

Oddly, MIDI channel 4 has four, whack-looking MIDI OFF events:

    NOTE OFF G#9
    NOTE OFF G5
    NOTE OFF C0
    NOTE OFF C0

A bug? The remaining markers indicate the start of the style sections. The section length corresponds to the length of the audio waveform for the section. Thus, if the audio waveform for “Main A” is 2 bars, then the MIDI section for “Main A” is 2 bars long.

The CASM chunk is minimal and sets NTR/NTT for MIDI channel 9 (Subrhythm). NTR is “Root Fixed” and NTT is “Bypass/Bass Off”. No NTR/NTT is given for channel 10 (rhythm/drums).

Audio Phraser does not generate an OTSc (One Touch Settings) chunk.

Audio Phraser creates an AWI file for each waveform that it imports into an audio style file. The AWI file most likely holds the results of Audio Phraser’s analysis (i.e., beat detection and so forth). It would be interesting and informative to compare the contents of an AWI file against the ASEG and AInf chunks in the resulting audio style file. I’m guessing that the AWI file is the “prototype” for the ASEG and AInf chunks.

Java source code

If you would like to explore audio style files, then download the source code for a simple audio style dump program. The code is relatively brittle and expects to encounter chunks in a certain order and/or quantity. Thus, be prepared to modify the code. This is an experimenter’s kit, after all. 😉

Sand, software and sound

Electronics and computing for the fun of it

Category Archives: Music technology