Polyphonic Arduino synthesizer

If you’re interested in building an Arduino-based ROM-pler, this next project is for you!

One of my long term dreams is to build a low-cost 60s-style combo organ. My latest project uses an Arduino UNO as a sample playback, sound synthesis engine. Although the waveforms are taken from the old VOX Continental and Farfisa Mini Compact organs, the design and code could easily use single cycle waveforms from a vintage synth, a string machine, your first born child, whatever! The 60s combo organ project is essentially a software ROM-pler that plays back up to five waveforms at a 22,050Hz sampling rate.

The project hardware consists of an Arduino UNO and a Narbotic Instruments MidiVOX shield. The MidiVOX shield has a Microchip Technologies MCP4921 12-bit digital to analog converter (DAC) and an opto-isolated MIDI input. Although the MidiVOX is no longer in production, it’s basic circuitry is easy to recreate; several other popular audio shields use the MCP4921.

Waveforms are stored in the Arduino’s program memory (PROGMEM), just like code. Program memory is non-volatile and the waveforms are ready to go just like a pre-loaded sketch. The combo organ sketch sets up TIMER1 to generate interrupts at a 22,050Hz sample playback rate. The interrupt handler reads the next sample for each of five virtual tone generators, sums the samples together, and writes the next aggregate sample to the DAC.

MIDI communication is performed through the standard Arduino MIDI library (version 4.2). The sketch registers two callback functions via the library: a note ON handler and a note OFF handler. The MIDI note handlers configure the five virtual tone generators. The sketch’s loop() function is trivial — it merely calls the MIDI library read() function and checks a reset button on the MidiVOX shield.

We all know that Direct Digital Synthesis (DDS) — the usual approach for sample playback — is a compute intensive technique for sound synthesis. DDS dynamically shifts the pitch of a stored waveform from its root pitch (the frequency of the sampled note) to the target pitch (the frequency of the MIDI note played by the musician). DDS performs waveform pitch-shifting through phase accumulation and interpolation. Floating point arithmetic is too slow and most DDS implementations use fixed point arithmetic. Even then, the computational load is heavy.

So, how did I achieve five note polyphony? Instead of storing a single waveform at a single root pitch, my approach stores twelve waveforms — one waveform for each basic pitch in the chromatic scale. The algorithm uses integer phase increments, thereby eliminating floating or fixed point arithmetic and interpolation entirely. The approach requires more space, but is quite fast. Each sampled instrument occupies 20% of program memory, allowing up to four different instruments before running out of PROGMEM.

Here are two quick MP3 demo files: a Farfisa-type sound and a and a VOX-type sound. I created the vibrato by routing the audio signal through an inexpensive Behringer UV300 vibrato pedal.

As usual, we always publish code. Need a cheap ROM-pler? Now you’ve got one!

Update 22 July 2016: If you’re into retro, be sure to check out the Arduino lo-fi beat box project. Filled with lo-fi TR-808 goodness!

MidiVOX: An appreciation and review

They just don’t make ’em like they used to. In the case, of the Narbotic Instruments MidiVOX shield for Arduino, I really mean it!

The MidiVox is a bit of a blast from the past as Narbotic no longer manufacture and sell the MidiVOX shield kit. Major bummer. Luckily, I purchased one of these little gems from the MakerShed when the shields were available a few years ago. Narbotic kindly maintain the design information and code on their Web site.

To me, the MidiVox is a most logical combination of a MIDI IN port and a 12-bit digital-to-analog converter (DAC). The MIDI port incorporates a 6N138 optocoupler for electrical isolation and a 5-pin DIN connector. The port is connected through a “PGM/MIDI” switch to Arduino digital pin D0, also known as the serial receive (RX) pin. The PGM position connects the serial pin the usual way in order to download to the Arduino. The MIDI position connects the Arduino serial RX pin to the MIDI IN circuitry. The switch component is robust and is easily accessible when the MidiVOX is on top of the Arduino and/or other shields.

The 12-bit DAC is a Microchip Technology MCP4921. This DAC is used in several other audio shield designs including the Adafruit Wave Shield and the Nootropic Design Audio Hacker Kit. The MCP4921 connects to the Arduino SPI port through digital pins D13 (SCK), D11 (MOSI), and D9 (chip select/slave select). Conventional practice recommends using D10 as slave select (SS), but it isn’t a big deal to use D9 instead as this is mainly a software issue. Slave Select (called “chip select” in the MCP4921) chooses and enables communication with the slave device. This capability is essential when more than one device is connected to the same SPI interface as in the case of the Nootropic Audio Hacker shield.

Although it seems like a no-brainer to connect all SPI devices to the Arduino SPI pins, the Adafruit Wave Shield does not follow this approach. It connects the SD card interface to the SPI pins, but connects its MCP4921 to three ordinary digital pins. The Wave shield software bit-bangs the digital pins to transfer data to its DAC. I’m not a fan of this approach, preferring to use standard libraries instead of possibly buggy, poorly documented bit twiddling code.

The MidiVOX shield implements a 2-stage, passive filter following the DAC output. The MidiVOX sends a mono signal through an on-board trim pot into a 3.5mm audio output jack. Trim pots are usually rated for a relatively small number of operating cycles, so it’s best to set this level once and make volume adjusts at an external mixer, preamp, or whatever.

The MidiVOX shield provides a DATA LED controlled by digital pin D7. The shield also has a RESET button (momentary contact switch) connected to digital pin D6. This button is ACTIVE LOW, meaning that the button pulls D6 to ground when it is pressed. Therefore, the pin mode should be configured as INPUT_PULLUP such that D6 is pulled up internally when the button is not pressed (i.e., the momentary contact switch is open).

Construction was easy. The resistors have five color bands, but don’t let this throw you off. The construction directions give the correct color code and you can (and should!) always check resistor values with a meter before insertion and soldering. I replaced the basic header pins with “stackable headers” (two 8-pin and two 6-pin). Stackable headers provide a way to make easy external connections to the shield stack from a breadboard, etc.

The completed board is shown in the photo below. The MidiVOX is stacked on an Arduino UNO with the USB, audio and MIDI cables, and is ready to go.

MidiVox

I wrote a diagnostic sketch to check out the different parts of the MidiVOX. I wish manufacturers would provide check-out sketches instead of relying on somebody’s possibly flaky application sketch for smoke testing. If something is busted, it’s important to find it early through a directed test that isolates the failure. Fortunately, everything checked out OK the first time!

The MidiVOX diagnostic program is an Arduino sketch to check out parts of a Narbotic Instruments MidiVOX shield. Rename the “loop” functions and rebuild in order to test a particular section of the shield.

Since the MidiVOX is discontinued, we’re all out of luck if we want to get (another) one. However, I strongly recommend studying the MidiVOX design. When I first got started with Arduino and MIDI, I borrowed the MIDI IN circuitry and the low pass filter design. These are simple, solid circuits and are good basic building blocks for other designs and applications.

Where to next? My dream is to build a low-cost 60s combo organ with the era-appropriate look and sound. The organ would look like a Vox Continental with a Z-shaped chrome stand and bright red Tolex covering. It would sound like either a Farfisa or a Vox — nothing too nuanced with all of the drawbars or tabs turned on. I’d like to use a cheap and lightweight MIDI controller as the keyboard. The controller would drive a low-cost (Arduino-based?) sound generator. I’m hacking out a prototype using an Arduino UNO and the MidiVOX shield. More to come…

Make music with MMS on a PSR

Yamaha Mobile Music Sequencer includes features for Motif, MOX and Tyros5, but did you know that you can create music using MMS on your PSR arranger? Yes, you can!

I’m using MMS with both the Yamaha PSR-E443 and PSR-S950 and I have written up a tutorial on making music with MMS on PSR/Tyros. This article concentrates on set-up, MIDI voice selection and MIDI file export which are aspects not covered by the MMS manual. The tutorial complements the many on-line videos that demonstrate composition and mix down. In particular, I show how to use the full 128 voice General MIDI voice set in the PSR, thereby expanding your sonic palette beyond the limited range of voices built into MMS.

Enjoy and keep on keepin’ on!

Scat voice expansion pack

I’m pleased to release version 1 of my jazz scat voice expansion pack for Yamaha PSR-S950 and PSR-S750 arranger workstations. The expansion pack has five PSR voices which let you create “Take 6” style, a cappella arrangements and other kinds of jazz voice performances. Give the MP3 demo a try!

Four of the PSR voices are individual syllables: DOO, DOT, BOP and DOW. The DOO syllable is looped and let’s you create sustained chords for backing. The DOT, BOP and DOW syllables are short and provide scat-like expression. All four syllables are combined into a velocity-switched voice where you select and play one of the syllables based on how hard you strike the keys (i.e., MIDI note velocity). You will need to adjust touch response (and practice!) to get the most playable and musical result.

Here is a link to the expansion pack file. You need to download and UNZIP this file, then install the YEP file by following the directions in the Yamaha PSR-S950/PSR-S750 Owner’s Manual. See the section titled “Expanding Voices”.

I am also releasing the multi-samples that I used to create the expansion pack in case you would like to create a scat voice for your own synthesizer or software instrument. If you are curious about how I created the expansion pack voices and the samples, please see this blog post.

Both the scat voice expansion pack and the scat voice samples are released under a Creative Commons Attribution 4.0 International License.

Creative Commons License
ScatVoices and ScatVoice samples by Paul J. Drongowski are licensed under a Creative Commons Attribution 4.0 International License.

You are free to use the expansion pack voice or samples (even for commercial purposes) as long as you provide a link to http://sandsoftwaresound.net from your own web site AND/OR explicitly credit me in your creative work, e.g., “Scat samples/voice by Paul J. Drongowski”.

Sampling “scat”

In this post, I describe the process and tools that I used to capture samples for my jazz scat voice. I will eventually release the voice (for the Yamaha PSR-S950 workstation) and its samples under the Creative Commons attribution license. I’m not the best singer, so I’ve had to rely on technology as much as possible while still producing a musical result. I want to emphasize that I sang, edited and produced all of the samples and the voice patch; it is original work.

The jazz scat voice is inspired by the (in)famous “jazz voice” patch found in Roland keyboards. The Roland patch is based on samples from the Spectrasonics Vocal Planet library by Eric Persing and Robby Duke. Their work was clearly influenced by Take 6 and other contemporary a cappella artists.

My patch uses four multi-samples where each multi-sample is a particular syllable taken over 12 (or so) pitches. The multi-samples cover the natural range of the human voice from F3 to F6 where C5 is middle C. The four syllables are: DOO, DOT, BOP and DOW. The DOOs are long, looped samples that provide a musical bed or harmony. The remaining three samples are short one shots suitable for melody, punctuation and accents. The DOW syllable falls.

The basic patch design is summarized in the following table.

Syllable Type Vel low Vel high Gain
DOO Loop 1 89 0 dB
DOT One shot 90 105 -3 dB
BOP One shot 106 119 -6 dB
DOW One shot 120 127 -9 dB

The table shows the MIDI velocity range to each syllable (multi-sample). It also shows the relative gain for each syllable. The gain decreases as velocity increases in order to maintain a more consistent volume level as the keys are struck harder to trigger the one shots.

At a strategic level, the sampling production process consists of two major steps:

  1. Capture a natural voice sample for each syllable and pitch. These natural voice samples are the formants to be used in the next step.
  2. Capture a vocoded sample for each syllable and pitch while playing the appropriate formant sample through the PSR-S950 vocoder.

This process produces scat syllable sounds that are consistent, pitch accurate and in the case of the DOO syllable, loopable.

Here’s a run-down of the practical problems that motivated this approach. My voice is an untrained baritone. It cannot possibly cover the F3 to F6 range without hysterical noise and possible voice damage. As I discovered, it is nearly impossible to sing pitch accurate short syllables such as these without proper training! I needed to find a method that would give me a consistent and pitch accurate sound across the desired range of pitches. This is a greater challenge than I originally anticipated and a lot of experimentation led to the two-step method. It took about 3 weeks to find the method and then a further two weeks of production work.

Now, the details.

I used a Roland Micro-BR digital recorder to capture both natural voice and vocoded samples. This little wonder is great — easy to use, fast and above all, quiet. For natural voice, I sang into a Shure PG-81 condenser microphone feeding an ART TUBE MP preamp. The TUBE MP is a really Swiss army knife providing phantom power for the PG81, a little bit of tube warmth, and conversion from XLR to a line level audio signal. The output of the TUBE MP is connected to the Micro-BR. For vocoded voice, I connected the line level mono output of the PSR-S950 to the Micro-BR. In both cases, all Micro-BR input effects are disabled and gain staging is established before hitting the RECORD button.

Formants are captured and produced in the following way. I sang each syllable multiple times at each of the desired pitches while recording to the Micro-BR. The pitches cover the F3 to F6 range such that no resulting final sample would be transposed up more than one semi-tone and/or down two semi-tones. Transposing up or down more than these limits negatively affect sound quality (obvious sample speed-up/slow-down). The entire sampling session is converted to WAV format and then transferred to a PC where Sony Sound Forge Audio Studio is used to review the sung syllables and to select the best one at each pitch. Each selected syllable is saved in its own WAV file. The selected syllables are tuned with Celemony Melodyne. The tuned syllables are the formants for the vocoding phase.

Sony Sound Forge is a solid audio editor. I can work fast in Sound Forge and its “Copy new” function is ideal for cherry picking a recording session. In a few cases, I had to amplify a sample to compensate for low level. When singing across such a wide range of pitches, one needs to rely on electronics/software for amplification in order to avoid voice strain! For tuning, I used the trial version of Celemony Melodyne Single Track which installed with Sonar X3. Although the procedure to enable the trial period was wonky, Melodyne is a great tool and I will very likely buy a copy.

In the second major production step, the formant syllables are sent to the PSR-S950 vocoder and vocoded syllables are recorded on the Micro-BR. The S950 vocoder is not a true synth vocoder. (The Motif/MOX and Tyros vocoders are true “synth” vocoders.) The S950 vocoder is part of its vocal harmony proceesor. Its “VocoderMONO” mode is designed to let (untrained) voices sing into a microphone and impose the formants onto a rather natural sounding, pitch accurate synthetic voice sound.

My early investigation found that the PSR-S950 vocoder needs clean formants that are near the desired final pitch. By clean, I mean formants that do not overdrive the vocoder input and are relatively free of the (un)natural gurgles and what not in the sounds made by the human vocal system. (Well, my vocal system anyway.) The first major step in the overall process let me select the cleanest formants. However, attempts to sing outside one’s natural vocal range introduce gurgles and rasps at the low end and off-pitch histrionics and screeches at the high end. The first major process step choses the cleanest formants and tunes them to the desired pitches.

I loaded the formant samples into a Roland RD-300GX piano as an Audio Key set. Each formant sample is assigned to a particular key and is played by the RD-300GX when the key is struck. Basically, this arrangement gives me a simple one-shot playback engine. The output of the RD-300GX is connected to the microphone/line input of the PSR-S950 in order to drive the vocoder. The mono output of the PSR-S950 is connected to the Micro-BR.

Once everything is connected and levels are set, a little trial and error is needed to find the best formant at each desired vocoder pitch. Think of this as a dry rehearsal for the final recording. Frequently, the formant at the same desired vocoder pitch is the best choice for the vocoded sample. However, sometimes one of the nearby formants is better or produces a more consistent timbre or articulation across the multi-sample. This involves a lot of critical listening and A/B comparison, producing a list of formant and pitch pairs.

Then, it’s time to hit RECORD and capture the vocoded samples by playing the desired pitch on the S950 keyboard and playing the corresponding desired formant on the RD-300GX. Once again, the recording session is converted to WAV format, is transferred to the PC, and is separated into individual WAV files.

At this point, the DOT, BOP and DOW one shot samples are pretty much complete. The DOO samples need to be looped. For some zany reason, Sony Sound Forge Audio Studio saves loop points in Acid METADATA within a WAV file. The Yamaha voice editor does not pick up this information. After searching the Web, I discovered that loop info within a WAV file is not really standardized. Given that the target tool is from Yamaha, I decided to use Yamaha’s Tiny Wave Editor (TWE) to loop the vocoded DOW samples. This worked out pretty well as TWE’s crossfade looping eliminated some bad thumps without introducing artifacts. A lot of trial and error was still involved in choosing the loop points, however. TWE can be found for free on the Web, by the way.

The final production step is to bring all of the vocoded samples into the Yamaha Expansion Voice Editor (EVE) and produce the final voice as part of an S750/S950 expansion pack. I made five voice patches:

  1. DooLoops: DOO syllables over MIDI velocities 1 to 127
  2. GetLayeredUp: All syllables, velocity-switched
  3. DatStuff: DOT syllables over MIDI velocities 1 to 127
  4. BopOnPop: BOP syllables over MIDI velocities 1 to 127
  5. Dow2008: DOW syllables over MIDI velocities 1 to 127

The multi-samples are most easily tested and normalized individually. Plus, the DOO loops and other syllables are musically useful by themselves without velocity switching. I built the GetLayeredUp patch after testing the individual multi-samples and normalizing the volumes of the individual samples within. Choosing the patch names was really fun! (Apologizes to George Clinton.)

The Yamaha Expansion Voice Editor is a trial version for which the trial period was, ahem, adjusted. Yamaha needs to just face facts and release an official version of EVE. Zillions of S750/S950 people are already using EVE and if Yamaha is somehow trying to protect its expansion pack franchise, well, that train done left the station a looooooooooong time ago. At this point, an official EVE would enhance the PSR product ecosystem and sales.

EVE does not implement velocity levels/switching. I used V. Muller’s version of the OLE Toy binary editor to set the element velocity ranges in the GetLayeredUp patch. Thank you, V. A huge amount of effort went into the analysis of YEP files and Python coding and he deserves all of the credit.

Thanks to vocoding, the final samples have a consistent sound. They are a little bit plain Jane by themselves, however. I gave each patch a little bit of reverb (reverb send level 20). I also added the “Ensemble Detune 2” DSP effect (send level 64). This is a truly spiffy effect — a chorus without modulation that gives the impression of an ensemble of slightly detuned voices. It is exactly the kind of gloss that the scat voices need.

Although the velocity ranges in GetLayeredUp are reasonable, users should still expect to tweak the keyboard velocity sensitivity and touch response to their personal needs. For example, I need to play GetLayeredUp on the softest touch setting. Your mileage will definitely vary!

Please stay tuned for the initial release of the expansion pack and multi-samples.

PSR-E443: Snap review

Ah, it’s always fun to post a “first impressions” review of a new toy! In this case, the Yamaha PSR-E443 portable arranger.

I like to use a battery powered keyboard at rehearsals since an all-in-one sets up and tears down without a lot of work. Up to this point, I’ve been playing an old Yamaha PSR-273. The 273 first made the scene in 2003, so it was definitely time for an update.

The PSR-E443 is the top of the entry-level portable keyboards from Yamaha. It has 61 keys and a built-in stereo sound system comprising two woofers and two tweeters. The E443 is powered by either an AC adapter (PA-150) or six AA batteries. So far, I’ve only used an AC adapter and don’t have a feel for battery life. Fortunately, the MOX6 uses the same PS-150 adapter and I didn’t need to buy yet another adapter. (The E443 does not ship with an AC adapter.)

For the sake of review, I played similar styles and MIDI songs on the old PSR-273 and the more expensive PSR-S950 arranger workstation ($250 street for the E443 version $1,900 street for the S950). The E443 sells for about the same price as a mid-range “boutique” guitar pedal. Given that the E443 consists of a computer-based sound generator, analog-to-digital converter (ADC) for the auxiliary audio input,
LCD display, keyboard and media content (e.g., styles, DJ patterns, voices), it’s quite a manufacturing feat to deliver a fun, usable product at such an aggressive price point!

In terms of build quality, you definitely get what you pay for. The build quality of the old PSR-273 seems to be more robust than the E443. Yamaha definitely has taken cost of the E443 in order to sell it for a $250 street price. Although the E443 is a reasonable solid product for the home, it would definitely not hold up on the road. The push buttons do not have the same solid feel as the S950 (or the MOX synthesizer) and one needs (and should use) a gentle touch when pressing buttons. Cosmetically, the only really bothersome observation is the obvious difference between the top C key and the rest of the keys in the key bed. The top C is an add-on key which is not aligned evenly with the rest of the keys and which has a slightly different color (shade of white) than the other white keys. In comparison, the old 273 and the more expensive S950 have nice even keys and consistent key color.

The E443 has a somewhat “retro” sound set augmented by many additional voices that were added over the history of the E4xx series. The E443 and 273 share many of the same panel voices which is a little disappointing. These common voices sound somewhat better on the E443 due to better effects, equalization and sound system. However, with only a few exceptions, the panel voices in common share the same waveforms. One of the exceptions are the string voices. The E443 strings sound much better especially in the lower octaves.

The XG sound set is definitely a step up from the 273 although the S950 XG sound set is at a still higher quantum level in quality. I played the same commercial XG file (“Smooth Operator” by Sade) through all three instruments. The 273 is truly pathetic, the E443 is acceptable, and the S950 is not too bad at all. The E443 does not have the benefit of the XG variation (DSP) effects as available on the S950 and the solo sax sounded just a tad naff. However, I think a typical consumer would be happy with MIDI file playback through the E443; it definitely beats the Microsoft wavetable synthesizer!

Although it sounds a bit negative at this point in the review, the E443 definitely shines brighter than the 273 due to the additional, augmented panel voices. These voices include the several “Cool” and “Sweet” voices, three dynamic velocity-switched voices, a handful of newer voices like “Woodwind Section”, and the many “DJ” synthesizer voices that were added to implement the DJ patterns. There are also some wonderful world voices like Trumpeta Banda and Harmonium. The sound designers also added a few dozen dual (layered) voices. Even though the dual panel voices use the same waveforms as normal non-layered panel voices, many of these dual panel voices are fatter, very playable and usable. I’m looking forward to using these “newer” voices and the improved strings at rehearsals.

The area where the E443 shines brighter than the S950 (!) is the real-time tweaking provided by the two sound control knobs on the front panel. Even though I’m not a huge synth enthusiast, I used the knobs to tweeze voices like the dynamic overdriven guitar while jamming over a style. I’m now sold on having a few knobs around for real-time tweaking and would love to see a couple of knobs on the mid-range arranger workstations. Pressing up/down buttons in the S950 mixing console just doesn’t have the same feel or immediacy. Further, a quick check with MIDI-OX shows that the E443 sends MIDI CC messages for cut-off frequency, resonance, reverb level, chorus level, attack time and release time when the appropriate knob is twisted.

The E443 also has some advantages over the S650 (the next model up in the arranger family). The E443 supports limited voice programming and stores the same six voice parameters for the main and dual voice. These voice parameters are stored in registration memory. This makes the E443 voices tweakable. The S650 lacks even this rudimentary level of voice editing.

Like voices, the styles are a mix of old and new. The styles include many old chestnuts like “Cool8Beat.” The older styles sound better through the improved sound system, but they retain the same essential phrases. The newer styles, especially those in the “Dance” category create more excitement. There are also a few fun additions in the Latin and World categories. Each style has a “One Touch Setting” (OTS) voice that selects a voice that Yamaha deemed to be appropriate for the style. Of course, this is somewhat hit or miss as personal taste and preference varies. There are a few surprises like a very nice Sweet Flute and Piano layer.

The E443 is reasonably adept at playing commercial styles in the original (and older) Style File Format (also known as “SFF” or “SFF1”). I played the styles in the MIDI Spot Soul and Blues pack and got a fairly decent result. These styles were developed for the PSR-9000 (circa 2000). It goes to show that good programming and musicality trumps mere technology! I had more trouble getting the recent “HappyBeat” style to sound decent even though Musicsoft sells this style as “PSR-E443 compatible.” It isn’t just a difference in voicing — the actual harmony sounds off and discordant. I am increasingly disappointed in Musicsoft’s notion of “compatibility.”

I successfully played back the DJX II patterns which I have been converting for PSR. More about this in a future post.

Speaking of DJ patterns, we finally are getting to the E443 functionality that makes it unique in the current arranger product line! There are twenty EDM patterns. I don’t work in the genre, so I’m not really qualified to speak to their currency or quality. However, I do know that EDM styles change with lightning speed! I also know that you cannot load new (user) patterns into the E443. You have to be happy with what Yamaha have provided. Yamaha, even if you continue to keep the internal patterns locked up — a user cannot save or play the patterns to a MIDI file or data stream — please, please, please add the ability to load new patterns. This capability would really enhance the product and create a community of developers around the E4xx series. As Patti Smith said, “This is the era when everyone creates.”

I like the Old Skool and R&B Smooth patterns the best, but that’s just me. Old Skool immediately brings up memories of Grandmaster Flash and “The Message.” Each pattern seems to have an OTS voice (panel voice number 000). The R&B Smooth pattern’s OTS brings up a nice Sweet Flute and Voice Lead layer.

The E443 has 150 arpeggios (musical phrases) for additional instant, real-time fun. The arpeggios track and respond to notes played with the right hand. (BTW, with the main, dual and split voice capability, you can play a left hand bass along with a two-voice layer with your right hand.) Wisely, there are also forty arpeggios voices which automatically bring up a voice and an appropriate arp. This makes it easy to jump into arpeggios without having to do any configuration. Of course, you can change the arp type, voice, etc. to come up with new combinations.

Between the DJ patterns and arpeggios, the E443 approaches the capabilities of the MM6/MM8 “Mini Mo” workstation. The Mini Mo had DSP effects and a smattering of Motif voices, but the E443 has more voice editing and more user style locations — all at a much lower price. If you crave the old MM6/MM8 patterns, they are available through the Yamaha Mobile Music Sequencer (MMS), where Yamaha have re-purposed them. I tried MMS with the E443 and I’m happy to report that you can drive the E443 with MMS on iPad with a little knowledge and consideration of how MMS selects General MIDI voices and drum kits. This is a subject for another day.

The E443 has a pretty decent range of drum kits. Some of the kits have been around the loop once too often and lack punch. When I was experimenting with the DJX II patterns, I noticed that the E443 Dance Kit is the older version of the Dance Kit and has been assigned a different program change number (#113) than the most current kit on the S950. This may be an issue for content creators more so than regular players.

The E443 user interface is a significant refinement of the old PSR-273 era interface. The E443 provides many direct access buttons where you just need to hold a button for a little while in order to be taken to the appropriate editing screen. Further, Yamaha have made it much easier to navigate through the “Function” menu. In the 273 era, one had to repeatedly push the function button to step sequentially through the function menu. With the E443, you navigate through the function menu using the category buttons which do double duty as up and down. Another nice improvement is the transpose button on the front panel. On the 273, I would often skip past the transpose screen and have to circle all the way around the menu. This is a true pain at rehearsals as our music director will often call for a new key right on the spot.

Overall, the E443 is “something old, something new, something borrowed, something blue.” For the street price, it’s hard to find a better value in both sound quality and fun!

Mining the Yamaha DJX II

Update: Follow this link to download a free collection of PSR/Tyros DJX-II styles.

Time to party like it’s 1999!

The Yamaha DJX II was the second generation of Yamaha “DJ” keyboards that were targeted for musicians/producers working in “dance” styles (e.g., tekno, hip-hop, drum’n’bass, etc.) Thus, the DJX II uses loop-like “patterns” as its basic musical element instead of arranger styles. The DJX II is best remembered for its unusual keyboard; Some octaves had white whole note keys while other octaves used grey. That’s because different octaves controlled different functions like selecting a pattern to play or transposing a pattern.

The DJX II had a selection of fairly decent patterns in different dance-oriented genres. Although I’ve never heard a DJX, it’s sound was probably hobbled a little bit by the sound set. The DJX II had only 4MBytes of wave ROM! The internal and external patterns are available for download from the Yamaha support site. Seems like a place to find and mine some useable musical phrases, and naturally, I’m looking for the funk. The target keyboard is the PSR-S950 arranger workstation.

The ZIP files from Yamaha unpack into a bunch of standard MIDI files (SMF). Each SMF contains a group of ten, musically related patterns that form a construction set. The SMF has a small amount of set up information at the beginning: General MIDI reset, reverb type select and chorus type select messages. Each pattern within the SMF begins with a MIDI text marker from “1” to “10”. In order to convert the SMF for the PSR-S950, I changed these markers to arranger style markers (e.g., “Main A,” “Intro A,” etc.) and added “SFF1” and “SInt” markers to the first measure. The new marker name determines the method by which the arranger will play the pattern. More about this in a second.

As I mentioned above, the DJX patterns are assigned to keys such that a single key press plays a particular pattern. The patterns are laid out according to black and white keys as follows:

Pattern  Type  Key color
-------  ----  ---------
1        Main  White
2        Fill  Black
3        Main  White
4        Fill  Black
5        Main  White
6        Main  White
7        Fill  Black
8        Main  White
9        Fill  Black
10       Main  White

Main patterns are on the white keys and fill patterns are on the black keys. Fill patterns are not restricted to one measure; a pattern may be anywhere from 1 to 256 measures in length.

Given these considerations, you may need to be a bit creative when assigning a pattern to an arranger section. Please recall that arranger introduction, ending and main sections may be 1 to 256 measures in length. Fill-in and break sections are limited to one measure. A DJX “fill” pattern may be greater than one measure and cannot always be assigned to an arrange fill-in section. Further, you may not even want to assign the fill pattern this way, preferring to invoke the pattern from one of the section buttons instead. The three introduction buttons (sections) are good destinations for a “fill” pattern because the section acts like a manually controlled fill button. The arranger will play the fill pattern (introduction) and then automatically proceed to the selected main section.

Patterns assigned to arranger ending sections are a little problematic. An arranger ending will stop playback unless another section is selected. You’ll need to fast finger the arranger buttons when jamming.

Even though this seems complicated, it’s not really. The more difficult and time-consuming part is dealing with the drum sets and note mappings.

First, some background is needed. The DJX channel layout is very different than the arranger channel layout. Here is the layout for the 53_Soul pattern file, which is typical of all DJX II SMFs:

Channel  DJX PC#     DJX voice         S950 voice/kit
-------  ----------  ------------      --------------
9        126   0  3  BD Kit        --> Real Drums
10       126   0  4  SD Kit        --> Real Drums
11       126   0  1  B900 Kit      --> Hip Hop Kit
12       127   0  5  Analog Kit1   --> Analog Kit
13       0   112 34  Pick Bass     --> Pick Bass
14       0     0  1  Bright Piano  --> Bright Piano
15       0   112 17  Jazz Organ    --> Organ
16       0   113 27  60's Clean    --> Tremolo Guitar

Channels 9 to 12 are rhythm, channel 13 is bass, and channels 14 to 16 are phrases. By (un)convention, channel 9 is bass drum, channel 10 is snare drum, channel 11 is high hat and channel 12 is percussion. Channels 9 to 12 must be set up as drum parts:

F0 43 10 4C 08 08 07 01 F7
F0 43 10 4C 08 09 07 01 F7
F0 43 10 4C 08 0A 07 01 F7
F0 43 10 4C 08 0B 07 01 F7

These System Exclusive (SysEx) messages must be added to the initialization part of the SMF in order to select different drum kits independently under XG.

You’ll need to choose new drum kits for the rhythm channels since the DJX II has its own unique, non-standard kits. This part is totally creative. Who’s to say what the new style should sound like? If it moves your booty, then it’s a winner! Fortunately, the bass drum, snare drum and hi-hat channels seem to use these drum instruments exclusively. This narrows the re-mapping problem. I remapped the kick first just to get a listenable groove going and then tackled the snare followed by the hi-hat. The following chart lists the DJX II drum kits and the roughly equivalent S950 drum kit.

DJX II drum kit           S950 drum kit
------------------------  ------------------------
127 0  5 Analog Kit1      127 0  25 AnalogKit
                          126 0   8 AnalogSet     [GM]
127 0  8 Analog Kit2      127 0  58 AnalogT8Kit   [Major update]
127 0 10 Analog Kit3      127 0  59 AnalogT9Kit   [Major update]
127 0 13 Analog Kit1D     127 0  58 AnalogT8Kit   [Distorted version]
127 0 14 Analog Kit2D     127 0  59 AnalogT9Kit   [Distorted version]
127 0 12 RhBox Kit
127 0  9 Hard Kit
127 0 11 Break Kit        127 0  57 BreakKit
127 0  6 Dance Kit        127 0  27 DanceKit      [Major update]
127 0  4 Electronic Kit1  127 0  24 ElectroKit
                          126 0   3 ElectronicSet [GM]

126 0  0 Electronic Kit2
126 0  1 B900 Kit
126 0  2 DJX Kit                  HipHopKit?
126 0  3 BD Kit
126 0  4 SD Kit
126 0  5 HH Kit
126 0  6 Human Kit        
126 0  7 Scratch Kit

127 0  0 Standard Kit1    127 0  0 Standard Kit1  [Legacy]
127 0  1 Standard Kit2    127 0  1 Standard Kit2  [Legacy]
127 0  2 Room Kit         127 0  8 RoomKit
                          126 0  1 RoomSet        [GM]
127 0  3 Rock Kit         127 0 16 RockKit        [Legacy]
127 0  3 Rock Kit         127 0 90 RockKit2
127 0  7 Jazz Kit         127 0 32 JazzKit
                          126 0 35 JazzSet        [GM]

The DJX-specific kits (BD kit, SD kit, B900 kit, etc.) do not remotely follow General MIDI-ish conventions. It takes a lot of note mapping to get these drum patterns to play sensibly. I recommend playing back the SMF from a DAW (like Sonar) while tweaking the SMF. Do not attempt note remapping on the arranger — you’ll only drive yourself crazy!

Chord progressions are part of the patterns, so the melody/chord phrases need to be transposed like introductions and endings. Please review Note Transposition Rules (NTR) and Note Transposition Tables (NTT) before forging ahead. Since the channel layout is unconventional, the CASM information must be changed to be consistent with the MIDI channel data. Channels 9 to 12 are configured for rhythm NTT/NTR (root fixed, bypass) and the Channels 13 to 16 are configured for intro/ending NTT/NTR (root transpose, bypass). The chord root must be changed to match the phrases (53_Soul: Fm7, 59_ClubFunk: Dm7). You’ll need to identify the root (the musical key) either by ear or by analyzing the chord harmony.

Tool-wise, I did most of the editing in Sonar X3. I used Jørgen Sørensen’s CASM editor ( http://www.jososoft.dk/yamaha ) to create the CASM section for the style and to change the NTR, NTT and chord root information. Special thanks go to Jørgen for creating such great and helpful tools!

Oh, yeah, the final results. Here is a link to the ZIP file containing the 53_Soul and 59_ClubFunk styles. Enjoy!

Prototino in progress

This week I got rolling on my next MIDI project — a mini MIDI controller with two knobs (potentiometers) and two buttons. I intend to mount the electronics in a Hammond 1991XXBBK enclosure, also known as an (ABS) stompbox. Plastic is OK because the box will reside by the pitch and modulation wheels on the Yamaha PSR-S950 arranger workstation. The plastic is less likely to mar the finish of the keyboard. (I hate scratches.) The ultimate goal is to augment the real-time control provided by the wheels.

The area next to the wheels is fairly small and a stompbox fits into it neatly. A stompbox is a fairly small, shallow box, so I needed an Arduino-based prototyping board that fits into a small enclosure. I first consider an Arduino plus prototyping shield combo, but rejected that solution. The Arduino + shield stack fit into a standard Arduino enclosure about the same size as the 1991, however, the pad-per-hole layout would have made soldering a bear.

Enter the Spikenzielabs Prototino. The Prototino is roughly 2 1/8 inches by 2 7/8 inches in size, compatible with the 1991. Its prototyping area uses the more standard DIP layout with two and three hole pads. This layout is soldering friendly. About one third of the surface area is taken by a minimal Arduino implementation: an ATMega 328P, crystal, power regulation and ICSP/FTDI connections. The voltage regulator is optional and I elected to leave it off in favor of an external 5V center positive power adapter. Here’s a picture of the assembled Prototino before pots and switches.

Prototino

The connector at the end of the long tail is a 2.1mm power connector. This will eventually be mounted through a hole in the side of the 1991 enclosure along with a 3.5mm stereo jack for the MIDI OUT port.

The Spikenzielabs’ directions are decent enough, but here’s a few more tips. The directions identify the optional power components to be omitted during assembly. The directions do not mention where to make the +5VDC and ground connections, however. As you can see in the picture, power and ground are connected to the +5V and GND pads in the prototyping area.

The directions also describe how to connect the FTDI cable. I have a Sparkfun 5V FTDI cable and decided to go that route for programming. The directions are a little sketchy (no pun intended) on how to configure the IDE for the Prototino. This led to the usual scrambling around in the Device Manager, etc. when the IDE wouldn’t communicate with the Prototino. Yes, you do need to select the correct COM port. You also need to select the appropriate board. With the Sparkfun cable, choose “Arduino Pro or Pro Mini” from the list of boards. This always seems to be a hassle and probably puts off a lot of beginning makers.

Finally, now that the power light comes on and the sketch is downloaded, how do we really know that the Prototino is operating normally? A stock Arduino UNO, for example, has an LED tied to one of the pins and comes preloaded with the blink sketch to turn the LED ON and OFF. The Prototino just sits there. Fortunately, one can easily whip up a sketch that uses the serial port and serial port monitor to see if the Prototino is genuinely alive. The setup() function needs to turn on the serial port and display a message:

    Serial.begin(9600) ;
    Serial.println("Hello world.\n") ;

The loop function can do something playful, if you wish. Compile and download the sketch, then look for the output in the IDE’s serial port monitor.

Experience with the Prototino has been positive so far. I plan to mount pots and switches on the back side of the Prototino and to mount the Prototino to the lid of 1991 enclosure. This will let me connect the FTDI cable to the Prototino and program the device in situ. Stay tuned!

MOX construction kits update: version 2

I hope that you have downloaded and are using the MOX construction kits.

It’s no secret that many of the Motif/MOX arpeggios are taken from Tyros/PSR workstation styles. If you scan through the MOX data list, you’ll notice that many arpeggios share a similar name. These musical phrases belong to the same family. A construction kit is an MOX performance consisting of arpeggios in the same family — a kind of “mini-style.” You can use a construction kit as the basis for a new original performance. Or, just play the arps for fun! This is a great way to get a feel for the musical groove within a family and to dive into the thousands of arpeggios built into the MOX.

I released the first set of construction kits in January 2014. Since then, I have fixed a few divots in my programming. Unfortunately, some sonic glitches remain here and there. Please think of these minor bugs as “exercises left to the reader.” At least the tedious work of arranging arpeggios into performances by family has been done for you.

Since then, I have spent a lot of time translating MOX performances back to PSR/Tyros styles. I have focused on the new combinations programmed by Yamaha since it is kind of pointless to duplicate the pre-existing PSR/Tyros styles which were the original source for the phrases! In my search for additional performances, I stumbled across two MOX ALL files:

    “XSpand Your World” voices and performances translated to MOX format, and
    Motif XS user bank 2 and 3 performances translated to MOX format.

These files are available at this site.

“XSpand You World” was a promotional package put together by Yamaha to drive sales of the Motif XS. Yamaha distributed new voices and performances through “XSpand Your World” in the form of an “ALL” file. The MOX is based on the Motif XS, so Motif XS voices and performances will play on the MOX when imported into the MOX. The Motif XS and MOX “ALL” files have different internal binary formats. (Yamaha strikes, again.) Fortunately for us, Moessieurs translated the XSpand Your World ALL file to MOX format.

The Motif XS has three user performance banks and the MOX has 2 user performance banks. When it’s factory fresh, the MOX USER 1 bank contains the same performances as the Motif XS USER 1 bank. The MOX USER 2 bank, however, is a “best of” collection from the Motif XS USER 2 and USER 3 banks. Thus, there are 128 (give or take) Motif XS performances that do not ship with the MOX. Moessieurs translated the Motif XS USER 2 and USER 3 banks to a single MOX ALL file. You may import both banks all at once (save your data first!) or you may import one performance at a time into a performance location specified by you. Please see the “FILE” section of the reference manual for further information about file formats, saving and loading.

The Motif XS USER 2 and 3 banks, in particular, are a rich resource for new sonic material. I immediately got to work and imported the funky and jazzy performances into my MOX workstation. Then, I saved everything into a MOX ALL file. The new ALL file (CKITS_V2.X4A) contains construction kits and the Motif XS jazz/funk performances. I prepared two tables (mox_perf_table_v2.txt) listing the performances in MOX USER banks 1 and 2. I’m calling this whole package “Construction Kits Version 2.” Download the ZIP file and have at it!

One final word. The Motif XS has only five arpeggio types per performance. The MOX has six. So, the Motif XS performances have only five arpeggios even though they are playing on the MOX.

SA and SA2: Is Motif up to the task?

Every now and again, the subject of Super Articulation and Super Articulation 2 voices come up on the Motifator site. Here are some rather lengthy comments that I posted in response to a recent inquiry.

First, here is some background information from the S950 and Tyros 5 manual. The descriptions of Super Articulation (SA) and Super Articulation 2 (SA2) are quoted from the Tyros 5 manual. The voice descriptions (e.g., JazzArtist guitar voice) are taken from the PSR-S950 itself — when you press [INFO] in the voice selection screen, the S950 displays a description of the selected voice. These descriptions show the kind of SA effects supported by the S950. The S950 does not have front panel articulation buttons; a foot pedal can be assigned to trigger SA effects.

The description of Articulation Element Modeling (AEM) is from the Tyros 5 manual. It is a pretty good concise description of what AEM (SA2) does, but is a gross simplification WRT Yamaha’s patents. AEM does a lot of cross-fading and sample whacking. Plus, the concise description downplays the timing analysis in order to avoid unwanted latency effects and to detect releases.

Super Articulation voices

These Voices provide many benefits with great playability and expressive control in real time. For example, with the Saxophone Voice, if you play a C and then a D in a very legato way, you will hear the note change seamlessly, as though a saxophone player played it in a single breath. Similarly with the Concert Guitar Voice and play the D note strongly, the D note would sound as a “hammer on,” without the string being plucked again. Depending on how you play, other effects such as “shaking” or breath noises (for the Trumpet Voice), or finger noises (for the Guitar Voice) are produced.

JazzArtist: Super Articulation provides realistic guitar phrasing: Legato notes played within an interval of a 4th sound as a hammer on, pull off or slide. The last note has a release noise. fret noise is added randomly and the Foot pedal 2 [controller] adds a cutting noise.

NylonGuitar: Play normally and the voice is expressive and dynamic. The Foot pedal 2 [controller] changes the sounds to harmonics.

SmoothBrass: When brass instruments play legato, there is no attack sound on the legato notes. Super Articulation recreates this. Play legato and the notes join together, changing with velocity.

ConcertStrings: Strings can play legato, where each phrase is one continuous sound. Play legato and Super Articulation strings work in the same way. There are also three dynamic levels.

TrumpetFall: Jazz Trumpeters often use a fall or doit. Super Articulation recreates this with a velocity switch: Play harder to create the effect, change between fall and doit with the Modulation wheel. (Pushing forward changes to a doit.) Use the Foot pedal 2 [controller] to add breath noise.

Super Articulation 2 voices

For wind instrument Voices and Violin Voices, a special technology called AEM (see below) has been used, which features detailed samples of special expressive techniques used on those specific instruments — to bend or slide into notes, to “join” different notes together, or to add expressive nuances at the end of a note, etc. You can add these articulations by playing legato or non-legato, or by jumping in pitch by around an octave. For example, using the Clarinet Voice, if you hold a C note and play the Bb above, you’ll hear a glissando up to the Bb. Some “note off” effects are also produced automatically when you hold a note for over a certain time. Each S.Art2! Voice has its own default vibrato setting, so that when you select a S.Art2! Voice, the appropriate vibrato is applied regardless of the Modulation wheel position. You can adjust the vibrato by moving the Modulation wheel.

AEM Technology

When you play the piano, pressing a “C” key produces a definite and relatively fixed C note. When you play a wind instrument, however, a single fingering may produce several different sounds depending on the breath strength, the note length, the adding of trills or bend effects, and other performance techniques. Also, when playing two notes continuously — for example “C” and “D” these two notes will be smoothly joined, and not sound independent as they would on a piano.

AEM (Articulation Element Modeling) is the technology for simulating this characteristic of instruments. During performance, the most appropriate sound samples are selected in sequence in real time, from huge quantities of sampled data. They are smoothly joined and sounded — as would naturally occur on an actual acoustic instrument.

This technology to smoothly join different samples enables the application of realistic vibrato. Conventionally on electronic musical instruments, vibrato is applied by moving the pitch periodically. AEM technology goes much further by analyzing and disaggregating the sampled vibrato waves, and smoothly joins the disaggregated data in real time during your performance. If you move the Modulation wheel when you play the S.Art2! Voice (using AEM technology), you can also control the depth of the vibrato, and still maintain remarkable realism.

Motif and MOX

Starting with the Motif XS, Yamaha added Expanded Articulation (XA). Without diving into too much detail, XA allows control over articulations using the assignable function buttons. XA also detects and triggers samples to handle legato technique. The Motif/MOX player has precise control over when an articulation is sounded and the Motif/MOX programmer can construct new voices using XA (or tweak existing voices).

The S950 (and Tyros) monitor and analyze the notes played by the musician. The Tyros, in addition, has two panel buttons to control articulation. The workstation software determines which articulation to sound and when based upon what the musician has played on the keyboard or (optional) controllers.

Both the S950 and Tyros implement Super Articulation (SA) voices. SA voices and XA voices use roughly comparable sample playback technology (AWM). New samples can only be installed onto an S950 through an expansion pack (proprietary format). Yamaha has not released an expansion pack editor. S950 voice editing is limited to “quick edit” envelope tweaks; you cannot get to the element level on the S950. Motif/MOX voice editing is vastly deeper.

Super Articulation 2 (SA2) voices on the Tyros are a whole other beast. SA2 uses Articulation Element Modeling (AEM) to “stitch” samples together in real-time in response to what the musician plays. The Motif XS (and later) do not have the software to analyze the musicians playing/gestures and it does not have the AEM sound engine. SA2 is not implemented on the S950. SA2 is a very complicated critter because it takes note timing into consideration. (See Yamaha’s patents on AEM.)

So, voices/samples cannot simply be ported from S950 (or Tyros) to Motif. You can, however, use XA to make your own SA-style voices without any of the front-end analysis of musical gestures/control.

Thoughts and speculation

Sometimes, I think SA is a different front-end for Mega Voices. A guitar Mega Voice, for example, uses velocity switching to trigger (one of) an open soft, open medium, open hard, dead soft, dead hard, hammer on or slide waveform for a given MIDI note played on the keyboard. Effects such as strum noise and fret noise are triggered by MIDI note numbers above C6 and c8, respectively.

An SA voice based upon the same waveforms might use velocity switching for open soft, open medium, open hard, dead soft and dead hard, while using legato notes within an interval of a fourth to trigger hammer on and slide. An articulation control button or pedal trigger strum noise. Fret noise is added randomly. Thus, the SA voice uses the same basic waveforms as the Mega Voice, but the SA voice uses different means and analysis to select, enable and render the waveforms.

Motif XS (and later) have Mega Voices. The MOX Mega Nylon voice, for example, uses seven elements:

       Elem#  Waveform                Low  High Velocity
       -----  ----------------------  ---  ---- --------
       Elem1  Nylon Open Sw St        C-2  B5   1-60
       Elem2  Nylon Dead Notes St     C-2  B5   61-75
       Elem3  Nylon Mute St           C-2  B5   76-90
       Elem4  Nylon Hammer St         C-2  B5   91-105
       Elem5  Nylon Slide St          C-2  B5   106-120
       Elem6  Nylon Harmonics St      C-2  B5   121-127
       Elem7  Nylon FX St             C6   G8   1-127

that select and play an internal waveform based upon MIDI note number and velocity. One could build a different voice that triggers the same waveforms under different conditions such as AF1 ON, AF2 ON, etc. Indeed, some of the other Mega Voices respond to AF1/2 and AS1/2. Thus, I believe that a stock Motif/MOX with XA could emulate an SA voice within certain limitations. Specific conditions like “legato within an interval of a fourth” are not supported in Motif/MOX. XA detects legato without regard for interval.

SA2 voices are based on AEM and I believe that the AWM tone generation model in the stock Motif/MOX is not enough. In AWM, each note is independent and follows the familiar attack, decay, sustain and release life-cycle. Legato based on XA merely changes the waveform that is used to render the attack of an independent note. An AEM note, on the other hand, evolves and morphs into the next note. The AEM tone generator behaves more like physical modeling than AWM’s ADSR note life-cycle. As mentioned in Yamaha’s description of AEM, the AEM tone generator does some fancy computation to correctly render vibrato through note transitions. Further, a stock Motif/MOX does not perform the timing analysis and control functions that drive AEM tone generation.

I would love to see Yamaha add AEM-based voices to future members of the Motif family!