Casio Lyric Creator: Pronunciation

You might decide to fine tune the Casio CT-S1000V’s pronunciation of certain words or syllables. Crack open Casio Lyric Creator, select a lyric phrase, and long-press a word or syllable. Lyric Creator opens the “Edit Phonemes” display.

Casio Lyric Creator phoneme editing

If you’re working in English, Lyric Creator shows the word’s phonetic translation above a peculiar looking on-screen keyboard. The keys correspond to the major phonemes in the English language.

If you want to spend (waste) an entire afternoon, query English phonics. At the very least, you’ll finding articles about phonics for first readers. If you deep dive further, you will be down the infinite rabbit hole of linguistics. Beware!

Sometimes it’s worth it to check up on Lyric Creator. In one version of “Amazing Grace,” Lyric Creator produced the ‘G’ at the beginning of “Grace.” In another version, it didn’t produce the ‘G’ at all. YMMV. The C-MU Pocket Sphinx of 10,000 frequent words is a good resource when checking pronounciations. [Hope they didn’t ask anyone from Pittsburgh to make this list.]

There are roughly 44 common phonemes in English separated into vowels and consonants. Looking at the phoneme keyboard, most of the consonants make sense. There are a few two-letter combinations like “ng”, “tt”, “th”, “jh”, and “zh”. These cover sounds like the “ng” in “sing”.

The vowels, however, are not what they appear to be! The phonemes follow the International Phonetic Alphabet (IPA). [Not “India Pale Ale”.] The IPA has more special characters (glyphs) than Iverson’s APL (A Programming Language). The glyphs are mapped to two- and three-character sequences.

So, what reference phonetic alphabet does Lyric Creator use? CT-S1000V Vocal Synthesis was trained on the Carnegie-Mellon University (C-MU) Pronouncing Dictionary. The C-MU Pronouncing Dictionary employs the ARPAbet, which was first used in speech recognition research.

Here’s a quick and dirty correspondence table between symbols in the ARPAbet and English sounds:

    Vowels          Consonants      Consonants 
---------- ---------- ----------
aa bOt b Buy ng siNG
ae bAt ch CHin p Pie
ah bUt d Die r Rye
ax oracLE dd miDDle s Sigh
ao OUght dh THy sh SHy
aw bOUt f Fight t Tie
axr lettER g Guy tt auTumn
ay bIte hh High th THigh
eh bEt jh Jive v Vie
er bIRd k Kite w Wise
ey bAIt l Lie y Yacht
ih bIt m My z Zoo
iy bEAt mm zh diviSion
ow bOAt n Night cl
oy bOY nn wiNNer
uh bOOk
uw bOOt

You’ll notice several oddities among the English vowels, e.g., “ay” producing the long ‘i’ sound in “bite”. “ay”, really? No doubt, you saw the “ey” in “Grace” and thought that was strange, too.

The 44 common phonemes aside, there are a few phonemes in the Lyric Creator keyboard that required investigation. There are a few phonemes for which I have no clue! Next are a few special cases for your consideration.

There are two “th” sounds: voiced and unvoiced. A voiced sound is produced with the vocal chords; An unvoiced sound is produced solely by other components of the vocal tract. In Lyric Creator:

  • “dh” is the voiced “th” sound.
  • “th” is the unvoiced “th” sound.

Some vowel sounds are influenced by the letter ‘r’. The phoneme “axr” is a vowel influenced by ‘r’ as in “creator” or “letter”.

“ax” turns up in some interesting cases like “autumn” (ao,tt,ax,m), “middle” (m,ih,dd,ax,l) and “kindle” (k,ih,n,d,ax,l).

The word “middle” contains “dd” and “tt”. These phonemes lead to a discussion of alveolar flapping in English. [Yes, “flapping”.] We now stand at the maw of the rabbit hole and I will take my leave.

Have fun!

Copyright © 2022 Paul J. Drongowski

Casio Lyric Creator: First experience

Here are some observations after getting my feet wet with Casio Lyric Creator and Casio CT-S1000V.

Connection

As several other folks have mentioned, you must connect your iPad (Android device) to the CT-S1000V with a USB cable. Right now, Casio Lyric Creator cannot communicate with the CT-S1000V over Bluetooth.

It’s a little weird. Casio Music Space — recently released — does communicate with the CT-S1000V over Bluetooth. Casio Music Space could be a useful educational tool. However, it doesn’t fill a need for me.

Casio Music Space has a pairing dialog when initiating Bluetooth connection. Perhaps this is all that Lyric Creator needs? The IOS Bluetooth settings page does not find or show the CT-S1000V; I guess pairing is up to the app.

This is all new and perhaps we should give the Casio developers a little more time. At the moment, the WU-BT10 Bluetooth dongle is not much help. Glad it was included with the CT-S1000V. If I laid out $80USD for the dongle on top of the CT-S1000V, I’d be disappointed.

Memory capacity

The CT-S1000V comes factory-loaded with 100 dance floor (EDM) phrases. It’s fun to mess with these although they are not my cup of tea.

Casio Lyric Creator has an “Instrument Data Management” button at the bottom of the main screen. IDM displays the phrases loaded into a connected CT-S1000V.

There are 48MBytes of space of internal phrase memory allocated for phrases. Nearly all of the factory phrases are about 300KBytes, occupying roughly 30MBytes total. That leaves about 18MBytes free and available.

At some point, I will zap the factory phrases. Fortunately, Casio provide a file with the CT-S1000V Preset Lyric Tone Data. The file is in the Electronic Musical Instruments support area. You will need to scroll down to the “Digital Keyboards” section to find the file.

It’s a DAL file which resets all user data. Yikes! Be sure to save your own content before loading this file! It’s a factory reset.

The Lyric Creator User Guide describes how to restore individual lyric tones (phrases). Check out the Data Management section of the Guide for more details.

Just a phrase

Starting simple, I created a phrase from the first line of “Amazing Grace:”

Am -az -ing grace how sweet the sound

Then I added note values (durations), save the lyrics, and transfered the phrase to the CT-S1000V. Pretty smooth although I missed the need to SAVE the lyric file (with the cryptic name) to the CT-S1000V. The keyboard sat there in the “Preparing” state until the transfer request timed out.

Casio Lyric Creator

I played the phrase note by note. All good. Then I tried to find the most natural sounding Vocalist, settling on the Bossa Nova Vocalist.

This short phrase, BTW, took roughly 600KBytes of storage.

Prying eyes

Of course, the next thing is to save the lyrics file (extension lyj) on the PC and open the file with a text editor (Emacs). Sure enough, it’s all text. A lyrics file contains:

  • Header information about file and path names.
  • The lyric text as one continuous string.
  • A sequence of syllable data items.
  • Lyric Creator option settings, e.g., input language mode, auto split, auto conversion, etc.
  • More file paths including the Music XML file path.

It’s all one big string with delimiters dividing fields, attributes and values. Not the easiest thing for a human to read and we probably weren’t expected to poke around inside. (Ha!)

The sequence of syllable data items is the most interesting part. This is how Lyric Creator subdivided your lyric phrase and its note values. Here is a typical syllable data item:

{"text":"grace","phoneme":"g,r,ey,s","length":960,"note":60}

“grace” is the syllable. Normally, the phoneme property is empty. I choose to enter my own phonetic spelling for “grace”. [I’ll have a lot more to say about phonetic spelling in a future post.] If you don’t spell out your own phonemes, Casio will use the default spelling and leave the phoneme property blank. Length is the note value (duration) specified in tick units (480 ticks per quarter note). The note property is a MIDI note number (default 60 or middle C).

Importing Music XML

That was too easy. Let’s go for all the marbles and import a Music XML file. Now the road gets rougher. I think Casio need to do more testing.

I installed the latest version of MuseScore and created a simple chart for “Amazing Grace.” The Lyric Creator Music Guide says: “If the file being imported has multiple parts, only the first part will be imported.” I take this to mean “Lyric Creator will only import the first verse.” Thus, I only created one verse.

I exported an uncompressed Music XML file from MuseScore. Lyric Creator cannot import compressed Music XML; only .xml or .musicxml extensions are allowed.

OK, I should have saved the first attempt to show you. [I didn’t.] Lyric Creator brought in the lyrics — sort of. It missed the pick-up syllable (note) and it inserted a few extra syllables. On reflection, the extra syllables may be Do Re Mi Fa So La Ti Do, corresponding to the pitches of the C scale. Surprise! Lyric Creator also inserted a few rests where rests occurred in the score. Hyphen placement seemed a little random.

Overall, if the text is short, I wouldn’t bother with Music XML right now. Pressing on…

I cleaned up the lyric text: deleting Do Re Mi etc., adding the missing pick-up syllable, and so forth. The changed text saved OK.

Next, I edited the note values. Lyric Creator had generated note values based on the Music XML score. However, when I added the pick-up syllable, it looked like everything was now off by one place. Diligently, I changed the note values to match the original score. Went to save, and uh-oh, the text plus note values are too big to save. I deleted the rests and that was enough to trim the lyrics and make Lyric Creator happy again.

The Lyric Creator User Guide has a caveat: “You can enter lyrics for up to 100 syllables when the note value (note length) is eighth note. The number of syllables you can input depends on the note value.” Looks like I tripped the limit.

I transferred the lyrics for the first verse to CT-S1000V and successfully played them note by note. Hurray!

The entire first verse with note values required 1.6MBytes of storage. Yes, some of those factory phrases are gonna go!

My final observation has to do with the MIDI note numbers. Are they or what are they? I need to investigate the note property as the generated note numbers don’t match the melody. I question my own conjecture…

Summary

Well, Lyric Creator works within its limitations. I’m not sure that Music XML is ready for prime time. Unless I really, really wanted or needed to import Music XML, I would start simply with Lyric Creator’s editor.

Copyright © 2022 Paul J. Drongowski

Casio singing synthesis in pictures

I’m slowly immersing myself in the singing synthesis technology behind the Casio CT-S1000V. Heck, ya need somethin’ to do during TV advertisements while watching sports. 🙂

There are two major approaches to speech (singing) synthesis: unit-selection and statistical parametric.

Most people are familiar with unit-selection systems like Texas Instruments old Speak and Spell or the much more advanced Yamaha Vocaloid™. Unit-selection relies upon a large database of short waveform units (AKA phonemes) which are concatenated during synthesis. The real trick behind natural sounding singing (and speech) is the connective “tissue” between units. Vocaloid creates waveform data that connects individual phonetic units.

If you are familiar with Yamaha’s Articulation Element Modeling (AEM), a light should have lit in your mind. The two technologies have similarities, i.e., joining note heads, bodies, and tails. The Yamaha NSX-1 chip implements a stripped down Vocaloid engine and Real Acoustic Sound (AEM).

The content and size of the unit waveform database is a significant practical problem. The developers must record, organize and store a huge number of sampled phrases (waveform units). The Vocaloid 2 Tonio database (male, operatic English singer) occupies 750MBytes on my hard drive — not small and was a real challenge to collect, no doubt.

Statistical parametric systems effectively encode the source phonetic sounds into a model such as an hidden Markov model (HMM). During training, the source speech is subdivided into temporal frames and the individual frames are reduced to acoustic parameters. The model learns to associate specific text with the corresponding acoustic parameters. During synthesis, the model is fed text and acoustic parameters are recalled by the model. The acoustic parameters drive some form of vocoding. (“Vocoding” is used broadly here.)

Deep neural networks (DNN) improve on HMM. Sinsy is a DNN-based singing voice synthesis (SVS) system from Nogoya Institute of Technology. It is the culmination of many years of research by sensei Professor Keiichi Tokuda, his students and colleagues. It was partially supported by the Casio Science Promotion Foundation. Thus, adoption by Casio is hardly accidental!

Sinsy Singing Voice Synthesis System

The Sinsy block diagram is taken from their paper: Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System, by Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda, EEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2803-2815, 2021. The method is quite complex and consists of several models. It’s not clear (to me, yet) if the Casio approach has all elements of the Sinsy approach. I recommend reading the paper, BTW; it’s well-written and highly technical.

Casio U.S. Patent 10,789,922 vocal synthesis

The next block diagram is taken from Casio’s U.S. Patent number 10,789,922 awarded September 29, 2020. Their approach is separated into a training phase and a synthesis (playing) phase. You’ll notice that Casio employ only an acoustic model. The patent discloses a “Voice synthesis LSI” unit, so their software may have a hardware assist. We’ll need to take a screwdriver to the CT-S1000V to find out for sure!

A picture is worth a thousand words. A technical diagram, however, requires a little interpretive context. 😉 Paraphrasing the Casio patent:

The text analysis unit produces phonemes, parts of speech, words and pitches. This information is sent to the acoustic model. The acoustic model unit estimates and outputs an acoustic feature sequence. The acoustic model represents a correspondence between the input linguistic feature sequence and the output acoustic feature sequence. Acoustic feature sequence includes:

  • Spectral information modeling the vocal tract (cepstrum MEL coefficients, line spectral pairs, or similar).
  • Sound source information modeling vocal chords (fundamental pitch frequency (F0) and power value).

The vocalization model unit receives the acoustic feature sequence. It generates singing voice inference data for a given singer. The singing voice inference data is output through a digital-to-analog converter (DAC). The vocalization model unit consists of:

  • A sound source generator:
    • Generates a pulse train for voiced phonemes.
    • Generates white noise for unvoiced phonemes.
  • A synthesis filter:
    • Uses the output signal from the sound source generator.
    • Is a digital filter that models the vocal tract based on spectral information.
    • Generates singing voice inference data (AKA “samples”).
Casio U.S. Patent 10,789,922 vocal synthesis process detail

This rather complicated diagram from U.S. Patent 10,789,922 shows the synthesis phase in more detail. It shows the lyric string decomposed into phoneme and frame sequences. Each frame is sent to an acoustic model which generates an acoustic feature sequence, that is, the acoustic parameters that were learned during training. The acoustic parameters are synthesized (vocoded) into 255 samples. Each frame is about 5.1 msec long.

Casio CT-S1000V Vocal Synthesis (User Guide)

Well, if the second patent diagram was TMI, here is the block diagram from the Casio CT-S1000V user guide. The simplified diagram is quite concise and accurate! You should be able to relate these blocks directly back to the patent.

I hope this discussion is informative. In a later post, I’ll take a look at a few practical details related to Casio CT1000V Vocal Synthesis.

Copyright © 2022 Paul J. Drongowski

Casio CT-S1000V: Quick tips

One feels “all thumbs” when starting in with a new keyboard. The Casio CT-S1000V has a lot of functionality and customization below the MENU button and within the SETTINGS item in the main MENU. I made a map to help me get around:

MENU                      SETTINGS 
My Set-up Transpose
Active DSP Touch Off Velocity
Balance Split Point
Octave Shift Rhythm Auto Set
Sustain Chord Finger Mode
Portamento Rhythm Controller Type
Pedal SUS/UPPER PORT Button
Pitch Band ARP/AH Button
Knob Rhythm Volume
Arpeggio Song Volume
Auto Harmony Tuning
Sampling Surround
Song Audio In Center Cancel
Metronome MIDI OUT Channel UPPER1, UPPER2, LOWER
System Effects Local Control
EQ MIDI SYNC Mode
Scale Auto Power Off
MIDI Control Battery
Wireless LCD Contrast
Media Button Long Press Time
Home Customization Speaker
Settings >>>>>>>>>>>> Phone Speaker
Demo Setting Initialize
Exit All Initialize
Version

The Casio CT-S500 is probably organized in the same way.

MY SETUP Power On Recall

Yesterday, I mentioned MY SETUP and how useful it is for establishing a global set-up for a given playing situation. It’s also useful for establishing an initial set-up during power-up. Simply enter MY SETUP, select one of the four set-up entries, and press the AT PW-ON soft button. The CT-S1000V will recall the selected set-up during power-ups.

Even though it’s cool to get kicked into Vocal Synthesis at power-up — a nice marketing/sales ploy — I’d rather have a B-3 at my fingertips. 🙂

Active DSP HOLD

Active DSP assigns effect parameters to the three front panel knobs: K1, K2 and K3. The “Amp Organ 1” tone, for example, assigns the knobs this way:

  • K1: M1 Speed
  • K2: M1 OD Gain
  • K3: M1 Brake

K1 is the rotary speaker speed, K2 is the overdrive and K3 is the speaker break which stops simulated rotor/horn.

That’s great until you hit HOME or MENU and — what the??? The knobs are re-assigned to cut-off, resonance and modulation. That’s when Active DSP HOLD comes into play.

If you press the Active DSP HOLD soft button before leaving the Active DSP screen, the CT-S1000V will remember the DSP knob assignments when you go HOME or whatever. Save this in your set-up, too.

Slow your roll, Sparky

The first time your spin up the rotary in “Amp Organ 1,” you’ll be appalled at the short ramp-up time (acceleration) and the final rotor/horn speed. The Active DSP screen is also your way into the DSP parameters. I have the Drive Rotary effect applied to the organ. It has the following parameters:

  • Rotary speaker type
  • Overdrive gain
  • Overdrive level
  • Speed
  • Brake
  • Fall acceleration (ramp down)
  • Rise acceleration (ramp up)
  • Slow rate
  • Fast rate
  • Vibrato/Chorus
  • Wet level
  • Dry level

I like the sound of a beat-up Leslie with slow motors and slipping belts. Feel free to adjust the acceleration and slow/fast rates down.

Nice to see the chorus/vibrato simulation (V1, C1, V2, C2, V3, C3). The Hammond had a unique chorus/vibrato scanner unit which is a necessary component of gospel organ registrations. I’d love to see more details about Casio’s rotary speaker emulation including the scanner and speaker types.

Oh, yeah, please let us assign rotary speaker speed to the foot pedal. Thanks.

Boing

I wish I could see more details about the reverb, chorus and delay effects, too.

CT-S1000V has three system-wide effects: reverb, chorus and delay. Usually you get only reverb and chorus, and don’t always see a separate delay unit. Cool.

Unlike the DSP effects, you do not get to tweak system-wide effect parameters. All you get are presets with rather uninformative names like Room 1, Hall 2, etc. I listened to the room reverbs and settled on Room 2 for church registrations. Although ears should be the final judge, I wish I could see the parameter values behind the presets in order to make good choices.

I recommend publishing an effect routing diagram like the one I found in the CT-S5000 manual (below). Thanks.

Casio CT-X3000/CT-X5000 effect routing diagram [Casio]

Copyright © 2022 Paul J. Drongowski

Casio CT-S1000V: Observations

Gonna post a few notes while I take an ear break.

I’m rather pleased with the sound and play-ability of the Casio CT-S1000V. For now, I’m focused on sound design and playing, having only dipped into the auto-accompaniment rhythms and vocal synthesis.

Patches

Sounds, not the song. (Sorry, Clarence Carter.)

My first order of business is building a bunch of sound combinations that are suitable for the contemporary and traditional church music that I play. Jazz, funk and pop will have to wait a little while…

The CT-S1000V provides two different means of storing a patch: My Set-up and registrations. My Set-up is accessed through the main MENU button. Up to four set-ups can be stored. As I quickly discovered, My Set-up stores everything but the kitchen sink including settings like speaker ON/OFF. I sacrificed the fourth location (SAX) and created a LINE OUT entry with the internal speaker turned off. This seems like a good use for My-Setup, namely, saving global configurations for different playing situations, e.g., home, gig, etc.

Registrations are more appropriate for tone (voice) programming. There’s more registrations than My-Setup locations: 16 banks with four registrations per bank, 64 registrations total. That may seem stingy by today’s standards, but I don’t need more than 8 to 16 locations to cover most of my gig needs. Plus, one can always save registrations to a USB flash drive and load them as playing situations arise.

Tone programming

Registrations can save most everything related to tone programming: split, layer, effects, and much more. Yes, you can edit CT-S1000V tones — one reason why I passed on the CT-S1 and waited.

Tone editing is similar to “quick edit” that you might find on a synth. You can tweak 21 parameters including cutoff, resonance, attack time, release, vibrato, volume, pan, effect sends and 4-band EQ. You don’t get synthesizer-level deep editing. Cutoff, etc. are offsets (-64:+64) from the preset value. If you want it all in front of you like MODX or Montage, this isn’t the droid you’re looking for.

With only three front panel knobs, you need to assign a tone parameter to a knob first, then tweak. The changed value is saved along with everything else in a registration (including the knob assignment).

DSP editing, on the other hand, is deep. Using a feature called “Active DSP”, you can choose an effect type, assign a parameter to a knob, and tweak effect parameters. I’m still experimenting with Active DSP, especially for controlling rotary speaker speed. I’ll have more to say when I have a better grip on Active DSP.

Splits and layers

The CT-S1000V works logically and supports two split zones: Lower and Upper. When Split is turned off, all’s you get is Upper. Upper supports two layers: Upper 1 and Upper 2; Lower cannot be layered. The split point is configurable and you can adjust the balance (level) between tones. This is just enough to be dangerous. If you’re looking for a pile of layers, move along.

Starting with my Roland days (circa 1995), I’ve kept notes about the most useful tone combinations for contemporary church music. Here are my favorite combinations:

               Tone 1       Tone 2       Tone 3        Tone 4 
----------- ----------- ------------ ------------
High School Tuba A Celesta Flute 1A Clarinet A
Warm Tp Sect Tbs mp C Tbs mp B Tbs mf C Brs LipNzl
CTp + Tb Sect C Tps mp A Tbs mp A C Tps f A Tbs f A
Horn+Wood Flute 1A Clarinet C Oboe mf A Horns mf A
NobleHornPop Horns f A Flugel C Tb Sect B Trumpet 1C
NobleHornPop French 1C Flugel C Tb Sect B Trumpet 1C

Orch Reeds Oboe mf A E.Horn C Oboe f A
Wind&Str1 Oboe mf A Flute 1B DolceStr.A JV Strings A
Wood Sect Oboe mf A Flute 1A Clarinet A E.Horn A
Flute/Clari Flute 1C Clarinet C

ChamberWinds Oboe mf B Oboe mf A Sop.Sax mf A Flute 1A
ChamberWoods Clarinet A Flute 1C Flute 1A

Warm Strings Soft Pad A F.Str mp A JP Strings2C JP Strings1A
ChmbrQuartet Violin C Violin 2 A Cello A Cello 2 A
ViolinCello Vc mp B Bassoon A Va mp A Oboe mf A

These combinations date back my old JV-90, XP-60 and XV-5050! You’ll find equivalents on my MODX and Genos. My task now is to build similar combinations (registrations) on CT-S1000V.

BTW, I’m also dialing back reverb where necessary. I try for a happy balance — not too dry for practice at home, but not so much as to murk up the sound in a reverberant church hall.

If I had one wish, I would like to give each registration a name. I have a running map of “which registration does what,” but wish the names appeared on the CT-S1000V screen, not “Bank 1-1”.

Tones for old bones

By and large, the CT-S1000V orchestral tones are decent; most of them are musically useful and sound good through both the built-in speaker and an external monitor. The tone parameters are enough to cure overly bright tones or sharp attacks. Although I haven’t worked with them yet, the chromatic percussion (celeste, glockenspiel, etc.) don’t have any obvious tuning issues and are musical.

Two layers aren’t much. Fortunately, CT-S1000V has a few preset combi tones — Brass & String, Violin Section, Chamber (orchestra), Flute & Oboe, Pipe Section — which provide another “layer” or two on the cheap.

Don’t forget about the ethnic voices. CT-S1000V has accordions, fiddle, and harmoniums. Harmoniums! Jazzers who want to get their Jon Batiste on should look to these for melodica.

Organs

The CT-S1000V pipe organs are decent. Yes, there is the usual over-done reedy sound, but there are three tone presets that are suitable for hymn-playing and congregational singing. Even though the keybed is squared-off and similar to piano keys, it has a nice resistance and allows the legato-like gestures one uses when playing a pipe organ.

The drawbar organ tones are serviceable. No, we’re not in clone territory here and you can’t change the drawbar settings. There are rotary speaker effect algorithms, but again, we’re not in Vent or clone territory quality-wise. The only way to change rotary speaker speed is via Active DSP and turning the knob to which rotary speed has been assigned. I wish there was a way to assign rotary speed to the foot pedal. (OK, I need two wishes from the Genie.)

On the up-side, the rotary effect has a brake setting. One could brake the rotary and put the LINE OUT into a Lester K, Vent, or whatever. I will be giving this a try and will post notes. With Lester on the floor, I could stomp on a foot switch and change rotary speed, too.

Summary

Well, I hope these observations are helpful! The Casio CT-S1000V has a lot of sound-making value for very little money. So many tones to try…

Copyright © 2022 Paul J. Drongowski

Casio CT-S1000V: First impressions

After test driving the Casio CT-S1 and CT-S410, I took the plunge and bought a Casio CT-S1000V (AiX Sound Source with Vocal Synthesis, $450USD street). The price was irresistible after making a trade-in. (The Yamaha SHS-500 Sonogenic retired.)

In terms of build quality, the Casio CT-S1000V is robust enough for light to moderate gigging. It feels solid. I miss the fabric speaker covering (Casio CT-S1) as it is a touch of class. I suspect that fabric would get dirty on gigs, however. I wouldn’t park any drinks on this keyboard (or any keyboard) with everything exposed! Yep, it weighs ten pounds, not bad for a keyboard with in-built speakers.

Casio CT-S1000V

The power supply is a small lump-in-the-middle brick. The mains lead is rather short with one of those “figure 8” IEC 60320 C7 plugs. Other accessories include a music stand and a Casio WU-BT10 Bluetooth dongle — don’t lose that tiny little bugger! The music stand isn’t super-robust and I’m not sure that I want to park a heavy binder o’tunes on it. It’s also too low for my reading glasses and I will probably stick to my usual tripod music stand.

The CT-S1000V keybed is rather nice for a keyboard in this price range. The keys are squared off and piano-like although there’s no hammer simulation, of course. The keys are evenly spaced, are level, and don’t wobble too much. The keys have a textured surface similar to the Roland GO:KEYS. The throw is a little bit light and soft, not unpleasant. (BTW, I couldn’t stand the Roland GO:KEYS and returned it due to keybed issues.)

I can hand-swipe without cutting my hands. I don’t know how the keys will stand up to this kind of abuse in the long run. Plus, this board is so light, I’m afraid of throwing it off the keyboard stand when swiping!

The speaker sound is OK. I regard the speakers as “courtesy speakers.” Sometimes it’s convenient to push only one switch and start playing. They’re loud enough for my studio room, maybe loud enough for the church gig where we don’t generate a lot of stage volume. They don’t get buzzy at loud volume. Since I don’t play at very loud volume at home, I’m good with that. Casio wisely blessed the CT-S1000V with 1/4″ stereo output jacks so I can send the CT to the church PA.

I read just enough of the manual to enable Active DSP, which assigns DSP parameters to the knobs. With an organ tone selected, turning knob 1 (K1) switched between slow and fast rotary speaker speed. Wish there was a way to assign rotary speed to a button or the foot switch… I need to experiment more with Active DSP. Gotta experiment with splits and layers, too. I guess everything is saved to a registration, but we’ll find out!

I played with Vocal Synthesis enough to know there are multiple Vocalists. Some Vocalists are more natural than others. One of the Vocalists is “Death Voice” and I would like to uncork that one in church. 🙂

Quite a playable instrument. I haven’t listened to any of the rhythms yet because I’m mostly interested in flat-out playing. Switching sections (intro, main, fill, etc.) with the buttons below the display reminds me of switching arpeggios on the Yamaha MOX/MOXF.

Hope these impressions help!

Copyright © 2022 Paul J. Drongowski

Ye olde Yamaha Dance Kit

Ya learn somethin’ every day. Thanks for to Mark — my neighborhood to the north in Vancouver — who looped me in.

As one might expect, Yamaha have updated their drum kit samples over the years. Who knew — the DanceKit circa 2000 is more heavy, punchy and analog than present-day DanceKit. According to Mark (and Musicnik), the Standard Kit had more punch back in the day.

The table below summaries the instruments in the Yamaha Standard Kit and Dance Kit:

                    Standard Kit      Dance Kit 
Keyboard MIDI 127/000/001 127/000/28
-------- -------- ---------------- ---------------
40 E 1 28 E 0 Brush Tap Swirl Reverse Cymbal *
41 F 1 29 F 0 Snare Roll Snare Roll
42 F# 1 30 F# 0 Castanet Hi Q 2 *
43 G 1 31 G 0 Snare Soft Snare Techno *
44 G# 1 32 G# 0 Sticks Sticks
45 A 1 33 A 0 Bass Drum Soft Kick Techno Q *
46 A# 1 34 A# 0 Open Rim Shot Rim Gate *
47 B 1 35 B 0 Bass Drum Hard Kick Techno L *
48 C 2 36 C 1 Bass Drum Kick Techno 2 *
49 C# 2 37 C# 1 Side Stick Side Stick Analog *
50 D 2 38 D 1 Snare Snare Clap *
51 D# 2 39 D# 1 Hand Clap Hand Clap
52 E 2 40 E 1 Snare Tight Snare Dry *
53 F 2 41 F 1 Floor Tom L Tom Analog 1 *
54 F# 2 42 F# 1 Hi-Hat Closed Hi-Hat Close Analog 1 *
55 G 2 43 G 1 Floor Tom H Tom Analog 2 *
56 G# 2 44 G# 1 Hi-Hat Pedal Hi-Hat Close Analog 2 *
57 A 2 45 A 1 Low Tom Tom Analog 3 *
58 A# 2 46 A# 1 Hi-Hat Open Hi-Hat Open Analog *
59 B 2 47 B 1 Mid Tom L Tom Analog 4 *
60 C 3 48 C 2 Mid Tom H Tom Analog 5 *
61 C# 3 49 C# 2 Crash Cymbal 1 Cymbal Analog *
62 D 3 50 D 2 High Tom Tom Analog 6 *
63 D# 3 51 D# 2 Ride Cymbal 1 Ride Cymbal 1
64 E 3 52 E 2 Chinese Cymbal Chinese Cymbal
65 F 3 53 F 2 Ride Cymbal Cup Ride Cymbal Cup
66 F# 3 54 F# 2 Tambourine Tambourine
67 G 3 55 G 2 Splash Cymbal Splash Cymbal
68 G# 3 56 G# 2 Cowbell Cowbell Analog *
69 A 3 57 A 2 Crash Cymbal 2 Crash Cymbal 2
70 A# 3 58 A# 2 Vibraslap Vibraslap
71 B 3 59 B 2 Ride Cymbal 2 Ride Cymbal 2
72 C 4 60 C 3 Bongo H Bongo H
73 C# 4 61 C# 3 Bongo L Bongo L
74 D 4 62 D 3 Conga H Mute Conga Analog H *
75 D# 4 63 D# 3 Conga H Open Conga Analog M *
76 E 4 64 E 3 Conga L Conga Analog L *
77 F 4 65 F 3 Timbale H Timbale H 7
8 F# 4 66 F# 3 Timbale L Timbale L
79 G 4 67 G 3 Agogo H Agogo H
80 G# 4 68 G# 3 Agogo L Agogo L
81 A 4 69 A 3 Cabasa Cabasa
82 A# 4 70 A# 3 Maracas Maracas 2 *
83 B 4 71 B 3 Samba Whistle H Samba Whistle H
84 C 5 72 C 4 Samba Whistle L Samba Whistle L
85 C# 5 73 C# 4 Guiro Short Guiro Short
86 D 5 74 D 4 Guiro Long Guiro Long
87 D# 5 75 D# 4 Claves Claves 2 *
88 E 5 76 E 4 Wood Block H Wood Block H
89 F 5 77 F 4 Wood Block L Wood Block L
90 F# 5 78 F# 4 Cuica Mute Scratch H *
91 G 5 79 G 4 Cuica Open Scratch L *

The starred (“*”) entries denote analog drum machine samples.

I decided to do a side-by-side comparison. I first recorded the DanceKit samples as dry as possible on the Yamaha PSS-A50 and the Yamaha QY-70 (circa 1997). Then I matched everything up, ignoring the toms and a few extraneous instruments.

You’ll hear all the PSS-A50 examples first followed by all of the QY70 examples. I’ll let you decide as to your personal preference. Although I tried to get the A50 dry, there seems to be a hint of reverb remaining.

Without further ado, here is a ZIP file containing the WAV for all of the Yamaha QY-70 Dance Kit instruments starting from the bottom of the keyboard to the top. Have fun! Slice and dice everything into audio mirepois.

If a drum machine plays in the forest and no one is around, does it still make a sound? 🙂

Copyright © 2022 Paul J. Drongowski

Casio speech synthesis technology

The voice synthesis in Casio’s new CT-S1000V keyboard raised quite a bit of interest on the Web, including my own curiosity.

I installed the Casio Lyric Creator app on my iPad just to see what I can see. Lo and behold, there is a long list of open source licensing statements which identify some of the voice synthesis technology in the app and the keyboard itself. Let’s take a look starting with the top of the list.

HMM-based speech synthesis engine, HTS_engine, developed by the HTS Working Group. That’s a lot of acronyms and shoulders to stand on:

  • HMM: Hidden Markov model
  • HTS: An HMM-based speech synthesis system
  • SPTK: Speech Signal Processing Toolkit

The HTS Working Group is a voluntary group developing the HMM-based speech synthesis system HTS. The software bears a joint copyright from two institutions:

  • Nagoya Institute of Technology, Department of Computer Science, and
  • Tokyo Institute of Technology, Interdisciplinary Graduate School of Science and Engineering

The HTS_engine API is released under the Modified BSD license. I won’t quote such chapter and verse everywhere, but it gives you a sense of the distribution terms and conditions. Read about HTS version 2 in “The HMM-based Speech Synthesis System (HTS) Version 2.0“, by Heiga Zen, et al., Sixth ISCA Workshop on Speech Synthesis, 2007.

HMM-based singing voice synthesis system, Sinsy, developed by the Sinsy Working Group. This software bears the copyright of Nagoya Institute of Technology, Department of Computer Science.

Speak Signal Processing Toolkit, SPTK, developed by the SPTK Working Group. Again, the toolkit has a joint copyright:

  • Nagoya Institute of Technology, Department of Computer Science, and
  • Tokyo Institute of Technology, Interdisciplinary Graduate School of Science and Engineering

CRF+ by Taku Kudo. “CRF” is an acronym for “conditional random fields”. CRFs are a class of statistical modeling methods that are used in pattern recognition and machine learning.

The developers also acknowledge other work which was used during speech analysis:

  • WORLD: A high-quality speech analysis and synthesis system based on vocoding.
  • CMUdict: The CMU Pronouncing Dictionary from Carnegie-Mellon University, Pittsburgh, PA (my old school)
  • Festival Speech Synthesis System, Centre for Speech Technology Research, University of Edinburgh, UK.

For (more than) an introduction to HMM-based speech synthesis, try: “An Introduction to HMM-Based Speech Synthesis” by Junichi Yamagishi, October 2006. That should be enough math for you. 🙂 This presentation is super helpful, too.

Casio’s voice synthesis technology is not Yamaha Vocaloid™. Vocaloid™, by the way, is a registered trademark belonging to Yamaha. I have seen punters on the Web attribute the technology to Vocaloid or Yamaha. “Oh, they must have licensed it.” Wrong. Please do not refer to Casio’s tech as “Vocaloid” as this is technically incorrect and a misuse of Yamaha’s trademark.

Plus, we want to give credit where credit is due. Casio have staked out their IP territory in a series of patents filed on their behalf.

Want more information? See Casio singing synthesis in pictures.

Copyright © 2022 Paul J. Drongowski

Yamaha PSR-E473 and PSR-EW425

The PSR-E473 and PSR-EW425 continue the evolution of the Yamaha E-series arranger keyboards.

Yamaha PSR-E473 and PSR-EW425 arranger keyboards

Main features are:

  • PSR-E473: 61 keys, PSR-EW425: 76 keys
  • Super Articulation Lite voices and articulation button
  • 820 voices (including 43 Super Articulation Lite)
  • Category access buttons to select voices
  • 290 auto-accompaniment styles
  • Two DSP effect channels (DSP1 and DSP2)
    • DSP1: 41 types of DSP insertion effects
    • DSP2: 12 effect types
  • New quick sampling user interface (44.1kHz, 16-bit, stereo, 9.6 sec)
  • Motion effects (57 types) and motion effect button
  • Mega Boost (adds +6dB to the apparent volume)
  • Two live control knobs
  • 1/4″ main audio out (R, L/L+R)

Pricing has not been announced as of this writing.

The PSR-EW425 has an exclusive organ sound from the YC stage keyboards. Although the E473 and EW425 share ten new drawbar organ voices, the EW425 has some extra tricks. Quoting Yamaha’s documentation, “On the PSR-EW425, a percussive click sound at key-on/key-off and a leakage sound are added, providing more realistic vintage organ sounds.”

DSP1 is automatically assigned to the main voice. DSP2 can be assigned to any part. DSP2 is assigned to all parts (including the keyboard and backing) by default. There is a dedicated DSP2 button on the front panel which provides direct access to DSP2 and turns it ON and OFF. You can choose the effect type for each DSP unit. Effect parameter editing is limited to that available through the Live Control knobs.

PSR-E473 and PSR-EW425 effect routing [Yamaha]

With reverb, chorus and two DSP effect units, effect routing (above) is more sophisticated than earlier E-series models. The routing adheres to the XG architecture. The MIDI implementation does not provide SysEx for effect selection and routing. (Well, at least it’s not documented…)

Motion effects are implemented via MIDI pitch bend and continuous control messages. (The approach is similar to the Yamaha PSS-A50.) Message-heavy effects will cut into song size when recording into MIDI.

The PSR-EW425 has two 12cm speakers and its amplifiers produce 12W per channel. The PSR-EW425 requires six D size batteries, which will affect final weight. The PSR-EW425 weighs 8.3kg (18 pounds, 5 ounces) without batteries.

The PSR-E473 requires six AA size batteries. The PSR-E473 weighs 7.0kg (15 pounds, 7 ounces) without batteries.

Live control knobs can be assigned to:

  • Keyboard:
    • Filter cutoff and resonance
    • Reverb and chorus level
    • DSP1 parameters A and B
  • Backing:
    • Filter cutoff and resonance
    • Reverb and chorus level
    • Volume balance and retrigger rate
  • System:
    • DSP2 parameter A and B

Check out my pre-announcement post. See how well I did. 🙂

Copyright © 2022 Paul J. Drongowski

New Casio portable keyboards

Casio CT-S1000V

Casio have announced the new CT-S1000V keyboard with vocal synthesis:

  • 61 full-size touch response keys plus pitch bend wheel
  • 64 voice polyphony
  • 3 assignable knobs for controlling modulation, effects, filters, and more
  • 800 AiX-powered Tones and 243 full accompaniment Rhythms
  • Advanced Tones (including vintage keyboards
  • Editable DSP effects (100 effects)
  • Split and layer (Upper 1/2, Lower 1)
  • Powerful bass-reflex stereo speaker system with surround effect
  • Two 13cm by 6cm speakers, 2.5W per channel
  • Audio sampler and 6-track MIDI recorder (sequencer)
  • Audio sample format: WAV, 44kHz, 16-bit, stereo
  • Vocal synthesis with personalized lyrics via the free Lyric Creator app
  • Vocal format: 44kHz, 16-bit, mono
  • Bright backlit LCD display with easy, intuitive interface
  • Strap pins for playing anywhere
  • 1/4″ line outputs to connect to mixers, PA systems, etc.
  • Class-compliant USB-MIDI connects to the free Casio Music Space iOS/Android app
  • Includes WU-BT01 Bluetooth MIDI/Audio adapter
  • Optional 6xAA battery power (AC adapter and music rest included)
  • Weight: 10 pounds
  • $449.99 USD (street)
Casio CT-S1000V portable keyboard with vocal synthesis

Quoting the Casio web site:

The CT-S1000V does what no other keyboard can do: Speak or type your lyrics into the free Lyric Creator app for iOS/Android, transfer them to the CT-S1000V, and play the keys to hear your words come alive. Choose from multiple vocalist models, and adjust age, vibrato, portamento and other parameters in real time. It can produce choirs, robotic sounds, vocoder-like textures, and more. You can even create a custom vocalist based on an audio recording.

Availability — “Coming soon”.

Please see my pre-announcement post for more pictures and information. I also have posted a list of recent Casio patents related to sound synthesis and vocal synthesis.

Vocal Synthesis

According to Casio, you can create lyrics using their tablet-/phone-based Lyric Creator app, transfer them to the CT-S1000V, and play them using the keys. You can dynamically change characteristics like age, gender, portamento and vibrato in real time. Of course, you can mangle the sound with DSP effects, too. The front panel knobs are assignable for real time control.

NOTE mode chooses how the lyrics play-back when keys are pressed. You can play a word or syllable with each key press or you can play choral harmonies. PHRASE mode follows your timing. Legato gestures change note (pitch) while the phrase is playing. You can also select a syllable with your left hand and use your right hand to play it.

Casio Lyric Creator [Casio]

Lyric Creator lets you edit, save and share lyrics. Lyrics can be imported from MusicXML files.

Casio CT-S500

Casio have announced the new CT-S500:

  • 61 full-size touch response keys plus pitch bend wheel
  • 64 voice polyphony
  • 3 assignable knobs for controlling modulation, effects, filters, and more
  • Editable DSP effects (100 effects)
  • Audio sampler and 6-track MIDI recorder (sequencer)
  • 1/4″ line outputs to connect to mixers, PA systems, etc.
  • Includes WU-BT01 Bluetooth MIDI/Audio adapter
  • Bright backlit LCD display with easy, intuitive interface
  • 800 AiX-powered Tones and 243 full accompaniment Rhythms
  • Advanced Tones (including vintage keyboards)
  • Splits and layers (Upper 1/2, Lower 1)
  • Powerful bass-reflex stereo speaker system with surround effect
  • Two 13cm by 6cm speakers, 2.5W per channel
  • Strap pins for playing anywhere
  • Class-compliant USB-MIDI connects to the free Casio Music Space iOS/Android app
  • Optional 6xAA battery power (AC adapter and music rest included)
  • 10 pounds (4.7kg)
  • $379.99 USD (street)

Available now!

Casio CT-S500 portable keyboard [Casio]

Quick reaction

Watching the Casio release video stream, the artist demonstrations are exciting. At these price points, Casio are going to sell a ton of these. I am so glad they included the “Advanced Tones”, that is, all the pianos, vintage keys and other instruments which created so much interest in the CT-S1. Hope they slash street prices on the CT-X series because the new S-series models blow them away.

Copyright © 2022 Paul J. Drongowski