Casio’s pre-NAMM 2022 press release mentions a few art projects to be released and shown during NAMM 2022, June 3-5.
Music Tapestry creates pictures from musical performances — a modern day color organ, for you old-timers like me. Music Tapestry is triggered by musical pitches and keyboard touch. Casio Sound Developer Hiroko Okuda — who helped developed Music Tapestry — will demonstrate it at the Casio booth.
Casio’s U.S. Patent 10,803,844 (October 2020) discloses a process to visualize musical performance. Hiroko Okuda is one of the inventors.
If you think the Casio CT-S1 is too plain, try the “Flowers & Hearts” fabric by Brazilian pop artist Romero Britto. Casio will be selling a Limited Edition CT-S1 FH model (limited to 200 units at $500 USD).
Check out more of Britto’s work on-line!
Of course, Casio will be demonstrating their latest products including the Casio CT-S1000V with vocal synthesis. I’ll bet that the CT-S500 will be there, too. 🙂
I spent a little time with the Casio CT-S1000V this morning, trying to dial in a mellow Rhodes EP with tremolo. The Stage E.Piano tone is nice, but has auto pan instead of tremolo. I like tremolo since I usually go MONO into the live sound system.
Studying presets is always informative. The DSP tones, by default, have an Active DSP chain pre-configured. The Active DSP chain for Stage E.Piano is:
Amp Cab -> Auto Pan -> Auto Pan ---------------- ---------------- ---------------- Type: RD-MK2-PRE Rate: 68 Rate: 62 Vari: 1 Depth: 80 Depth: 80 Wet Level: 127 Waveform: Sine Waveform: Sine Dry Level: 0 Manual: 0 Manual: 0 Bypass: OFF Wet Level: 70 Wet Level: 70 Dry Level: 100 Dry Level: 118 Bypass: OFF Bypass: OFF
Two Auto Pan stages? Well, let’s find this chain in the list of DSP combinations. What the? The default “Tone” DSP chain doesn’t appear in the DSP List!
The Trem 60’s EP has tremolo, so let’s take a look at its default Active DSP chain:
Amp Cab -> Tremolo -> Tremolo ---------------- ---------------- ---------------- Type: WR-200-PRE Rate: 92 Rate: 92 Vari: 3 Depth: 64 Depth: 64 Wet Level: 112 Waveform: Sine Waveform: Sine Dry Level: 0 Wet Level: 100 Wet Level: 100 Bypass: OFF Dry Level: 100 Dry Level: 100 Bypass: OFF Bypass: OFF
Two Tremolo stages and once again, such a DSP combination is not listed User Guide DSP List!
Well, DSP combi number 33, Drive Amp 2, is close to what we need. Starting with the Stage E.Piano tone, I changed it’s Active DSP programming to:
Drive -> Tone Control -> Amp Cab -> Tremolo -------------- -------------- ---------------- ---------------- Type: Crunch3 Low Freq: 400 Type: RD-MK2-PRE Rate: 82 Gain: 70 Low Gain: +3 Vari: 1 Depth: 120 Out Level: 70 Mid Freq: 2.5k Wet Level: 100 Waveform: Sine Wet Level: 127 Mid Gain: +5 Dry Level: 0 Wet Level: 70 Dry Level: 0 High Freq : 5k Bypass: OFF Dry Level: 60 Bypass: ON High Gain: 0 Bypass: OFF In Level: 127 Wet Level: 68 Dry Level: 0 Bypass: OFF
The Bypass parameter comes to the rescue. I didn’t like any of the Clean drive types, so I disabled (bypassed) the drive stage.
The Tone Control boosts the MIDs adding warmth. The Amp Cab model is a Rhodes Mk2 preamp — the same model in the stock Stage E.Piano tone. These Tremolo settings just sound right to me. Of course, you’re welcome to play with any of these settings.
I uploaded updated registration banks, including the Stage E.Piano tremolo. Please see the CT-S free registrations page for a link to the ZIP file.
I hope today’s post will help liturgical musicians who want to play the Casio CT-S1000V and CT-S500 at church services. I invested a fair amount of effort building patches and registrations which fit contemporary and traditional church music. The sounds would also be compatible with soft pop and gospel-tinged genres, too.
I’ve gig-tested there sounds, having played them at services. So, if you would like to try them yourself, please download the ZIP file. The ZIP contains six CT-S registration files:
RegBank01.RBK: Woodwinds
RegBank02.RBK: Strings
RegBank03.RBK: Horns / Brass
RegBank04.RBK: Drawbar organs
RegBank05.RBK: Pipe organs
RegBank06.RBK: Miscellaneous
The sixth bank is a work in progress. The first five banks cover most of my needs, but there are always a few miscellaneous sounds that pop up.
Each CT-S1000V and CT-S500 registration has four slots (patches). The following table summarizes the registration and patch layout.
1 2 3 4 -------------- -------------- -------------- -------------- Bank 1 Horn+Wood Flute+Cla Wood Sect ChamberWinds Bank 2 MellowStrings StereoStrings SoloViolin ChamberStrings Bank 3 FrenchHorns NobleHorns HighSchool Tp + Tb Bank 4 MellowGospel SoftGospel BrightChurch Simmering Bank 5 Pipe Organ 3 Chapel Organ Organ Flute Bandoneon Bank 6 SoftPad VoiceEnsemble StageE.Piano StageE.Piano Trem
I usually pre-select a bank and patch before each musical piece. Then I switch to a different patch within the same bank in order to add a different color. I wish it was a little easier to change registration bank on the fly. Maybe I’ll get better with practice.
B-3 Organ Upper1 Upper2 Lower ---------------- ----------------- ---------------- MellowGospel GospelOrgan2 127 Organ Bass 100 SoftGospel Rock Organ 2 127 Organ Bass 110 BrightChurch Elec.Organ 1 100 Organ Bass 127 Simmering Elec.Organ 6 110 Organ Bass 127
Pipe Organ Upper1 Upper2 Lower ---------------- ----------------- ---------------- Pipe Organ 3 Pipe Organ 3 100 Chapel Organ Chapel Organ 100 Organ Flute Organ Flute 120 Bandoneon Bandoneon 120
I dialed down the reverb in all cases and settled on the ROOM2 reverb type. These patches are intended for live playing in a reverberant church hall, so additional reverb is unnecessary. You might find the pipe organ patches to be waaay dry when compared with the factory tones. I removed the initial reflections and delay which create the impression of a large space — totally unwanted in a live church.
I added 3-band EQ (ACTIVE DSP) to the woodwind patches to add warmth and to reduce harshness. Feel free to tweak away!
For string patches, Knob 1 and 2 are assigned to attack time and release time, respectively. I had to decrease the release time to reduce a simulated reverb tail. Knob 3 is usually modulation.
For drawbar organ patches, Knob 1 is rotary speaker speed, Knob 2 is scanner vibrato/chorus and Knob 3 is rotary speaker brake. Drive Rotary (ACTIVE DSP) is enabled with ACTIVE DSP HOLD. Here are the Drive Rotary DSP parameters:
BrightChurch Param MellowGospel SoftGospel Simmering ------- ------------ ---------- ------------ Type 2 2 2 OD Gain 30 42 42 OD Level 30 42 42 Speed SLOW SLOW SLOW Brake ROTATE ROTATE ROTATE FallAcel 35 35 20 RiseAcel 40 40 35 SlowRate 45 45 65 FastRate 95 95 100 Vib/Cho OFF OFF OFF WetLevel 100 110 110 DryLevel 0 0 0 Bypass OFF OFF OFF
I programmed Organ Bass in the left hand because I didn’t care for the sound of the rotary speaker on notes below middle C (or so). Drive Rotary does not have a parameter for the horn/rotor balance — maybe that would help.
Beware, this post is going to bury you in numbers. 🙂
I’ve been investigating master equalization in the Casio CT-S1000V. The CT-S500 has the same master EQ, so everything discussed here applies to the CT-S500, too.
The CT-S1000V master EQ is a four band, semi-parameteric equalizer. The four bands are: LOW, MID1, MID2, and HIGH. It’s possible to create and store a USER setting. The edit page let’s you set the center frequency and gain for each of the four bands. You cannot set the band quality factor, Q, which determines the bandwidth spread.
The CT-S1000V provides ten master EQ presets with suggestive names. Casio, unfortunately, do not publish the center frequencies and gains for the presets. Listening to each preset, one thinks “Yeah, that’s bright,” or whatever. Details are missing in action, however.
One can assign LOW, MID1, MID2, and HIGH gain levels to a knob. Thanks to the knob edit function, it’s possible to suss out the gain level for each band within a preset. After much button pushing and knob twiddling, here are the gain levels (dB) for each preset:
As to the band frequencies, we turn to the published table of master EQ frequencies:
LOW frequency range 50Hz to 800Hz MID1 frequency range 100Hz to 8.0kHz MID2 frequency range 100Hz to 8.0kHz HIGH frequency range 2.0kHz to 16.0kHz
That’s enough to get into the right ballpark.
Yamaha XG Multi EQ
Never content, I worked out a table for Yamaha XG Multi EQ. Multi EQ is an optional master EQ in the Yamaha XG effects chain. Multi EQ is fully parameteric and has five bands: LOW, LOW-MID, MID, HIGH-MID, and HIGH. The LOW and HIGH bands support a peak mode, but are usually configured for shelving.
Multi EQ has five presets: Flat, Jazz, Pops, Rock and Concert (AKA “Classic”).
The settings match the names. Mellow knocks down the highs. Bright cuts the lows and boosts the highs. Loudness is a bathtub boosting both lows and highs. Powerful kicks all bands up a notch.
If I find a way to discover the CT-S1000V band frequencies, I will update its table. In the meantime, have fun!
The CT-S1000V (CT-S500) has portamento. Control over portamento is quite flexible. Be prepared to experiment, however, as the interaction between portamento settings is not immediately obvious. Unfortunately, the User Guide is not super helpful as it refers to several terms with “portamento” in the name, e.g., “Upper Portamento,” “Part Portamento”, etc.
There are two different ways to access portameto-related settings: through the MENU button and through the Settings sub-menu.
The MENU parameters allow the following adjustments:
UPPER PORT: Turn on Upper Portamento.
PART PORT: Turn on Part Portmento for each part (Upper1, Upper2, Lower).
TIME: Change portamento time for Upper1, Upper2 and Lower, individually. Each part has its own time.
When Upper Portamento is enabled, you can enable/disable portamento on the Upper part using a front panel button.
Ah, so which panel button is that? If you press the INSTRUMENT button, the CT-S1000V displays five soft buttons: SPLIT, LAYER, TOUCH, SUS and ARP. The SUS button controls sustain. If you want to control Upper Portamento instead, dive into the Settings sub-menu and scross to “SUS/UPPER PORT button”. Change the value from “SUS” to “UPPER PORT”. Now the INSTRUMENT button shows “UPPER PORT” instead of “SUS”. Pressing the “UPPER PORT” soft button applies portamento to the Upper part. This feature allows you to apply portamento selectively during a solo line.
I hope this brief overview helps when reading the User Guide. I recommend reading the fine print about Upper Portamento because Upper Portamento can override Part Portamento. (Surprise!)
Expression pedal
For some crazy reason, I didn’t hook up and configure an expression pedal on Day 1. In retrospect, one should probably tangle with pedal set-up early just in case pedal settings are saved in user memory locations like CT-S1000V registrations.
The User Guide gives good step-by-step directions concerning pedal set-up. The CT-S1000V is vendor agnostic, thankfully. I have three (!) Yamaha FC-7 pedals and didn’t want to buy another pedal. The CT-S1000V supports two TRS wiring schemes as shown in the diagrams below:
To make a long story short, the Yamaha FC-7 is polarity type 1. The FC-7 resistance is 50K ohms and be sure to go through the simple calibration steps in the User Guide. For reference, the FC-7 TRS signals are:
Tip: Reference voltage
Ring: Wiper
Sleeve: Ground
You wouldn’t believe how many forum posts get this wrong!
Roland, Kurzweil and Fata are polarity type 2. The User Guide confirms operation for the Roland EV-5, Kurzweill CC-1, Fatar VP-25 and Fatar VP-26. Type 2 TRS signals are:
Tip: Wiper
Ring: Reference voltage
Sleeve: Ground
The Roland FV-500L should work, too. Be aware that the EV-5 and FV-500L have a “minimum volume” potentiometer (variable resistor) in series with the main control potentiometer. Turn the minimum volume control to 0 before calibrating. The main control potentiometer resistance is 10K ohms; the minimum volume potentiometer resistance is 50K ohms.
Rant of the day: I have a nice, light-weight Boss EV-1-WL expression pedal. Wish I could use it with the CT-S1000V (and others). No device to device BLE, no 5-pin MIDI, no host compatibility. Arg.
Are pedal settings really stored?
I posted this question on the Casio Music Forums site. Even though the User Guide claims the pedal settings are stored in MY SETUP and registrations, I haven’t seen evidence. If you change the pedal settings and load a registration (or MY SETUP), the changed settings remain.
I hope that Casio will clarify.
Drawbar organ tones
As mentioned earlier, I pulled together a bank of drawbar organ registrations. I settled on the following tones:
Gospel Organ 2 Mellow Rock Organ 2 Mellow Elec.Organ 1 Bright church-y for hymns Elec.Organ 6 Simmerin' for grease
In all four cases, I split the keyboard putting the “Organ Bass” tone in the left hand. I like the way Organ Bass holds down the low-end and doesn’t sound swirly. Split point is E4.
I’m choosing and dialing in the Hammond B-3 organ tones for church tunes. The Casio CT-S1000V (and CT-S500) provides over thirty electric organ tones. Most of the organ tones incorporate an Active DSP effect.
Choosing tones is a process of selection and elimination. Some sounds just aren’t suitable for church, e.g., distorted rock organ, combo organ, or theatre organ. As to personal taste, I usually avoid voices with hard percussion. I often play in a group with piano, guitar or drums. There is already enough going on percussively that I don’t need to add another element to the on-going rhythm. In a typical situation, it’s more important to hold down or emphasize a left-hand bass.
Tone name Description DSP effect --------------- ------------------------------ ------------ JS Organ Jimmy Smith, percussion AMP cab->Trem->3EQ AMP Organ 1 First four, no percussion Phaser->Drive->AMP cab Rock Organ 1 No percussion Phaser->Drive->AMP cab Hard Rock Organ Chorus->Drive->AMP cab Gospel Organ 1 Mellow-ish Rotary Velo Organ Velocity brings in bars Drive rotary F.Organ Farfisa combo organ Trem->AMP cab V.Organ Vox combo organ Trem->AMP cab RTF FD Organ Rotary fast, full drawbars Drive rotary Rock OD Organ Drive rotary Tremolo Organ Mellow, C3 sampled-in Auto pan DP Organ Total Jon Lord Dist Jazz Organ 1 Bright hard perc jazz Rotary Jazz Organ 2 Darker hard perc jazz Rotary Elec.Organ 1 Bright. church-y Rotary Elec.Organ 2 Mellow-ish, key click Rotary Elec.Organ 3 First four Rotary Elec.Organ 4 Full drawbars Rotary Elec.Organ 5 Mellow, soft perc Phaser Perc.Organ 1 Bright hard perc Rotary Perc.Organ 2 Darker soft perc Rotary Gospel Organ 2 Mellow Rotary Full Drawbar Full drawbar Rotary Rock Organ 2 Church-y (swimming rotary) Drive rotary Rock Organ 3 Mellow-ish very fast Drive rotary Click Organ Low drawbars, key click Drive rotary 70's Organ Farfisa-like, key click Rotary Organ Pad Weird No DSP effect Theatre Organ No DSP effect Perc.Organ 3 Hard perc, low drawbars Rotary Elec.Organ 6 Low+high (16+1), click Rotary AMP Organ 2 Rocky bars Tone->Drive->AMP cab->Trem AMP Organ 3 Heavy OD Tone->Drive->AMP cab->Trem Organ Flute Pipe organ flute No DSP effect Puff Organ Chiffy organ flute No DSP effect Reed Organ Portable reed organ Mono 3-band EQ Rotary F-Organ Farfisa organ, rotary speaker Rotary Rotary V-Organ Vox organ, rotary speaker Rotary
The table above are notes taken while auditioning CT-S1000V drawbar organ tones. The CT-S500 has the same complement of organ voices, so any comments made here apply to the CT-S500, too.
My goal is a registration bank of four tones that provide a range of timbres. I want to push a registration button while playing in order to match the current musical situation. So far, a few candidates stand out:
Elec.Organ 6 Simmering 16+1 Gospel Organ 2 Mellow, ballad-like sound Gospel Organ 1 Brighter, hymn foundation Rock Organ 2 Brighter still, hymn playing
I listened to all organ tones with Active DSP enabled and bypassed. “Rock Organ 2” has a church-y sound when the overdrive and rotary speaker are removed. I can knock the rock out of the preset by re-programming the Active DSP effect.
Most organ tones have a preset Active DSP effect as shown in the table. A few of the tones — like AMP Organ 1 — don’t use the rotary effect at all. (Surprise!)
As I mentioned in an earlier post, a few of the tones have a swirl-y or swimming fast rotary setting. Sometimes the rise acceleration and fall deceleration times are too short. The settings below are ballpark.
Drive Rotary PJD Velo Org ---------------- -------- -------- Type: 2 2 Overdrive Gain: 42 48 Overdrive Level: 42 31 Speed: Fast Fast Brake: Rotate Rotate Fall Accel: 40 10 Rise Accel: 60 18 Slow Rate: 40 17 Fast Rate: 100 111 Vib/chorus: C3 C2 Wet Level: 100 100 Dry Level: 0 44 Bypass: Off Off
I will use these parameter values as my starting point moving ahead. The wet/dry parameters could be helpful in a stereo mix avoiding hard left/right channel throb. BTW, the sim doesn’t have parameters for horn/rotor balance.
The Casio rotary speaker sims include a scanner vibrato/chorus option: Off, V1, V2, V3, C1, C2, C3. The presets assign overdrive to knob 2 (K2). I don’t usually adjust overdrive when I’m playing and will assign vibrato/chorus control instead. That way I can add or subtract chorus on the fly. Unfortunately, scanner off is at the knob’s full left position and C3 is at the full right position. That will make for a big gesture knob-spin when playing. Wish this could be assigned to a button…
As to reverb, I’ve settled on ROOM 2 with a send level of 20. This is just enough to add a little space when practicing, but it won’t muddy the sound too much when playing live.
Casio gave the CT-S Casiotone series a major boost with the new CT-S500 and CT-S1000V models. The new models include the acclaimed CT-S1 voices. Even better, Casio added a wide range of editable DSP effects. The DSP effects are in addition to the three system effect units: reverb, chorus and delay. Up to two DSP effect chains can be applied to tones (voice) and auto-accompaniment parts.
Casio have their own way of organizing effects. DSP effects are organized into chains, where each chain consists of one to four DSP modules. A DSP module is a unit executing a particular effect algorithm like EQ, tremolo, compression, etc. There are 29 DSP module types.
As a user, you don’t work directly in terms of DSP modules. Instead, you select a chain from one of 100 preset DSP chains. The first 25 or so preset chains have only a single DSP module, just about covering the 29 DSP module types individually. The remaining 75 presets have two, three or four DSP modules. Many chains are obviously guitar-oriented while others are slanted toward keyboards, percussion and ambience. When it comes to effect applications, who am I to judge? 🙂
The CT-S approach may appear to be too restrictive, but each DSP module has a BYPASS parameter which lets you turn off its stage. If you can find a preset chain with effects in the desired order, you’re good to go, thanks to DSP parameter editing.
Thank heavens for DSP parameter editing! The rotary speaker simulations are too fast right out of the box, for example. In addition to modifying effect parameters, you can assign a DSP parameter to each of the three front panel knobs (e.g., rotary speaker speed). Everything gets saved into a registration slot from which tone and effect settings are recalled.
Oh, yes, you have full access to Active DSP parameters. Unfortunately, this is not true for system effects (reverb, chorus, and delay). In the case of system effects, only the effect type (e.g., ROOM 3, HALL 2, etc.) can be changed. Minor bummer.
If I’m a little vague about effect routing, it’s because Casio have not published an effect routing diagram for the CT-S500 or CT-S1000V. I don’t feel like I’ve been held back in practice, but I would love to see a detailed signal flow. A diagram would clear up people’s questions about effect assignment (how many and where).
I’ve been pulling together a dozen of my most frequently used patches (layers and splits). I like to detune layers in order to add a bit more motion and interest. Individual tones, however, do not have a fine tuning parameter (i.e., plus or minus cents).
The orchestral voices are not pre-programmed with a DSP effect (thank goodness). Lacking a fine tuning control, an alternative method is to add a pitch shift DSP module to the tone. There are four DSP preset chains that contain pitch shift:
#21 Pitch: --> Pitch --> #87 Pitch Delay: --> Delay --> Pitch --> Phaser --> Auto Pan --> #90 Pitch Mod 1: --> Tone --> Phaser --> Delay --> Pitch --> #91 Pitch Mod 2: --> Pitch --> Delay --> Phaser --> Tone -->
The multi-module chains are fun, but waaay more complicated than necessary. Nonetheless, I jotted down the pitch-related parameters for reference:
Turns out, the stock pitch parameters (column 1) are ready to go, adding a pleasant chorus-like effect to layered tones.
Of course, I can’t let it rest and tried the six ambient enhancement presets. I got a nice, subtle feel with “AmbientEnh 4”. The chain breaks down like this:
Piano Effect --> Delay --> Tone Control -------------- -------------- -------------- Lid: Full Open Time: 90 Low Freq: 50 Ref Level: 60 TmRatioL: 28 Low Gain: -12 TmRatioR: 64 Mid Freq: 100 Level L: 110 Mid Gain: -6 Level R: 110 High Freq: 16k Fdbk: Cross High Gain: 0 Fdbk Level: 30 High Damp: 120 Tempo Syn: Off In Level: 127 In Level: 100 In Level: 90 Dry Level: 110 Dry Level: 110 Dry Level: 0 Wet Level: 90 Wet Level: 110 Wet Level: 90 Bypass: Off Bypass: Off Bypass: Off
I used my ears first before diving into the details. Yep, I was surprised to see “Piano Effect” in the chain. Frankly, I don’t care what it’s called as long as it sounds good! The chain takes a little harshness off woodwinds while adding early reflections and cross delay. Cross delay isn’t too important to me as I will be playing in MONO at the job. Still, why not?
Hopefully, this look at CT-S1000V (CT-S500) DSP effects is helpful.
You might decide to fine tune the Casio CT-S1000V’s pronunciation of certain words or syllables. Crack open Casio Lyric Creator, select a lyric phrase, and long-press a word or syllable. Lyric Creator opens the “Edit Phonemes” display.
If you’re working in English, Lyric Creator shows the word’s phonetic translation above a peculiar looking on-screen keyboard. The keys correspond to the major phonemes in the English language.
If you want to spend (waste) an entire afternoon, query English phonics. At the very least, you’ll finding articles about phonics for first readers. If you deep dive further, you will be down the infinite rabbit hole of linguistics. Beware!
Sometimes it’s worth it to check up on Lyric Creator. In one version of “Amazing Grace,” Lyric Creator produced the ‘G’ at the beginning of “Grace.” In another version, it didn’t produce the ‘G’ at all. YMMV. The C-MU Pocket Sphinx of 10,000 frequent words is a good resource when checking pronounciations. [Hope they didn’t ask anyone from Pittsburgh to make this list.]
There are roughly 44 common phonemes in English separated into vowels and consonants. Looking at the phoneme keyboard, most of the consonants make sense. There are a few two-letter combinations like “ng”, “tt”, “th”, “jh”, and “zh”. These cover sounds like the “ng” in “sing”.
The vowels, however, are not what they appear to be! The phonemes follow the International Phonetic Alphabet (IPA). [Not “India Pale Ale”.] The IPA has more special characters (glyphs) than Iverson’s APL (A Programming Language). The glyphs are mapped to two- and three-character sequences.
Here’s a quick and dirty correspondence table between symbols in the ARPAbet and English sounds:
Vowels Consonants Consonants ---------- ---------- ---------- aa bOt b Buy ng siNG ae bAt ch CHin p Pie ah bUt d Die r Rye ax oracLE dd miDDle s Sigh ao OUght dh THy sh SHy aw bOUt f Fight t Tie axr lettER g Guy tt auTumn ay bIte hh High th THigh eh bEt jh Jive v Vie er bIRd k Kite w Wise ey bAIt l Lie y Yacht ih bIt m My z Zoo iy bEAt mm zh diviSion ow bOAt n Night cl oy bOY nn wiNNer uh bOOk uw bOOt
You’ll notice several oddities among the English vowels, e.g., “ay” producing the long ‘i’ sound in “bite”. “ay”, really? No doubt, you saw the “ey” in “Grace” and thought that was strange, too.
The 44 common phonemes aside, there are a few phonemes in the Lyric Creator keyboard that required investigation. There are a few phonemes for which I have no clue! Next are a few special cases for your consideration.
There are two “th” sounds: voiced and unvoiced. A voiced sound is produced with the vocal chords; An unvoiced sound is produced solely by other components of the vocal tract. In Lyric Creator:
“dh” is the voiced “th” sound.
“th” is the unvoiced “th” sound.
Some vowel sounds are influenced by the letter ‘r’. The phoneme “axr” is a vowel influenced by ‘r’ as in “creator” or “letter”.
“ax” turns up in some interesting cases like “autumn” (ao,tt,ax,m), “middle” (m,ih,dd,ax,l) and “kindle” (k,ih,n,d,ax,l).
The word “middle” contains “dd” and “tt”. These phonemes lead to a discussion of alveolar flapping in English. [Yes, “flapping”.] We now stand at the maw of the rabbit hole and I will take my leave.
Here are some observations after getting my feet wet with Casio Lyric Creator and Casio CT-S1000V.
Connection
As several other folks have mentioned, you must connect your iPad (Android device) to the CT-S1000V with a USB cable. Right now, Casio Lyric Creator cannot communicate with the CT-S1000V over Bluetooth.
It’s a little weird. Casio Music Space — recently released — does communicate with the CT-S1000V over Bluetooth. Casio Music Space could be a useful educational tool. However, it doesn’t fill a need for me.
Casio Music Space has a pairing dialog when initiating Bluetooth connection. Perhaps this is all that Lyric Creator needs? The IOS Bluetooth settings page does not find or show the CT-S1000V; I guess pairing is up to the app.
This is all new and perhaps we should give the Casio developers a little more time. At the moment, the WU-BT10 Bluetooth dongle is not much help. Glad it was included with the CT-S1000V. If I laid out $80USD for the dongle on top of the CT-S1000V, I’d be disappointed.
Memory capacity
The CT-S1000V comes factory-loaded with 100 dance floor (EDM) phrases. It’s fun to mess with these although they are not my cup of tea.
Casio Lyric Creator has an “Instrument Data Management” button at the bottom of the main screen. IDM displays the phrases loaded into a connected CT-S1000V.
There are 48MBytes of space of internal phrase memory allocated for phrases. Nearly all of the factory phrases are about 300KBytes, occupying roughly 30MBytes total. That leaves about 18MBytes free and available.
At some point, I will zap the factory phrases. Fortunately, Casio provide a file with the CT-S1000V Preset Lyric Tone Data. The file is in the Electronic Musical Instruments support area. You will need to scroll down to the “Digital Keyboards” section to find the file.
It’s a DAL file which resets all user data. Yikes! Be sure to save your own content before loading this file! It’s a factory reset.
The Lyric Creator User Guide describes how to restore individual lyric tones (phrases). Check out the Data Management section of the Guide for more details.
Just a phrase
Starting simple, I created a phrase from the first line of “Amazing Grace:”
Am -az -ing grace how sweet the sound
Then I added note values (durations), save the lyrics, and transfered the phrase to the CT-S1000V. Pretty smooth although I missed the need to SAVE the lyric file (with the cryptic name) to the CT-S1000V. The keyboard sat there in the “Preparing” state until the transfer request timed out.
I played the phrase note by note. All good. Then I tried to find the most natural sounding Vocalist, settling on the Bossa Nova Vocalist.
This short phrase, BTW, took roughly 600KBytes of storage.
Prying eyes
Of course, the next thing is to save the lyrics file (extension lyj) on the PC and open the file with a text editor (Emacs). Sure enough, it’s all text. A lyrics file contains:
Header information about file and path names.
The lyric text as one continuous string.
A sequence of syllable data items.
Lyric Creator option settings, e.g., input language mode, auto split, auto conversion, etc.
More file paths including the Music XML file path.
It’s all one big string with delimiters dividing fields, attributes and values. Not the easiest thing for a human to read and we probably weren’t expected to poke around inside. (Ha!)
The sequence of syllable data items is the most interesting part. This is how Lyric Creator subdivided your lyric phrase and its note values. Here is a typical syllable data item:
“grace” is the syllable. Normally, the phoneme property is empty. I choose to enter my own phonetic spelling for “grace”. [I’ll have a lot more to say about phonetic spelling in a future post.] If you don’t spell out your own phonemes, Casio will use the default spelling and leave the phoneme property blank. Length is the note value (duration) specified in tick units (480 ticks per quarter note). The note property is a MIDI note number (default 60 or middle C).
Importing Music XML
That was too easy. Let’s go for all the marbles and import a Music XML file. Now the road gets rougher. I think Casio need to do more testing.
I installed the latest version of MuseScore and created a simple chart for “Amazing Grace.” The Lyric Creator Music Guide says: “If the file being imported has multiple parts, only the first part will be imported.” I take this to mean “Lyric Creator will only import the first verse.” Thus, I only created one verse.
I exported an uncompressed Music XML file from MuseScore. Lyric Creator cannot import compressed Music XML; only .xml or .musicxml extensions are allowed.
OK, I should have saved the first attempt to show you. [I didn’t.] Lyric Creator brought in the lyrics — sort of. It missed the pick-up syllable (note) and it inserted a few extra syllables. On reflection, the extra syllables may be Do Re Mi Fa So La Ti Do, corresponding to the pitches of the C scale. Surprise! Lyric Creator also inserted a few rests where rests occurred in the score. Hyphen placement seemed a little random.
Overall, if the text is short, I wouldn’t bother with Music XML right now. Pressing on…
I cleaned up the lyric text: deleting Do Re Mi etc., adding the missing pick-up syllable, and so forth. The changed text saved OK.
Next, I edited the note values. Lyric Creator had generated note values based on the Music XML score. However, when I added the pick-up syllable, it looked like everything was now off by one place. Diligently, I changed the note values to match the original score. Went to save, and uh-oh, the text plus note values are too big to save. I deleted the rests and that was enough to trim the lyrics and make Lyric Creator happy again.
The Lyric Creator User Guide has a caveat: “You can enter lyrics for up to 100 syllables when the note value (note length) is eighth note. The number of syllables you can input depends on the note value.” Looks like I tripped the limit.
I transferred the lyrics for the first verse to CT-S1000V and successfully played them note by note. Hurray!
The entire first verse with note values required 1.6MBytes of storage. Yes, some of those factory phrases are gonna go!
My final observation has to do with the MIDI note numbers. Are they or what are they? I need to investigate the note property as the generated note numbers don’t match the melody. I question my own conjecture…
Summary
Well, Lyric Creator works within its limitations. I’m not sure that Music XML is ready for prime time. Unless I really, really wanted or needed to import Music XML, I would start simply with Lyric Creator’s editor.
There are two major approaches to speech (singing) synthesis: unit-selection and statistical parametric.
Most people are familiar with unit-selection systems like Texas Instruments old Speak and Spell or the much more advanced Yamaha Vocaloid™. Unit-selection relies upon a large database of short waveform units (AKA phonemes) which are concatenated during synthesis. The real trick behind natural sounding singing (and speech) is the connective “tissue” between units. Vocaloid creates waveform data that connects individual phonetic units.
If you are familiar with Yamaha’s Articulation Element Modeling (AEM), a light should have lit in your mind. The two technologies have similarities, i.e., joining note heads, bodies, and tails. The Yamaha NSX-1 chip implements a stripped down Vocaloid engine and Real Acoustic Sound (AEM).
The content and size of the unit waveform database is a significant practical problem. The developers must record, organize and store a huge number of sampled phrases (waveform units). The Vocaloid 2 Tonio database (male, operatic English singer) occupies 750MBytes on my hard drive — not small and was a real challenge to collect, no doubt.
Statistical parametric systems effectively encode the source phonetic sounds into a model such as an hidden Markov model (HMM). During training, the source speech is subdivided into temporal frames and the individual frames are reduced to acoustic parameters. The model learns to associate specific text with the corresponding acoustic parameters. During synthesis, the model is fed text and acoustic parameters are recalled by the model. The acoustic parameters drive some form of vocoding. (“Vocoding” is used broadly here.)
Deep neural networks (DNN) improve on HMM. Sinsy is a DNN-based singing voice synthesis (SVS) system from Nogoya Institute of Technology. It is the culmination of many years of research by sensei Professor Keiichi Tokuda, his students and colleagues. It was partially supported by the Casio Science Promotion Foundation. Thus, adoption by Casio is hardly accidental!
The Sinsy block diagram is taken from their paper: Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System, by Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda, EEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2803-2815, 2021. The method is quite complex and consists of several models. It’s not clear (to me, yet) if the Casio approach has all elements of the Sinsy approach. I recommend reading the paper, BTW; it’s well-written and highly technical.
The next block diagram is taken from Casio’s U.S. Patent number 10,789,922 awarded September 29, 2020. Their approach is separated into a training phase and a synthesis (playing) phase. You’ll notice that Casio employ only an acoustic model. The patent discloses a “Voice synthesis LSI” unit, so their software may have a hardware assist. We’ll need to take a screwdriver to the CT-S1000V to find out for sure!
A picture is worth a thousand words. A technical diagram, however, requires a little interpretive context. 😉 Paraphrasing the Casio patent:
The text analysis unit produces phonemes, parts of speech, words and pitches. This information is sent to the acoustic model. The acoustic model unit estimates and outputs an acoustic feature sequence. The acoustic model represents a correspondence between the input linguistic feature sequence and the output acoustic feature sequence. Acoustic feature sequence includes:
Spectral information modeling the vocal tract (cepstrum MEL coefficients, line spectral pairs, or similar).
Sound source information modeling vocal chords (fundamental pitch frequency (F0) and power value).
The vocalization model unit receives the acoustic feature sequence. It generates singing voice inference data for a given singer. The singing voice inference data is output through a digital-to-analog converter (DAC). The vocalization model unit consists of:
A sound source generator:
Generates a pulse train for voiced phonemes.
Generates white noise for unvoiced phonemes.
A synthesis filter:
Uses the output signal from the sound source generator.
Is a digital filter that models the vocal tract based on spectral information.
Generates singing voice inference data (AKA “samples”).
This rather complicated diagram from U.S. Patent 10,789,922 shows the synthesis phase in more detail. It shows the lyric string decomposed into phoneme and frame sequences. Each frame is sent to an acoustic model which generates an acoustic feature sequence, that is, the acoustic parameters that were learned during training. The acoustic parameters are synthesized (vocoded) into 255 samples. Each frame is about 5.1 msec long.
Well, if the second patent diagram was TMI, here is the block diagram from the Casio CT-S1000V user guide. The simplified diagram is quite concise and accurate! You should be able to relate these blocks directly back to the patent.
I hope this discussion is informative. In a later post, I’ll take a look at a few practical details related to Casio CT1000V Vocal Synthesis.