Wednesday, May 4, 2016

Building a Fantasy Language 2B: Sound System - Syllable Structure

Syllable structure and syllable structure rules are an important part of any language’s sound system.  They are also a big part of what makes languages different from one another.  Why do Russian (zdrasvuytche!) and Hawaiian (aloha!) sound so different?  Why do English speakers find Hawaiian easy to pronounce, and find Russian to be hard?  It’s not just that Russian has a few sounds that we don’t have in English—it’s Russian’s syllable structure rules.  It allows more complicated syllables and syllable structures than English does, which makes it harder to pronounce.  Hawaiian only allows very simple syllables, which makes it easier to pronounce.

I remember how the teachers used to make us count out syllables in elementary school (“How many syllables in education?”).  I was pretty good at that, but I was terrible at dividing words into syllables—at some point I think that I mixed up the rules for dividing words into syllables with the rules for dividing words in your handwritten reports when you came to the end of a line. 
Note: If you want the short version of this post, read “Parts of a Syllable,” then jump down to “Syllable Structure Rules.”

The Parts of a Syllable
Before we can talk about syllable structure rules, we have to know the different parts of the syllable: the onset, nucleus, and coda.

The nucleus is the easiest part of the syllable to identify.  In most cases, the nucleus consists of a vowel or vowels in the middle of the syllable.  Thus, in e-du-ca-tion, the e, u, a, and o are the nucleii of the word’s four syllables.  (A few consonants can also act as nucleii, in rare cases: l, r, m, and n.  This is why we can say mmhm and other such useful things.)  Vowels will NOT be part of the onset or coda, so if your syllable has a vowel, it will be in the nucleus.

The onset consists of any consonant or consonants in your syllable that come before the nucleus.  In English, not all syllables NEED to have an onset, although they like to have them. In e-du-ca-tion, the first syllable, e-, does not have an onset.  In the rest of the syllables, d-, c-, and t- are the onsets.

The coda consists of any consonant or consonants in your syllable that come after the nucleus.  I don’t know of any language that requires syllables to have codas; some languages specifically do not allow syllables to have codas.  In e-du-ca-tion, only the last syllable has a coda, -n.

Dividing Syllables
Okay, we know about onsets, nucleii, and codas now; but what do you do with words like mirthful or utensils or escort?  Which syllables do all those consonants in the middle belong to?  mirthful is two syllables, but do the r, th, and f all go with the first syllable, or all with the second, or are they divided somehow?  What about that ns in the middle of utensils, and the sc in the middle of escort?

There are a few rules that can help us in this situation.  The first one is called Maximize Onset, which basically means “shoehorn as many of the consonants into a syllable onset as you can.”  How does that work?

Let’s look at escort.  Obviously, it consists of 2 syllables, but is it divided es-cort, e-scort, or esc-ort?  Your gut reaction might be to divide it as es-cort, which is how you would hyphenate it at the end of a line in a report… but this would not divide the word correctly in terms of its syllable structure.  We want to maximize the onset of syllable 2, which gives us a division e-scort.

Now look at utensils.  We know there are three syllables, the first of which is u-.  The t- drops nicely into the onset position for syllable 2.  The –ls is plainly the coda of syllable 3.  Using the principle of Maximize Onset, we might divide the word u-te-nsils.  “But wait!  This isn’t right!” says my college linguistics teacher.

“Wait, what?”

u-te-nsils is not right.”

“But I maximized the onset just like you told me!”

*linguistics teacher strokes mustache*  “Yes, but—there was one thing I should have mentioned…”

The combination of consonants in a syllable onset has to be a combination that can begin a word in whatever language you are working in.  So, we can divide up a word e-scort, because there are plenty of English words that begin with the /sk/ sound: skirt, scar, skirmish…  However, we can’t divide u-te-nsils because no English words begin with ns-.

So now what do we do with utensils?  Well, we still want to maximize the onset, but we now know that we can’t include the –n as part of that onset, because no English word begins with ns-.  So we divide it as u-ten-sils.  Hurrah!

Now our last example, mirthful.  We have two syllables; the m- is the onset of syllable 1, and the –l is the coda of syllable 2.  What do we do with rthf?  After our experience with utensils, let’s start with a small onset and work our way up to as large an onset as we can.  -ful is perfectly fine—lots of English words start with f-.  But -thful is a problem—no English words start with thf-.  -rthful is even worse!  So, even though we tried to maximize the onset, only the f- can be part of syllable 2; -rth will just have to go with syllable 1.  mirth-ful.

(You may be tempted to make up a rule something like this: the combination of consonants in a syllable coda has to be a combination that can end a word.  Alas, this rule frequently does not work.  It is true that, if a combo of consonants can end a word, that combo of consonants can be a coda; but not all codas can end a word.  What would you do with a language where all the word-final consonants get dropped, for example?)

Predicting Syllable Division in Unknown and Fantasy Languages
Our Maximize Onset rule is all very well, but so far, the only way we have to know if Language X will allow a certain combo of consonants to be an onset is to know if there are words starting with that combo in Language X.  So how do we divide syllables if we A) don’t know the language that well or B) are in the process of making the language up from scratch?

To PREDICT what combinations of consonants are possible in syllable onsets and codas, we need to know the Sonority Sequencing Principle.  (More hard-to-understand linguistics jargon!  But I’ll explain.)

Sonority is a fancy linguistics word that has to do with how vowel-like a certain sound is.  A more vowel-like sound is more sonorous.  Obviously, the most vowel-like (and therefore most sonorous) sounds are… vowels.  Different consonants have different degrees of vowel-like-ness.  For example, y and w are extremely vowel-like. (Think about it.  In English, y is so vowel like that kindergartners are taught that the vowels are a, e, i, o, u and sometimes y.)  On the other hand, p, t, and k are not vowel-like at all.

(The /th/ in f/th/h_/h is th as in thatch.  The /th/ in v/th is th as in that.  English used to have two different letters for these two sounds, but neither letter is used in English any more.)

(For an explanation of why certain sounds are classed together in the chart above, look for my next post in this series.  Be aware that some languages are less discerning in terms of sonority than others.  They might class p/t/k/b/d/g all together, or f/th/h/v/th all together, or p/t/k/b/d/g/f/th/h/v/th all together, or m/n/l/r all together… you get the idea.)

(s and z sometimes behave strangely.  They ought to fall right into some of the classes that are already on the chart (f/th/s/h and v/th/z), but they don’t always comply.  When they’re misbehaving, they usually do it at the edges of a syllable—so, s may be the first element of a three-consonant onset where you didn’t expect more than two consonants, or the last last element of a three-consonant coda where you didn’t expect more than two consonants.)

But back to the Sonority Sequencing Principle.  The Sonority Sequencing Principle says that sounds should be arranged in an order such that the sounds get more sonorous as they get closer to the nucleus, and get less sonorous as they get farther from the nucleus.

For example, look at the following diagram, which shows the sonority of the sounds in the one-syllable word prank.

The sonority starts low with p, gets higher with r, peaks with the nucelus vowel a, drops to n, and drops even more to k.  The sounds in the syllable that are closer to the nucleus have higher sonority than the sounds that are farther away.

How does this help us to predict how syllables will divide?  Look at the following sonority diagram for the word pran-king.

To divide the syllables, we wanted to maximize the onset of the second syllable.  k is less sonorous than i, and can drop into the onset just fine.  But n is more sonorous than k, so it can’t fit into the onset of syllable 2.  We want a one-syllable sonority curve that looks like this:

Not like this or this:

So that’s how you can predict syllable division in unknown languages!

Obviously, there are complications and things that languages do differently… like, can you have two consonants in an onset that have the SAME sonority? In English, no (No be-dbugs here!) but in some other languages, yes.  And what consonants count as having the same sonority, exactly?  You can determine these things for your fantasy language as you wish.

Syllable Structure Rules,
or, the short version of “How to make your language consistent and distinctive.”

In what follows, please note:
C = a consonant
V = a vowel
CC = 2 consonants (and CCC is 3 consonants, and so on)
VV = a long vowel (ee, uu) or a diphthong (au, oi, ay)
An onset that contains more than one consonant (CC, CCC, etc.) is a complex onset.  A coda that contains more than one consonant (CC, CCC, etc.) is a complex coda.

So what is it about syllable structure rules that makes a language distinctive?  Different languages allow for different groups of syllable types, which make the languages look and sound different.

A list of syllable types:

V – Nucleus only.  The a in a-bove and in re-a-li-za-tion.
CV – Simple onset and nucleus only.  The re, li, and za in re-a-li-za-tion.  MOST COMMON syllable type in almost all (if not all) languages.
VC – Nucleus and simple coda only.  The al in al-ge-bra, the im in im-po-ssi-ble. (See how the l is acting as the nucleus of the last syllable in impossible?  Cool, huh?)
CVC – Simple onset, nucleus, and simple coda.  cat, mod, fur, lip.
VV – Heavy nucleus, either a long vowel or a diphthong.  aah!  oy!  ay!  ow!
CVV – Onset and heavy nucleus.  The koi in koi pond.
VVC, CVVC – These are less common in the world’s languages.  English still has them though: oil, boil.
CCV, CCVV – Complex two-C onset plus nucleus.  Some languages (like Biblical Hebrew) do not allow complex onsets.  English does: spa, spay.
VCC, CVCC – With complex two-C coda.  Many languages do not allow complex codas.  English does: ark, park.
VVCC, CVVCC – With heavy nucleus AND complex two-C coda!  Man, this syllable is just getting too complicated!  A few languages will allow it… but I don’t think English does, at least not standard English.
CCVCC – With complex onset and coda.  English is okay with this one, although it’s not terribly common: prank, blurb.
CCVVCC – Not in English, but yes elsewhere.
CCCV, CCCVV, CCCVC, CCCVCC, CCCVCCC – Yep, these exist, though not in English.  At least some of them occur in Russian.  (Scary!)
CCCCV, CCCCVV, CCCCVC, CCCCVVC, CCCCVCC, CCCCVCCC – At least some of these exist, although even in the languages where they are possible they are extremely rare.

All languages have limits on the types of syllables that are allowed in that language.  

For example, to the best of my knowledge Hawaiian only allows V, CV, VV, and CVV syllables.  No complex onsets, and no codas at all—so Hawaiian is easy for almost anyone to pronounce.

Mandarin Chinese is almost as simple in its syllable structure rules.  As far as I know, it allows only V, CV, CVn, VV, CVV, and CVVn syllables.  What are CVn syllables?  This means that Mandarin only allows syllables to have codas if the codas are nasal sounds (m?, n, or ng).

English allows a much greater variety of syllable structures: V, CV, VC, CVC, VV, CVV, VVC, CVVC, CCV, CCVV, VCC, CVCC, CCVCC...  What are the limiting factors?  First, no more than 2 consonants in an onset or a coda unless one of them is s: spring, parks.  Second, you can’t add a complex coda to a heavy nucleus.

Russian allows more than two consonants in onsets and codas.

Biblical Hebrew, as it developed in the two thousand years between the period of the Judges (before 1050 BC or so) and the time of the Tiberian Masoretes who wrote the vowels in (around 800 AD or so), gives us a nice example of how a language’s syllable structure rules can change over time.

“Proto-Hebrew” – All syllables MUST have an onset (so no V or VC syllables).  NO complex codas or complex onsets are allowed (so no CCV or CVCC syllables).  Thus, only CV, CVV, CVVC, and CVC syllables are possible.

Early Hebrew (say around 1000 BC) – All syllables MUST still have an onset.  NO complex onsets are allowed, but complex codas ARE allowed at the end of a word.  Thus, CV, CVV, CVC, CVVC, and CVCC (word-final) syllables are possible.

Late Kingdom Hebrew (say around 750 BC) – Two consonants have gone silent, so not all syllables have an onset anymore—but they are still spelled as if they do. (Fun, right?)  No complex onsets.  Complex codas are okay at the end of a word.  Thus, CV, CVV, CVC, CVCC (word-final), (?)V, (?) VV, (?)VC, and (?)VCC (word-final) syllables are possible.

Tiberian Masoretic Hebrew (around 800 AD) – Those two consonants are still silent.  NO complex onsets OR complex codas, even at the end of a word!  Thus, CV, CVV, CVC, (?)V, (?)VV, and (?)VC syllables are possible.

Modern Hebrew (2016 AD) – Those two consonants are still silent… and some borrowed words have syllables without onsets.  Complex onsets occur in both ordinary and borrowed words.  Complex codas appear only in borrowed words.  Thus, CV, CVV, CVC, CVCC (borrowed), CVVC, CCV, CCVV, CCVC, CCVCC (borrowed), V, VV, VC, VVC, and VCC (borrowed) syllables are possible.

Syllable Structure Rules for Your Fantasy Language
When you create your fantasy language, choose your syllable structure rules, and make sure your words obey them.

1. Do you have to have an onset, or not?
2. Can you have a coda, or not?
3. Can you have a heavy nucleus, or not?
4. Can you have a complex onset?  If so, how many consonants can it be? Don’t say more than four… unless you are writing a comedy. J
5. If you can have a coda, how many consonants can it be?  Probably don’t allow more than three.

And that’s all!