Annals of Wu

a sinotibetoburman linguistics blog

Name Changes And Challenging Orthographies discussion - orthography

A recent Article in the New York Times, Tribes See Name on Oregon Maps as Being Out of Bounds which addressed efforts by local native groups to replace certain toponyms in the area. Specifically the term "squaw" was targeted as being offensive. As a solution, the groups provided a number of alternatives that would better reflect their culture as the indigenous peoples. From the article:

“I really didn’t think it would be this hard,” said Teara Farrow Ferman, manager of cultural resource programs for the Confederated Tribes of the Umatilla Indian Reservation. “I didn’t think that we would still be disputing this after so much time.”

The county agreed to change most of the names, but it would not accept the Indian names proposed by the tribes.

The reason given being that they are seen as too difficult to pronounce. "United States Board on Geographic Names… will not accept new ones without a consensus among interested local groups and state and local officials" the article goes on to say.

Officials protested that some of the name changes proposed by Native Americans — like Sáykiptatpa and Nikéemex — were too hard to pronounce, prompting the tribes to create an interactive pronunciation guide.
“Seriously, can you pronounce them?” asked Mr. Britton, the county commissioner. “It’s a safety issue. Someone making a 911 call has to say the location, and the dispatcher has to understand and repeat it to the sheriff.”

Setting aside the fact that people will probably still say "Squaw Lake" when making a panicked 911 call (just like people tended to dial the more familiar 411 instead), I think there's a different issue here and one which is much easier to address. The issue, I believe, is ultimately with orthography. Or maybe racism, but then racism and orthography. I'll stick to orthography for now.

An example I brought up earlier is the proposed name Weelikéecet Creek. In recordings provided by the supporters of the name changes, this sounds to me like /wɛlikætsɪt/. Without trying to dictate orthographies since this is an incredibly sensitive topic, I would like to argue that there must be some middle ground since /wɛlikætsɪt/ is actually not particularly difficult for the average resident. I think instead people are getting thrown off by the unfamiliar orthography.

I don't know if this is a standard orthography for the language, but if this were happening in the not-too-distant past, I have no doubt that would have been rendered as and no one living there today would think twice about the name of their town, having grown up with it being as familiar a word as Chicago or Topeka or Biloxi.

It seems to me, understanding full well my position as an uninformed outsider, that the issue may not be the names themselves as much as it is about how people are seeing them for the first time.

    Confusing Etymologies In Hakka discussion - orthography

    There are some intereting things going on with characters as used by a large number of Taiwanese Hakka speakers. There's a phrase, dá sóng (here rendered in Hoiliuk dialect). It basically means to waste. The usual way this gets written by a lot of speakers these days is 打爽. Note the second character 爽 would normally mean "refreshing" or "pleasurable". What sense, then, does "hit/do refreshing" make as a phrase meaning "to waste"?

    The obvious answer is that 爽 is not the original character for this phrase. In fact, it should be 喪. However in Hakka 喪 is pronounced sòng, while the second syllable in 打喪 is sóng. The tone is wrong. Meanwhile since 爽 in the same dialect is sóng, people have taken to writing the phrase as 打爽.

    I was speaking with a friend of mine, a native speaker, about how there's no real reason not to use 喪 to write the phrase, since a lot of characters have multiple readings, and many varying on tone alone. After some discussion she conceded that her own teacher, a highly respected Hakka scholar in the area, would also agree that 喪 is the way it should be written, and that he laments people's use of 爽.

    Historically, etymologically, semantically, the "correct" character for this phrase is 喪. So what do you do with 本字 when general usage has otherwise abandoned it? My answer is that you should resist, and that the value in having cognates preserved in the orthography is sufficient enough that it's worth pushing for the use of the "correct" character.

    Interestingly enough is the use in the same dialect of the character 冇, rendered mao in Mandarin and borrowed from a Cantonese simplification of 無. This is sort of the opposide side of the same sort of problem discussed above. In Hakka, the word represented by 冇 isn't pronounced anything like 無, and doesn't share any etymological connection. Instead it's pronoucned pǎng and means "hollow". It should be apparent that the borrowing is the result of the graphical comparison between 有 and 冇, which in following the same logic is what many people argue is the origin of the glyph in the first place (though that's not actually true; it really is just a simplification of 無).

    An example of the use of pǎng is the phrase 打冇嘴 dá pǎng zhǒi which means to speak about something about which you don't actually know or haven't considered.

    This actually illustrates for me the problem with sòng. That is, the origin of pǎng isn't clear in the written form since the character doesn't reflect its origins. It's not a simple task to track down how it originated or what the cognate might be, assuming there is on. It might also be the restult of some substratum. At this point I don't know.

    I feel pretty strongly that there is real value in having this stuff encoded in the orthography. "But that's prescriptivism" you might say. You're right. But prescriptivism isn't really the problem with prescriptivists, now is it?

      Revisiting Tianweiban discussion - general

      Today I was having a conversation with a friend about placenames in Southern China. I brought up Sawndip and the occasional non-Mandarin characters that show up in placenames coming from languages like Zhuang, with 岜 bya being one of the more common ones as well as one of the few encoded in Unicode.
      In our discussion I did a quick Google search to bring up some visual examples and came across an old post on Language Hat from 2007 which quotes from an AP article.
      Quoting a local resident speaking to Xinhua:
      The character ere described is 湴, likely pronounced bàn in MSM. The interesting thing is that the problem which the villagers were facing, and the reason anyone was writing about this at all in 2007, was that people couldn't type the character 湴 for things like legal documents. The reason? The PRC is using an outdated character encoding standard, GB 2312. Newer standards such as GB 18030 or Unicode do support the character. It's not even an issue in the style of Ma Cheng (馬馬馬馬) where even modern systems have problems, but rather just that the systems in use by the State are outdated.
      I realise I'm a few years late in writing about this. I vaguely recall being aware of it when it was happening, but since it came up and took a fair amount of time from the discussion today I thought it worth revisiting.

        A Possible Tonal Connection To Shanghainese Voiced Implosive Onsets discussion - tone

        Zhengzhang Shangfang has previously written about voiced implosives in Shanghainese and Southern Wú. The very short summary is that in some Wú dialects implosives are found in place of the voiceless unaspirated onsets. So instead of [pʰ] [p] and [b], we see [pʰ] [ɓ] and [b]. They do not have corresponding vowel quality changes, so there is still a clear difference between /ɓa/ and /ba/ where the former lacks the breathy voicing found in the latter.

        An oft-cited reason for this is that there must be some substratum, Tai-Kadai or otherwise, which had implosive obstruents and that's why they're showing up in Modern Wú.

        This never sat well with me, and I know I'm not the only one. I recall talking to a respected scholar about this at one point and was answered with a comment along the lines of "substrates are what scholars point to when they really just don't know the answer".

        There is one other possibility. The following is from a piece on phonemic tone but may apply here:

        We can speculate both on articulatory and perceptual grounds [for a particular tonal phenomenon]. First, a possible explanation is that the voiced consonants went through an implosive stage (b > ɓ) before merging with the voiceless series. Since implosives have a tendency to raise the F0 of the following vowel, it would not be surprising to find lower tonal reflexes on vowels following historically voiceless consonants.

        The significant part is this: Implosives do not do anything significant to fundamental frequency if developing from previously voiceless stops. A change from [b] to [ɓ] would result in a raising of F0, but [p] to [ɓ] wouldn't. No contrast would be lost by this change. It is conceivable that there are phonological motivations for this development. Rather than grasping at the substratum cause, which doesn't itself actually address the issue but rather only gives a convenient "hey look over there", there may be room for analysis on phonological grounds taking tone into consideration.

        I'm not offering that here. But it's not a bad possibility for some future research.
        1. Jean-Marie Hombert. Consonnt Types, Vowel Quality, and Tone. In Fromkin, Victoria: Tone. 1978

        Languages, Dialects And Varieties discussion - general

        Very often, people ask what the difference is between a langauge and a dialect, the idea being that there is some scientificly justified line in the sand. Linguists spend all this time studying language, so clearly they would have an answer.

        The problem is that there isn't one. There simply is no scientifically objective difference between a language and a dialect on any linguistic grounds. The distinction is entirely extra-linguistic, and is instead based on sociopolitical factors. That said, an military force is not a necessary or sufficient condition, and even those who are fans of quoting Weinreich would agree that some languages lack both but still get to be called languages.

        A common follow up question is about what linguists use. If the terms are based on extralinguistic determiners, then how do linguists talk about speech varieties? One way is simply to call them varieties, since that's a term which lacks the baggage of the other two options. I know a number of people who only use this term. Alternatively, linguists use the terms language and dialect, but with the full understanding that these are sociological labels and not scientific ones and are flexible regarding their referents. Of course, sometimes they are used with some intent, an effort to bring the listener to a certain frame of thought. If I call Cantonese a language, it's possible I mean something very specific by doing so. But even then, it's extra-linguistic. It might be a matter of minority langauge rights or it might be a point being made about distance to related languages. But in any case, it's not a scientifically meaningful term with static boundaries.

        There's another common response, which is to bring up mutual intelligibility. This is also insufficient for a couple reasons. To begin with, mutual intellibility doesn't take into account dialect continua. I can provide two dialects that everyone would agree are Mandarin but which to native speakers would not be mutually intelligible. A person from rural Nantong would need to accomodate pretty significantly to be understood in Beijing, to the point of speaking a whole different variety altogether. Another problem with mutual intelligibility is that it's very difficult to test objectively. I can find a number of Beijing Mandarin speakers who have no experience with Cantonese. It will be harder to find Cantonese speakers without any exposure to Mandarin. And then finally, you cannot account for motivation when determining mutual intelligibility. The two speakers do not have the same sorts of sociolinguistic pressures to understand the other's variety, especially when one speaks something quite close to the prestige variety.

        So it's problematic as we've established. That's not to say there aren't useful solutions, so long as we can keep in mind that they're simply terms of convention and not of scientific description. The following is a quote from Furgeson & Glumperz regarding the difference between languages, dialects and varieties. It's a set of definitions I find myself often repeating for how simply and intuitively they are defined.
        A variety is any body of human speech patterns which is sufficiently homogeneous to be analyzed by available techniques of synchronic description and which has a sufficiently large repertory with broad enough semantic scope to function in all normal contexts of communication.

        A language consists of all varieties which share a single super-posed variety having substantial similarity in phonology and grammar with the included varieties of which are mutually intelligible or are connected by a series of mutually intelligible varieties.

        A dialect is any set of one or more varieties of a language which share at least sone feature or combination of features setting them apart from other varieties of the language, and which may appropriately be treated as a unit on linguistic or non-linguistic grounds.
        Simple as that. I particularly like how it takes into account dialect continua. It's inclusive enough to be fairly unobjectionable, but simple enough to not get bogged down in the details or exceptions to the definitions.
        1. Furgeson, Charles A. & Gumperz, John J., Linguistic Diversity in South Asia, International Journal of American Linguistics, 1960

        Fluffy Benches discussion - orthography

        "Sofa" as 沙發 is one of the widely cited examples of words in Mandarin which came from a foreign language by way of Shanghainese, still pronounced /so.fa/. I'd made a comment on this just in passing while speaking to a Hakka friend of mine, only to quickly be reminded that Hakka does not use the word 沙發, despite being widespread elsewhere.

        In Níngbō a hundred years ago, you might have been likely to hear 春凳 instead. But in Hakka, a sofa is referred to as 肨凳. If you're unsure of that first character, it's a variant of 胖. Except that in Hakka it's not. Every non-Hakka character dictionary I've checked has 肨 listed as a variant of 胖, and thus meaning "fat" and pronounced the same. Hakka meanwhile has split these. 肨 is pronounced pong55 – with a meaning of fluffy (or swollen) – while 胖 is pang55 – meaning fat. You can't call a sofa 胖凳, and you won't refer to a fat kid as being 肨. There's an additional meaning to 肨, pang24, which has the meaning "scent".

        As far as I can gather without spending a few hours in the library, the two words 肨 and 胖 may originally have just been character variants, and then through Mandarin influence, they became split with 胖, the more commonly used word up north, took on the more focused meaning of "fat" compared to "fluffy" or "swollen", and carried a more Mandarin-like pronunciation with it. 肨 meanwhile was left to its original pronunciation with the meaning of "fat but not like that guy over there is fat".

        There's an alternative possibility, which is that 肨凳 could be a fossil and the orthography is just meant to represent the different pronunciation. If that's the case, 肨凳 isn't the only case. There are other words listed that use 肨 as pong55xien55sam24 in Siyan dialect and pong11sa53sam53 in Hoiliuk dialect mean "sweater", a.k.a. "fluffy thread shirt".

          Nonstandard IPA – φ And η discussion - phonology

          In general, I'm a big fan of not being too strict about what gets counted in phonetic/phonemic transcriptions. My use of IPA is conditioned by years of Siniticists not doing much better in terms of reaching the standard. You will find /ᴇ/ and/ɿ/ in much of what I've written, and I'll surely continue to use them. Well, /ᴇ/ at least.

          Today I found something confusing. I was reading through 绍兴方言研究 edited by 寿永明 and came across two glyps that I've never seen in any variation of IPA before today.

          First is Greek φ, not to be confused with IPA ɸ which is also used in transcriptions. In the typeface of the book, they are visually quite distinct. I have no idea what sound this φ is representing. I first thought it might be /ɤ/ since I didn't see that used elsewhere at first, but I quickly found instances of /ɤ/ on the same page as \φ\ so I know it's not that.

          The second glyph is also from the Greek section of the Unicode tables: \η\. This is also visually quite unlike /ŋ/ since the IPA velar nasal has a left hook, while \η\ does not. Just going by Greek, it might be /ɛ/, which I don't see elsewhere. That's my best guess based on the environment it's showing up in. This seems the most likely.

          Phi on the other hand seems to denote breathy voicing. What would normally be written /ɦm/ in other sources corresponds with instances of \φ\ here. That makes some sense. <A> is used in place of <ᴀ> here as well, so my best guess is that the typesetter didn't have proper access to IPA glyphs and had to use alternatives in their place.
          1. 寿永明. 绍兴方言研究. 2005. 三联书店上海分店. 上海

          Seeking Jingjiang's River-crossing Past discussion

          Some time ago I was reading about Jingjiang 靖江市, the city on the northern bank of the Yangtze, and how the town was once on the southern bank before a shift in the river's course to the south of the town put it where it is today. If you look at modern satellite photography of the area you can see what the contours of the river used to be.

          Thing is, I can't for the life of me remember where I read it, and an afternoon of Google- and Baidu-fu turned up nothing of value.

          I know I've read about this, and I remember at the time thinking it was a trustworthy source. Something like YR Chao's own writings. And not I can't find it.

          If anyone has an idea of where I can find this information, I'd be ever grateful. Otherwise I'm spending the day in the library tomorrow and I'll have to see what I can dig up there.

            Plotting Shanghainese Tone Contours In Praat discussion - tone

            As mentioned in the last post, the excellent Shanghai Dialect — An Introduction to Speaking the Contemporary Language, Lance Eccles gives a good introduction to the language.

            Last week my buddy Qi and I sat down to record the tone contours as given in the book, to set up a comparison between the contours as represented in the book and the same words plotted in Praat.

            Altogether there are 5 groups of phrases given as examples of tone contours, each with a monosyllabic word, a bisyllabic word and a trisyllabic one to show spreading of the inital syllables tone over the whole word. In the images below, I've grouped the phrases by initial syllable.

            Falling tone

            For falling tone words, the examples given were fi (fly), fici (airplane), and ficizang (airport). The following graph shows those words in that order. You can ignore the extra high mark in the middle word. This is a mistake in Praat where the formants were too weak so the line was drawn in the wrong place.

            Middle tone

            Examples for mid-tone words are given for both checked and non-checked tone intial syllables. For non-checked, the examples are sa (what), saning (who) and sameqzï (also "what"). Again the following shows those words in that order.

            Checked examples are iq (one), iqti (a bit) and iqngenge (also "a bit"). Again, you can ignore the small anomaly at the end of the second word.

            For mid-tone words of 2 or more syllables, the pattern is the same for checked and non-checked tones with the exception of syllable length.

            Low tone

            Low tone is also split between checked and non-checked syllable initials. Examples are mwo (horse), mwozâng (immediately) and mwotongke (washroom) for non-checked.

            For checked tones, exmple words are liq (stand), liqchi (stand up) and liqchile (also "stand up).

            The isolated utterances shown on Praat are not identical to Eccles' but are certainly close enough to be of substantial value to the Shanghainese learner.

              Sometimes 入声 Isn't 入声 discussion

              I've working with some data on Changzhou Wu these days. It's interesting because aside from the merger of 上 tones into one the redistribution of some shang tones, Changzhou preserves the rest of the 8 tones*. This is true for most Northern Wu dialects, including some Pudong varieties of Shanghainese which has otherwise merged itself into 5 tones which are mostly disregarded anyway.

              In an oversimplification of the relationship between dialects, we can pretty much say that two dialects of two different Sinitic languages (ignoring Mandarin and maybe Min as well for different but comparably significant reasons) which preserve the two registers of the four tones, what is a yang ru tone in one dialect will be a yang ru in the other. What's more, an entering tone will end in p,t,k in Cantonese, p or k in Korean and -ʔ in Shanghainese or Changzhou dialect.

              Except when it doesn't. It's an oversimplification because language contact is a thing, and words get borrowed from neighbouring dialects of dissimilar languages, thus /ŋ/ is quickly becoming /ʋu/ or /wu/ along the shores of the Yangtze.

              Two cases have come up in my recent work that show this, but in a somewhat baffling way.

              1) 昨 zuó is jok3 in Cantonese, 작 jak in Korean, tạc in Vietnamese, and ought to be zɔʔ8 in Changzhou, but instead it's zo2, yang ping, corresponding to Mandarin's zuó, also yang ping.

              2) 幕 mù is mok6 in Cantonese, 막 mak in Korean, mạc in Vietnamese, and I'd expect it to be mɔʔ8 in Changzhou but actually it's mɤʊ6, yang qu, which also corresponds to the tone of the syllable in Mandarin, yin and yang having merged into what is now Mandarin's fourth tone.

              I don't have an answer, except to speculate that it is exactly what I mentioned before: These have been borrowed from across the River. The borrowing preserved the tone, and in the case of the qu sheng, assigned yin/yang based on voicing. I hope this is what happened, because it's downright fascinating if that's the case.

              If anyone has some insight into this I'd love to hear it.

              - - -
              * more on shang distribution in a future post

                Every Dialect Is A Creole discussion

                It’s enough to make you pull your hair out. You’re looking for the pronunciation of a single character which should not be a 破音词. It’s a simple one with a single meaning. It’s the character 多, this time.

                You pull out your handy dictionary and check the index, which tells you the entry you want it on page 248… and 290. That’s ok though; lots of entries are duplicated since the dictionary is organised by category, not by stroke or pronunciation.

                Flipping to page 248 you find /tu/. Sounds right. Checking page 290 to be sure, you find… /tɑ/. Hmm. The note says the latter is 代词, and that the reading has been held over from a much earlier pronunciation. You’ve just added a layer. Specifically, you can not count on being able to convert 多 to any transcription without knowing the context and usage. That means your parsing has to be that much more on-the-ball. Or maybe it’s worse than that, and your entire understanding of the situation is off.

                It’s easy when dealing with dialects to get frustrated. It’s especially easy if you have any expectation of things being systematic. To summarise a pretty clear expert on the topic, “every dialect in China is a creole”. It’s not that Spanish and Italian evolved from Latin but on different routes. It’s more like, that happened, but with lots of borrowing from French and Arabic, and from each other in not-so-predictable ways along the way. So 五 is /ŋ/ until /ʋʊ/ is borrowed from neighbouring Mandarin dialects and then /wu/ is borrowed a little bit later.

                Language contact has always been rampant and things like Hangzhou dialect with its substantial influence from Song immigrants is not so much an exception as it is a more obvious example of the rule.

                It’s not enough to apply sound change rules to Mandarin and expect to get Wu, or even to get an interesting dialect of Mandarin (连云港话 anyone?). Since pretty much all digital setups are based on Mandarin, it pretty much means you have to start from scratch to make a system that’s natively comfortable with Wu, knowing when character X is pronounced Y and when Z, and it’s not going to agree with Mandarin.

                It is frustrating. And it’s time consuming. But it’s the reality. Lots of the work has been done. The only thing that hasn’t is getting it all online in a way that it can be combined and utilised in the best way possible.

                  Preservation Of Entering Tones discussion

                  I've made the somewhat controversial comment before that living in Korea, after learning Mandarin and becoming familiar with Wu, a lot of spoken Korean was much more accessible to me than had I not worked with Wu. The specific example I'd given at the time was actually Cantonese, and how my friend who grew up speaking Cantonese to friends and Mandarin to family had little trouble making sense of spoken Korean in the earliest stages of her first semester in Seoul as a language student.

                  I made this argument based on cognates and the fact that, while producing Korean grammar is incredibly complex for the speaker, day to day conversations between casual acquaintances follow more or less the same pattern. The reason Wu proved useful has to do with the large number of Sinokorean words, the pronunciation of which being borrowed at a time when the entering tone (入聲) was still important. Now of course it's gone from Mandarin, though easily uncovered in Southern dialects.

                  It's not flawless, but a lot of words that were once entering tones now have consonants in the syllable final position. 立 and 李 are both family names, and while they're both "Li" in Mandarin, in Korean the first is Yip and the second is Yi. In Mandarin 立 was reassigned but its origins as 入聲 remain in Korean. For this reason Sinokorean pronunciation has been useful in various reconstructions of Middle Chinese phonology.

                  Shanghainese, despite being far more tonally stripped down than other Wu dialects (Changzhou still has all 8 tones), has managed to preserve all the entering tones, both yin and yang. So 立 and 李 are /liɪʔ/ and /li/.

                  I've once again been hard at work organising the Phonemica database and implementing some features, one of which has to do with how we handle tones in fangyan. I had a handful of characters that were originally entering tones but now (in Mandarin) are not. I thought I'd check them against Korean and Shanghainese and just see how they held up.

                  For reference, the four tones of Mandarin are 阴平, 阳平, 上声, 去声. The pinyin tome markers refer to those, in order.

                  一 七 乐 勿 日 发 白 百 舌 色 节 约
                  一 七 樂 勿 日 發 白 百 舌 色 節 約
                  yī qī lè wù rì fā bái bǎi shé sè jié yuē
                  일 칠 락 물 일 발 백 백 설 색 절 약
                  iɪˀ ʨʰiɪˀ ɦiɑˀ vəˀ ɲiɪˀ fɑˀ bɑˀ pɑˀ zəˀ səˀ ʨiɪˀ iɑˀ

                  Korean 白 百 色 約 and 樂 all have a /k/ ending, while the rest have what we'll call /l/. In Shanghainese, every single one ends in /ʔ/. I actually checked about 500 entering tone characters against my phonetic corpus, and almost all of them checked out.

                    Shanghainese Pitch Contours discussion

                    As requested, here are contours of different sentences with samples by a native speaker. This sentence and the corresponding audio are from Tatoeba.

                    As is always the case, the generalisations dictating what is expected aren't always spot on, and there's a lot of room for variation based on mood or the speaker, regional factors, as well as just the possibility for individual idiolects. This is all one speaker and does not necessarily represent all Shanghainese utterances or speakers.

                    Let's look at the example.

                    gəˀ ʦəˀ ʦɔ ɕiã ʨi ŋu vəˀ huø ɕi / 普通话:我不喜欢这只照相机

                    And even though I said you'd need to click through, for this one since I've split it into it's two parts, you can just listen to the parts here:

                    I'm a little worried here that I am trying to fit the reality into the generalisations, but I have to trust Qian Nairong on the validity of the sandhi rules. There are a steps to dividing this up. First, into [搿只照相机] and [我勿欢喜], which is really just dividing the sentence along the O,SV pattern that we find in Mandarin (but often in Wu). However the contours we'd expect in that case are [][], which doesn't really come close to matching what we hear in the sample. Looking at just the first half then, we need to further divide it as [搿只[照相机]], giving us [11.23[33.55.21]] which is a lot more similar to the pitch contour of the recording.

                    The second part, 我勿欢喜 also needs further division. 欢喜 by itself should be [55.31], consistent with the recording. And what we hear sounds like what you'd expect with [我勿][欢喜], [55.31][55.31].

                    Assuming this sentence is typical and consistent with the contour rules, the phrasing we should expect is [搿只][照相机],[我勿][欢喜].

                    Again, it's entirely possible that this isn't correct, and/or that this speaker's idiolect has some free variation that isn't accounted for in the contour generalisations. So take it all with a grain of salt.

                    In a coming post look at specific examples from Qian Nairong, along with his explanation of each.

                    edit: reworded some bits for clarity.

                      Understanding Tone Sandhi In Shanghainese discussion

                      Unlike Mandarin or Cantonese, spoken Shanghainese tonality operates as a pitch accent system similar to Korean or Japanese. However this does not mean that syllables in Shanghainese do not have tones. They do exist in the traditional sense, and we’ll address their importance in a moment.

                      The thing we have to consider when addressing tones is whether we’re going to be looking at them in terms that are simple and easy to understand and thus immediately useful, or in terms that offer a much more comprehensive but less intuitive understanding of the rules that determine how they manifest in the language of native speakers. In this case we’ll do both, starting with a more simple way of thinking about tone in Shanghainese.

                      There are basically three different contours that you’ll find in Shanghainese phrases. I say basically three because even though you will find some variation, it is essentially minimal and at least for now can be ignored. The contours work across phrases of 2 to 5 syllables and work out to be basically HLL, LHL and LHH with L and low tone and H as high. We can add a middle tone for longer phrases, thus creating HMML, LHML, LHMM. Five-syllable phrases follow the same pattern by duplicating a middle tone. A phrase/word like bicycle 脚踏车 would be an example of LHL. You may be asking how we know that it’s LHL and not HLL. And that’s the right question to ask.

                      Tones as a speaker of Mandarin or Cantonese would think of them are significant in Shanghainese. When syllables are isolated, you may find them to be as relevant as any other Sinitic language; each character has a tone and it is always that tone when in isolation. 老 is lǎo across the board. But, as a speaker of Mandarin, you know that 老 isn’t always lǎo, specifically when paired with another 3rd tone, such as in the case of 老鼠. In this case tone sandhi rules come into play, making it change from lǎo shǔ to láo shǔ. In Shanghainese, it’s all a lot easier. Instead of thinking of syllable to syllable tone sandhi, instead think of it as phrasal tone sandhi. The tones show up when a syllable or character is isolated, but when it’s in the middle of a sentence, the tone doesn’t matter. That’s because in Shanghainese tones only really matter at the beginning of phrases. In phrase-initial position where they determine the contour of the rest of the phrase. So with our bicycle example, 脚踏车 would be one phrase, and since 脚 is what we’ll call tone #8, we can look at the rules for determining the phrasal contour and know that it’s LHL (or MHL if we want to get specific).

                      But before we get to those rules, let’s look at the tones in isolation.The following are the five Shanghainese tones, as well as some examples of characters that have the corresponding contours. The following tables are taken from 《上海話大辭典, 辭海板》 by Qian Nairong 錢乃榮 et al, published in 2008.

                      阴平155 ˥˥刀丁姑风江天
                      阴去5334 ˧˧˦岛到顶订古故
                      阳去6113 ˩˧˧桃导道墙象匠
                      阴入755 ˥˥雀削滴踢足笔
                      阳入812 ˩˨嚼笛局读食合

                      The tones have been numbered according to the traditional system, despite 3 of these traditional tones having been lost in Shanghainese. That’s why in the previous example of 脚 I said it was tone #8, even though on the list it’s the fifth tone listed.

                      Now that we know the basic isolated tones, we can look at the rules to determine phrasal contours. The following is a table of phrases containing between 2 and 5 syllables. The number in the first columns corresponds to the tone number of the first syllable in the phrase, which in turn corresponds to the previous table.


                      Why only 2-5 syllables? Because phrases aren’t sentences. They’re more like conceptual units within a sentence. We could say something like this:
                      “The other day I went to the store but they’re closed until tomorrow because of the national holiday.”
                      However this would be quite cumbersome in any Sinitic language and, more importantly, the concepts of the sentence are easily broken down. So instead we could think of this sentance more along the lines of this:
                      “The other day, I went to the store, but, because of the national holiday, they’re closed until tomorrow.”
                      I’ve added more commas than we’d normally see in English in order to more clearly distinguish what might qualify as a phrase in our 2-5 syllable rule set.

                      In keeping with the bicycle example, let’s assume the speaker is talking about a small bicycle. Now we have a 4-syllable phrase beginning with 小, a 阳去 word with a contour of 113. According to the contour rules, we’d expect 小脚踏车 to have a contour of 22+55+33+31 with the stress on the second syllable. WIthout having 小, we’d see 33.55.31.

                      So far this is fairly straightforward. We’re ignoring the underlined numbers on tones 7 and 8 for now, but we’ll get to them in a little bit. Also It’s important to remember that in this system phrases are not equivalent to sentences. A pause between phrases would initiate a new pitch contour based on the syllable immediately following the phrase. Now that we’ve covered the basic contours, we’ll let’s look at how things can get a little more complicated in the next post.

                        Thirteen O'clock discussion

                        My friend Jason brought this up at the chit-chat at Xindanwei yesterday, which he in turn heard from someone else. It seems the insult 十三点, common in Mandarin, is originally from Shanghainese. In Shanghainese it's said zəˀ sɛ ti, pinyin "se sei di". Sounds a lot like English "society," which, as Jason brought up, is no accident. From some BSS somewhere:

                        Society ,由這個詞演变而来。開埠之初的上海,傳統的上海女人是看不慣那些在交際界(society) 混的女人。洋泾浜英語把這些女人混迹的地方稱為“society”。十三點由此也就慢慢地變成了罵女人的專用詞。往後,上海人就漸漸地淡忘了十三點的本來意思,會把十三與點分開,簡化地罵:“十三伐啦?”幹脆省略去了“點”。在今天,十三作為一個專門人的名詞,已經遠遠離開了它的原來的本意。罵誰都可以用“十三點”。

                        Long story short, in English, like in Mandarin, calling a woman a "society" girl was a way of calling them a prostitute. This carried over into the speech of the Shanghainese during the great foreign adventurer infestation of the 30s. 十三点 was just a convenient way of writing it in the Shanghai dialect. Eventually the original meaning was lost, though not the insulting nature. Now it's common in all Wu dialects, and can be found in Mandarin as well, though certainly less frequently.

                        This explanation may just be folk etymology, and the actual origin of the phrase isn't clear. There's still a fair amount of debate on this.

                          The Invention Of Fiào 覅 discussion - history

                          This is the text of an article from the 12th May 2010 edition of 新民晚报社区版, a free newspaper here in Shanghai. The article talks about some dialectal characters and ends with a quick history of the creation of 覅, the character corresponding to the Wu equivalent of “不要”. Translations are my own and approximate at best. If you read Mandarin I strongly suggest you stick to the 漢字 text.



                           With the advance of the internet, language has become more colourful. We're seeing the outpouring of new words and new characters, many of which are meant as shortened forms¹. "Biao" 表 is one such playful example, intended to express "bu yao" 不要. In the past, Chinese character sounds were given in books using fanqie character pairs, where a first character gave the initial consonant of the syllable and a second gave the rest², thus providing the reading for the original character. If you use the two characters "bu yao" in this way (since "yao" is a -iao ending), the resulting sound is "biao".


                           Of course, here "biao" is really a dialectal word to express 不要 (bu yao). The place where this is used is not far from here, but is just Hangzhou. Hanzhou natives never say "bu yao wan", "bu yao chi", but rather "biao wan" and "biao chi".


                           The character 表 is here only half-jokingly repurposed. It's only a phonetic representation, not an idiographic one. But there is a much earlier character for 不要, 嫑, which is not an invention of mine and can be found in many character dictionaries. I have always felt these 不 characters were pretty niu, for example 不正 as 歪, 不用 as 甭 and 不好 as 孬.


                           There is simply no 不 in Shanghainese. But then if someone were to wish to speak for a very long time, what should they say?


                           Shanghainese does have the character 不, for example in "stainless steel" (不锈钢, not-rust-steel), but this word entered Shanghainese from Mandarin. Otherwise, Shanghainese has 不过 (but), but it's pronounced like "毕过". So in this way, it seems Shanghainese really doesn't really have the character 不. Instead, Shanghai locals express negation with 勿, the pronunciation of which is somewhere between Mandarin's 佛 (fó) and 浮 (fé), which if spelled with pinyin would be "fé".


                           What's more, as a result of the Song capital being moved from Kaifeng to Hangzhou³, the Hangzhou dialect has a large number of northern sounds. 嫑 is an example of one. And since Shanghainese uses 勿 instead of 不, we can substitute 勿 for 不 when used, thus changing the 不 in 嫑 to 勿.


                           Can we really do this? My answer is surely we can, and this character is 覅, 不 having been changed to 勿 and moving it from above to the side. How to read this character? According to classical reading or right-to-left, this character is 勿要. Try using the fanqie method. That's right, it's read "fiào", and as such it is printed in character dictionaries.


                           This character is actually an invention of Han Bangqing. A Shanghai local, Han Bangqing wrote China's first periodical novel called A Remarkable Book of Shanghai. In it is a story called Flowers in Shanghai, in which Han Bangqing coined the character 覅.


                           Flowers in Shanghai takes place in Shanghai and has many characters speaking the Suzhou⁴ dialect. This is also evidence of why Shanghainese has many sounds similar to Suzhou dialect, and in Suzhou you'll also hear "fiào". Flowers in Shanghai is written in the vernacular, and for this reason Han Bangqing "invented" 覅.


                           不 is used in many combinations, and therefore we can use 勿 in the same way. 朆 is another such character, coming from 勿曾 and meaning "to not have". Using fanqie for 勿 and 曾, we read it as "fen". So for example if someone asks you if you've eaten yet, you can respond "hai fen chi lai", "I haven't yet eaten".

                          Right, so I kind of resent the tone of the article, but am happy to see these sorts of things get press all the same. However I can't help but feel as though a child is performing a magic trick that we already know all to well, and we an audience held captive by the fact that it's their freaking birthday.

                          It's worth noting that the original author was Han Bangqing and not Eileen Chang 张爱玲 (or Lust, Caution fame) as is often thought. Ms Chang translated the text into Mandarin, and it's from her Mandarin version that the English translation is taken from. She herself was Shanghainese, and you'll find the language used in other works of hers, including a bit of dialogue in the movie version of Lust, Caution (色, 戒).

                          Thanks to Chen for bringing this to my attention.

                          - - -
                          1. See this recent Language Log post for another example.
                          2. …tones included. If you're not familiar with the system, go check it out.
                          3. I've linked the corresponding sentence in the Chinese text to an explanation of the line in quotes, which is a verse from a poem by Lin Sheng written during the Song Dynasty. Long story short, the author of this article is using it to explain that the capital of the Song moved from the North, brining with it northern sounds.
                          4. For a long time, up until only recently, the Suzhou dialect was the prestige dialect of Wu. For a parallel in English, think of stars of the silver screen speaking Mid-Atlantic English in decades past.

                            Mélange Revisited discussion

                            I spent the better part of my day in Shanghai's book district, also known as Fuzhou Road. I tend to stay out of the giant 7-storey book city, or whatever it's called, but for one reason or another I went in and was not disappointed. They actually had a handful of books on Manchu and a Mandarin to ancient Greek dictionary, not that I have a need for ancient Greek.

                            One book that caught my eye, nudged in between the 外国小说 on the top floor, was Say it Right: A Quick Guide to Mandarin, Cantonese and Shanghainese. What better way than to test the Winchester Theory™ than a side-by-side comparison of the three languages.

                            Unfortunately it didn't seem worth the 100 so I don't have an excerpt. Needless to say, it didn't offer much in support of the idea that Wu is a mixture of MSM and Yue and little more.

                            Anyway take a look if you happen to be in the bookstore. It's at least interesting to see the comparison on the level of specific phrases.

                              The Mélange discussion

                              Slightly different from previous ideas on the origin of Wu, the following comes from Simon Winchester's 1996 book The River at the Center of the World.

                              "…her people … [speak] the ugliest of languages, a discordant mélange of Mandarin and Cantonese spoken by no one outside the Yangtze delta…"

                              I've always enjoyed Simon Winchester's books. I've read many of his works, starting with The Professor and the Madman, through Krakatoa, The Map that Changed the World and most recently The Man Who Loved China. Now I'm in the middle of the book quoted here and for the most part I'm still quite enjoying it.

                              But Wu as a mélange, a creole? I guess I can see it. There are a great number of aspects to Wu that are cognate with Cantonese (Yue 粵語) and a great number cognate with Mandarin. But it seems that makes it no less a mixture than Castillian is a mixture of Catalan and Portuguese or Catalan a mixture of Castillian Spanish and Parisian French. One would not be too far off to describe Catalan as such, but I think it severely misses the point.

                              Ugly or not and I tend to think not, I believe Wu is significantly more than some pidgin or creole made up scraps of Mandarin and Cantonese, especially considering Shanghainese far outdoes Mandarin Proper in longevity given the comparatively recent creation of Mandarin, the presence of dialects of Northern Chinese aside.

                              But, maybe I'm making something out of nothing, so other than this post, I'll let it go.

                              Completely off the topic of Wu classification, I was a little disappointed to see a number of errors in the maps in the book. I expected the cartography to be spot on given the resources at the author's disposal. I'll chalk it up to an editing error.

                                Language V. Dialect discussion

                                The following is from page 144 of 中国的语言1. It's one of the few sections written in English, titled "Chinese", and attempts to give a brief introduction to the language/s.

                                 The Chinese dialect situation is complex. Generally, they are divided into seven major regional dialects: Northern, Wu, Xiang (Hunan), Gan (Jiangxi), Kejia (Hakka), Yue (Guangdong), Min (Fujian). Their grammar and basic vocabulary are more or less the same, but the phonological systems are different. These differences manifest different patterns of consistent changes and regular correspondences. If people from two different dialects can decode the corresponding relationship of phonological systems of each other's dialect, they can communicate.

                                Italics added. It amounts to the matrix theory mentioned a long while back both on the site and in comments: If one only could apply a phonetic filter, would the differences between the topoloects/dialects/languages be negated? It's been discussed at length before, and I firmly believe the answer is a resounding "no". It baffles me that anyone who's studied languages in China would believe this could be true, so I remind myself I'm reading it in a book that touches neither Wu nor Yue. The section was written by Xíng Gōngwǎn 邢公畹 and it's not clear if it has been published outside of this text.

                                It's a difficult task to champion the cause of "China has a bunch of languages" in favour of "…dialects". I'm not at all sure why that's the case. For all the talk of the diversity of China, it's difficult to say why one wouldn't choose to brag about the number of languages that have developed here, rather than continue to push the idea that everyone is speaking the same thing.

                                The following examples taken from the same book, page 464. These are from the dialect/language spoken by the Zouzuo, Chinese name Róurùo 柔若, residing in the Nujiang Lisu Autonomous Prefecture in Yunnan.



                                In Mandarin, that would be 你家里有几个人 and 我比你大五岁.
                                - - -
                                1. published 2005, 2007 by 商务印书出牌, ISBN 7-100-04363-8


                                  A semi-academic linguistics blog about Sinotibetan, previously focused primarily on Wú, a Sinitic language spoken in the Yangtze Delta region. Topics now include historical linguistics, documentation, language rights, sociolinguistics and learning materials, as well as acting as the dev blog for Phonemica from time to time.

                                  I'm a linguist based in Asia, working on documentation and historical development of Sinotibetan. In addition to academic research, I'm heavily involved in Phonemica, an organisation that promotes crowd-sourced preservation of local languages.

                                  I'm currently in the field, so getting in touch isn't easy. However you can try to email me at the following address and I'll respond as soon as I'm able:

                                  © 2009-2017