吳實錄

Annals of Wu

漢藏緬語々言研究ㄟ博客
a sinotibetoburman linguistics blog
2012-12-19

Every Dialect Is A Creole discussion

It’s enough to make you pull your hair out. You’re looking for the pronunciation of a single character which should not be a 破音词. It’s a simple one with a single meaning. It’s the character 多, this time.

You pull out your handy dictionary and check the index, which tells you the entry you want it on page 248… and 290. That’s ok though; lots of entries are duplicated since the dictionary is organised by category, not by stroke or pronunciation.

Flipping to page 248 you find /tu/. Sounds right. Checking page 290 to be sure, you find… /tɑ/. Hmm. The note says the latter is 代词, and that the reading has been held over from a much earlier pronunciation. You’ve just added a layer. Specifically, you can not count on being able to convert 多 to any transcription without knowing the context and usage. That means your parsing has to be that much more on-the-ball. Or maybe it’s worse than that, and your entire understanding of the situation is off.

It’s easy when dealing with dialects to get frustrated. It’s especially easy if you have any expectation of things being systematic. To summarise a pretty clear expert on the topic, “every dialect in China is a creole”. It’s not that Spanish and Italian evolved from Latin but on different routes. It’s more like, that happened, but with lots of borrowing from French and Arabic, and from each other in not-so-predictable ways along the way. So 五 is /ŋ/ until /ʋʊ/ is borrowed from neighbouring Mandarin dialects and then /wu/ is borrowed a little bit later.

Language contact has always been rampant and things like Hangzhou dialect with its substantial influence from Song immigrants is not so much an exception as it is a more obvious example of the rule.

It’s not enough to apply sound change rules to Mandarin and expect to get Wu, or even to get an interesting dialect of Mandarin (连云港话 anyone?). Since pretty much all digital setups are based on Mandarin, it pretty much means you have to start from scratch to make a system that’s natively comfortable with Wu, knowing when character X is pronounced Y and when Z, and it’s not going to agree with Mandarin.

It is frustrating. And it’s time consuming. But it’s the reality. Lots of the work has been done. The only thing that hasn’t is getting it all online in a way that it can be combined and utilised in the best way possible.

    Leave a comment




    About

    A semi-academic linguistics blog about Sinotibetan, previously focused primarily on Wú, a Sinitic language spoken in the Yangtze Delta region. Topics now include historical linguistics, documentation, language rights, sociolinguistics and learning materials, as well as acting as the dev blog for Phonemica from time to time.

    I'm a linguist based in Asia, working on documentation and historical development of Sinotibetan. In addition to academic research, I'm heavily involved in Phonemica, an organisation that promotes crowd-sourced preservation of local languages.

    I'm currently in the field, so getting in touch isn't easy. However you can try to email me at the following address and I'll respond as soon as I'm able:

    yhilan.ko@gmail.com
    © 2009-2017