Thursday, July 16, 2009

Digital Data and Japanese Language Research

I listened to this lecture by Tanomura last weekend. He stressed the importance of using a large corpus for studying linguistic phenomena, as a large sampling of data allows us to see patterns in language change more accurately. A corpus is also useful for seeing patterns in collocations. Search engines are not suitable for use as corpora, because the results they give are too variable.

Corpus:

日本語コーパス

Friday, July 10, 2009

Rotuman and Phonology

In a recent class, our task was to analyze the Austronesian language Rotuman. The purpose of this task was to familiarize ourselves with using distinctive features to analyze processes of neutralization. The handout our teacher gave us clearly stated that the vowel system of Rotuman is complex and that there may be more than one way to analyze a particular case of neutralization.

I devoted a lot of time and thought to my analysis and was sure that I had come up with good arguments to support my approach. I was therefore surprised to find that my teacher had approached the problem in a very different (and in my opinion, erroneous) way. During the class, I attempted to make a case for my method of analysis, but unfortunately I don't think it got across very well. Later reflection still has me convinced that my approach is superior, and I will try to explain why here.

We were given a list of 92 words from Rotuman, each with a phonetic transcription and a gloss. The vowels and their environments can be summarized as follows:

(1) Ten different vowels appear in closed syllables
(2) Five different vowels appear word-finally
(3) In a disyllabic word with the structure CVCV, a total seven different vowels appear for the first vowel, and their qualities depend on the quality of the second vowel

My teacher started his analysis with the ten vowels in (1), setting them all up as phonemes. He then went on to describe a process of neutralization for the five vowels in word-final position. In other words, he started with the ten vowels in (1) and then narrowed them down to the five vowels in (2) by explaining this as the result of neutralization. He then went on to explain the seven vowels in (3) through neutralization, where the two "extra" vowels are explained as contextual variants.

The first major flaw in this analysis is that it completely ignores the distinction between the phonemic and the phonetic. Previously in the class, we had been practicing standard approaches to phonological analysis, using concepts such as the phonemic/phonetic distinction, complementary distribution, allophones, etc. However, it seems that my teacher failed to take these approaches into account when analyzing using a system of distinctive features. I questioned him on this point, asking how we should distinguish phonemes from phones when using a distinctive-feature-based approach. He replied that (and maybe this was just for the sake of practice) we take all the sounds in the phonetic transcription to be phonemes! This is what I think makes his analysis flawed. Consider the definition of neutralization:

Neutralization: the identical phonetic realization of distinct phonemes.

Notice that the very definition of neutralization presupposes that we make the phonemic/phonetic distinction. Regardless of whether we use distinctive features or not, we must make this distinction. If we accept that all phonetic realizations are also phonemic we will have no way of knowing what a speaker's phonemic knowledge entails.

The advantages to using distinctive features are that they allow us to predict speakers' choices beyond the specific data given, and that they allow us to describe with accuracy exactly what is being neutralized when neutralization occurs.

The moral of this story, which I think was lost on most of the students, is that using distinctive features can add to the power of our analysis once we have determined the phonemic inventory of a language and described common allophones, etc. In other words, the methods of the Prague School and "standard" Structuralism aren't mutually exclusive.

In my analysis, I started with the five word-final vowels as phonemes and described other variants as allophones (contextual variants). In a word with the structure CVCV, certain distinctive features for the first vowel are neutralized by the features of the second vowel. In closed syllables, the ten vowels that can appear there undergo some kind of conditioning (I later found out this was metathesis).

One objection that was raised to my approach was that if we started with five vowels, we would have to assume an increased number of contextual variants in closed syllables. But this is the normal state of affairs! It is natural that there will always be many many more phones than phonemes in a particular language.

After completing my analysis, I checked for some information on Rotuman, and it showed that indeed there are a total of five vowel phonemes in the language and that the reason for the variety of vowels in closed syllables results from the effects of vowel sandhi after metathesis. Thus the validity of my argument has been confirmed. Why did it have to be so difficult?!