ISMIL Homepage  | ISMIL 6 Homepage  | Venue and Accommodations  | Getting there  | Programme  | Presenters

The Sixth

INTERNATIONAL SYMPOSIUM ON MALAY/INDONESIAN LINGUISTICS

3 - 5 August 2002

Nirwana Resort Hotel, Bintan Island, Riau, Indonesia


Computational Tools for Malay / Indonesian Linguistics
Brad Taylor & David Gil
Max Planck Institute for Evolutionary Anthropology, Leipzig
gil@eva.mpg.de

This talk will provide a brief presentation of MasterFM, a computational tool being developed by the Max Planck Institute Jakarta Field Station for the transcription, coding, and analysis of spoken Malay and Indonesian corpora. Based on FileMaker Pro, MasterFM provides database templates for the transcription and coding of naturalistic data, facilitating sophisticated search procedures for grammatical analysis. At present, these computerised systems have been applied to a corpus of over 500,000 utterances from 8 longitudinal studies of young children acquiring the Jakarta dialect of Indonesian, and to a variety of smaller corpora from various dialects of Malay / Indonesian spoken in Sumatra, Kalimantan and Java.

For data in digital video or audio format, MasterFM provides a system of automatic linkage between the FileMaker database and two media playing applications, QuickTime for Macintosh and Windows Media Player for PC. With such automatic linkage, clicking on an utterance in the database selects a passage of the video containing that utterance, making it possible and convenient to inspect such features as the exact pronunciation of the utterance, and the situation within which it occurred. Alternatively, with a display of continuous text in MasterFM, it is posssible to run the video, with the text scrolling down in accordance.

Another feature of MasterFM is AutoGloss, a tool which performs automatic interlinear glossing, making use of a bipartite glossary consisting of stems and affixes respectively. AutoGloss performs morphological analysis, in order to decompose complex words into their constituent stems and affixes. In cases where a single form has more than one possible gloss, AutoGloss presents the user with the available choices, eg. "kaya": "like" or "rich".


/ismil/6