Proceedings of the International Conference on Spoken Language Processing, ICSLP 2002 (Denver, USA, September 2002)
We describe a newly created broadcast news (BN) corpus based on programs of seven different German and Austrian TV stations and the development of a German BN transcription system based on this corpus. We report on a series of experiments addressing the fact that German is less suited than English for word-based trigram language models. Furthermore, we investigate various phoneme sets and examine the difference between a transregional standard (Bavarian dialect spoken in southern Germany and Austria) and standard German (Hochdeutsch) on the word error rate.
To access the full article, please fill in the form below.