Proceedings of the International Conference on Spoken Language Processing, ICSLP 2002 (Denver, USA, September 2002)
We describe a newly created broadcast news (BN) corpus based on programs of seven different German and Austrian TV stations and the development of a German BN transcription system based on this corpus. We report on a series of experiments addressing the fact that German is less suited than English for word-based trigram language models. Furthermore, we investigate various phoneme sets and examine the difference between a transregional standard (Bavarian dialect spoken in southern Germany and Austria) and standard German (Hochdeutsch) on the word error rate.
To access the full article, please fill in the form below.
Your name and e-mail are going to be used in order to send you only the research file and not any additional commercial material. You can change your mind at any time by clicking the unsubscribe in the footer of the email that you receive from us, or by contacting dataprotectionofficer@hensoldt-analytics.com. Please find out about your rights and choices and how we use your information in our Privacy Policy.