Fitting German into N-Gram Language Models

Proceedings of the International Conference on Text, Speech, and Dialogue, TSD 2002 (Brno, Czech Republic, September 2002)

In January 2021, SAIL LABS Technology GmbH was acquired by the sensor specialist HENSOLDT and became HENSOLDT Analytics.

We report on a series of experiments addressing the fact that German is less suited than English for word-based n-gram language models. Several systems were trained at different vocabulary sizes using various sets of lexical units. They were evaluated against a newly created corpus of German and Austrian broadcast news.

To access the full article, please fill in the form below.

Your name and e-mail are going to be used in order to send you only the research file and not any additional commercial material. You can change your mind at any time by clicking the unsubscribe in the footer of the email that you receive from us, or by contacting dataprotectionofficer@hensoldt-analytics.com. Please find out about your rights and choices and how we use your information in our Privacy Policy.

HENSOLDT Analytics

HENSOLDT Analytics is a global leading provider of Open Source Intelligence (OSINT) systems and Natural Language Processing technologies, such as Automatic Speech Recognition, which are key elements for media monitoring and analysis.

Fitting German into N-Gram Language Models

Privacy Policy

HENSOLDT Analytics

News & Research

Company

Contact

Contact us to discover our solutions

Webdemo

Request access to our demo system

Podcast

Lisen to tour last intelligence episodes now

Careers & Jobs

Join us in shaping the future