Development of a Modern Greek Broadcast-News Corpus and Speech Recognition System

Email
Print

submitted to NODALIDA 2007, the 16th Nordic Conference of Computational Linguistics, Tartu, Estonia, May 2007

In January 2021, SAIL LABS Technology GmbH was acquired by the sensor specialist HENSOLDT and became HENSOLDT Analytics.

HENSOLDT Analytics, one of the world’s leading innovators in the field of speech technology, presented a comprehensive research paper on the “Development of a Modern Greek Broadcast-News Corpus and Speech Recognition System” at the NODALIDA 2007, the 16th Nordic Conference of Computational Linguistics.

The NODALIDA conference, which took place on 25-26 May 2007 in Tartu, Estonia, represented a high-profile platform for invited speakers, panel discussions and dissemination of research findings covering a broad range of interest areas in language technology (ranging from speech recognition and speech synthesis to evaluation methodology).

Dr. Juergen Riedler, Senior Research Engineer at HENSDOLT Analytics, was invited to present the key findings: “The paper describes a revolutionary approach in the domain of Modern Greek automatic speech recognition: Modern Greek has its peculiarities since it is a highly inflectional language. In order to face this challenge, a large-vocabulary continuous-speech recognition system was created. In contrast to recent efforts in this field focussing on dictation, the main emphasis of our research is in the broadcast news domain,” said Dr. Riedler. “Our projects in this area are concentrated on the development of a Modern Greek automatic speech recognition system”.

The overall influence of this development on the broadcast industry is tremendous: On the base of automatic speech recognition and indexing technology, the Greek Media Observation market is enabled to provide its users with broadcast news transcription in real-time.

ABSTRACT: We report on the creation of a Modern Greek broadcast-news corpus as a pre-requisite to build a large-vocabulary continuous-speech recognition system. We discuss lexical modelling with respect to pronunciation generation and examine the effects of the lexicon size on word accuracies. Peculiarities of Modern Greek as a highly inflectional language and their challenges for speech recognition are discussed.

To access the full article, please fill in the form below.

    Your name and e-mail are going to be used in order to send you only the research file and not any additional commercial material. You can change your mind at any time by clicking the unsubscribe in the footer of the email that you receive from us, or by contacting dataprotectionofficer@hensoldt-analytics.com. Please find out about your rights and choices and how we use your information in our Privacy Policy.

    HENSOLDT Analytics
    HENSOLDT Analytics

    HENSOLDT Analytics is a global leading provider of Open Source Intelligence (OSINT) systems and Natural Language Processing technologies, such as Automatic Speech Recognition, which are key elements for media monitoring and analysis.