Creating a European English Broadcast News Transcription Corpus and System


submitted to EuroSpeech’01 (Aalborg, Denmark, September 2001)

In January 2021, SAIL LABS Technology GmbH was acquired by the sensor specialist HENSOLDT and became HENSOLDT Analytics.

Based on BBN’s Rough’n’Ready suite of technologies used in the DARPA Hub-4 evaluations we describe the Sail-Labs Media Indexer system aiming at processing European English television broadcasts. We discuss the development of a European English broadcast news corpus, suitable for measuring performance of system components, such as speaker identification and speech recognition. We further report evaluation results on our multi-purpose test set, and outline the integration of real-time indexing into a spoken document retrieval system.

This article was presented at EuroSpeech’01 (Aalborg, Denmark, September 2001)  . To access the full article, please fill in the form below.

    Your name and e-mail are going to be used in order to send you only the research file and not any additional commercial material. You can change your mind at any time by clicking the unsubscribe in the footer of the email that you receive from us, or by contacting Please find out about your rights and choices and how we use your information in our Privacy Policy.

    HENSOLDT Analytics
    HENSOLDT Analytics

    HENSOLDT Analytics is a global leading provider of Open Source Intelligence (OSINT) systems and Natural Language Processing technologies, such as Automatic Speech Recognition, which are key elements for media monitoring and analysis.