Summary: | "This file contains documentation on the ACE 2005 Multilingual Training Corpus, Linguistic Data Consortium (LDC) catalog number LDC2006T06 and isbn 1-58563-376-3. This publication contains the complete set of English, Arabic and Chinese training data for the 2005 Automatic Content Extraction (ACE) technology evaluation. The corpus consists of data of various types annotated for entities, relations and events was created by Linguistic Data Consortium with support from the ACE Program, with additional assistance from LDC. This data was previously distributed as an e-corpus (LDC2005E18) to participants in the 2005 ACE evaluation. The objective of the ACE program is to develop automatic content extraction technology to support automatic processing of human language in text form. In November 2005, sites were evaluated on system performance in five primary areas: the recognition of entities, values, temporal expressions, relations, and events. Entity, relation and event mention detection were also offered as diagnostic tasks. All tasks with the exception of event tasks were performed for three languages, English, Chinese and Arabic. Events tasks were evaluated in English and Chinese only. The current publication comprises the official training data for these evaluation tasks. A complete description of the ACE 2005 Evaluation can be found on the ACE Program website maintained by the National Institute of Standards and Technology (NIST). For more information about linguistic resources for the ACE Program, including annotation guidelines, task definitions, free annotation tools and other documentation, please visit LDC's ACE website."--index.html
|