Datasets for generic relations extraction (reACE).

Saved in:
Bibliographic Details
Imprint:[Philadelphia, Pa.] : Linguistic Data Consortium, c2011.
Description:1 DVD-ROM ; 4 3/4 in.
Language:English
Subject:
Format: DVD Video E-Resource
URL for this record:http://pi.lib.uchicago.edu/1001/cat/bib/8439132
Hidden Bibliographic Details
Other authors / contributors:Linguistic Data Consortium.
ISBN:1585635820
9781585635825
Notes:Title from disc label.
"LDC2011T08."
"Developed at The University of Edinburgh ... consists of English broadcast news and newswire data originally annotated for the ACE (Automatic Content Extraction) program to which the Edinburgh Regularized ACE (reACE) mark-up has been applied."--Index.html file.
Also available on the Internet.
Summary:Datasets for Generic Relation Extraction (reACE) was developed at The University of Edinburgh, Edinburgh, Scotland. It consists of English broadcast news and newswire data originally annotated for the ACE (Automatic Content Extraction) program to which the Edinburgh Regularized ACE (reACE) mark-up has been applied.
The Edinburgh relation extraction (RE) task aims to identify useful information in text (e.g., PersonW works for OrganisationX, GeneY encodes ProteinZ) and to recode it in a format such as a relational database or RDF triple store (a database for the storage and retreival of Resource Description Framework (RDF) metadata) that can be more effectively used for querying and automated reasoning. A number of resources have been developed for training and evaluation of automatic systems for RE in different domains. However, comparative evaluation is impeded by the fact that these corpora use different markup formats and different notions of what constitutes a relation.
reACE solves this problem by converting data to a common document type using token standoff and including detailed linguistic markup while maintaining all information in the original annotation. The subsequent reannotation process normalises the two data sets so that they comply with a notion of relation that is intuitive, simple and informed by the semantic web.
The data in this corpus consists of newswire and broadcast news material from ACE 2004 Multilingual Training Corpus LDC 2005T09 and ACE 2005 Multilingual Training Corpus LDC2006T06. This material has been standardised for evaluation of multi-type RE across domains.

MARC

LEADER 00000cmm a2200000Ia 4500
001 8439132
003 ICU
005 20131126112800.0
007 co cg|||||||||
008 110715s2011 pau d eng d
020 |a 1585635820 
020 |a 9781585635825 
035 |a (OCoLC)741339442 
035 |a 8439132 
040 |a CVU  |c CVU 
049 |a CGUA 
090 |a PN4784.B75  |b D376 2011 
245 0 0 |a Datasets for generic relations extraction (reACE). 
260 |a [Philadelphia, Pa.] :  |b Linguistic Data Consortium,  |c c2011. 
300 |a 1 DVD-ROM ;  |c 4 3/4 in. 
336 |a computer dataset  |b cod  |2 rdacontent  |0 http://id.loc.gov/vocabulary/contentTypes/cod 
337 |a computer  |b c  |2 rdamedia  |0 http://id.loc.gov/vocabulary/mediaTypes/c 
338 |a other  |b cz  |2 rdacarrier 
500 |a Title from disc label. 
500 |a "LDC2011T08." 
500 |a "Developed at The University of Edinburgh ... consists of English broadcast news and newswire data originally annotated for the ACE (Automatic Content Extraction) program to which the Edinburgh Regularized ACE (reACE) mark-up has been applied."--Index.html file. 
520 |a Datasets for Generic Relation Extraction (reACE) was developed at The University of Edinburgh, Edinburgh, Scotland. It consists of English broadcast news and newswire data originally annotated for the ACE (Automatic Content Extraction) program to which the Edinburgh Regularized ACE (reACE) mark-up has been applied. 
520 |a The Edinburgh relation extraction (RE) task aims to identify useful information in text (e.g., PersonW works for OrganisationX, GeneY encodes ProteinZ) and to recode it in a format such as a relational database or RDF triple store (a database for the storage and retreival of Resource Description Framework (RDF) metadata) that can be more effectively used for querying and automated reasoning. A number of resources have been developed for training and evaluation of automatic systems for RE in different domains. However, comparative evaluation is impeded by the fact that these corpora use different markup formats and different notions of what constitutes a relation. 
520 |a reACE solves this problem by converting data to a common document type using token standoff and including detailed linguistic markup while maintaining all information in the original annotation. The subsequent reannotation process normalises the two data sets so that they comply with a notion of relation that is intuitive, simple and informed by the semantic web. 
520 |a The data in this corpus consists of newswire and broadcast news material from ACE 2004 Multilingual Training Corpus LDC 2005T09 and ACE 2005 Multilingual Training Corpus LDC2006T06. This material has been standardised for evaluation of multi-type RE across domains. 
530 |a Also available on the Internet. 
650 0 |a Broadcast journalism  |v Databases. 
650 0 |a Content analysis (Communication)  |v Databases. 
650 0 |a Computatinal linguistics  |v Databases. 
650 0 |a English language  |x Data processing  |v Databases. 
650 0 |a Linguistics  |x Research  |0 http://id.loc.gov/authorities/subjects/sh2008106989 
650 7 |a Broadcast journalism.  |2 fast  |0 http://id.worldcat.org/fast/fst00839167 
650 7 |a Computational linguistics.  |2 fast  |0 http://id.worldcat.org/fast/fst00871998 
650 7 |a Content analysis (Communication)  |2 fast  |0 http://id.worldcat.org/fast/fst00876639 
650 7 |a English language  |x Data processing.  |2 fast  |0 http://id.worldcat.org/fast/fst00911073 
655 7 |a Databases.  |2 fast  |0 http://id.worldcat.org/fast/fst01411643 
710 2 |a Linguistic Data Consortium.  |0 http://id.loc.gov/authorities/names/no2003104537  |1 http://viaf.org/viaf/130534201 
856 4 1 |z For additional information on data files, see the LDC website:  |u http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011T08 
903 |a HeVa 
929 |a cat 
999 f f |i da812f41-5067-5fcf-861c-91fde2ec1df4  |s 7a87e62d-2bd2-56e5-9277-443407a82306 
928 |t Library of Congress classification  |a PN4784.B75D376 2011  |p DVD  |l ASR  |c ASR-JRLASR  |i 6672705 
928 |t Library of Congress classification  |a PN4784.B75D376 2011  |p DVD  |l ASR  |c ASR-JRLASR  |i 6672706 
928 |t Library of Congress classification  |a PN4784.B75D376 2011  |l Online  |c UC-FullText  |n For additional information on data files, see the LDC website:  |u http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2011T08  |g ebooks  |i 7668054 
927 |t Library of Congress classification  |a PN4784.B75D376 2011  |p DVD  |l ASR  |c ASR-JRLASR  |b 70603442  |i 8940799 
927 |t Library of Congress classification  |a PN4784.B75D376 2011  |p DVD  |l ASR  |c ASR-JRLASR  |b 70603500  |i 8940800