English news text treebank : Penn treebank revised.

Saved in:
Bibliographic Details
Imprint:[Philadelphia, PA] : Linguistic Data Consortium, [2015]
Description:1 CD-ROM ; 4 3/4 in.
Language:English
Subject:
Format: Unknown
URL for this record:http://pi.lib.uchicago.edu/1001/cat/bib/10802807
Hidden Bibliographic Details
Other authors / contributors:Linguistic Data Consortium, issuing body.
ISBN:1585637246
9781585637249
Notes:Title from disc label.
"LDC2015T13."
Data type: Text.
Data source: Newswire.
Application: Parsing, tagging, part of speech tagging, natural language processing.
Summary:"English News Text Treebank: Penn Treebank Revised was developed by the Linguistic Data Consortium (LDC) with funding through a gift from Google Inc. It consists of a combination of automated and manual revisions of the Penn Treebank annotation of Wall Street Journal (WSJ) stories. The data is comprised of 1,203,648 word-level tokens in 49,191 sentence-level tokens -- in all 2,312 of the original Penn Treebank WSJ files." -- LDC online catalogue.