Arabic treebank - Weblog /

Saved in:
Bibliographic Details
Imprint:[Philadelphia, PA] : Linguistic Data Consortium, c2016.
Description:1 CD-ROM ; 4 3/4 in.
Language:Arabic
Subject:
Format: Unknown
URL for this record:http://pi.lib.uchicago.edu/1001/cat/bib/10802792
Hidden Bibliographic Details
Other authors / contributors:Maamouri, Mohamed.
Linguistic Data Consortium.
ISBN:1585637416
9781585637416
Notes:Title from disc label.
Data type(s): text.
Data source(s): weblogs.
Application(s): automatic content extraction, cross-lingual information retrieval, information detection.
Author(s): Mohamed Maamouri, Ann Bies, Seth Kulick, Sondos Krouna, Dalila Tabassi, Michael Ciul.
Restricted for use by site license.
Arabic, standard Arabic.
Summary:"The ongoing Penn Arabic Treebank Project (PATB) supports research in Arabic-language natural language processing and human language technology development. ... This release contains 243,117 source tokens before clitics were split, and 308,996 tree tokens after clitics were separated for treebank annotation. The source material is weblogs collected by LDC from various sources."--LDC catalog.