Bitext alignment /

Saved in:
Bibliographic Details
Author / Creator:Tiedemann, Jörg.
Imprint:San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) : Morgan & Claypool, c2011.
Description:1 electronic text (153 p.) : ill., digital file.
Language:English
Series:Synthesis lectures on human language technologies, 1947-4059 ; # 14
Synthesis lectures on human language technologies, # 14.
Subject:
Format: E-Resource Book
URL for this record:http://pi.lib.uchicago.edu/1001/cat/bib/8512898
Hidden Bibliographic Details
ISBN:9781608455119 (electronic bk.)
9781608455102 (pbk.)
Notes:Series from website.
Includes bibliographical references (p. 129-152).
Abstract freely available; full-text restricted to subscribers or individual document purchasers.
Also available in print.
Mode of access: World Wide Web.
System requirements: Adobe Acrobat Reader.
Summary:This book provides an overview of various techniques for the alignment of bitexts. It describes general concepts and strategies that can be applied to map corresponding parts in parallel documents on various levels of granularity. Bitexts are valuable linguistic resources for many different research fields and practical applications. The most predominant application is machine translation, in particular, statistical machine translation. However, there are various other threads that can be followed which may be supported by the rich linguistic knowledge implicitly stored in parallel resources. Bitexts have been explored in lexicography, word sense disambiguation, terminology extraction, computer-aided language learning and translation studies to name just a few. The book covers the essential tasks that have to be carried out when building parallel corpora starting from the collection of translated documents up to sub-sentential alignments. In particular, it describes various approaches to document alignment, sentence alignment, word alignment and tree structure alignment. It also includes a list of resources and a comprehensive review of the literature on alignment techniques.
Standard no.:10.2200/S00367ED1V01Y201106HLT014