Building and Using Comparable Corpora

Editors: Sharoff, S., Rapp, R., Zweigenbaum, P., Fung, P. (Eds.)

  • A reference source for researchers and students coming to the field of comparable corpora
  • Identifies the state of the art in the field as well as future trends
  • Written by experts in the fields
see more benefits

Buy this book

eBook $99.00
price for USA (gross)
  • ISBN 978-3-642-20128-8
  • Digitally watermarked, DRM-free
  • Included format: PDF, EPUB
  • ebooks can be used on all reading devices
  • Immediate eBook download after purchase
Hardcover $129.00
price for USA
  • ISBN 978-3-642-20127-1
  • Free shipping for individuals worldwide
  • Usually dispatched within 3 to 5 business days.
Softcover $129.00
price for USA
  • ISBN 978-3-662-52006-2
  • Free shipping for individuals worldwide
  • Usually dispatched within 3 to 5 business days.
About this book

The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field.

The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.

Reviews

“I would like to recommend ‘Building and Using Comparable … to those who are working with or are interested in multilingual and monolingual comparable corpora. … it is easy to say that the notion of comparable corpora was not only visionary, long-sighted, and productive. It is also easy to say that this volume remains the optimal starting point for any research or for any applications in Language Technology leveraging on comparable corpora.” (Marina Santini, forum.santini.se, February, 2017)


Table of contents (17 chapters)

  • Overviewing Important Aspects of the Last Twenty Years of Research in Comparable Corpora

    Sharoff, Serge (et al.)

    Pages 1-17

  • Mining Parallel Documents Using Low Bandwidth and High Precision CLIR from the Heterogeneous Web

    Shi, Simon (et al.)

    Pages 21-49

  • Automatic Comparable Web Corpora Collection and Bilingual Terminology Extraction for Specialized Dictionary Making

    Gurrutxaga, Antton (et al.)

    Pages 51-75

  • Statistical Comparability: Methodological Caveats

    Köhler, Reinhard

    Pages 77-91

  • Methods for Collection and Evaluation of Comparable Documents

    Paramita, Monica Lestari (et al.)

    Pages 93-112

Buy this book

eBook $99.00
price for USA (gross)
  • ISBN 978-3-642-20128-8
  • Digitally watermarked, DRM-free
  • Included format: PDF, EPUB
  • ebooks can be used on all reading devices
  • Immediate eBook download after purchase
Hardcover $129.00
price for USA
  • ISBN 978-3-642-20127-1
  • Free shipping for individuals worldwide
  • Usually dispatched within 3 to 5 business days.
Softcover $129.00
price for USA
  • ISBN 978-3-662-52006-2
  • Free shipping for individuals worldwide
  • Usually dispatched within 3 to 5 business days.
Loading...

Recommended for you

Loading...

Bibliographic Information

Bibliographic Information
Book Title
Building and Using Comparable Corpora
Editors
  • Serge Sharoff
  • Reinhard Rapp
  • Pierre Zweigenbaum
  • Pascale Fung
Copyright
2013
Publisher
Springer-Verlag Berlin Heidelberg
Copyright Holder
Springer-Verlag Berlin Heidelberg
eBook ISBN
978-3-642-20128-8
DOI
10.1007/978-3-642-20128-8
Hardcover ISBN
978-3-642-20127-1
Softcover ISBN
978-3-662-52006-2
Edition Number
1
Number of Pages
XII, 335
Number of Illustrations and Tables
56 b/w illustrations, 14 illustrations in colour
Topics