Skip to main content
  • Book
  • © 2003

Treebanks

Building and Using Parsed Corpora

Editors:

Part of the book series: Text, Speech and Language Technology (TLTB, volume 20)

Buy it now

Buying options

Softcover Book USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Table of contents (21 chapters)

  1. Front Matter

    Pages i-xxvi
  2. Building Treebanks

    1. Front Matter

      Pages 1-1
    2. English treebanks

      1. The Penn Treebank: An Overview
        • Ann Taylor, Mitchell Marcus, Beatrice Santorini
        Pages 5-22
      2. Thoughts on Two Decades of Drawing Trees
        • Geoffrey Sampson
        Pages 23-41
      3. Bank of English and Beyond
        • Timo Järvinen
        Pages 43-59
      4. Completing Parsed Corpora
        • Sean Wallis
        Pages 61-71
    3. German treebanks

      1. Syntactic Annotation of a German Newspaper Corpus
        • Thorsten Brants, Wojciech Skut, Hans Uszkoreit
        Pages 73-87
      2. Annotation of Error Types for German Newsgroup Corpus
        • Markus Becker, Andrew Bredenkamp, Berthold Crysmann, Judith Klein
        Pages 89-100
    4. Slavic treebanks

      1. The Prague Dependency Treebank
        • Alena Böhmová, Jan Hajič, Eva Hajičová, Barbora Hladká
        Pages 103-127
      2. An HPSG-Annotated Test Suite for Polish
        • Małgorzata Marciniak, Agnieszka Mykowiecka, Adam Przepiórkowski, Anna Kupść
        Pages 129-146
    5. Treebanks for romance languages

      1. Developing a Syntactic Annotation Scheme and Tools for a Spanish Treebank
        • Antonio Moreno, Susana López, Fernando Sánchez, Ralph Grishman
        Pages 149-163
      2. Building a Treebank for French
        • Anne Abeillé, Lionel Clément, François Toussenel
        Pages 165-187
      3. Building the Italian Syntactic-Semantic Treebank
        • Simonetta Montemagni, Francesco Barsotti, Marco Battista, Nicoletta Calzolari, Ornella Corazzari, Alessandro Lenci et al.
        Pages 189-210
      4. Automated Creation of a Medieval Portuguese Partial Treebank
        • Vitor Rocio, Mário Amado Alves, J. Gabriel Lopes, Maria Francisca Xavier, Graça Vicente
        Pages 211-227
    6. Treebanks for other languages

      1. Sinica Treebank
        • Keh-Jiann Chen, Chi-Ching Luo, Ming-Chung Chang, Feng-Yi Chen, Chao-Jan Chen, Chu-Ren Huang et al.
        Pages 231-248
      2. Building A Japanese Parsed Corpus
        • Sadao Kurohashi, Makoto Nagao
        Pages 249-260
      3. Building a Turkish Treebank
        • Kemal Oflazer, Bilge Say, Dilek Zeynep Hakkani-Tür, Gökhan Tür
        Pages 261-277
  3. Using Treebanks

    1. Front Matter

      Pages 279-279
    2. Encoding Syntactic Annotation

      • Nancy Ide, Laurent Romary
      Pages 281-296
    3. Evaluation with treebanks

      1. Parser Evaluation
        • John Carroll, Guido Minnen, Ted Briscoe
        Pages 299-316

About this book

Linguists and engineers in Natural Language Processing tend to use electronic corpora more and more. Most research has long been limited to raw (unannotated) texts or to tagged texts (annotated with parts of speech only), but these approaches suffer from a word by word perspective. A new line of research involves corpora with richer annotations such as clauses and major constituents, grammatical functions and dependency links. The first parsed corpora were the English Lancaster treebank and Penn Treebank. New ones have recently been developed for other languages.
This book:

provides a state of the art on work being done with parsed corpora;

gathers 21 papers on building and using parsed corpora raising many relevant questions;

deals with a variety of languages and a variety of corpora;

is for those working in linguistics, computational linguistics, natural language, syntax, and grammar.

Reviews

From the reviews:

"Anne Abeillé draws together a collection of fifteen short pieces focused primarily on the issues that come up in creating treebanks, demonstrated across an impressive variety of languages, along with six chapters on how treebanks are used. … For computational linguists working on automatic parsing, a pass through this book should be required … . The reader … will be rewarded with a clear sense of the challenge and the promise of systematically applying theoretically motivated linguistic representations to ‘language in the large’." (Philip Resnik, Language, Vol. 83 (4), 2007)

Editors and Affiliations

  • Universite Paris 7, Paris, France

    Anne Abeillé

Bibliographic Information

Buy it now

Buying options

Softcover Book USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access