Skip to main content
Book cover

Handbook of Linguistic Annotation

  • Book
  • © 2017

Overview

  • Leading scientists guide the reader through the process of modeling a phenomenon, creating an annotation language, building a corpus, and evaluating it for correctness
  • Offers a thorough treatment of the science of annotation with clearly defined methodology
  • Aimed at and accessible for both computer scientists and linguistic researchers
  • Includes supplementary material: sn.pub/extras

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 349.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 449.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (55 chapters)

  1. The Science of Annotation

  2. Case Studies

Keywords

About this book

This handbook offers a thorough treatment of the science of linguistic annotation. Leaders in the field guide the reader through the process of modeling, creating an annotation language, building a corpus and evaluating it for correctness. Essential reading for both computer scientists and linguistic researchers.
Linguistic annotation is an increasingly important activity in the field of computational linguistics because of its critical role in the development of language models for natural language processing applications. Part one of this book covers all phases of the linguistic annotation process, from annotation scheme design and choice of representation format through both the manual and automatic annotation process, evaluation, and iterative improvement of annotation accuracy.  The second part of the book includes case studies of annotation projects across the spectrum of linguistic annotation types, including morpho-syntactic tagging, syntactic analyses, a range of semantic analyses (semantic roles, named entities, sentiment and opinion), time and event and spatial analyses, and discourse level analyses including discourse structure, co-reference, etc. Each case study addresses the various phases and processes discussed in the chapters of part one.

Reviews

“In this context, this book is an important effort towards giving linguistic annotation full attention. … Indeed, this handbook will give you all you need to conceive your annotation scheme and assess its quality … . this book undoubtedly finds its place in every linguistics department library as a major reference on linguistic annotation.” (Emmanuel Schang, The Linguist List, linguistlist.org, August, 2018)



“Handbook of Linguistic Annotation is worth reading in that this volume presents a spate of annotation projects … . This book includes a detailed introduction to a wealth of linguistic annotated resources and is worthy of recommendation for researchers of Quantitative Linguistics because these resources can either be used as direct sources for future quantitative studies or offer various choices on the annotation patterns.” (Peng Bi, Journal of Quantitative Linguistics, January, 2018)

Editors and Affiliations

  • Department of Computer Science, Vassar College, Poughkeepsie, USA

    Nancy Ide

  • Department of Computer Science, Volen Center for Complex Systems, Brandeis University, Waltham, USA

    James Pustejovsky

About the editors

Nancy Ide is Professor of Computer Science at Vassar College in Poughkeepsie, New York, USA. She has been in the field of computational linguistics for over 30 years and made significant contributions to research in word sense disambiguation, computational lexicography, discourse analysis, and the use of semantic web technologies for language data. She is founder of the Text Encoding Initiative (TEI), the first major standard for representing electronic language data, and later developed the XML Corpus Encoding Standard (XCES). More recently, she co-developed the ISO LAF/GrAF representation format for linguistically annotated data. She has also developed major corpora for American English, including the Open American National Corpus (OANC) and the Manually Annotated Sub-Corpus (MASC), and has been a pioneer in efforts to foster open data and resources. Professor Ide is Co-Editor-in-Chief of the journal Language Resources and Evaluation and Editor of the Springer book series Text,Speech, and Language Technology.
 
James Pustejovsky is the TJX Feldberg professor of computer science at Brandeis University in Waltham, Massachusetts, United States. His expertise includes theoretical and computational modeling of language, specifically: Computational linguistics, Lexical semantics, Knowledge representation, temporal and spatial reasoning and Extraction. His main topics of research are Natural language processing generally, and in particular, the computational analysis of linguistic meaning. He proposed Generative Lexicon theory in lexical semantics. His other interests include temporal reasoning, event semantics, spatial language, language annotation, computational linguistics, and machine learning.
  

Bibliographic Information

Publish with us