Skip to main content
  • Book
  • © 2014

XML and Web Technologies for Data Sciences with R

  • Covers important modern Web technologies from the R perspective
  • Describes over 30 R packages developed by the authors
  • Comprehensive examples show how to interface with many applications, Web services and data formats

Part of the book series: Use R! (USE R)

Buy it now

Buying options

eBook USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (18 chapters)

  1. Front Matter

    Pages i-xxiv
  2. Data Formats: XML and JSON

    1. Front Matter

      Pages 1-3
  3. Data Formats: XML and JSON

    1. Getting Started with XML and JSON

      • Deborah Nolan, Duncan Temple Lang
      Pages 5-18
    2. Parsing XML Content

      • Deborah Nolan, Duncan Temple Lang
      Pages 53-74
    3. XPath, XPointer, and XInclude

      • Deborah Nolan, Duncan Temple Lang
      Pages 75-113
    4. Strategies for Extracting Data from HTML and XML Content

      • Deborah Nolan, Duncan Temple Lang
      Pages 115-182
    5. Generating XML

      • Deborah Nolan, Duncan Temple Lang
      Pages 183-225
    6. JavaScript Object Notation

      • Deborah Nolan, Duncan Temple Lang
      Pages 227-253
  4. Data Formats: XML and JSON

    1. An Introduction to XML

      • Deborah Nolan, Duncan Temple Lang
      Pages 19-52
  5. Web Technologies Getting Data from the Web

    1. Front Matter

      Pages 255-258
    2. HTTP Requests

      • Deborah Nolan, Duncan Temple Lang
      Pages 259-313
    3. Scraping Data from HTML Forms

      • Deborah Nolan, Duncan Temple Lang
      Pages 315-338
    4. REST-based Web Services

      • Deborah Nolan, Duncan Temple Lang
      Pages 339-379
    5. SimpleWeb Services and Remote Method Calls with XML-RPC

      • Deborah Nolan, Duncan Temple Lang
      Pages 381-401
    6. Accessing SOAP Web Services

      • Deborah Nolan, Duncan Temple Lang
      Pages 403-439
    7. Authentication for Web Services via OAuth

      • Deborah Nolan, Duncan Temple Lang
      Pages 441-461
  6. General XML Application Areas

    1. Front Matter

      Pages 463-466
    2. Meta-Programming with XML Schema

      • Deborah Nolan, Duncan Temple Lang
      Pages 467-500
    3. Spreadsheets

      • Deborah Nolan, Duncan Temple Lang
      Pages 501-535
    4. Scalable Vector Graphics

      • Deborah Nolan, Duncan Temple Lang
      Pages 537-580

About this book

Web technologies are increasingly relevant to scientists working with data, for both accessing data and creating rich dynamic and interactive displays.  The XML and JSON data formats are widely used in Web services, regular Web pages and JavaScript code, and visualization formats such as SVG and KML for Google Earth and Google Maps.  In addition, scientists use HTTP and other network protocols to scrape data from Web pages, access REST and SOAP Web Services, and interact with NoSQL databases and text search applications.  This book provides a practical hands-on introduction to these technologies, including high-level functions the authors have developed for data scientists.  It describes strategies and approaches for extracting data from HTML, XML, and JSON formats and how to programmatically access data from the Web. 

Along with these general skills, the authors illustrate several applications that are relevant to data scientists, such as reading and writing spreadsheet documents both locally and via Google Docs, creating interactive and dynamic visualizations, displaying spatial-temporal displays with Google Earth, and generating code from descriptions of data structures to read and write data.  These topics demonstrate the rich possibilities and opportunities to do new things with these modern technologies.  The book contains many examples and case-studies that readers can use directly and adapt to their own work.  The authors have focused on the integration of these technologies with the R statistical computing environment.  However, the ideas and skills presented here are more general, and statisticians who use other computing environments will also find them relevant to their work.

Deborah Nolan is Professor of Statistics at University of California, Berkeley.

Duncan Temple Lang is Associate Professor of Statistics at University of California, Davis and has been a member of both the S and R development teams.

Authors and Affiliations

  • University of California, Berkeley, USA

    Deborah Nolan

  • University of California, Davis, USA

    Duncan Temple Lang

About the authors

Deborah Nolan is Professor of Statistics at University of California, Berkeley.

Duncan Temple Lang is Associate Professor of Statistics at University of California, Davis and has been a member of both the S and R development teams.

Bibliographic Information

Buy it now

Buying options

eBook USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access