Kowalkiewicz, M., Orlowska, M.E., Kaczmarek, T., Abramowicz, W.
2012, Approx. 255 p.
Springer eBooks may be purchased by end-customers only and are sold without copy protection (DRM free). Instead, all eBooks include personalized watermarks. This means you can read the Springer eBooks across numerous devices such as Laptops, eReaders, and tablets.
You can pay for Springer eBooks with Visa, Mastercard, American Express or Paypal.
After the purchase you can directly download the eBook file or read it online in our Springer eBook Reader. Furthermore your eBook will be stored in your MySpringer account. So you can always re-download your eBooks.
digitally watermarked, no DRM
The eBook version of this title will be available soon
Extensive analysis of existing systems in Web Content Extraction and Deep Web Data Extraction
A comprehensive overview of the subject, both theoretical and practical
Presentation and discussion of original content and data extraction methods, designed to be used by non-specialist users (all approaches to date present methods requiring a degree in CS to be used)
Currently, there exists an overburdening growth in the number of reliable information sources on the Internet. At the same time, temporal and cognitive resources of human users are not changing. In an effort to curtail the information overload resulting from this conflict, recent research has attempted to provide methods and tools for web content extraction and aggregation. Success in these areas will greatly enhance business processes, and provide information seekers with new tools allowing them to reduce their time and cost involvement.
This book focuses on web content extraction and deep web data integration, and the methods and tools used, as well as analyzing the limitations of existing technology and solutions. The volume presents an accessible, well-organized and comprehensive survey of this discipline. Professionals, researchers, and academics involved in information technology will all find this book a timely and essential reference.
Content Level »Professional/practitioner
Keywords »XPath - content aggregation - deep web data integration - information integration - web content extraction
Introduction.- Section I. Web Content Extraction: Information Extraction and Aggregation in Contemporary Economy.- Web Content Extraction and Aggregation.- Practical Aspects of Content Aggregation.- Using XPath in Web Content Extraction.- Sample Proof-of-concept Prototype of a Web Content Extraction System.- Empirical Analysis of Robustness of Content Extraction Systems.- Section II. Deep Web Data Extraction: Information Integration.- Deep Web in a Context of Information Integration.- A Method for Deep Web Information Integration. Model of a Deep Web Information Integration System.- Empirical Analysis of a Proposed Deep Web Information Integration System.- Summary.- Index.