Springer Book Archives: eBooks only 8.99 each! Save now >>

Studies in Computational Intelligence

Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering

Authors: Abualigah, Laith Mohammad Qasim

Free Preview
  • Presents a new method for solving the text document clustering problem and demonstrates that it can outperform other comparable methods
  • Covers the main text clustering preprocessing steps and the metaheuristics needed in order to deal with the text document clustering problems
  • Proposes methods that can be applied to a broad range of text documents (e.g. newsgroup documents appearing on newswires, Internet web pages, and hospital information), modern applications (technical reports and university data), and the biomedical sciences (large biomedical datasets)
see more benefits

Buy this book

eBook $129.00
price for USA in USD (gross)
  • ISBN 978-3-030-10674-4
  • Digitally watermarked, DRM-free
  • Included format: PDF, EPUB
  • ebooks can be used on all reading devices
  • Immediate eBook download after purchase
Hardcover $169.99
price for USA in USD
  • ISBN 978-3-030-10673-7
  • Free shipping for individuals worldwide
  • Immediate ebook access, if available*, with your print order
  • Usually dispatched within 3 to 5 business days.
About this book

This book puts forward a new method for solving the text document (TD) clustering problem, which is established in two main stages: (i) A new feature selection method based on a particle swarm optimization algorithm with a novel weighting scheme is proposed, as well as a detailed dimension reduction technique, in order to obtain a new subset of more informative features with low-dimensional space. This new subset is subsequently used to improve the performance of the text clustering (TC) algorithm and reduce its computation time. The k-mean clustering algorithm is used to evaluate the effectiveness of the obtained subsets. (ii) Four krill herd algorithms (KHAs), namely, the (a) basic KHA, (b) modified KHA, (c) hybrid KHA, and (d) multi-objective hybrid KHA, are proposed to solve the TC problem; each algorithm represents an incremental improvement on its predecessor. For the evaluation process, seven benchmark text datasets are used with different characterizations and complexities.

Text document (TD) clustering is a new trend in text mining in which the TDs are separated into several coherent clusters, where all documents in the same cluster are similar. The findings presented here confirm that the proposed methods and algorithms delivered the best results in comparison with other, similar methods to be found in the literature.

Reviews

“The book is well written, with high-quality tables and graphs. Each chapter ends with a collection of references, including the most recent work in the area. The book should be very useful for scholars who want to study the general field of text document clustering. It is also a good reference for those who work in text document clustering and use genetic algorithms.” (Xiannong Meng, Computing Reviews, May 10, 2019)



Table of contents (6 chapters)

Table of contents (6 chapters)

Buy this book

eBook $129.00
price for USA in USD (gross)
  • ISBN 978-3-030-10674-4
  • Digitally watermarked, DRM-free
  • Included format: PDF, EPUB
  • ebooks can be used on all reading devices
  • Immediate eBook download after purchase
Hardcover $169.99
price for USA in USD
  • ISBN 978-3-030-10673-7
  • Free shipping for individuals worldwide
  • Immediate ebook access, if available*, with your print order
  • Usually dispatched within 3 to 5 business days.
Loading...

Recommended for you

Loading...

Bibliographic Information

Bibliographic Information
Book Title
Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering
Authors
Series Title
Studies in Computational Intelligence
Series Volume
816
Copyright
2019
Publisher
Springer International Publishing
Copyright Holder
Springer Nature Switzerland AG
eBook ISBN
978-3-030-10674-4
DOI
10.1007/978-3-030-10674-4
Hardcover ISBN
978-3-030-10673-7
Series ISSN
1860-949X
Edition Number
1
Number of Pages
XXVII, 165
Number of Illustrations
2 b/w illustrations, 21 illustrations in colour
Topics

*immediately available upon purchase as print book shipments may be delayed due to the COVID-19 crisis. ebook access is temporary and does not include ownership of the ebook. Only valid for books with an ebook version. Springer Reference Works are not included.