Skip to main content

Scalable Big Data Analytics for Protein Bioinformatics

Efficient Computational Solutions for Protein Structures

  • Book
  • © 2018

Overview

  • Highlights the potential held by new computational techniques, such as cloud computing and big data technologies, in connection with protein bioinformatics
  • Chiefly focuses on protein structure, which remains poorly understood and is not effectively used in medicine
  • Describes methods for applying structural bioinformatics in medical diagnostics

Part of the book series: Computational Biology (COBO, volume 28)

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (11 chapters)

  1. Cloud Services for Scalable Computations

  2. Multi-threaded Solutions for Protein Bioinformatics

Keywords

About this book

This book presents a focus on proteins and their structures. The text describes various scalable solutions for protein structure similarity searching, carried out at main representation levels and for prediction of 3D structures of proteins. Emphasis is placed on techniques that can be used to accelerate similarity searches and protein structure modeling processes.


The content of the book is divided into four parts. The first part provides background information on proteins and their representation levels, including a formal model of a 3D protein structure used in computational processes, and a brief overview of the technologies used in the solutions presented in the book. The second part of the book discusses Cloud services that are utilized in the development of scalable and reliable cloud applications for 3D protein structure similarity searching and protein structure prediction. The third part of the book shows the utilization of scalable Big Datacomputational frameworks, like Hadoop and Spark, in massive 3D protein structure alignments and identification of intrinsically disordered regions in protein structures. The fourth part of the book focuses on finding 3D protein structure similarities, accelerated with the use of GPUs and the use of multithreading and relational databases for efficient approximate searching on protein secondary structures.


The book introduces advanced techniques and computational architectures that benefit from recent achievements in the field of computing and parallelism. Recent developments in computer science have allowed algorithms previously considered too time-consuming to now be efficiently used for applications in bioinformatics and the life sciences. Given its depth of coverage, the book will be of interest to researchers and software developers working in the fields of structural bioinformatics and biomedical databases.



Reviews

“In this book, the author deals with various techniques that can be used for data handling and efficient analysis related to computational processes that require a great deal of time and effort, for example, structure similarity searching, protein structure modeling, protein structure alignment, and superposition.” (Jasbir Kaur, zbMath 1411.92002, 2019)

“This excellent and practically oriented text can benefit researchers seeking to establish a cloud-based bioinformatics HPC facility. Note that most of the solutions are implemented as embarrassingly parallel processes and not as distributed parallel processes. The book will be of interest to researchers and scientific software developers of bioinformatics and biomedical databases.” (Alexander Tzanov, Computing Reviews, June 06, 2019)

Authors and Affiliations

  • Silesian University of Technology, Gliwice, Poland

    Dariusz Mrozek

About the author

Dariusz Mrozek is currently an Associate Professor and Head of Division of Theory of Informatics in Institute of Informatics at the Silesian University of Technology (SUT) in Gliwice, Poland. He received his PhD degree from SUT in 2006. His research interests cover bioinformatics, information systems, parallel and Cloud computing, databases and Big data. He is now focused on the analysis of protein structures, functions and activities, and the use of novel computation techniques to get insights from biological data, including NGS and proteomics data. He is the author of 90+ papers published in conference proceedings and international journals, co-editor of thirteen books devoted to databases and data processing, and editor of two special issues in reputable scientific journals. He is a member of the IEEE Engineering in Medicine and Biology Society (EMBS), IEEE Systems, Man, and Cybernetics Society (SMCS), and IEEE Cloud Computing Community. Working in different research projects, he cooperated with qualified institutions, e.g. Imperial College of London (on the Chernobyl Tissue Bank), V P Komisarenko Institute of Endocrinology and Metabolism - Academy of Medical Sciences of the Ukraine, Medical Radiological Research Centre - Russian Academy of Medical Sciences, Helmholtz Zentrum Muenchen Deutsches Forschungszentrum Fuer Gesundheit und Umwelt Gmbh, Microsoft Research in the USA, Institute of Oncology in Gliwice, Poland, Medical University of Silesia, Katowice, Poland.

Bibliographic Information

Publish with us