Name: Synthetic Datasets for Statistical Disclosure Control
ISBN: 978-1-4614-0326-5

Overview

Authors:

Jörg Drechsler ⁰

Jörg Drechsler
1. , Department for Statistical Methods, Institute for Employment Research, Nürnberg, Germany
View author publications

You can also search for this author in PubMed Google Scholar

Is the first book that fully covers all different approaches to generating multiply imputed synthetic datasets
Combination of theory and practical implementation issues makes it appealing to the researcher and the practitioner alike
Obtaining access to real datasets which is already burdensome for researchers will become more and more complicated in the future, due to rising confidentiality concerns. Dissemination strategies, like generating synthetic datasets, which allow the release of data with high analytical validity while guaranteeing the confidentiality of the respondent, are greatly needed
Addresses problems of trying to find ways to benefit society by releasing microdata that simultaneously preserve individuals’ confidential information and yet allow valid inferences at some level of detail
Includes supplementary material: sn.pub/extras

Part of the book series: Lecture Notes in Statistics (LNS, volume 201)

10k Accesses
69 Citations
2 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 99.00

Price excludes VAT (USA)

Softcover Book USD 129.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (10 chapters)

Front Matter

Pages i-xx

Download chapter PDF
Introduction
- Jörg Drechsler
Pages 1-5
Background on Multiply Imputed Synthetic Datasets
- Jörg Drechsler
Pages 7-11
Background on Multiple Imputation
- Jörg Drechsler
Pages 13-21
The IAB Establishment Panel
- Jörg Drechsler
Pages 23-25
Multiple Imputation for Nonresponse
- Jörg Drechsler
Pages 27-37
Fully Synthetic Datasets
- Jörg Drechsler
Pages 39-51
Partially Synthetic Datasets
- Jörg Drechsler
Pages 53-63
Multiple Imputation for Nonresponse and Statistical Disclosure Control
- Jörg Drechsler
Pages 65-85
A Two-Stage Imputation Procedure to Balance the Risk–Utility Trade-Off
- Jörg Drechsler
Pages 87-97
Chances and Obstacles for Multiply Imputed Synthetic Datasets
- Jörg Drechsler
Pages 99-102
Back Matter

Pages 103-138

Download chapter PDF

Keywords

About this book

The aim of this book is to give the reader a detailed introduction to the different approaches to generating multiply imputed synthetic datasets. It describes all approaches that have been developed so far, provides a brief history of synthetic datasets, and gives useful hints on how to deal with real data problems like nonresponse, skip patterns, or logical constraints.

Each chapter is dedicated to one approach, first describing the general concept followed by a detailed application to a real dataset providing useful guidelines on how to implement the theory in practice.

The discussed multiple imputation approaches include imputation for nonresponse, generating fully synthetic datasets, generating partially synthetic datasets, generating synthetic datasets when the original data is subject to nonresponse, and a two-stage imputation approach that helps to better address the omnipresent trade-off between analytical validity and the risk of disclosure.

The book concludes with a glimpse into the future of synthetic datasets, discussing the potential benefits and possible obstacles of the approach and ways to address the concerns of data users and their understandable discomfort with using data that doesn’t consist only of the originally collected values.

The book is intended for researchers and practitioners alike. It helps the researcher to find the state of the art in synthetic data summarized in one book with full reference to all relevant papers on the topic. But it is also useful for the practitioner at the statistical agency who is considering the synthetic data approach for data dissemination in the future and wants to get familiar with the topic.

Reviews

From the reviews:

“This book explores … the generation of synthetic data with the same statistical properties as the original data, but containing different records. … It also addresses complications which arise with real data, such as missing values. The book provides an excellent overview of the area, and would make ideal reading for anyone new to the issues, as well as serving as a good source book describing technical details.” (David J. Hand, International Statistical Review, Vol. 80 (3), 2012)

Authors and Affiliations

, Department for Statistical Methods, Institute for Employment Research, Nürnberg, Germany

Jörg Drechsler

About the author

Jörg Drechsler is a Research Scientist at the German Institute for Employment Research, Department for Statistical Methods. His main areas of research involve statistical disclosure control and imputation with published papers in JASA, Statistica Sinica, JOS, and Survey Methodology.

Bibliographic Information

Book Title: Synthetic Datasets for Statistical Disclosure Control
Book Subtitle: Theory and Implementation
Authors: Jörg Drechsler
Series Title: Lecture Notes in Statistics
DOI: https://doi.org/10.1007/978-1-4614-0326-5
Publisher: Springer New York, NY
eBook Packages: Mathematics and Statistics, Mathematics and Statistics (R0)
Copyright Information: Springer Science+Business Media, LLC 2011
Softcover ISBN: 978-1-4614-0325-8Published: 29 June 2011
eBook ISBN: 978-1-4614-0326-5Published: 24 June 2011
Series ISSN: 0930-0325
Series E-ISSN: 2197-7186
Edition Number: 1
Number of Pages: XX, 138
Number of Illustrations: 19 b/w illustrations
Topics: Statistics for Social Sciences, Humanities, Law, Statistics for Business, Management, Economics, Finance, Insurance, Statistics for Life Sciences, Medicine, Health Sciences

Publish with us

Policies and ethics

Synthetic Datasets for Statistical Disclosure Control

Overview

Access this book

Other ways to access

Table of contents (10 chapters)

Front Matter

Back Matter