Free

Generating synthetic data with the synthpop package for R

Event Information

Share this event

Date and Time

Location

Location

Room 03.017 on the third floor

Peter Froggatt Centre

Queen's University Belfast

Belfast

BT7 1PS

United Kingdom

View Map

Event description

Description

When concern about disclosure restricts access to a data set, a synthesized version, with no records that correspond to those in the original, can help. Ideally, a researcher will draw the same conclusions from an analysis of the synthetic data as they would from the original. The synthpop package for R has been developed to facilitate the creation of synthetic data and it now includes routines to evaluate the utility of the synthetic data.

This workshop will provide participants with an introduction to synthetic data concepts and methods. A key focus will be on practical issues of generating and analysing synthetic data using the R package synthpop.

Basic level of knowledge of R is required. Participants not familiar with R are strongly encouraged to learn the basics from the free textbook Introductory Statistics with R by Dalgaard (Chapter 1-2), or other online resources.

Participants will need to bring their own laptops with the latest version of R and RStudio installed.

Course details

Session 1 Introducing data synthesis and synthpop

This session will offer a brief overview of the history of proposals for synthetic data generation and how these have been used in practice. In particular, we will explain how synthetic data sets are being made available to users of the Scottish Longitudinal Study. After a brief introduction to synthpop, participants will then start carry out a simple example of data synthesis.

Session 2 Using synthpop

This session will provide participants with the details of the various functionalities of the synthpop package for R. Through real data examples you will learn how to run default and customized synthesis and how to evaluate quality of synthetic data by visualisation, formal utility measures and comparisons of results of analysis based on original observed data and their synthesised version. Some practical advice on synthesising problematic variables will be given.


Presenters: Gillian Raab and Beata Nowok

Contact: gillian.raab@ed.ac.uk & beata.nowok@ed.ac.uk

Share with friends

Date and Time

Location

Room 03.017 on the third floor

Peter Froggatt Centre

Queen's University Belfast

Belfast

BT7 1PS

United Kingdom

View Map

Save This Event

Event Saved