Using pointblank Package to Ensure Maximal Data Quality: Pfizer Case Study

Registrations are closed

This session has been fully booked.

Using pointblank Package to Ensure Maximal Data Quality: Pfizer Case Study

Pfizer Case Study: Enhancing Data Quality Assessments: User-Friendly Shiny Application for Reproducible and Flexible Validation Workflows

By R/Pharma

Date and time

Mon, Oct 16, 2023 6:00 AM - 10:00 AM PDT

Location

Online

About this event

Description:

Staying on top of data quality in an organization can be tough. The data may change constantly, there might be lots of it, and ad hoc queries just aren’t going to be effective. With the R package pointblank, we have a tool to get data quality analyses as often as we need them and then get actionable insights. Within this workshop, Rich will introduce the package and demonstrate how it can be used to assess data quality issues and identify the root causes. Together we will learn how to use of all the tooling that pointblank provides to generate useful reporting of data quality. We’ll also learn about a new Shiny application that makes pointblank even easier to use. Because when it comes to data testing, easier is always the preferred choice.

Modern data analysis often involves complex workflows for Data Quality Assessments (DQA). The well-known package {pointblank} has proven effective for constructing complex data validation workflows; however, a challenge arises when individuals lack R skills. To address this, we propose building a Shiny application around {pointblank} package, enabling a user-friendly approach. But what if validation needs to be performed regularly? Our solution involves developing two complementary Shiny apps. The first app guides users through data import, selection, and validation test construction while capturing user inputs as a YAML file. The second app utilizes the YAML file to "replay" the validation process for new data, ensuring reproducibility and flexibility. With this procedure, DQA becomes accessible to a wider audience and enables efficient data quality assurance.

Bio:

Rich Iannone

Rich works at Posit PBC and is the author of the gt and pointblank.

Ramanathan Perumal

Dr.-Ing. Ramanathan Perumal is a Lead Data Scientist at Pfizer's R CoE, with over 13 years of hands-on experience in Data Analytics. Holding a Ph.D. in Computational Engineering & Data Processing, his expertise lies in scientific computing and data science. Dr. Perumal excels in leveraging tools like R, Python, and C/C++, and possesses a profound grasp of software development principles.

His career is marked by a rich tapestry of achievements, including authoring six peer-reviewed international scientific research articles, earning citations and boasting an h-index of 5. His forte lies in utilizing statistical analysis and data visualization to decode complex datasets effectively.

Within Pfizer, Dr. Perumal is a linchpin in collaborating with leadership and project teams, devising technical solutions for business challenges. He spearheads the creation and deployment of data science pipelines, R packages, Shiny applications, and modules. Furthermore, he actively contributes to the open-source development community and imparts his extensive knowledge of R, Python, and ML methods to propel Pfizer's business lines forward.

Organized by

Zoom info will be sent via email closer to the event. Please contact R/Pharma via Eventbrite if you do not see the Zoom info 1 day before the workshop.

R in Pharma Free Workshops run Oct 16-20th, Oct 23rd & Oct 27th! The full list is here: https://rinpharma.com/workshop/2023conference/

The gathering is Oct 24-26 2023. Be sure to register here: https://hopin.com/events/r-pharma-2023/registration

Sales Ended