SANDS Meeting - October 13, 2022

Actions Panel

SANDS Meeting - October 13, 2022

Our next meeting is in October and we are excited to announce that we will have two workshops and three presentations!

When and where

Date and time

Location

Pfizer La Jolla Campus 10770 Science Center Drive Building CB2, Room 1110 San Diego, CA 92121

Map and directions

How to get there

Refund Policy

Contact the organizer to request a refund.
Eventbrite's fee is nonrefundable.

About this event

*** Fee-based Workshop - Kirk Paul Lafler ***

Transform Your Organization to a Data-driven Culture using Data, Analytics and SAS® Software

Abstract

The digital world we live in has data everywhere around us. Represented as facts, figures, and information, data has been translated into a form that is efficient for movement and processing by computing resources. We find data stored in files, data sets, relational data base management systems (RDBMS), spreadsheets, emails, text messages, images, pictures, videos, animations, sounds, and many other forms. Data is not only important but serves as the lifeblood of an organization where growth, expansion, and survival are often dependent. As a result, more organizations are finding ways to leverage data resources to help derive better insights and improve decision-making activities.

So, what does a data-driven culture mean and how can an organization become one? Data-driven cultures are best found in organizations where its people understand the driver metrics and how those metrics impact key performance indicators (KPIs). To help achieve a data-driven culture an organization must utilize data governance and master data management to ensure data consistency, accuracy, usability, and security of data resources, and provide data democratization, where these data resources are accessible to users throughout the organization.

This workshop will present an approach that can be used to transform an organization to a data-driven culture using data, analytics, and SAS software. We will explore techniques on how to make your data simple and accessible, utilize an analytics foundation to read and process structured and unstructured data, and scale the discovery of new insights using SAS software. Specific topics will include identifying business problems; data cleaning, validation, and organization; performing exploratory data analysis (EDA) and the presentation of trends and relationships; formulating predictions based on emerging trends; conducting analysis to determine if your predictions are true; and communicating results in the context of the original business problems.

Workshop Start and End Time

3:00 PM – 4:30 PM PST

Workshop Cost

$39.95

Workshop Presenter Bio

Kirk Paul Lafler is a SAS consultant, application developer, and programmer; lecturer and adjunct professor at San Diego State University; an advisor and adjunct professor at the University of California San Diego Extension; and teaches SAS, SQL, Python and Excel courses, seminars, workshops, and webinars to users around the world. Kirk has been a SAS user since 1979 and is the author of several books including, PROC SQL: Beyond the Basics Using SAS, Third Edition (SAS Press. 2019) along with papers and articles on a variety of SAS topics. Kirk has also been selected as an Invited speaker, educator, keynote, and section leader at SAS conferences; and is the recipient of 26 “Best” contributed paper, hands-on workshop (HOW), and poster awards.

*** 2nd Workshop - Jeff Cao ***

Modern Biostatistical Programming Tools

Abstract

Two innovative tools related to SAS programming will be introduced:

  • An application that can potentially booster your programming team's productivity by 30% or more while ensuring compliance to applicable regulations.
  • An application that allows biostatisticians to create a Shell document in 10 minutes, and allows programmers to keep Shell and TOC in sync in minutes.

Workshop Start and End Time

4:35 PM – 5:35 PM PST

Jeff Cao, Speaker’s bio

Jeff is passionate about designing and developing breakthrough professional software applications specifically for life science industry. He is co-founder and CEO of RealtimeCRO. He is also President of Viitai, an innovative software company focused on developing efficient and compliant applications to streamline the drug development process. Previously, Jeff served as the head of IT at Ultragenyx Pharmaceutical, where he reported to the CEO and built the IT organization from scratch.

Jeff joined BioMarin Pharmaceutical in 2007, where he was responsible for information technology for all functions in DevOps. Together with the CMO, he developed an advanced application that Google requested to make it its first case study for the biopharmaceutical industry.

Jeff started his career in life science at Parexel International in 2001. He developed specialized systems that hosted clinical trial information for hundreds of sponsors. From 2005-2007, he was the technical lead for Parexel’s biggest IT project ever, PMED. Before Parexel, Jeff was at Compaq developing software applications for internal content search/indexing.

Jeff holds a master’s in computer science and a Ph.D. in chemistry from the University of Memphis.

SANDS Business Meeting Start and End Time

5:45 PM – 6:00 PM PST

*** SAS Institute Featured Speaker - Ari Zitin ***

Introduction to Machine learning using SAS Software: From Basic Concepts to Advanced Algorithms

Abstract

Machine learning models are increasingly used in a wide variety of business and scientific applications. Complex nonlinear algorithms can enable you to make accurate predictions on diverse kinds of data, while simpler more interpretable algorithms can provide insights that might not be obvious from looking at the data. In this session we learn how to build machine learning pipelines using a SAS Viya graphical interface. We start by introducing core machine learning concepts like the training/validation split and talk in detail about two popular machine learning algorithms: the decision tree model (an interpretable non-linear model), and the neural network model (an uninterpretable non-linear model). We compare our nonlinear models to a simple regression model and discuss some of the core ideas around model comparison and evaluation in the field of machine learning. Although most of the examples will be shown using a graphical interface, all the machine learning models and data processing routines are also available in SAS Code, and we will also cover a simple machine learning example using code tools available to SAS 9 users.

Presentation Start and End Time

6:00 PM – 7:00 PM PST

Presenter Bio

Ari Zitin holds bachelor’s degrees in both physics and mathematics from UNC-Chapel Hill. His research focused on collecting and analyzing low energy physics data to better understand the neutrino. Ari taught introductory and advanced physics and scientific programming courses at UC-Berkeley while working on a master’s in physics with a focus on nonlinear dynamics. While at SAS, Ari has worked to develop courses that teach how to use Python code to control SAS analytical procedures.

*** 20-minute Presentation - Curtis Smith ***

Navigating the Time Continuum Using SAS to Predict the Future Using Time Series Data

Abstract

When attempting to forecast future values from historical data, some traditional methods (such as regression analysis) may not suffice when the passage of time impacts the dependent variable. For example, if the analyst is considering monthly interest rates, “January” verses “February” as an independent variable is not suitable for predicting the interest rate that will occur in “March.” Complicating the analysis of time series data is the possible existence of cycles, seasons, and/or trends, and the potential of lagging impact of human intervention. Statisticians and data scientists have developed time series analyses to analyze time series data; adjust the data for cycles, seasons, and trends; and forecast future values. These time series analyses include many statistical measures to assess the statistical significance of the time series analyses results. Fortunately, SAS® includes the procedures for performing time series analyses. In this paper, the author will describe a few typical analysis scenarios using time series data, and will then provide a survey of the SAS® procedures TIMEDATA, TIMESERIES, ESM, and ARIMA. All SAS® code used for this presentation was run from SAS® Studio.

Presentation Start and End Time

7:05 PM – 7:25 PM PST

Presenter Bio

Mr. Curtis Smith worked for the Department of Defense for 38.5 years as an Auditor, an IT Technical Specialist, a Supervisory Auditor, a Field Audit Office Manager, a Program Manager for Headquarters’ Operations division, and a Program Manager for Headquarters’ Policy division. Mr. Smith frequently writes technical papers and teaches on data analysis, internal controls, and other audit related subjects, and has had more than 50 technical papers published in profession publications.

Mr. Smith has been a frequent presenter at professional conferences, in addition to internal presentations within the Department of Defense.

Mr. Smith graduated from California State University Long Beach in 1982 with a Bachelor’s Degree in Accounting and graduated from Georgetown University in 2013 with a Master’s Degree in Policy Management.

Mr. Smith is also amateur photographer, and has self-published two books on Amazon.com.

*** 20-minute Presentation - Ryan Lafler ***

Model Evaluation using Cross-Validation in SAS

Abstract

Model building is a tricky process—samples from populations are limited in their size and with enough variability between samples, can drastically change a model (a function of independent variables) despite that sample coming from the exact same population! How so? Failing to generalize the model beyond the scope of the sampled data, resulting in model overfitting.

Cross-Validation on a sampled dataset provides analysts and researchers with diagnostic tools to reduce model overfitting, design models utilizing optimal specifications of parameters and hyperparameters, and compare model parameterizations to determine which is best fitting.

This presentation introduces cross-validation techniques for building, comparing, and optimizing Generalized Linear Models (GLMs) in SAS.

Presentation Start and End Time

7:30 PM – 7:50 PM PST

Presenter Bio

Ryan Lafler is a second-year graduate student at San Diego State University pursuing a Master of Science in Big Data Analytics. He is a Graduate Teaching Associate for STAT-250, Introductory Statistics, and is currently working on his thesis.

Ryan is passionate about statistical modeling, inferential testing, data science, machine learning, and big data analytics using Python, R, SAS, and SQL.

Raffle

7:50 PM - 8:00 PM