$1,500 – $2,500

Learn Bayesian Data Analysis and Stan with Stan developer Jonah Gabry

Event Information

Share this event

Date and Time

Location

Location

eBay NYC

625 6th Avenue

Floor 3

New York, NY 10011

View Map

Refund Policy

Refund Policy

Refunds up to 30 days before event

Event description

Description

Learn Bayesian Data Analysis (BDA) and Markov chain Monte Carlo (MCMC) computation using Stan with Stan developer Jonah Gabry.

This three-day workshop will be taught by Jonah Gabry. Jonah is a Stan developer based at Columbia University and the developer of many R packages for applied Bayesian data analysis (rstan, rstanarm, rstantools, bayesplot, shinystan, loo). Jonah will be joined by fellow Stan developer Rob Trangucci, and other members of the Stan Development Team will make some guest appearances at various times throughout the course.

The course consists of three main themes: Bayesian inference and computation; the Stan modeling language; applied statistics/Bayesian data analysis in practice. There will be some lectures to cover important concepts, but the course will also be heavily interactive, with much of the time dedicated to hands on examples. We will be interfacing with Stan from R, but users of Python and other languages/platforms can still benefit from the course as all of the code we write in the Stan language (and all of the modeling techniques and concepts covered in the course) can be used with any of the Stan interfaces.

Participants will receive a copy of Andrew Gelman's landmark book Bayesian Data Analysis. Proceeds from the class support further development of Stan and the New York Open Statistical Programming Meetup.

Before class everyone should install R, RStudio and RStan on their computers. If problems occur please join the stan-users group and post any questions. It is important that all participants get Stan running and bring their laptops to the course.

Example topics and course structure is below. Actual coverage will be determined during the class based on the participants.

Section 1: Bayesian Inference in Theory and Practice

This section covers (or reviews, depending on the audience) the most essential concepts that form the foundations of Bayesian inference. The focus is on the necessary background required to successfully apply Bayesian statistics to real world problems.

  • What is Bayesian inference and how does it differ from other forms of statistical inference?
    • Advantages and disadvantages compared to frequentist inference and approximate forms of Bayesian inference
    • Generative models
    • The role of prior distributions in practice
    • Important properties of posterior distributions
  • The Bayesian data analysis workflow
    • Iterative model building, checking, and refinement
  • Bayesian computation with Markov chain Monte Carlo

Section 2: Intro to Stan

This section introduces the Stan modeling language, RStan, the interface for fitting Stan models via R, and the rest of the Stan ecosystem (e.g., the many supporting R packages).

  • What is Stan and why is it an important tool for Bayesian data analysis?
  • Programming statistical models in the Stan language
    • Understanding the structure of a Stan program
    • Key similarities and differences between the Stan language and other common programming languages
  • Using the Stan interfaces to fit models
    • Introduction to RStan, the R interface to Stan
    • Estimating models in Stan using data from an R session
    • Working with the fitted model objects returned by RStan
    • Brief intro to the Stan Development Team’s packages for diagnostics and visualization (bayesplot, shinystan). More on these packages also in later sections.

Section 3: Linear and Generalized Linear Models in Stan

This section covers how to program the most commonly used regression models in Stan and fit them using the RStan interface. In subsequent sections we will add hierarchical structure to these models.

  • Review of regression and generalized linear models (GLMs) from a Bayesian perspective
  • Programming and fitting GLMs in Stan
    • Models for continuous data, binary data, count data
    • Examples will be drawn from various domains including A/B testing, political polling, clinical trials, sports and more
  • How to think about specifying prior distributions for parameters in GLMs
    • Weakly informative defaults
    • When are non-informative priors appropriate?
    • Translating prior knowledge into mathematical form

Section 4: Model Checking and Model Comparison

This section covers methods and tools for checking the fit of a model to data and comparing (or combining) multiple competing models.

  • Understanding the role of the posterior predictive distribution for model checking
  • Graphical and numerical posterior predictive checking using the bayesplot and shinystan R packages:
    • How to use visualizations of the posterior predictive distribution to identify important features of the data not captured by a model
    • Using posterior predictive checks to motivate improvements to the A/B testing model from Section 3
  • Comparing multiple models on estimated predictive performance
    • When are techniques like cross-validation appropriate?
    • The importance of predictive power and explanatory power and then frequent tension between them
    • Introduction to the loo R package for model comparison and model averaging

Section 5: Hierarchical/Multilevel Models (Part 1)

In this section we focus on more advanced models that incorporate hierarchical structures unique to the particular application. These models are more difficult computationally and require paying more attention to diagnostics that motivate changes to the models.

  • Review of hierarchical models from a Bayesian perspective
    • Bias/variance tradeoff
    • Partial pooling, shrinkage, borrowing strength, regularization
    • Hyperpriors and hyperparameters
  • Implementing hierarchical models in Stan
    • Adding hierarchical structure to the GLMs from Section 3
    • Coding tips for balancing ease of programming, code clarity, and computational efficiency
  • Diagnosing and fixing computational problems when fitting hierarchical models
    • Visual and numerical Markov Chain Monte Carlo diagnostics
    • Sampler tuning parameters
    • Reparameterization

Section 6: Hierarchical/Multilevel Models (Part 2)

In this section we continue with the topic of hierarchical models and introduce techniques for decision making using inferences from the models.

  • Intro to more advanced hierarchical modeling techniques
    • Temporal variation
    • Spatial correlation structures
    • Splines and Gaussian processes
  • Forecasting and out-of-sample prediction with hierarchical models
    • Out-of-sample prediction using the generated quantities block of a Stan program
    • Decision analysis (e.g., setting prices to maximize expected revenue, cost/benefit analysis in healthcare)

Section 7: Wrapping Up

  • Review essential concepts from previous sections
  • Time for discussing additional topics of interest to participants
  • Various tips and tricks for becoming an advanced Stan user
  • Q&A session with other Stan developers

For more information please contact us.


Share with friends

Date and Time

Location

eBay NYC

625 6th Avenue

Floor 3

New York, NY 10011

View Map

Refund Policy

Refunds up to 30 days before event

Save This Event

Event Saved