Managing Data with R
Friday, December 7, 2012 from 10:00 AM to 2:00 PM (PST)
Managing Data with R
Before you can analyze data, it must be in a form you can analyze. That is where we typically spend most of our time. This 4-hour workshop shows how to perform the most commonly used data management tasks in R. For many topics we will cover both the standard built-in approach and then perform the same task using add-on packages that are often much easier to use.
We will devote most of our time to working through examples that you may run simultaneously on your computer. However, handouts will include each step and its output if you prefer instead just relax and take notes. Most examples come from the extensive data management examples in R for SAS and SPSS Users, R for Stata Users, and http://r4stats.com. That makes it easy to review what we did later with full explanations, or to learn more about a particular subject by extending an example which you have already learned. (No coverage of SAS, SPSS or Stata is included in this workshop.)
At the end of the workshop, you will receive a set of practice exercises for you to do on your own time, as well as solutions to the problems. The instructor will be available via email to address these problems or any other topic in the workshop.
Attendees should know basic R programming, including how to read data files and call functions.
When finished, you will be able to prepare most data sets for analysis.
Robert A. Muenchen is the creator of the web site http://r4stats.com and is the author of the books, R for SAS and SPSS Users, and, with Joseph Hilbe, R for Stata Users. A consulting statistician with 30 years of experience, Bob is currently the manager of Research Computing Support (formerly the Statistical Consulting Center) at the University of Tennessee. Bob has served on the advisory boards of SPSS Inc., the Statistical Graphics Corporation and PC Week Magazine. His suggested improvements and/or programming code have been incorporated into SAS, SPSS, JMP, STATGRAPHICS and several R packages.
Course Outline (Data management topics have been moved to a separate workshop)
1. Missing values
2. Mean substitution
3. Why the attach function is dangerous
4. Transforming variables – within, transform, mutate
5. Mathematical and statistical operators
6. The “apply” family of functions and the plyr package
7. Variable length vs. n of valid observations
8. Adding function results to data frames
9. Conditional transformations
10. Sorting data (sort, order, arrange functions)
11. Recoding variables
12. Renaming variables
13. Keeping and dropping variables
14. Stacking/concatenating/adding cases to data sets
15. Joining/merging/adding variables to data sets
16. Creating summary/aggregate data sets to analyze further
17. Reshaping data (wide to long and vice versa)
18. “Casting” aggregates with the reshape package
19. Character string manipulations using the stringr package
20. Date / time manipulations using the lubridate package
Disclaimer: We have the right to cancel the event for any reason at any time. Revolution Analytics will refund all monies paid for ticket sales in full in the event of a cancellation. We are not responsible for any travel related expenses incurred by attendees for this event. This includes but not limited to transportation, hotel accommodations or any other travel related expenses secured by the attendee, due to a cancellation on our part.
30 days from event date Full refund less 10% of the paid ticket price
21 days from event date 50% of paid ticket price
Within 15 days of event date Non refundable