Introduction to Data Mining and Predictive Analytics
Tuesday, March 18, 2014 at 8:30 AM - Wednesday, March 19, 2014 at 5:00 PM (EDT)
Introduction to Data Mining and Predictive Analytics with RapidMiner
Tuesday, March 18th and Wednesday, March 19th from 8:30 am to 5 pm
A light breakfast, lunch and afternoon snacks will be provided.
Be sure to check out our Advanced Text and Web Mining Techniques with RapidMiner and RapidAnalytics course being held in the same location March 20 and 21. Register for both at the same time to get a special discount.
Class size for this event is limited to 12 students. If the class is sold out and you wish to be added to a waiting list, contact the event organizer.
This course is a two-day introduction to the foundations of data mining, business analytics, and RapidMiner software. After this training course, participants will have a complete understanding of how RapidMiner Studio and RapidMiner Server work and are used. They will be able to create predictive models in the standard data environments found within most analyst positions. A high number of practical exercises ensure that the participants will be able to transfer the gained knowledge to their own data mining problems, solving them quickly and easily.
After the training course participants will be able to:
- perform basic data preparations
- build initial predictive models
- evaluate model quality
- score new data sets
- Target audience: users, analysts, developers, administrators
- Previous knowledge: basic knowledge of computer programs and mathematics
- Methods: lectures, discussions, individual and group work, exercises on realistic data.
Participants may introduce their own work and project specific questions in order to find particular solutions together with the trainer and other participants. The training course addresses beginners and intermediate learners.
- Basic Usage
- User Interface
- Creating and handling RapidMiner repositories
- Starting a new RapidMiner project
- Operators and processes
- Loading data from flat files
- Loading data from databases
- Storing data, processes and results
- Data Preparation
- Normalization and standardization
- Basic transformations of value types
- Handling missing values
- Filtering examples and attributes
- Handling attribute roles
- Predictive Models
- Explorative data analysis including data visualization
- Linear Regression
- Naïve Bayes
- Decision Trees
- Importance of attributes
- Model Evaluation
- Splitting data
- Evaluation methods
- Performance criteria
- Lift charts
- ROC plots
- Applying models
You must bring a laptop to class (Windows, Mac or Linux OS is fine). Should have Java Runtime Environment (JRE) version 1.6 (officially Java 6.0) or later installed. Students will be provided with links to install the Community Version of RapidMiner prior to the class.
David Weisman is a data scientist consultant with over 35 years of experience in the software field. In addition to consulting, he is a researcher at the University of Massachusetts Boston, working at the intersection of molecular biology and data mining. David is searching for cancer biomarkers in enormous volumes of DNA sequence data, identifying biosensors of environmental pollutants in bacterial and plant transcriptomic data, and teaching bioinformatics courses. Prior to obtaining his recent Ph.D. in molecular biology, David ran a long-term successful software consulting firm, specializing in distributed system development, compiler design, operating system development, quantitative finance, network security, and health care informatics.
Training Facility Logistics
Classes require a minimum of 3 students by March 10 to be held. If there are insufficient registrants, the class may be cancelled and all students will be refunded the full registration fee. Students should organize their travel arrangements accordingly and with this proviso. RapidMiner will promptly notify registered users as soon as the 3 students quota will be met.
Can't make it? Sign up for our newletter to stay in the loop on future events and classes by clicking on the Subcribe button at the top of any page on www.rapidminer.com.
Our Refund Policy: Plans change? We get it. But if you can't make it to the class, please email us at firstname.lastname@example.org no later than March 10. No refunds will be given after this timeframe.
When & Where
Pioneering advanced analytics vendor RapidMiner is redefining how business analysts use data to predict the future. With an open source heritage, RapidMiner is one of today’s most widely known and used predictive analytics platforms, providing powerful solutions for a wide variety of industries.
RapidMiner focuses on the fields of predictive analytics, data mining, and text mining. The discovery and leverage of unused business intelligence from existing data enables better informed decisions and allows for process optimization.
RapidMiner serves customers globally from offices in the United States, Germany, and the United Kingdom. Furthermore, our network of partners can support your data analysis projects using RapidMiner software products.