Introduction to Data Mining and Predictive Analytics with RapidMiner
Tuesday, September 16th and Wednesday, September 17th from 8:30 am to 5 pm
Light snacks in the morning and afternoon will be provided, as well as lunch.
Be sure to check out our Intermediate Data Mining and Predictive Analytics course being held in the same location on September 18 and 19. Register for both at the same time to get a special discount.
Class size for this event is limited to 12 students. If the class is sold out and you wish to be added to a waiting list, please contact the event organizer.
This course is a two-day introduction to the foundations of data mining, predicitve analytics, and RapidMiner software. To support a business context for these topics, we will develop a specific business scenario as a through line during the course. The class follows a learn-do model, allowing students time to focus on the new material as it is explained, then apply that understanding in a lab exercize on their own.
After successfully completing this training, participants will have an understanding of how RapidMiner Studio and RapidMiner Server work and are used. They will be able to create predictive models in the standard data environments found within most analyst positions.
Practical exercises during class prepare the participants to transfer the knowledge gained and apply it to their own data mining problems, solving them more quickly and easily. Since the class labs are hands-on, performed on the students' own laptops, the students will be taking their actual classwork home with them to jumpstart their application to the real world.
After this course, participants will be able to:
- perform basic data preparations
- build initial predictive models
- evaluate model quality
- score new data sets
- Target audience: newcomers, analysts, developers, administrators
- Previous knowledge: basic knowledge of computer programs and mathematics
- Methods: lectures, discussions, individual and group work, exercises on realistic data.
Participants may introduce their own work and project specific questions in order to find particular solutions together with the trainer and other participants. The training course addresses beginners and intermediate learners.
- Business Scenario
- Data Mining in the Enterprise
- Basic Usage
- User Interface
- Creating and handling RapidMiner repositories
- Starting a new RapidMiner project
- Operators and processes
- Loading data
- Storing data, processes, and results
- EDA: Exploratory Data Analysis
- Data Types
- Data Hierarchy
- Quick Summary Statistics
- Visualizing Data
- Data Preparation
- Normalization and standardization
- Basic transformations of value types
- Handling missing values
- Filtering examples and attributes
- Handling attribute roles
- Building Better Processes
- Relative Path
- Flow Control
- Building Blocks
- Predictive Models
- k-Nearest Neighbor
- Naïve Bayes
- Linear Regression
- Decision Trees
- Importance of attributes
- Model Evaluation
- Applying models
- Splitting data
- Evaluation methods
- Performance criteria
- Sharing and Collaboration
- Exporting images
- RapidMiner Server
You must bring a laptop to class (Windows, Mac or Linux OS). For Windows, Java Runtime Environment (JRE) version 7 is required. For Mac and Linux, Java Development Kit (JDK) version 7 is needed. Students will be provided with links to install RapidMiner Studio 6 prior to the class.
Todd is the Director of RapidMiner University at RapidMiner, a leader in Predictive Analytics providing an easy-to-use desktop-to-cloud solution designed for data scientists and business leaders. As a strong advocate for training and certification he combines his experience in technology and education to impart real-world use cases to students and users of analytics solutions across multiple industries.
For more than 20 years, Todd has been highly respected as both a technologist and a trainer. As a tech, he has seen that world from many perspectives: “data guy” and developer; architect, analyst, and consultant. As a trainer, he has designed and covered subject matter from operating systems to end-user applications, with an emphasis on data and programming. He is a regular contributor to the community of analytics and technology user groups in the Boston area, writes and teaches on many topics, and looks forward to the next time he can strap on a dive mask and get wet.
Training Facility Logistics
The training will be held at the MicroTek - Washington, D.C. training facility.
There is public parking next to the building. Map and directions.
Public transportation options cane be found here.
For lodging options, MicroTek has arranged rates at a number of nearby hotels.
For local contact information, contact the facility at 202-289-3811
Classes require a minimum of 3 students by September 2 to be held. If there are insufficient registrants, the class may be cancelled and all students will be refunded the full registration fee. Students should organize their travel arrangements accordingly and with this proviso.
Can't make it? Sign up for our newletter to stay in the loop on future events and classes by clicking on the Subscribe button at the top of any page on www.rapidminer.com.
Our Refund Policy: Plans change? We get it. But if you can't make it to the class, please email us at firstname.lastname@example.org no later than September 2. No refunds will be given after this date.
When & Where
RapidMiner offers a variety of ways to learn and develop your skills with the RapidMiner product suite. Our training courses are the most efficient and effective way for data analysts, data scientists, and administrators to get started with RapidMiner. They are also the perfect preparation for our , which can qualify you as a Certified RapidMiner Analyst and Certified RapidMiner Expert.