Intermediate Data Mining and Predictive Analytics
Thursday, June 12, 2014 at 8:30 AM - Friday, June 13, 2014 at 5:00 PM (EDT)
San Francisco, California
London, United Kingdom
Intermediate Data Mining and Predictive Analytics with RapidMiner
Thursday, June 12th and Friday, June 13th from 8:30 am to 5 pm
A light breakfast, lunch, and afternoon snacks will be provided.
Be sure to check out our Introduction to Data Mining and Predictive Analytics course being held in the same location on June 10 and 11. Register for both at the same time to get a special discount.
Class size for this event is limited to 12 students. If the class is sold out and you wish to be added to a waiting list, please contact the event organizer.
This training is a second two-day course, exploring additional possibilities of performing data mining and business analytics with RapidMiner Studio and RapidMiner Server. After successfully completing this course, participants will have an increased understanding of how RapidMiner software works and is used. The participants will be able to prepare data and create predictive models in standard data environments typically found within most analyst positions, as well as in many more uncommon environments.
Practical exercises during class prepare the participants to transfer the knowledge gained and apply it to their own data mining problems, solving them more quickly and easily. Since the class labs are hands-on, performed on the students' own laptops, the students will be taking their actual classwork home with them to jumpstart their application to the real world.
After the training, students will have the ability to:
- perform the most necessary and common data preparations
- build sophisticated predictive models
- evaluate model quality with respect to different criteria
- deploy data mining models
- Target audience: users, analysts, developers, administrators
- Previous knowledge: Introduction to Data Mining and Predictive Analytics or equivalent
- Methods: lectures, discussions, individual and group work, exercises on realistic data.
Participants may introduce their own work and project specific questions in order to find particular solutions together with the trainer and other participants. The training course addresses beginners and intermediate learners.
- Data Preparation
- Changing value types by discretization and dichotimization
- Balancing data
- Detection and removal of outliers
- Dimensionality reduction
- Predictive Models
- Neural Networks
- Logistic Regression
- Meta Learning: Bagging and Boosting
- Model Evaluation
- Advanced performance criteria
- Comparison between models
- Significance tests
- Validation of preprocessing and preprocessing models
- Scaling confidences
- Filter uncertain predictions
- Sharing data, models, and processes
- Exporting processes as web service
- Basics of report creation
- Managing processes and services
You must bring a laptop to class (Windows, Mac or Linux OS). For Windows, Java Runtime Environment (JRE) version 7 is required. For Mac and Linux, Java Development Kit (JDK) version 7 is needed. Students will be provided with links to install RapidMiner Studio 6 prior to the class.
David Weisman, PhD
David Weisman is a data scientist consultant with over 35 years of experience in the software field. In addition to consulting, he is a researcher at the University of Massachusetts Boston, working at the intersection of molecular biology and data mining. David is searching for cancer biomarkers in enormous volumes of DNA sequence data, identifying biosensors of environmental pollutants in bacterial and plant transcriptomic data, and teaching bioinformatics courses. Prior to obtaining his recent Ph.D. in molecular biology, David ran a long-term successful software consulting firm, specializing in distributed system development, compiler design, operating system development, quantitative finance, network security, and health care informatics.
Training Facility Logistics
The training will be held in Cambridge, MA at the RapidMiner Headquarters training center. The facility is near the Alewife T (metro) stop and other public transportation. Map and directions.
There are a variety of lodging options locally and in other parts of Cambridge and Boston that are accessible to the facility either on foot, bike, or by public transportation.
The closest reliable public parking is at the Alewife T station. From Alewife to RapidMiner HQ, there is a free public shuttle, but it is only a short walk away (perhaps ten minutes).
Classes require a minimum of 3 students by June 4 to be held. If there are insufficient registrants, the class may be cancelled and all students will be refunded the full registration fee. Students should organize their travel arrangements accordingly and with this proviso. RapidMiner will promptly notify registered users as soon as the 3 students quota will be met.
Can't make it? Sign up for our newletter to stay in the loop on future events and classes by clicking on the Subcribe button at the top of any page on www.rapidminer.com.
Our Refund Policy: Plans change? We get it. But if you can't make it to the class, please email us at firstname.lastname@example.org no later than June 4. No refunds will be given after this timeframe.
When & Where
Pioneering advanced analytics vendor RapidMiner is redefining how business analysts use data to predict the future. With an open source heritage, RapidMiner is one of today’s most widely known and used predictive analytics platforms, providing powerful solutions for a wide variety of industries.
RapidMiner focuses on the fields of predictive analytics, data mining, and text mining. The discovery and leverage of unused business intelligence from existing data enables better informed decisions and allows for process optimization.
RapidMiner serves customers globally from offices in the United States, Germany, Hungary and the United Kingdom. Furthermore, our network of partners can support your data analysis projects using RapidMiner software products.