Heterogeneity and rarity of Data with Dr. He Jingrui
Wednesday, March 27, 2013 from 4:00 PM to 6:00 PM (EDT)
San Francisco, California
London, United Kingdom
Many real-world problems exhibit both heterogeneity and rarity. Take insider threat detection from various social contexts as an example. While the target, malicious insiders, may only be a very small portion of the entire population (i.e., rarity), each person can be characterized by rich features, such as social friendship, emails, instant messages, etc (i.e., feature heterogeneity). Moreover, different types of insiders, though correlated, may exhibit different statistical characteristics (i.e., task heterogeneity). For such problems, how can we quickly identify an example from a new rare category? How can we leverage both feature heterogeneity and task heterogeneity to maximally boost the learning performance?
In this talk, Dr. Jingrui will present her recent work on addressing these two challenges. For the challenge of rarity, she will introduce rare category analysis, e.g., how to detect the rare examples with the help of a labeling oracle. For the challenge of heterogeneity, she will present a graph-based approach taking into consideration both feature heterogeneity and task heterogeneity. Dr. Jingrui will also talk about how these techniques can be used in applications such as insider threat detection.
Dr. Jingrui He is currently an assistant professor in Computer Science Department at Stevens Institute of Technology. She received her M.Sc and Ph.D degree from Carnegie Mellon University in 2008 and 2010 respectively, both majored in Machine Learning. Her research interests include developing scalable algorithms for heterogeneous learning, rare category analysis, and semi-supervised learning, with applications in social network analysis, semiconductor manufacturing, traffic analysis, etc. She has published over 30 referred articles and served as the organization committee member of ICML, KDD, etc.