Text Mining with the HathiTrust: Empowering Librarians to Support Digital Scholarship Research - IFLA

Actions Panel

Text Mining with the HathiTrust: Empowering Librarians to Support Digital Scholarship Research - IFLA

By HathiTrust Research Center

When and where

Date and time

Thursday, August 23, 2018 · 9am - 4pm +08


Perpustakaan Dr. Abdul Latiff UNIVERSITI KEBANGSAAN MALAYSIA, SEMINAR ROOM 2, LEVEL 3, BLOCK K JALAN RAJA MUDA ABDUL AZIZ Kuala Lumpur, Federal Territory of Kuala Lumpur 50300 Malaysia


This free workshop for librarians and LIS professionals will introduce attendees to text analysis research and the common methods and tools used in this emerging area of scholarship, with particular attention to the HathiTrust Research Center and HathiTrust Digital Library. The workshop curriculum and discussions will provide a framework for how librarians can support text data mining, as well as teach attendees about transferable skills useful for many other areas of digital scholarly inquiry.

Topics include:

  • Introduction to gathering, managing, analyzing, and visualizing textual data;

  • Hands-on experience with text analysis tools, including the HTRC Analytics' off-the-shelf algorithms and datasets, such as the HTRC Extracted Features;

  • Using the command line to run basic text analysis processes.

No experience necessary! Attendees do NOT need to be from HathiTrust member institutions. All attendees must bring a laptop.

The workshop will run from 9:00am to 4:00pm with a one hour break for lunch (on your own).

Please contact htrc_workshop@library.illinois.edu if you have questions.

Funded by IMLS RE-00-15-0112-15.

About the organizer

The HathiTrust Research Center (HTRC) enables computational access for nonprofit and educational users to published works in the public domain and, in the future, on limited terms to works in-copyright from the HathiTrust.

The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Library, to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.

Leveraging data storage and computational infrastructure at Indiana University and the University of Illinois at Urbana-Champaign, the HTRC will provision a secure computational and data environment for scholars to perform research using the HathiTrust Digital Library. The center will break new ground in the areas of text mining and non-consumptive research, allowing scholars to fully utilize content of the HathiTrust Library while preventing intellectual property misuse within the confines of current U.S. copyright law. 

Sales Ended