Text Mining with the HathiTrust: Empowering Librarians to Support Digital Scholarship Research - Library of Congress

Actions Panel

Text Mining with the HathiTrust: Empowering Librarians to Support Digital Scholarship Research - Library of Congress

By HathiTrust Research Center

Date and time

Thursday, August 9, 2018 · 9am - 4pm EDT

Location

James Madison Building

Mumford Room (LM-649) 101 Independence Ave SE Washington, DC 20540

Description

This free workshop for librarians and LIS professionals will introduce attendees to text analysis research and the common methods and tools used in this emerging area of scholarship, with particular attention to the HathiTrust Research Center. The workshop's "train the trainer" curriculum will provide a framework for how librarians can support text data mining, as well as teach transferable skills useful for many other areas of digital scholarly inquiry.

Topics include:

  • Introduction to gathering, managing, analyzing, and visualizing textual data;

  • Hands-on experience with text analysis tools, including the HTRC's off-the-shelf algorithms and datasets, such as the HTRC Extracted Features;

  • Using the command line to run basic text analysis processes.

No experience necessary! Attendees must bring a laptop.


The workshop will run from 9:00am to 4:00pm with a one hour break for lunch (on your own).


NOTE: The Library of Congress's James Madison Building does not open to the public until *8:30am*. Be prepared for screened security.


Please contact htrc_workshop@library.illinois.edu if you have questions.

Funded in part by IMLS award # RE-00-15-0112-15.

Organized by

The HathiTrust Research Center (HTRC) enables computational access for nonprofit and educational users to published works in the public domain and, in the future, on limited terms to works in-copyright from the HathiTrust.

The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Library, to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.

Leveraging data storage and computational infrastructure at Indiana University and the University of Illinois at Urbana-Champaign, the HTRC will provision a secure computational and data environment for scholars to perform research using the HathiTrust Digital Library. The center will break new ground in the areas of text mining and non-consumptive research, allowing scholars to fully utilize content of the HathiTrust Library while preventing intellectual property misuse within the confines of current U.S. copyright law. 

Sales Ended