Actions and Detail Panel
DHAsia @ Stanford Presents: Situating Knowledge with Large Corpus Analysis:...
Thu, April 13, 2017, 3:00 PM – 5:30 PM PDT
In this workshop, Michael Stanley-Baker (Max Planck Institute for the History of Science) will demonstrate the use of three toolsets as a foundation for building a trans-disciplinary research project.
The first tool enables users to analyse large corpuses of documents against a user-defined vocabulary set, showing where those terms appear and in what frequency. Through analysis of these results, users can hone in on useful texts, or “hot spots” within the larger corpus for further research. These results can also be used to show the overlap or distance between vocabulary sets by drawing simple network relationship graphs.
The second stage uses MARKUS to markup select texts.
Finally we export the MARKUS results into DocuSky for analysis, comparison between paragraphs, and visualisation of their distribution across time, space and genre. We will also learn how to use the term-equivalence feature in Docusky.
LAPTOPS ARE REQUIRED & WILL NOT BE PROVIDED.
Files will be available for use during the workshop, but participants are also invited to bring sample corpuses, in any language. Chinese is preferred, but you are welcome to experiment with other languages. If you bring your own corpus, please also prepare a list of terms which to help you identify your topic within your texts. The longer the better.
DHAsia gratefully acknowledges support from the Stanford Center for Spatial and Textual Analysis, the Stanford Center for Interdisciplinary Digital Research, the Center for East Asian Studies, and the Stanford Confucius Institute, among other supporters.