Actions and Detail Panel
DHAsia @ Stanford Presents: Chinese Text Project: Historical Texts in a Dig...
Thu, April 27, 2017, 3:00 PM – 5:30 PM PDT
In this workshop, Donald Sturgeon (Harvard University) demonstrates the use of digital methods, which offer increasingly powerful tools to aid in the study and analysis of historical written works. In the majority of cases, a pre-requisite to applying these techniques is digitization of the material: most fundamentally, the creation of and access to reliable machine-readable editions of the texts in question. As digital tools – particularly those for distant reading – become increasingly sophisticated, organization of the ever larger body of digitized material becomes an important first step in applying these techniques.
Since its creation in 2005, the Chinese Text Project (http://ctext.org) has grown to become the largest full-text digital library of pre-modern Chinese. As it continues to develop, it presents an ever greater range of opportunities for use in both close and distant reading. On the one hand, the website offers a simple means to access commonly used functions such as full-text search for a wide range of pre-modern Chinese sources; on the other, it also provides users with much more sophisticated mechanisms that make possible more open-ended use of its contents, as well as the ability to contribute directly to the digitization of entirely new materials. This workshop will cover a range of topics including the basic usage, editing mechanisms, search patterns, plugin sysem and principles of API access for the Chinese Text Project.
Donald Sturgeon is Postdoctoral Fellow in Chinese Digital Humanities and Social Sciences at the Fairbank Center for Chinese Studies, Harvard University. He holds postgraduate degrees from Soochow University (Taipei) and the University of Hong Kong. His research interests include issues of language, mind and knowledge in classical Chinese thought, and the application of digital methods to the study of pre-modern Chinese language and literature.
Since 2005, he has managed the Chinese Text Project (http://ctext.org), an online digital library of pre-modern Chinese which is now the largest such library in the world and attracts tens of thousands of visitors and large numbers of crowd-sourced contributions every day. His current projects include large-scale Optical Character Recognition (OCR) of historical Chinese documents, the application of machine learning to the dating of pre-modern Chinese texts, and development and evaluation of automated methods for analyzing pre-modern Chinese documents and their relationship to the wider corpus of pre-modern Chinese writing.