Technical Report Number
Domain knowledge expressed in structured citation formats can be exploited in data mining. We propose four structural properties of canonically cited texts, then look at to two classic problems in the study of the scholia, or ancient scholarly commentary, found in the manuscripts of the Iliad. We cluster citations of scholia to analyze their distribution in different manuscripts; this leads to a revised view of how the manuscripts' scribes drew on their source material. Correlated frequencies of named entities suggest that one group of manuscripts had access to material more closely based on the work of the greatest Hellenistic editor of Homer, Aristarchus of Samothrace.
Dartmouth Digital Commons Citation
Smith, D Neel and Weaver, Gabriel A., "Applying Domain Knowledge from Structured Citation Formats to Text and Data Mining: Examples Using the CITE Architecture" (2009). Computer Science Technical Report TR2009-649. https://digitalcommons.dartmouth.edu/cs_tr/326