Ansel MacLaughlin

Applied Scientist at Amazon

Content-based Models of Quotation: Datasets contains two of the datasets (KJB-CA and LAT-EJC) from our paper Content-based Models of Quotation (EACL-21). Unfortunately, we are unable to share any data from the JSTOR Understanding Series (KJB-JA, SHAK-JA, ABL-JA). Please contact JSTOR to discuss getting access to this data.

If you use either of our datasets, please cite our paper:

  author={Ansel MacLaughlin and David A. Smith},
  title={Content-based Models of Quotation},

kjb-ca.jsonl: King James Bible - Chronicling America:

lat-ejc.jsonl: Latin Text - JSTOR Early Journal Content: