Dataset Open Access
This study employs a machine learning algorithm (the Stanford Named Entity Recognizer, or NER) to shed light on the relative rates at which William James and John Dewey mention other persons in their respective books. The NER attempts to tag words and phrases in a corpus with either PERSON, ORGANIZATION, or LOCATION. I created a corpus of major books published by James and Dewey, respectively, and used the NER to analyze each corpus. I then created databases and collected all entities tagged with PERSON in each book in each corpus.