The Wilford Woodruff Papers is a collection of documents from the life of Wilford Woodruff, the fourth president of the Mormon Church. The collection contains over 100,000 documents, including journals, letters, and other documents.
The Challenge and Solution
The challenge was to create a tool that would match the entire corpus of Mormon scripture to the Wilford Woodruff Papers.
My team was able to deliver an interactive web application that cross-referenced religious texts with 19th century published journal entries by using a Bag-of-Words model to vectorize n-grams to find textual matches sorted by confidence
Packages Used:
Streamlit, Pandas, NLTK, Scikit-Learn, Numpy, Plotly… plus a few others
The Result
Back to top