Steph Buongiorno, PhD, is a researcher and a teacher. She designs computational methods for knowledge production in the digital humanities. Her work reconfigures traditional academic boundaries and opens up new approaches to interpreting texts, data, and culture.

She enjoys poetry, all sorts of sensory things, as well as quiet, underwater worlds. Read more at She Dives Caves.

Her work has been supported by the National Science Foundation (NSF), National Endowment for the Humanities (NEH), and the National Institute of Justice (NIJ), and the Department of Education (ED).

's Picture

Text Mining for Historical Analysis

Text Mining for Historical Analysis

Text Mining for Historical Analysis offers a critical intervention into the evolving field of digital history. It introduces "computational historical thinking"-a mode of thinking that explores the epistemological entanglements between computation, theory, and historical analysis, emphasizing how computational procedures actively shape the questions we ask and the meanings we derive from data. Through sustained engagement with historical corpora—such as the 19th-century Hansard debates and contemporary U.S. Congressional Records—this book demonstrates how to attend to both structure and semantics, thus reimagining the relationship between computation and historical knowledge in the digital age.

Democracy Viewer

Democracy Viewer

Democracy Viewer is an open-source text mining application that enables analysts to explore and interpret humanities texts using techniques like word counts, TF-IDF, and word embeddings. It supports both distant and close reading. Analysts can upload their own datasets or work with curated collections available on the platform. Democracy Viewer also provides access to open government data, including U.S. Congressional records, making public texts more accessible for research and civic engagement.

Beyond the Black Box: Toward Transparent AI for Computational Text Analysis in the Digital Humanities

Beyond the Black Box: Toward Transparent AI for Computational Text Analysis in the Digital Humanities

This article introduces Critical Generative Interpretation, a method that supports humanist inquiry by making AI-generated insights traceable and grounded in textual evidence. By linking large language model (LLM) outputs to structured knowledge graphs derived from source texts, the method enables scholars to critically assess where generated interpretations come from and how they relate to the original material. This methodology supports humanist inquiry through close reading. Through a case study of Harold and the Purple Crayon, the article shows how this approach fosters interpretive engagement and makes AI a method for humanistic knowledge production.

Foundations and Applications of Humanities Analytics

Foundations and Applications of Humanities Analytics

Computational methods allow researchers to systematically analyze and interpret large volumes of social, political, and cultural data, uncovering underlying patterns and insights at scale. These course materials, made for the Santa Fe Institute, are designed to equip humanities researchers with computational and quantitative tools. The course aims to foster a supportive community, build practical skills, and diversify the field of humanities analytics by welcoming participants from various backgrounds and stages of their academic careers.

Database Escrituras Protocolos 1640 a 44 y 50 and 1730 a 1733

Database Escrituras Protocolos 1640 a 44 y 50 and 1730 a 1733

Transcripts of notarial records preserved from an endangered colonial archive that documents the selling and pawnship practices involving enslaved people in Havana, Cuba during the 17th and 18th centuries.

Word Embeddings as a Key to the Study of Bias, Race, and Gender in Congress, 1880-2010

Word Embeddings as a Key to the Study of Bias, Race, and Gender in Congress, 1880-2010

Word embeddings reveal how Congressional language around bias, race, and gender shifted from 1880 to 2010. From 1880 to 1970, “bias” was linked to personal emotion and partisanship; after 1975, it became associated with systemic issues like racism, sexism, and gerrymandering. Vector subtraction techniques show that early references to women emphasized suffrage and labor, while post-1970 discourse focused on reproduction and sexuality, with terms like “unwed,” “contraceptives,” and “clinics.” These changes reflect a broader shift toward identity-based and structural understandings of inequality in political speech.

PANGeA

PANGeA

PANGeA is a system that uses large language models (LLMs) to create narrative content for turn-based RPGs based on game designers' high-level criteria. It introduces a novel validation system for handling free-form text input during development and gameplay, employing "self-reflection" techniques, enabling small/local LLMs to perform comparably to foundational models. It enriches player-NPC interactions by generating personality-biased non-playable characters (NPCs). It improves AI accuracy through crowdsourcing mechanics. PANGeA houses a server with a custom memory system that provides context for LLM generation. The server's REST interface enables integration with any game engine.

GAME-KG

GAME-KG

Knowledge graphs (KGs) can augment large language models (LLMs) while also providing an explainable set of facts that can be inspected by a human. Explainability is valuable for fields that may otherwise avoid LLMs due to hallucinations, such as human trafficking analysis. Creating KGs poses challenges, however. KGs parsed from documents may include explicit connections (those directly stated in a document) but miss implicit connections (those evident to a human, but not directly stated). This research introduces GAME-KG, an approach to modifying explicit and implicit KG connections by crowdsourcing feedback through video games.

Dark Shadows

Dark Shadows

Dark Shadows is a film-noir style detective thriller that acts as a test bed for proof-of-concept and prototype system components, frameworks, and models that contribute to research in AI and machine learning. The gameplay focuses on social scenarios where players provide natural language input to progress the narrative. Dark Shadows includes PANGeA’s novel validation system, which leverages self-reflection to evoke a large language model's (LLM) intelligence when evaluating and responding to user input. Narrative and artwork are procedurally generated.