Steph Buongiorno, PhD
Southern Methodist University
Ford Hall 306E
3100 McFarlin Blvd, Dallas, TX 75205
sbuongiorno@smu.edu
Professional Employment
I co-direct a 10-person lab while designing intelligent agents and multi-agent systems that autonomously learn new subjects (by parsing plain-language text into knowledge graphs), retrieve information that encodes domain knowledge, and self-validate (using techniques based on self-reflection/self-prompting).
Publications
(peer reviewed = *)
Book Projects
*Steph Buongiorno [Corresponding Author] and Jo Guldi. Text Mining for Historical Analysis. Under Review. Cambridge University Press.
*Steph Buongiorno [Corresponding Author] and Jo Guldi. Text Mining for Historical Analysis. eBook Edition. Under Review. Cambridge University Press.
Articles
In my profession it is expected to include co-authors who do not contribute writing. As corresponding author I performed all experimental research design, evaluation, and writing. I directed the student co-authors following my name in conceptualization, design, and development of their respective artifacts (e.g. a video game mechanic, artwork, story narrative) while enabling their own creative agency in our shared project. In this respect, co-authorship is used here to convey transparency and signal their contributions to dimensions of a larger project. When acting as 2nd+ author, I offered an intellectual contribution and produced writing.
*Steph Buongiorno [Corresponding Author], Jake Klinkert, Tanishq Chawla, Zixin Zhuang, and Corey Clark. "PANGeA: Procedural Artificial Narrative Using Generative AI for Turn-Based Video Games." Proceedings of the 2024 AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), Lexington, KY, USA, 2024.
*Jake Klinkert, Steph Buongiorno [Second Author], Corey Clark. "Evaluating the Efficacy of LLMs to Emulate Realistic Human Personalities." Proceedings of the 2024 AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), Lexington, KY, USA, 2024.
*Steph Buongiorno [Corresponding Author], and Corey Clark. "A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy-Critical Generative AI Applications." Proceedings of the 2024 IEEE Conference on Games (CoG), Milan, Italy, 2024.
Article Projects
*Steph Buongiorno [Corresponding Author], Alexander Cerpa, Jo Guldi. "Disambiguating Speakers in the Hansard 19th-Century British Parliamentary Debates." Under review at Journal of Cultural Analytics.
*Steph Buongiorno [Corresponding Author], Rob Kalescky, Jo Guldi. "The Hansard 19th-Century British Parliamentary Debates: Discovering Lost Records and the Creation of an Analysis Ready Data Set." Under review at Journal of Cultural Analytics.
Steph Buongiorno [Corresponding Author], Ananya Das Manolyl, Corey Clark. "Hierarchies of Thought: A Development Methodology for Explainable Multi-Agent Planning Systems Driven by Generative AI and Specialized Knowledge Graphs."
Steph Buongiorno [Corresponding Author], Aiyou Tan, Ryan Schaefer, Jo Guldi. "Democratizing Text-Based Data Analytics and Data Sharing Across the Humanities and Social Sciences". Target Journal: International Journal of Digital Humanities.
Steph Buongiorno. "North and South American Cave Diving Fatalities and Comorbid Factors (1970-2021)." Collected from the archives of the National Speleological Society accident reports and public news reports. Target Journal: International Journal of Aquatic Research and Education.
Video Games
Steph Buongiorno, Jake Klinkert, Zixin Zhaung, Tanishq Chawla, and Corey Clark. Dark Shadows, Southern Methodist University, Guildhall.
Counter the real-world problem of human trafficking while playing a video game.
I conceived of Dark Shadows as a film noir-style "document thriller" (inspired by Papers, Please! and Night Call). Prior to the advent of generative AI, I designed and developed human-in-the-loop mechanics to gather player feedback and train a machine learning model capable of disambiguating speakers, locations, and events in real human trafficking data. Now, Dark Shadows pushes the boundaries of generative AI in video games through a novel validation system that enables dynamic, free-form interactions with the player aligned with a procedural game narrative (re: PANGeA). The validation system uses self-reflection to evaluate the text input and generate a response in alignment with expected rules. To enrich players’ experiences, NPCs express traits from the Big 5 Personality Model in their responses.
Future Work: Dark Shadows will include a social engineering scene, where the player must investigate the NPCs’ personalities and use rhetorical devices to gain clues on the mystery.
Instructional Materials
Steph Buongiorno. "Foundations and Applications of Humanities Analytics 2023." Santa Fe Institute. GitHub.
I was the lead instructor for the SFI’s "Humanities Analytics" summer camp two years running. My code formed the basis for all instruction and activity.
Steph Buongiorno. "Foundations and Applications of Humanities Analytics 2022." Santa Fe Institute. GitHub.
Steph Buongiorno. "Digital History." Southern Methodist University. GitHub.
Digital Projects
Steph Buongiorno, Ryan Schaefer, Aiyou Tan, Wes Anderson, Chris Miller, and Matt Swigart. Democracy Viewer, Emory University (forthcoming Summer 2024).
I lead the development of a public-facing web app for exploring, text mining, and visualizing humanities and social sciences data sets in English, German, Spanish, and French. To be deployed on Amazon Web Service (AWS) in Summer 2024.
Steph Buongiorno. The Hansard Viewer. Southern Methodist University, 2022.
Description: A Shiny app for text mining and visualizing the 19th-century British parliamentary debate using data science metrics.
Steph Buongiorno. The Congress Viewer. Southern Methodist University, 2022.
Description: A Shiny app for text mining and visualizing the U.S. Congressional Records using data science metrics.
*Steph Buongiorno. usdoj. ROpenGov.
Description: An R package for creating a structured version of the U.S. Department of Justice press releases, blogs, and records.
*Steph Buongiorno. oldbailey. ROpenGov.
Description: An R package for creating a structured version of the Old Bailey criminal trials. Handles broken tags and messy data and returns an analysis-ready dataset.
*Steph Buongiorno. hansardr. GitHub.
Description: An R package for querying a clean version of the 19th-century Hansard Corpus.
*Steph Buongiorno and Omar Alexander Cerpa. hansard-speakers. GitHub.
Description: Code for disambiguating speakers in the 19th-century Hansard Corpus using Levenshtein distances and parallel computing.
Steph Buongiorno. noaa. GitHub.
Description: An R package for querying a clean version of NOAA climate and weather data.
Steph Buongiorno and Omar Alexander Cerpa. posextract. GitHub.
Description: A Python package for extracting grammatical subject-predicate triples from data. Tailored for the analysis of agency in text.
Steph Buongiorno. posextractr. GitHub.
Description: An R package for extracting grammatical subject-predicate triples from data. Tailored for the analysis of agency in text.
Steph Buongiorno and Jo Guldi. democracy-lab. GitHub.
Description: A code repository for text mining techniques for the Digital Humanities.
Steph Buongiorno and Ryan Schaefer. dhmeasures. GitHub.
Description: Optimized, "white-box" statistical functions for textual analysis.
Data Sets
Steph Buongiorno; Robert Kalescky; Omar Alexander Cerpa; Jo Guldi, 2022, "The Hansard 19th-Century British Parliamentary Debates with Improved Speaker Names: Parsed Debates, N-Gram Counts, Special Vocabulary, Collocates, and Topics", https://doi.org/10.7910/DVN/ZCYJH8, Harvard Dataverse.
Steph Buongiorno; Omar Alexander Cerpa; Jo Guldi, 2022, "The Hansard 19th-Century British Parliamentary Debates with Improved Speaker Names: Speaker Metadata", https://doi.org/10.7910/DVN/Z3LTVV, Harvard Dataverse, FORTHCOMING
Other Contributions
30 visualizations in Jo Guldi's The Dangerous Art of Text Mining.
Grant Proposals
"Integrating Human Computer Interaction, Machine Learning, Game Design, and Educational Assessment in a STEM+C Curriculum." Topic: To support the development of personalized, educational agents in Minecraft. We propose the integration of an advanced set of agent abilities across three design types: A) Personalized Educational Agents that autonomously monitor student progression and dynamically generate customized curriculum for addressing individuals' needs based on their existing knowledge, personalities, and interests; B) Bridging Agents that demonstrate the "bigger picture" behind computational thinking by creating connections between STEM topics and interdisciplinary topics, such as the Language Arts; and C) Analyst Agents that translate student progression and learning outcomes to teachers for their easy assessment of student progress.
"Developing Autonomous Agents to Improve Information Flow on Human trafficking" Topic: To make the response to human trafficking more resilient, we propose a new, complex data ecosystem of open knowledge networks along with information retrieval technology that improves data accessibility. Our research will focus on the state of Texas and confront the problems surrounding the development and use of a distributed, open knowledge network that holds sensitive information that cannot be shared directly. Our aim is to improve information flow between stakeholders (e.g. law enforcement, DHS, travel intermediaries) through a scalable architecture while enabling users to both ingest and retrieve meaningful knowledge artifacts. Key to our research is a novel agent layer that advances the state-of-the-art in privacy-preserving data sharing and analytics (PPDSA) technology by serving the dual purpose of: a. ensuring privacy protection and compliance with data sharing regulation; and b. maintaining integrity by preventing unauthorized data injections and data leaks. In addition, agents will be able to interface directly with the network, making private and secure connections between entities.
"Gen-AI Distributed Autonomous Agents and Deployment Infrastructure to Accelerate Research, Discovery and Development." Topic: To support finishing the development and design of intelligent agents and hierarchical, multi-agent systems for deployment in a distributed environment. To be applied to research problems with collaborators in psychology, chemistry, education, and economics.