Steph Buongiorno, PhD
Southern Methodist University
Ford Hall 306E
3100 McFarlin Blvd, Dallas, TX 75205
sbuongiorno@smu.edu
Professional Employment
I co-direct a 10-person lab while designing intelligent agents and multi-agent systems that autonomously learn new subjects (by parsing plain-language text into knowledge graphs), retrieve information that encodes domain knowledge, and self-validate (using techniques based on self-reflection/self-prompting).
Publications
(peer reviewed = *)
Book Projects
*Steph Buongiorno [Corresponding Author] and Jo Guldi. Text Mining for Historical Analysis. Under Contract. Cambridge University Press.
*Steph Buongiorno [Corresponding Author] and Jo Guldi. Text Mining for Historical Analysis. eBook Edition. Under Contract. Cambridge University Press.
Articles
In my profession it is expected to include co-authors who do not contribute writing. As corresponding author I performed all experimental research design, evaluation, and writing. I directed the student co-authors following my name in conceptualization, design, and development of their respective artifacts (e.g. a video game mechanic, artwork, story narrative) while enabling their own creative agency in our shared project. In this respect, co-authorship is used here to convey transparency and signal their contributions to dimensions of a larger project. When acting as 2nd+ author, I offered an intellectual contribution and produced writing.
*Steph Buongiorno [Corresponding Author], Jake Klinkert, Tanishq Chawla, Zixin Zhuang, and Corey Clark. "PANGeA: Procedural Artificial Narrative Using Generative AI for Turn-Based Video Games." Proceedings of the 2024 AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), Lexington, KY, USA, 2024.
*Jake Klinkert, Steph Buongiorno [Second Author], Corey Clark. "Evaluating the Efficacy of LLMs to Emulate Realistic Human Personalities." Proceedings of the 2024 AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), Lexington, KY, USA, 2024.
*Steph Buongiorno [Corresponding Author], and Corey Clark. "A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy-Critical Generative AI Applications." Proceedings of the 2024 IEEE Conference on Games (CoG), Milan, Italy, 2024.
Articles Under Review
*Steph Buongiorno [Corresponding Author], Alexander Cerpa, Jo Guldi. “Hansard 2.0: Discovering Lost Records and the Creation of an Analysis Ready Data Set.” Revise and resubmit at Journal of Cultural Analytics.
Article Projects
Steph Buongiorno [Corresponding Author] and Corey Clark. "Critical Thinking Agents for Solving the ARC AGI Puzzle." Target publication venues: AAAI, IEEE, or ACM.
Steph Buongiorno [Corresponding Author] and Ryan Schaefer. "Democratizing Text-Based Data Analytics and Data Sharing Across the Humanities and Social Sciences". Target publication venues: AAAI, IEEE, or ACM.
Steph Buongiorno [Corresponding Author]. "North and South American Cave Diving Fatalities and Comorbid Factors (1970-2021)." Collected from the archives of the National Speleological Society accident reports and public news reports. Target Journal: International Journal of Aquatic Research and Education.
Video Games
Steph Buongiorno, Jake Klinkert, Zixin Zhaung, Tanishq Chawla, and Corey Clark. Dark Shadows, Southern Methodist University, Guildhall.
Counter the real-world problem of human trafficking while playing a video game.
I conceived of Dark Shadows as a film noir-style "document thriller" (inspired by Papers, Please! and Night Call). Prior to the advent of generative AI, I designed and developed human-in-the-loop mechanics to gather player feedback and train a machine learning model capable of disambiguating speakers, locations, and events in real human trafficking data. Now, Dark Shadows pushes the boundaries of generative AI in video games through a novel validation system that enables dynamic, free-form interactions with the player aligned with a procedural game narrative (re: PANGeA). The validation system uses self-reflection to evaluate the text input and generate a response in alignment with expected rules. To enrich players’ experiences, NPCs express traits from the Big 5 Personality Model in their responses.
Future Work: Dark Shadows will include a social engineering scene, where the player must investigate the NPCs’ personalities and use rhetorical devices to gain clues on the mystery.
Instructional Materials
Steph Buongiorno. "Foundations and Applications of Humanities Analytics 2023." Santa Fe Institute. GitHub.
I was the lead instructor for the SFI’s "Humanities Analytics" summer camp two years running. My code formed the basis for all instruction and activity.
Steph Buongiorno. "Foundations and Applications of Humanities Analytics 2022." Santa Fe Institute. GitHub.
Steph Buongiorno. "Digital History." Southern Methodist University. GitHub.
Web Apps
Steph Buongiorno, Ryan Schaefer, Wes Anderson, Chris Miller, and Matt Swigart. Democracy Viewer, Emory University.
I lead students in the development of a public-facing web app for exploring, text mining, and visualizing humanities and social sciences data sets in English, German, Spanish, and French. To be deployed on Amazon Web Service (AWS) in Summer 2024.
Steph Buongiorno. The Hansard Viewer. Southern Methodist University, 2022.
Description: A Shiny app for text mining and visualizing the 19th-century British parliamentary debate using data science metrics.
Steph Buongiorno. The Congress Viewer. Southern Methodist University, 2022.
Description: A Shiny app for text mining and visualizing the U.S. Congressional Records using data science metrics.
Software Packages
*Steph Buongiorno. usdoj. ROpenGov.
Description: An R package for creating a structured version of the U.S. Department of Justice press releases, blogs, and records.
*Steph Buongiorno. oldbailey. ROpenGov.
Description: An R package for creating a structured version of the Old Bailey criminal trials. Handles broken tags and messy data and returns an analysis-ready dataset.
*Steph Buongiorno. hansardr. GitHub.
Description: An R package for querying a clean version of the 19th-century Hansard Corpus.
*Steph Buongiorno and Omar Alexander Cerpa. hansard-speakers. GitHub.
Description: Code for disambiguating speakers in the 19th-century Hansard Corpus using Levenshtein distances and parallel computing.
Steph Buongiorno. noaa. GitHub.
Description: An R package for querying a clean version of NOAA climate and weather data.
Steph Buongiorno and Omar Alexander Cerpa. posextract. GitHub.
Description: A Python package for extracting grammatical subject-predicate triples from data. Tailored for the analysis of agency in text.
Steph Buongiorno. posextractr. GitHub.
Description: An R package for extracting grammatical subject-predicate triples from data. Tailored for the analysis of agency in text.
Steph Buongiorno and Jo Guldi. democracy-lab. GitHub.
Description: A code repository for text mining techniques for the Digital Humanities.
Steph Buongiorno and Ryan Schaefer. dhmeasures. GitHub.
Description: Optimized, "white-box" statistical functions for textual analysis.
Datasets
Steph Buongiorno; Robert Kalescky; Omar Alexander Cerpa; Jo Guldi, 2022, "The Hansard 19th-Century British Parliamentary Debates with Improved Speaker Names: Parsed Debates, N-Gram Counts, Special Vocabulary, Collocates, and Topics", https://doi.org/10.7910/DVN/ZCYJH8, Harvard Dataverse.
Steph Buongiorno; Omar Alexander Cerpa; Jo Guldi, 2022, "The Hansard 19th-Century British Parliamentary Debates with Improved Speaker Names: Speaker Metadata", https://doi.org/10.7910/DVN/Z3LTVV, Harvard Dataverse, FORTHCOMING
Other Contributions
30 visualizations in Jo Guldi's The Dangerous Art of Text Mining.
Grant Proposals
"Inclusive Game-Based Learning with AI Agents: Personalizing Computational Thinking Instruction for Students with Reading Disabilities" Topic: To support the development of personalized, educational agents in Minecraft. We propose the integration of an advanced set of agent abilities across three design types: A) Personalized Educational Agents that autonomously monitor student progression and dynamically generate customized curriculum for addressing individuals' needs based on their existing knowledge, personalities, and interests; B) Bridging Agents that demonstrate the "bigger picture" behind computational thinking by creating connections between STEM topics and interdisciplinary topics, such as the Language Arts; and C) Analyst Agents that translate student progression and learning outcomes to teachers for their easy assessment of student progress.
"Developing Autonomous Agents to Improve Information Flow on Human trafficking" Topic: To make the response to human trafficking more resilient, we propose a new, complex data ecosystem of open knowledge networks along with information retrieval technology that improves data accessibility. Our research will focus on the state of Texas and confront the problems surrounding the development and use of a distributed, open knowledge network that holds sensitive information that cannot be shared directly. Our aim is to improve information flow between stakeholders (e.g. law enforcement, DHS, travel intermediaries) through a scalable architecture while enabling users to both ingest and retrieve meaningful knowledge artifacts. Key to our research is a novel agent layer that advances the state-of-the-art in privacy-preserving data sharing and analytics (PPDSA) technology by serving the dual purpose of: a. ensuring privacy protection and compliance with data sharing regulation; and b. maintaining integrity by preventing unauthorized data injections and data leaks. In addition, agents will be able to interface directly with the network, making private and secure connections between entities.
"Gen-AI Distributed Autonomous Agents and Deployment Infrastructure to Accelerate Research, Discovery and Development." Topic: To support finishing the development and design of intelligent agents and hierarchical, multi-agent systems for deployment in a distributed environment. To be applied to research problems with collaborators in psychology, chemistry, education, and economics.