IEEE VIS 2024 Content: Steering LLM Summarization with Visual Workspaces for Sensemaking

Steering LLM Summarization with Visual Workspaces for Sensemaking

Xuxin Tang - Computer Science Department, Blacksburg, United States

Eric Krokos - Dod, Laurel, United States

Kirsten Whitley - Department of Defense, College Park, United States

Can Liu - City University of Hong Kong, Hong Kong, China

Naren Ramakrishnan - Virginia Tech, Blacksburg, United States

Chris North - Virginia Tech, Blacksburg, United States

Room: Bayshore II

2024-10-14T16:00:00ZGMT-0600Change your timezone on the schedule page
2024-10-14T16:00:00Z
Exemplar figure, described by caption below
We created an intermediate workspace based on the ground truth of an intelligence analysis dataset to better understand the enhancements in LLM summarization achieved by integrating the worksapce. We then conducted proof-of-concept experiments to assess how the workspace and each type of information impact LLM summarization. The experiment pipeline and simulated workspace is shown in the image.
Abstract

Large Language Models (LLMs) have been widely applied in summarization due to their speedy and high-quality text generation. Summarization for sensemaking involves information compression and insight extraction. Human guidance in sensemaking tasks can prioritize and cluster relevant information for LLMs. However, users must translate their cognitive thinking into natural language to communicate with LLMs. Can we use more readable and operable visual representations to guide the summarization process for sensemaking? Therefore, we propose introducing an intermediate step--a schematic visual workspace for human sensemaking--before the LLM generation to steer and refine the summarization process. We conduct a series of proof-of-concept experiments to investigate the potential for enhancing the summarization by GPT-4 through visual workspaces. Leveraging a textual sensemaking dataset with a ground truth summary, we evaluate the impact of a human-generated visual workspace on LLM-generated summarization of the dataset and assess the effectiveness of space-steered summarization. We categorize several types of extractable information from typical human workspaces that can be injected into engineered prompts to steer the LLM summarization. The results demonstrate how such workspaces can help align an LLM with the ground truth, leading to more accurate summarization results than without the workspaces.