There's an old saying: “Sticks and stones can break your bones.” “But words will never hurt me.”
Tell that to Eugenia Lo, assistant professor in the Department of Computer Science, and she'll show you tons of data that proves otherwise.
Her Social + AI & Language Lab showed that:
Police language accurately predicts violent behavior with black drivers. Broadcast media bias and social media echo chambers are putting American democracy at risk.
Now Professor Low's research team at the School of Engineering turned their attention to another question. How social media rhetoric has affected COVID-19 infection and death rates across the United States, and what policymakers and public health officials can learn from it.
Many studies only describe what happens online. In many cases, a direct relationship with offline behavior is not shown. But there are concrete ways to connect online behavior with offline decisions. ”
Eugenia Law, Assistant Professor, Department of Computer Science, Virginia Tech
cause and effect
During the COVID-19 pandemic, social media has become a gathering place for people who oppose public health guidance such as mask-wearing, social distancing and vaccines. The spread of misinformation has fueled widespread disregard for preventive measures, causing soaring infection rates, strained hospitals, health worker shortages, preventable deaths, and economic losses.
According to a 2022 study published in the Yale Journal of Biology and Medicine, there were 692,000 preventable hospitalizations among unvaccinated patients in the November-December 2021 month. More than 10 cases were reported. These hospitalizations alone cost a staggering $13.8 billion.
In this study, Rowe's team, including Drs. Student Xiaohan Ding developed the technology to train the chatbot GPT-4 to analyze posts in a discussion group on a banned subreddit that opposes COVID-19 prevention measures. Low said the research team focused on Reddit because the data was available. Many other social media platforms prohibit outside researchers from using their data.
Lowe's research is based on a social science framework called fuzzy-trace theory, pioneered by Valerie Reyna, a professor of psychology at Cornell University and a co-investigator on the project at Virginia Tech. Reyna showed that individuals learn and recall information better when it is expressed in causal relationships, rather than simply as memorized information. This is true even if the information is inaccurate or the implied relationship is weak. Reyna calls this construction of cause-and-effect relationships the “gist.”
The researchers worked to answer four basic questions about the nuts and bolts of social media.
How can we efficiently predict the main points of an entire social media debate on a national scale? What are the main points that characterize how and why people oppose COVID-19 public health practices? And how do these takeaways evolve over time through key events? Patterns of takeaways across users in banned subreddits opposing COVID-19 health practices? do gist patterns significantly predict trends in national health outcomes?
missing link
Rho's team used prompting techniques in large-scale language models (LLMs). A type of artificial intelligence (AI) program – also provides advanced statistics to search and track these points across banned subreddit groups. The model then compared COVID-19 milestones, including infection rates, hospitalizations, deaths, and related public policy announcements.
As a result, in fact, social media posts linking causes such as “I got the coronavirus vaccine'' and effects such as “I've felt like death since then'' quickly changed people's beliefs. It has been shown that it has appeared and had an impact. Offline health decisions. In fact, the total number of daily coronavirus infections and new cases in the United States can be largely predicted by the amount of gist in banned subreddit groups.
This is the first AI study to empirically link social media language patterns with real-world public health trends, identifying important online discussion patterns and linking these large-scale social media to inform more effective public health communication strategies. It highlights the potential of language models at scale.
“This research solves the daunting question of how to connect the cognitive components of meaning that people actually use to the world of information flows and health outcomes on social media. '' Reyna said. “This prompt-based LLM framework of identifying key points at scale has many potential applications that can promote better health and well-being.”
Big data, big impact
Low said he hopes this study will encourage other researchers to use these techniques to address important questions. To that end, the code used in this project will be made freely available when the paper is published in the Proceedings of the Association for Computing Machinery Conference on Human Factors in Computing Systems. The paper also compares the costs of different methods for researchers to analyze big data sets and draw meaningful conclusions at lower cost. The research team plans to present their findings in Honolulu, Hawaii, from May 11th to 16th.
Beyond academia, Low hopes the initiative will encourage social media platforms and other stakeholders to find alternatives to removing or banning groups that discuss controversial topics. Stated.
“Simply banning people outright in online communities, especially spaces where they already exchange and learn health information, will only lead people deeper into conspiracy theories and force them onto platforms that don't moderate their content at all. There's a danger,” Lo said. “This research explores how social media companies can work with public health officials and organizations to better understand what's going on in the minds of the public during a public health crisis.” I hope I can let you know what I'm working on.”
sauce:
Reference magazines:
Ding, X. et al. (2024). Leveraging prompt-based large-scale language models: Predicting pandemic health decisions and outcomes through social media language. CHI '24: Proceedings of his CHI Conference on Human Factors in Computing Systems. doi.org/10.1145/3613904.3642117.