People around the world compose and send more than 500 million Tweets every day—and in the middle of a global pandemic, there’s a lot to be learned from those geotags and hashtags. The massive amounts of data created by social media sites like Twitter is helping researchers gain powerful insights into the COVID-19 outbreak.. One of the researchers at the forefront of this work is Georgia State computer scientist Juan M. Banda, who specializes in machine learning and natural language processing.
“Each time we pick up our smartphones and share a Tweet or update our status, we’re providing information about ourselves in the form of data,” says Banda. “All those bits and bytes provide a snapshot of our lives. On a macro scale they can provide insights into society and populations on a number of fronts.”
Banda, assistant professor in the Department of Computer Science, began a project in March to collect and analyze Twitter data related to COVID-19. To date, his lab has compiled more than 700 million Tweets, which have yielded insights on the spread of misinformation and how human mobility has driven the pandemic’s next move. He has also studied the reported symptoms of so-called long-haulers—patients who continue to suffer from long-term health problems as a result of SARS-CoV-2 infection.
The team has made the dataset publicly available as a resource for the global research community, and it’s already been downloaded more than 30,000 times. The data have also been used in several multi-national studies.
Banda says his passion for big data came during graduate school at Montana State University, where he worked on the huge volume of data being generated by NASA’s Solar Dynamics Observatory mission. That led to work in astroinformatics and a Ph.D. in computer science. It