Big Data, Social Technology and Politics

Rapidly organizing large scape political networks are challenging both our theories of collective action and our methods for analyzing “big data” sets that may contain tens or hundreds of millions of digital records such as tweets. We have partnered with the UW Information School Social Media Lab (SoMe Lab) — to study these large data sets. Our concern is to ask social science questions about their organization and political effects while developing the methods and tools to answer those questions.

This work will be enabled through a grant from the National Science Foundation INSPIRE program, which is an interdisciplinary collaboration among different NSF programs: Socio-computational Systems, Human Centered Computing, Information Technology Research, and Political Science. The grant is titled “Tools, Models, and Innovation Platforms for Research on Social Media.”

The abstract from the grant describes the project as follows:

This project seeks 1) to develop an open-source toolkit of affordable methods and approaches that enables researchers in the information and social sciences to describe, analyze, and visualize how information flows within and between social media platforms and across geographic locations; and 2) to apply these methods and tools to create and analyze a dataset of tweets, posts, related messages and sites. The initial focus is on Twitter data associated with the social movement that began as “Occupy Wall Street” but broadens to include other venues and even other topics. This research will both develop tools for social media research and demonstrate how these tools improve our capabilities for research in this social media space.

Social media platforms are transforming how we work, live together, and govern ourselves. The many modalities offered by social media platforms such as Twitter, YouTube, and Facebook create a complex infrastructure, resulting in a dynamic ecosystem of information flows within and across platforms and among individuals and groups. The resulting socio-technical system enables both rapid and emergent organizational and relational transformations, but its scope and dynamic structure present conceptual and practical challenges for researchers studying the mechanisms of the transformations. The combination of volume, scope, complexity, and ephemeral nature of message flows require an interdisciplinary approach to developing research methods and tools for curating, processing, storing, and analyzing that go beyond the typical relational database management approaches.

The intellectual merit of the research is two-fold. First, the research will combine knowledge of social science, political science, communications, geography, information science, and emerging computing techniques to create a scalable and affordable approach to the selective collection and analysis of data from the huge flows of tweets, “likes,” and links associated with the Occupy movement. The approach will be documented and made available as an open source toolkit for other social media researchers, thus enabling a wider scope of inquiry. Second, the research will contribute to knowledge about grass-roots movements, using the Occupy movement as a currently active example that can serve as a test case, by examining the relationships of tweets, geography, and the possible linkages to other groups. The emerging information ecosystem of networks and linkages enabled by social media supplement traditional media such as newspapers and television, and messages may “go viral,” crossing community and geographic boundaries with unprecedented speed and quickly reaching remote and previously unconnected audiences. This work will provide new insights into the mechanisms by which these viral events occur and will help us understand the impact viral events can have on public participation in political discourse.

The broader impact of the research is both immediate and cumulative. The toolkit for data curation, analysis, and visualization has the potential for transforming the research questions social and information scientists can investigate using large datasets resulting from social media. These tools can enable us to envision different frameworks within which to understand how new social media communities form and engage in the broader public discourse. In this way, the research can transform how we think about and study the role of grass roots endeavors in community formation and change. The techniques developed through this research will lower the technical barriers and costs encountered by other researchers and observers who conduct their own research on such datasets. The research engages a new generation of researchers at both the doctoral and undergraduate levels. Graduate students working on the project will be developing the approach and methods and will be gaining experience in managing undergraduate students who work on the team. The experience of working on a truly transformative effort can inspire them and give them confidence to try risky and emergent efforts in the future, and they can be leaders in what may become an interdisciplinary area of study of large social media datasets.