Monday, April 1, 2019

2019-04-01: Creating a data set for 116th Congress Twitter handles

Senators from Alabama in the 115th Congress

Any researcher conducting research on Twitter and the US Congress might think, "how hard could it be in creating a data set of Twitter handles for the members of Congress?". At any given time, we know the number of members in the US Congress and we also know the current members of Congress. At this point, creating a data set of Twitter handles for the members of Congress might seem like an easy task, but it turns out it is a lot more challenging than expected. We present the challenges involved in creating a data set of Twitter handles for the members of 116th US Congress and provide a data set of Twitter handles for 116th US Congress

Brief about the US Congress


The US Congress is a bicameral legislature comprising of the Senate and the House of Representatives. The Congress consists of:

  • 100 senators, two from each of the fifty states.
  • 435 representatives, seats are distributed by population across the fifty states.
  • 6 non-voting members from the District of Columbia and US territories which include American Samoa, Guam, Northern Mariana Islands, Puerto Rico, and US Virgin Islands.
Every US Congress is consecutively numbered and has a term of two years. The current US Congress is the 116th Congress which began on 2019-01-03 and will end on 2021-01-03.       

Previous Work on Congressional Twitter


Since the inception of social media, Congress members have aggressively used it as a medium of communication with the rest of the world. Previous researchers have completed their US Congress Twitter handles data set by both using other lists and manually adding to them. 

Jennifer Golbeck et al. in their papers "Twitter Use by the US Congress" (2010) and "Congressional twitter use revisited on the platform's 10-year anniversary" (2018) used the Tweet Congress to build their data set of Twitter handles for the members of Congress. An important highlight from their 2018 paper is that every member of Congress has a Twitter account. Libby Hemphill in "What's congress doing on twitter?" talks about the manual creation of 380 Twitter handles for US Congress which were used for collecting tweets in the winter of 2012. Theresa Loraine Cardenas in "The Tweet Delete of Congress: Congress and Deleted Posts on Twitter" (2013) used Politwoops to create the list of Twitter handles for members of Congress. Jihui Lee et al. in their paper "Detecting Changes in Congressional Twitter Networks over Time" used the community maintained GitHub repository from @unitedstates to collect Twitter data for 369 representatives of the 435 from the 114th US Congress. Libby Hemphill and Matthew A. Shapiro in their paper "Appealing to the Base or to the MoveableMiddle? Incumbents’ Partisan MessagingBefore the 2016 U.S. Congressional Elections" (2018) also used the community maintained GitHub repository from @unitedstates
Screenshot from Tweet Congress

Twitter Handles of the 116th Congress 


January 3, 2019 marked the beginning of 116th United States Congress with 99 freshman members to the Congress. It has already been two months since the new Congress has been sworn in. Now, let us review Tweet Congress and GitHub repository @unitedstates to check how up-to-date these sources are with the Twitter handles for the current members of Congress. We also review the CSPAN Twitter list for the members of Congress in our analysis.

Tweet Congress 

Tweet Congress is an initiative from the Sunlight Foundation with help from Twitter to create a transparent environment which allows easy conversation between lawmakers and voters in real time. It was launched in 2011. It lists all the members of Congress and their contact information. The service also provides visualizations and analytics for Congressional accounts.     

@unitedstates (GitHub Repository)

It is a community maintained GitHub repository which has list of members of the United States Congress from 1789 to present, congressional committees from 1973 to present, committee memberships for current, and information about all the presidents and vice-presidents of the United States. The data is available in YAML, JSON, and CSV format. 

CSPAN (Twitter List)

CSPAN maintains Twitter lists for the 116th US Representatives and US Senators. The Representatives list has 482 Twitter accounts while the Senators list has 114 Twitter accounts. 

Combining Lists  


We used the Wikipedia page on the 116th Congress as our gold-standard data for the current members of Congress. The data from Wikipedia was collected on 2019-03-01. Correspondingly, the data from CSPAN, @unitedstates (GitHub Repository), and Tweet Congress was also collected on 2019-03-01. We then manually compiled a CSV file with the members of Congress and the presence of their Twitter handles in all the different sources. The reason for manual compilation of the list was largely due to discrepancy in the names of the members of Congress from different sources under consideration.
  • Some of the members of Congress use diacritic characters. For example, Wikipedia and Tweet Congress have the name of a representative from New York as Nydia_Velázquez, while  Twitter and @unitedstates repository has her name as Nydia Velazquez
Screenshot from Wikipedia showing Nydia Velazquez, representative from New York using diacritic characters

Screenshot from Twitter for Rep. Nydia Velazquez from New York without diacritic characters
  • Some of the members of Congress have abbreviated middle names or suffixes in their names. For example, Wikipedia has the name of a representative from Tennessee as Mark E. Green while Tweet Congress has his name as Mark Green.
Screenshot from Wikipedia for Rep. Mark Green from Tennessee with his middle name


Screenshot from Twitter for Rep. Mark Green from Tennessee without his middle name
Screenshot from Tweet Congress for Rep. Mark Green from Tennessee without his middle name
Screenshot from Wikipedia for Rep. Chuck Fleischmann from Tennessee using his nick name
Screenshot from Twitter for Rep. Chuck Fleischmann from Tennessee using his nick name
Screenshot from Tweet Congress for Rep. Chuck Fleischmann from Tennessee using his given name

What did we learn from our analysis?


As of 2019-03-01, the US Congress had 538 members of 541 with three vacant representative positions. The three vacant positions include the third and ninth Congressional Districts of North Carolina and the twelfth Congressional District of Pennsylvania. Of the 538 members of Congress, 537 have Twitter accounts while the non-voting member from Guam, Michael San Nicolas, has no Twitter account.


Name Position Joined Congress CSPAN @unitedstates TweetCongress Remark
Collin Peterson Rep. 1991-01-03 F F F @collinpeterson
Greg Gianforte Rep. 2017-06-21 F F F @GregForMontana
Gregorio Sablan Del. 2019-01-03 F T T
Rick Scott Sen. 2019-01-08 T !T F
Tim Kaine Sen. 2013-01-03 T !T F
James Comer Rep. 2016-11-08 T !T F
Justin Amash Rep. 2011-01-03 T !T F
Lucy Clay Rep. 2001-01-03 T !T F
Bill Cassidy Rep. 2015-01-03 T !T T
Members of the 116th Congress whose Twitter handles are missing from either one or all of the sources. T represents both name and Twitter handle present, !T represents name present but Twitter handle missing, and F represents both the name and Twitter handle missing.
  • CSPAN has Twitter handles for 534 members of Congress out of the 537 members of Congress with two representatives and a non-voting member missing from its list. The absentees from the list are Rep. Collin Peterson (@collinpeterson), Rep. Greg Gianforte (@GregForMontana), and Delegate Gregorio Sablan (@Kilili_Sablan).
  • The GitHub repository, @unitedstates has Twitter handles for 529 members of Congress out of the 537 members of Congress with five representatives and three senators missing from its data set. The absentees from the repository are Rep. Collin Peterson (@collinpeterson), Rep. Greg Gianforte (@GregForMontana), Sen. Rick Scott (@SenRickScott), Sen. Tim Kaine (@timkaine), Rep. James Comer (@KYComer), Rep. Justin Amash (@justinamash), Rep. Lucy Clay (@LucyClayMO1), and Sen. Bill Cassidy (@SenBillCassidy).
  • Tweet Congress has Twitter handles for 530 members of Congress out of the 537 members of Congress with five representatives and two senators missing.  The absentees are Rep. Collin Peterson (@collinpeterson), Rep. Greg Gianforte (@GregForMontana), Sen. Rick Scott (@SenRickScott), Sen. Tim Kaine (@timkaine), Rep. James Comer (@KYComer), Rep. Justin Amash (@justinamash), and Rep. Lucy Clay (@LucyClayMO1).
The combined list of Twitter handles for the members of Congress from all the sources has two representatives missing, namely Collin Peterson who is a representative from Minnesota since 1991-01-03 and Greg Gianforte who is a representative from Montana since 2017-06-21. The combined list from all the sources also has six members of Congress who have different Twitter handles from different sources.


Name Position Joined Congress CSPAN @unitedstates + TweetCongress
Chris Murphy Sen. 2013-01-03 @ChrisMurphyCT @senmurphyoffice
Marco Rubio Sen. 2011-01-03 @marcorubio @SenRubioPress
James Inhofe Sen. 1994-11-16 @JimInhofe @InhofePress
Julia Brownley Rep. 2013-01-03 @RepBrownley @JuliaBrownley26
Seth Moulton Rep. 2015-01-03 @Sethmoulton @teammoulton
Earl Blumenauer Rep. 1996-05-21 @repblumenauer @BlumenauerMedia
Members of the 116th Congress who have different Twitter handles in different sources

Possible reasons for disagreement in creating a Members of Congress Twitter handles data set


Scenarios involved in creating Twitter handles for members of Congress when done over a period of time

One Seat - One Member - One Twitter Handle: When creating our data set of Twitter handles for members of Congress over a period of time, the perfect situation is where we have one seat in the Congress which is held by one member for the entire congress tenure who holds one Twitter account. For example, Amy Klobuchar, senator from Minnesota has only one Twitter account @amyklobuchar.

Google search screenshot for Sen. Amy Klobuchar's Twitter account
Twitter screenshot for Sen. Amy Klobuchar's Twitter account

One Seat - One Member - No Twitter Handle: When creating our data set of Twitter handles for members of Congress over a period of time, we have one seat in Congress which is held by one member for the entire congress tenure and does not have a Twitter account. For example, Michael San Nicolas, delegate from Guam has no Twitter account.

Screenshot from Congressman Michael San Nicolas page showing a Twitter link for HouseDems Twitter account while the rest of the social media icons are linked to his personal accounts

One Seat - One Member - Multiple Twitter Handles: When creating our data set of Twitter handles for members of Congress over a period of time, we have one seat in Congress which is held by one member for the entire congress tenure who has more than one Twitter account. A member of Congress can have multiple Twitter accounts. Based on the purpose of the Twitter accounts they can be classified as Personal, Official, and Campaign accounts.

  • Personal Account: A Twitter account used by the members of Congress to tweet their personal thoughts can be referred to as a personal account. A majority of these accounts might have a creation date prior to when they were elected to the Congress. For example, Marco Rubio, a Senator from Florida created his Twitter account @marcorubio in August, 2008 while he was sworn in to Congress on 2011-01-03.
Screenshot for the Personal Twitter account of Sen. Marco Rubio from Florida. The account was created in August, 2008 while he was elected to Congress on 2011-01-03 
  • Official Account: A Twitter account used by the member of Congress or their staff to tweet out all the official information for general public related to the member of Congress' activity is referred to as an official account. A majority of these accounts creation dates will be close to the date on which the member of Congress got elected. For example, Marco Rubio, a Senator from Florida has a Twitter account @senrubiopress which has a creation date of December, 2010, while he was sworn in to Congress on 2011-01-03. 
Screenshot for the Official Twitter account of Sen. Marco Rubio from Florida. The account was created in December, 2010 while he was elected to Congress on 2011-01-03.
  • Campaign Accounts: A Twitter account used by a member of Congress for campaigning their elections is referred to as a campaign account. For example, Rep. Greg Gianforte from Montana has a Twitter account @gregformontana which contains tweets related to his campaigns for re-election can be referred to as a campaign account.
Twitter Screenshot for the Campaign account of Rep. Greg Gianforte from Montana which contains tweets related to his re-election campaigns.
Twitter Screenshot for the Personal account of Rep. Greg Gianforte from Montana which has personal tweets from him. 

One Seat - Multiple Members - Multiple Twitter Handles: When creating our data set of Twitter handles for members of Congress over a period of time, we can have a seat in Congress which is held by different members during the tenure of Congress at different points in time who have different Twitter accounts. An example from the 115th Congress is the Alabama Senator situation between January 2017 and July 2018. On February 9, 2017, Jeff Sessions resigns as senator and was succeeded by Alabama Governor's appointee Luther Strange. After the special election on January 3, 2018, Luther Strange leaves the office to make way for Doug Jones as the Senator of Alabama. Now,  who do we include as the Senator from Alabama for the 115th Congress? Even though we might decide to include all of them based on the date they join or leave their offices but, when this analysis is done for a year who will provide us all the historical information for the current Congress in session. As of now, all the sources we analyzed try to provide with the most recent information rather than historical information about the current Congress and its members over the entire tenure. 
  
Alabama Senate seat situation between January 2017 and July 2018. It highlights the issue in context of Social Feed Manager's 115th Congress tweet dataset.  
One of the other issues worth mentioning is when members of Congress change their Twitter handle. An example for this scenario is when Rep. Alexandria Ocasio-Cortez from New York tweeted on 2018-12-28 about changing her Twitter handle from @ocasio2018 to @aoc. In the case of popular Twitter accounts for members of Congress, it is easy to discover their change of handles but for a member of Congress who is not popular on Twitter, they might go unnoticed for quite some time.

Screenshot of memento for @Ocasio2018
Screenshot of memento which shows the announcement for change of Twitter handle from @Ocasio2018 to @aoc 
Screenshot of @aoc

Twitter data set for the 116th Congress Handle

  • We have created a data set for the 16th Congress Twitter handles which resolves the issues of CSPAN, Tweet Congress, and @unitedstates (GitHub repository). 
  • We have Twitter handles for all the current 537 members of Congress who are on Twitter, except for one delegate from Guam who does not have a Twitter account. 
  • Unlike other sources, our data set does not  include any member of Congress who are not a part of the 116th Congress.
  • In case of conflicts of Twitter handles for members of Congress from different sources under investigation, we chose accounts which were personally managed by the member of Congress (Personal Twitter Account) over accounts which were managed by their teams or used for campaign purposes (Official or Campaign Accounts). The reason for choosing personal accounts over official or campaign accounts is because some of the members of Congress explicitly mention in Twitter biography of their personal accounts that all the tweets are their own which is not reflected in their official or campaign account's Twitter biography. 
Twitter Screenshot of the Personal account for Rep. Seth Moulton where he states that all the tweets are his own in his Twitter bio.

Name Position WSDL Data set CSPAN @unitedstates + TweetCongress
Chris Murphy Sen. @ChrisMurphyCT @ChrisMurphyCT @senmurphyoffice
Marco Rubio Sen. @marcorubio @marcorubio @SenRubioPress
James Inhofe Sen. @JimInhofe @JimInhofe @InhofePress
Julia Brownley Rep. @RepBrownley @RepBrownley @JuliaBrownley26
Seth Moulton Rep. @Sethmoulton @Sethmoulton @teammoulton
Earl Blumenauer Rep. @repblumenauer @repblumenauer @BlumenauerMedia
Members of the 116th Congress who have different Twitter handles in different sources. The WSDL data set has personal Twitter handles over official Twitter handles

Conclusion


Of all the three sources Tweet Congress, @unitedstates (GitHub Repository) and CSPAN, none of them have a full coverage of all the Twitter handles for the members of the 116th Congress. There is one member of Congress who does not have a Twitter account and additionally there are two members of Congress who do not have their Twitter handles present in any of the sources. There is no source which provides the historical information about the members of Congress over the entire tenure of the Congress, as all the sources focus on the recency rather than holding information about the entire tenure of Congress. It turns out creating a data set of Twitter handles for members of Congress seems an easy task on first glance, but it is a lot more difficult owing to multiple reasons for disagreements when the study is to be done for over a period of time. We share a data set for the 116th Congress Twitter handles by combining all the lists.

https://github.com/oduwsdl/US-Congress

----
Mohammed Nauman Siddique
@m_nsiddique

No comments:

Post a Comment