everything
An index of all our open-source data, analysis, libraries, tools, and guides.
Top Related Projects
Data and code behind the articles and graphics at FiveThirtyEight
The Washington Post is compiling a database of every fatal shooting in the United States by a police officer in the line of duty since 2015.
A repository of data on coronavirus cases and deaths in the U.S.
Quick Overview
The BuzzFeedNews/everything repository is a comprehensive collection of data, methodologies, and analyses used in BuzzFeed News stories. It serves as a transparent resource for journalists, researchers, and the public to access and verify the data behind BuzzFeed's investigative reporting and data-driven stories.
Pros
- Promotes transparency in journalism by sharing raw data and methodologies
- Provides valuable datasets for researchers and data enthusiasts
- Encourages reproducibility of analyses and findings
- Serves as an educational resource for aspiring data journalists
Cons
- May require technical knowledge to fully utilize some datasets
- Not all BuzzFeed News stories have corresponding data in the repository
- Some datasets may become outdated over time
- Lack of standardized format across different projects
Getting Started
As this is not a code library but a collection of data and analyses, there's no specific code to get started. However, you can follow these steps to explore the repository:
- Visit the GitHub repository: https://github.com/BuzzFeedNews/everything
- Browse through the folders to find specific projects or datasets
- Read the README files in each project folder for context and instructions
- Download or clone the repository to access the data locally
- Use your preferred data analysis tools (e.g., R, Python, Excel) to explore the datasets
Note: Some projects may have specific requirements or dependencies, so be sure to check the individual project documentation for detailed instructions.
Competitor Comparisons
Data and code behind the articles and graphics at FiveThirtyEight
Pros of data
- More frequently updated with new datasets
- Better organized directory structure
- Includes detailed README files for each dataset
Cons of data
- Narrower focus on political and sports data
- Less diverse range of topics covered
- Fewer total datasets available
Code comparison
data:
import pandas as pd
df = pd.read_csv('datasets/nfl-elo/nfl_elo.csv')
print(df.head())
everything:
import json
with open('data/2016-10-presidential-campaign-donors/data.json') as f:
data = json.load(f)
print(data[:5])
Summary
Both repositories provide valuable datasets for analysis, but they cater to different needs. data offers a more structured and frequently updated collection, focusing on political and sports data with detailed documentation. everything covers a broader range of topics but with less organization and fewer updates. The code examples demonstrate different data formats used in each repository, with data primarily using CSV files and everything using JSON.
The Washington Post is compiling a database of every fatal shooting in the United States by a police officer in the line of duty since 2015.
Pros of data-police-shootings
- Focused dataset on a specific topic (police shootings)
- Regularly updated with new incidents
- Clear documentation and methodology
Cons of data-police-shootings
- Limited scope compared to the broader range of topics in everything
- Less diverse data formats and analysis tools
- Fewer supplementary resources and explanatory materials
Code comparison
data-police-shootings:
import pandas as pd
df = pd.read_csv('fatal-police-shootings-data.csv')
print(df.head())
everything:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data/some_dataset.csv')
df.plot(x='date', y='value')
plt.show()
The data-police-shootings repository provides a straightforward CSV file for analysis, while everything offers a wider range of datasets and potentially more complex analysis tools. The code examples reflect this difference, with data-police-shootings focusing on basic data loading and everything showcasing more advanced visualization capabilities.
A repository of data on coronavirus cases and deaths in the U.S.
Pros of covid-19-data
- Focused and specific dataset, making it easier to navigate and use for COVID-19 related analysis
- Regularly updated with current data, ensuring relevance for ongoing research
- Well-documented data sources and methodologies
Cons of covid-19-data
- Limited scope, only covering COVID-19 data
- Less diverse in terms of data types and topics covered
- May require additional datasets for comprehensive analysis
Code Comparison
covid-19-data:
import pandas as pd
df = pd.read_csv('us-states.csv')
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(['state', 'date'])
everything:
import pandas as pd
df = pd.read_csv('data/some_dataset.csv')
df['date'] = pd.to_datetime(df['date'])
df = df.groupby('category').agg({'value': 'sum'})
Both repositories use pandas for data manipulation, but covid-19-data focuses on COVID-specific data processing, while everything demonstrates more general-purpose data analysis techniques across various datasets.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
BuzzFeedNews/everything
An index of all our open-source data, analysis, libraries, tools, and guides.
Table of Contents
Data and Analyses
Date | Description | Repo(s) | Article(s) |
---|---|---|---|
2022-04-27 | Data and analysis of state child abuse and neglect registries and appeals | :link: | :link: |
2022-04-25 | Data and analysis of intermediate care facilities | :link: | :link: |
2021-09-17 | Data and analysis re. US adult guardianship filing counts | :link: | :link: |
2021-05-26 | Analysis of excess deaths caused by the February 2021 winter storm and power outages in Texas | :link: | :link: |
2020-11-11 | Analysis of county-level COVID-19 deaths and presidential voter preference | :link: | :link: |
2020-10-28 | Analysis of 2020's "Electoral College effect" by demographic | :link: | :link: |
2020-06-04 | Analysis of "1033" program transfers since Ferguson | :link: | :link: |
2020-05-07 | Analysis of ZIP codeâlevel COVID-19 cases in five major cities | :link: | :link: |
2020-02-27 | Analysis of Census tractâlevel gentrification in five major cities | :link: | :link: |
2019-11-11 | Analysis of U.S. Census Survey of Income and Program Participation (SIPP), re. generational trends in support providers | :link: | :link: |
2019-10-31 | Analysis for "Your Dumb Tweets Are Getting Flagged To People Trying To Stop School Shootings" | :link: | :link: |
2019-10-17 | Analysis for "Donald Trumpâs Campaign Is Cashing In On Impeachment" | :link: | :link: |
2019-10-03 | Analysis of FCC comments and data breaches | :link: | :link: |
2019-08-03 | Analysis of ActBlue's 2019 mid-year FEC report | :link: | :link: |
2019-07-17 | Analysis of contributions to presidential campaigns, based on 2019 Q2 filings | :link: | :link: |
2019-04-22 | Data and code to make maps and animation depicting current realities of climate change | :link: | :link: |
2019-04-16 | Analysis of donors giving $200+ to multiple Democratic candidates early in 2020 election cycle | :link: | :link: |
2019-01-24 | Data and code for, "Shoot Someone In A Major US City, And Odds Are Youâll Get Away With It" | :link: | :link: :link: |
2018-12-28 | Year-end analysis of fake news sites and viral posts, 2016â2018 | :link: | :link: |
2018-12-19 | Analysis of WeChat posts re. VP Pence | :link: | :link: |
2018-10-25 | Analysis and graphics for "How Russiaâs Online Trolls Engaged Unsuspecting American Voters â And Sometimes Duped The Media" | :link: | :link: |
2018-10-18 | Analysis of 2018 midterm election demographics | :link: | :link: |
2018-09-29 | Analysis of 'immigration services'-related FTC complaints | :link: | :link: |
2018-08-10 | Data, analysis, and graphics for "Russian Trolls Swarmed The Charlottesville March â Then Twitter Cracked Down" | :link: | :link: |
2018-07-28 | Analysis of wildfire trends (with graphics) | :link: | :link: |
2018-07-26 | Analysis of children's home inspection data from the UK's Office for Standards in Education, Children's Services and Skills ("Ofsted") | :link: | :link::link: |
2018-06-29 | Analysis of NYC 311 complaints and gentrification | :link: | :link: |
2018-05-01 | Analysis of fentanyl and cocaine overdose deaths | :link: | :link: |
2018-03-02 | Analysis of diversity in the dialogue of Best Pictureânominated films | :link: | :link: |
2018-02-23 | Analysis of Olympic figure skating scores | :link: | :link: |
2018-02-08 | Data and analysis for "The Edge" (re. figure skating) | :link: | :link: |
2018-01-31 | Analysis of the text of every State of the Union address | :link: | :link: |
2018-01-24 | Data and analysis for "An Inside Look At The Accounts Twitter Has Censored In Countries Around The World" | :link: | :link: |
2018-01-23 | Data and analysis for "How Trumpâs Tweets Shaped A Year In Politics" | :link: | :link: |
2017-12-28 | Data and analysis for "These Are 50 Of The Biggest Fake News Hits On Facebook In 2017" | :link: | :link: |
2017-12-10 | Data, analysis, and charts for "What Sexual Misconduct Allegations Are Getting The Most Attention On Cable News?" | :link: | :link: |
2017-12-05 | Data and analysis for "We Got Government Data On 20 Years Of Workplace Sexual Harassment Claims. These Charts Break It Down." | :link: | :link: |
2017-11-15 | Data on, and analysis of, federal employee diversity | :link: | :link: |
2017-11-03 | Data and code for "Under Trump, Gun Sales Did Not Spike After The Las Vegas Shooting" | :link: | :link: |
2017-09-19 | Updated analysis of Harvey-related industrial emissions in Texas | :link: | :link: |
2017-09-11 | Federal employee departure rates, for "Trumpâs Election Didnât Spark An Immediate Exodus From The Federal Government" | :link: | :link: |
2017-09-02 | FOIA logs referenced in "These Scientists Got To See Their Competitorsâ Research Through Public Records Requests." | :link: | :link: |
2017-08-31 | Data and analysis on Harvey-related industrial emissions in Texas | :link: | :link: |
2017-08-08 | Data and analysis for "Inside The Partisan Fight For Your News Feed" | :link: | :link: |
2017-08-07 | Data and analysis for "BuzzFeed News Trained A Computer To Search For Hidden Spy Planes. This Is What We Found." | :link: | :link: |
2017-07-25 | Data and analysis for "If Jeff Sessions Exits, Trump Could Choose An Acting Attorney General From Among Thousands Of People" | :link: | :link: |
2017-05-24 | R code to recreate the graphics in "Why Americans Are So Damn Unhealthy, In 4 Shocking Charts" | :link: | :link: |
2017-04-04 | Data and analysis supporting portions of "Fake News, Real Ads" | :link: | :link: |
2017-01-31 | R code to recreate the graphics in "These Nobel Prizewinners Show Why Immigration Is So Important For American Science" | :link: | :link: |
2017-01-19 | Data and analysis supporting "Most American Adults Get News From Facebook â But They Donât Really Trust It, A New Survey Says" | :link: | :link: |
2017-01-18 | Data and R code to reproduce the graphics in "2016 Was The Hottest Year. Yes, Greenhouse Gases Are To Blame." | :link: | :link: |
2016-12-29 | Data and analysis re. transgender rights survey | :link: | :link::link: |
2016-12-20 | Data and code to reproduce the graphics from "2016 Will Be The Warmest Year, But This Is How Deniers Will Spin It" | :link: | :link: |
2016-12-07 | Data, methodologies, and analyses supporting "Intake" | :link::link::link: | :link: |
2016-12-06 | Data and analysis supporting "Most Americans Who See Fake News Believe It, New Survey Says" | :link: | :link: |
2016-11-28 | Data and code supporting evaluation of forecasters' 2016 election forecast | :link: | :link: |
2016-11-07 | Data and analysis supporting "How The Electoral College Screws Hispanic And Asian Voters" | :link: | :link: |
2016-11-03 | Analysis of "bellwether" counties in U.S. presidential elections | :link: | :link: |
2016-10-27 | Data and analysis supporting "Clinton Receives Thirty Times As Much Tech Cash As Trump" | :link: | :link: |
2016-10-20 | Data and analysis supporting "Hyperpartisan Facebook Pages Are Publishing False And Misleading Information At An Alarming Rate" | :link: | :link: |
2016-10-09 | Data and analysis re. White ancestry and Trump support | :link: | :link: |
2016-09-16 | Code supporting "Why 'Shy Trumpers' Probably Won't Decide The Election" | :link: | :link: |
2016-09-08 | Data and analysis supporting "When Detectives Dismiss Rape Reports Before Investigating Them" | :link: | :link: |
2016-08-22 | Data and analysis supporting "How Katie Ledecky Stacks Up Against Male Swimmers" | :link: | :link: |
2016-07-30 | Data and code supporting "Why Track-And-Field Stars Donât Set World Records Like They Used To (But Swimmers Do)" | :link: | :link: |
2016-07-24 | Data supporting "The Republican Convention Was Secretly Watched From Above" and "Government Spy Planes Circled Over The Democratic Convention More Intensely Than GOP Event" | :link: | :link: :link: |
2016-05-12 | Analysis of H-2 debarments, violations, and certifications ("The Pushovers") | :link: | :link: |
2016-04-26 | Analysis of GOP donor movements post-Bush and post-Rubio | :link: | :link: |
2016-04-20 | Analysis of Bernie Sanders's ActBlue donors | :link: | :link: |
2016-04-06 | Data and code for "Spies In The Skies" | :link: | :link: |
2016-02-01 | Bush/Rubio/Cruz donor movement analysis | :link: | :link: |
2016-01-29 | Data and code for "America's Quiet Crackdown On Indian Immigrants" | :link: | :link: |
2016-01-26 | Analysis of Jefferson County (TX) jail data | :link: | :link: |
2016-01-26 | Analysis of criminal case dispositions in Texas municipal courts | :link: | :link: |
2016-01-17 | Methodology and code for "The Tennis Racket" | :link: | :link: |
2015-12-29 | Data and analysis for "The Coyote" | :link: | :link: |
2015-12-09 | How long will the Warriors' win streak last? | :link: | :link: |
2015-12-07 | Race and fatal police shootings | :link: | :link: |
2015-12-02 | Time elapsed between mass shootings in the U.S. | :link: | :link: |
2015-12-01 | H-2 visa certifications and experience requirements | :link: | :link: |
2015-11-24 | Simulated lottery odds | :link: | :link: |
2015-11-19 | Refugee arrivals in the United States | :link: | :link: |
2015-10-16 | Data and analysis re. Scott Walker's donors post-dropout | :link: | :link: |
2015-08-25 | Data and analysis re. immigrant detention rates | :link: | :link: |
2015-07-24 | Data and analysis re. H-2 visa certifications and enforcement | :link: | :link: |
2015-07-07 | Data and analysis re. the use of primates in biodefense research | :link: | :link: |
2015-06-04 | Data and analysis of BuzzFeed/Ipsos poll on same-sex marriage and abortion views | :link: | :link: :link: |
2015-05-03 | Analyzing #talkpay tweets | :link: | :link: |
2015-03-06 | Analyzing state-by-state changes in earthquake frequency | :link: | :link: |
2015-02-20 | Analyzing deficiencies among Texas foster care child placing agencies | :link: | :link: |
2015-02-20 | Analyzing performance scores of Georgia child placing agencies | :link: | :link: |
2014-10-19 | Debunking the Obama-pronoun myth â data and code | :link: | :link: |
2014-09-05 | Detecting Sunday morning show guests whose "stars are rising" â data and code | :link: | :link: |
2014-09-04 | Comparing college costs to minimum-wage earnings â data, sourcing notes, and analysis | :link: | :link: |
2014-08-20 | Quantifying racial segregation in St. Louis County â code | :link: | :link: |
2014-08-13 | NBA owners' winning percentages â data | :link: | :link: |
2014-08-07 | FTC complaints re. IRS impersonators â data and analysis | :link: | :link: |
2014-06-30 | Firework-related injuries â data | :link: | :link: |
2014-06-16 | Mapping the gender divide in bikeshare programs â data and code | :link: | :link: |
Standalone Datasets
Repo | Description |
---|---|
trumpworld | Data from TrumpWorld |
presidential-campaign-contributions | Contributions, transfers, and refunds from recent U.S. presidential candidates' principal campaign committees. |
nics-firearm-background-checks | Monthly data from the FBI's National Instant Criminal Background Check System, converted from PDF to CSV. |
H-2-certification-data | H-2 visa certification data & data-standardization. |
opm-federal-employment-data | 40+ years of federal employment data from the Office of Personnel Management |
Libraries and Tools
Repo | Description |
---|---|
whtranscripts | Fetch and parse the American Presidency Project's press-briefing and presidential-news-conference transcripts. |
bikeshares | Standardized parsers for data published by bicycle-sharing programs. Currently supporting: NYC's Citi Bike, Chicago's Divvy, and Boston's Hubway. |
twick | Twitter, quick. Fetch and store tweets on short notice. |
Guides
Repo | Description |
---|---|
zika-data | Data â and pointers to data â related to the 2015â16 Zika virus outbreak. |
bikeshare-data-sources | Guide for getting trip history and station data from various bicycle-sharing programs. |
Top Related Projects
Data and code behind the articles and graphics at FiveThirtyEight
The Washington Post is compiling a database of every fatal shooting in the United States by a police officer in the line of duty since 2015.
A repository of data on coronavirus cases and deaths in the U.S.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot