Google Sheets is a browser-based app that allows users to analyze data and is a great free alternative to programs like Excel.
Previously known as Google Spreadsheets, users can import CSV or Excel files and access it on any computer with an internet connection. Other perks include sharing options and Google-driven search functions.
It came out several years ago, but some have hesitated to make the switch because it lacked the powerful functions that Excel had. However, that changed recently when Google introduced add-ons that allowed programmers to expand on its functionality, like getting rid of duplicate rows.
This tutorial will show you how make pivot tables in Google Sheets. Pivot tables is a way to summarize and explore data interactively. It helps extract what might be significant from a large, detailed data set. I’ve also written about how to do the same thing in R.
We’ll be working with data from the National Institute of Health. We’re going to investigate how many research grants were awarded in the state of Connecticut.
First, import the research_grants.csv data into Google Sheets.
Select all the rows and columns with values (ctrl + a, cmd + a, or click the cell where the rows and column headers intersect), and then click on Data > Pivot tables in the menu.
A new sheet will pop up with a blank grid.
Here’s where you can explore the data interactively. Let’s start out with a basic question: Where does Connecticut rank in number of NIH grants in the country?
On the right, in the window called Report Editor, click the Add field link next to Rows. Scroll down the pull down menu and click recipient_state_code. Everything from the recipient_state_code column in the original spreadsheet will pop up on the left (You can double check the original data by clicking Sheet 1 at the bottom left of the window).
Skip Columns for now and select Add field next to Values. We want to count how many times a state occurs up in the original data set. You could theoretically select anything in the values pulldown, but let’s go ahead and select recipient_zip.
The default shows up as SUM, which adds up all the zipcodes. That’s not what we want. Go to the pulldown next to Summarize by: and select COUNTA.
OK, there were 12 grants awarded to organizations in Connecticut in 2013. Where does that rank?
We need to sort the new data. But we can’t sort this particular sheet where the pivot table was created. We need to copy and paste the data into a new sheet. Copy and right-click when pasting to select Paste special > Values only. This will make no extra formulas or pivot table code was picked up.
Go ahead and add the column headers. Select the header of the column with the grant counts and then right click the arrow button in the header and select Sort A-Z
It looks like Connecticut ranks 13th in the country. Meh.
But what about in total grant money? Where does Connecticut rank there?
Go back to the Pivot Table 1 sheet at the bottom
Get rid of the Display: recipient_zip in the Values window and instead add total_funding_amount. This time in Summarize by: select SUM in the pulldown menu.
Copy and paste the results into another sheet so you can analyze it. Sort it and you’ll see that CT is ranked sixth in total NIH grants for 2013. That’s interesting.
Let’s do another.
This time let’s take a look at which specific entities are getting the grants, how much total money they’re getting, as well as the average.
Change the Group by: to recipient_name and change Values: to Display: recipient_zip and Summarize by: COUNTA.
Then click Add field next to Values: to add another column in the pivot table. We have a grant count column but now we want to see how much total money each entity is getting.
Select total_funding_amount in Display: . Set Summarize to SUM
Repeat the process again for total_funding_amount to add a column that calculates the average of total_funding_amount.
Copy and paste the data (values only) into another sheet so it can be sorted. It looks like Yale has got all 12 grants that were awarded in Connecticut. And after sorting the total grant figures, it appears Yale received $10 million in NIH research grants. That’s fifth out of nearly 200 other organizations that got similar funding in 2013. Averaged out, each Yale grant amounted to about $850,000.
Last one. This time to show an example of what happens when you Add field in the Column: window of the Report Editor.
Go back to sorting the column by recipient_state_code.
In Columns click Add field and select recipient_type from the pulldown menu.
In Values select Summarize total_funding_amount by SUM
This broke out the data into even more specific figures than before. Last time we looked at how much total money each state got from NIH grants. But this extra variable in the Column field broke out the amount by more specific factors.
This new table we created took the original data of 577 rows and 40 columns and summarized it to tell us how much grant money was going to each state and, more specifically, how much was going to different types of groups.
Before, we saw that Connecticut was ranked 7th in total NIH federal grants. But after drilling into the data with the additional type column, we see that Connecticut is ranked 5th in NIH grants given to private higher education in 2013.
Pivot tables is a powerful and easy way to make sense of big data. Have fun.