Google Cloud Platform
New York City public datasets now available on Google BigQuery
This rich dataset makes it easy to learn how to explore and visualize data using BigQuery.
New York City is home to 8.5 million residents, and more than 50 million people visit this vibrant and dynamic city each year. With so many sights and sounds, it’s easy to get lost in the details, and lose sight of the big picture: How do New Yorkers actually survive in the “concrete jungle?”
- Over 8 million 311 service requests from 2012-2016 (updated daily)
- More than 1 million motor vehicle collisions 2012-present (updated regularly)
- Citi Bike stations and 30 million Citi Bike trips 2013-present (updated regularly)
- Over 1 billion Yellow and Green Taxi rides from 2009-present (updated regularly)
- Over 500,000 sidewalk trees surveyed decennially in 1995, 2005, and 2015
On which New York City streets are you most likely to find a loud party?
If there's something strange in your neighborhood, the right number to call is 311; created specifically for non-emergency municipal inquiries and non-urgent community concerns. What does that include?
The graph below shows the top five reasons why New Yorkers call 311 over the past 4 years.
Extract(YEAR from created_date) AS year,
"HEATING", "HEAT/HOT WATER") as complaint,
COUNT(*) AS count
GROUP BY complaint, year
ORDER BY COUNT DESC
Call volume tells us that it gets noisy in New York, and it also gets very cold. By joining the 311 calls to the NOAA GSOD weather table, we confirm that most calls about faulty heat and hot water happen when the temperature drops — while noise remains a constant annoyance.
There’s a lot of traffic in New York, and while the number of accidents has slowly increased each year, the number of injuries has remained fairly consistent. Fortunately, the number of deaths has dropped by an average of 9% each year.
As you can see below, “Driver Inattention/Distraction” is the most likely cause of accident and injury, but disregarding traffic control (such as running a red light) is the most common cause of death.
The following graphs show that most traffic accidents happen in Brooklyn, but it’s Midtown and Downtown Manhattan that have the highest concentration of collisions — and Staten Island the highest proportion of deaths per accident.
Comparing the average duration of 5 of the most popular Citi Bike routes, to taxi journeys beginning and ending within an approximately 50-meter radius of the corresponding Citi Bike stations, we see that for trips under 10 minutes there’s not much difference between taking a taxi or riding a bike.
If you’re new to BigQuery, here are some concepts to keep in mind while working with the New York City datasets:
- With BigQuery, everyone gets one terabyte at no charge every month to run queries. If you've never tried BigQuery before, follow these getting started instructions.