This dataset is collected by the NYC Taxi and Limousine Commission (TLC) and includes trip records from all trips completed in yellow and green taxis in NYC from 2009 to 2015. Records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts. For detailed information about this dataset, go to TOC Trip Record Data
You can start exploring this data for yellow and green taxis in the BigQuery console:
Here are some examples of SQL queries you can run on this data in BigQuery.
2014 trips by month (yellow cabs)
SELECT LEFT(STRING(pickup_datetime), 7) month, COUNT(*) trips FROM [nyc-tlc:yellow.trips] WHERE YEAR(pickup_datetime) = 2014 GROUP BY 1 ORDER BY 1
Average speed per hour of day (yellow cabs)
SELECT HOUR(pickup_datetime) hour, INTEGER(100*AVG(trip_distance/((dropoff_datetime-pickup_datetime)/3600000000)))/100 speed FROM [nyc-tlc:yellow.trips] WHERE fare_amount/trip_distance BETWEEN 2 AND 10 GROUP BY 1 ORDER BY 1
During the day the average speed is around 11-12 MPH, but at 5 AM the average speed almost doubles to 22 MPH.
Average speed by day of the week
SELECT DAYOFWEEK(pickup_datetime) dayofweek, INTEGER(100*AVG(trip_distance/((dropoff_datetime-pickup_datetime)/3600000000)))/100 speed FROM [nyc-tlc:yellow.trips] WHERE fare_amount/trip_distance BETWEEN 2 AND 10 AND HOUR(pickup_datetime) BETWEEN 8 AND 18 GROUP BY 1 ORDER BY 1
Sundays are the fastest days of the week, followed by Saturday and Monday. Why are Mondays faster than all the other week days?
About the data
Dataset Source: NYC OpenData
Category: Geographic, Transportation
Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — https://data.cityofnewyork.us/ — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.