NYC Taxi and Limousine Trips

This dataset is collected by the NYC Taxi and Limousine Commission (TLC) and includes trip records from all trips completed in Yellow and Green taxis in NYC from 2009 to present, and all trips in for-hire vehicles (FHV) from 2015 to present. Records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts. For detailed information about this dataset, go to TOC Trip Record Data

You can start exploring this data for taxis and FHVs in the BigQuery console:

Go to New York City dataset

Sample queries

Here are some examples of SQL queries you can run on this data in BigQuery.

These samples use BigQuery’s support for standard SQL. Use the #standardSQL tag to let BigQuery know you want to use standard SQL. For more information about the #standardSQL prefix, see Setting a query prefix.

How many trips did Yellow taxis take each month in 2015?

This query returns monthly trip totals for all Yellow taxis in 2015.

Web UI

Open the following query in the Web UI

#standardSQL
SELECT
  TIMESTAMP_TRUNC(pickup_datetime,
    MONTH) month,
  COUNT(*) trips
FROM
  `bigquery-public-data.new_york.tlc_yellow_trips_2015`
GROUP BY
  1
ORDER BY
  1

Command-line

bq query --use_legacy_sql=false '
SELECT
  TIMESTAMP_TRUNC(pickup_datetime,
    MONTH) month,
  COUNT(*) trips
FROM
  `bigquery-public-data.new_york.tlc_yellow_trips_2015`
GROUP BY
  1
ORDER BY
  1'

The results are shown here:

+---------------------+----------+
|        month        |  trips   |
+---------------------+----------+
| 2015-01-01 00:00:00 | 12748986 |
| 2015-02-01 00:00:00 | 12450521 |
| 2015-03-01 00:00:00 | 13351609 |
| 2015-04-01 00:00:00 | 13071789 |
| 2015-05-01 00:00:00 | 13158262 |
| 2015-06-01 00:00:00 | 12324935 |
| 2015-07-01 00:00:00 | 11562783 |
| 2015-08-01 00:00:00 | 11130304 |
| 2015-09-01 00:00:00 | 11225063 |
| 2015-10-01 00:00:00 | 12315488 |
| 2015-11-01 00:00:00 | 11312676 |
| 2015-12-01 00:00:00 | 11460573 |
+---------------------+----------+

What was the average speed of Yellow taxi trips in 2015?

This query returns the average speed during each hour of the day for Yellow taxi trips in 2015.

Web UI

Open the following query in the Web UI

#standardSQL
SELECT
  EXTRACT(HOUR
  FROM
    pickup_datetime) hour,
  ROUND(AVG(trip_distance / TIMESTAMP_DIFF(dropoff_datetime,
        pickup_datetime,
        SECOND))*3600, 1) speed
FROM
  bigquery-public-data.new_york.tlc_yellow_trips_2015
WHERE
  trip_distance > 0
  AND fare_amount/trip_distance BETWEEN 2
  AND 10
  AND dropoff_datetime > pickup_datetime
GROUP BY
  1
ORDER BY
  1

Command-line

bq query --use_legacy_sql=false '
SELECT
  EXTRACT(HOUR
  FROM
    pickup_datetime) hour,
  ROUND(AVG(trip_distance / TIMESTAMP_DIFF(dropoff_datetime,
        pickup_datetime,
        SECOND))*3600, 1) speed
FROM
  `bigquery-public-data.new_york.tlc_yellow_trips_2015`
WHERE
  trip_distance > 0
  AND fare_amount/trip_distance BETWEEN 2
  AND 10
  AND dropoff_datetime > pickup_datetime
GROUP BY
  1
ORDER BY
  1'

During the day the average speed is around 11-12 MPH, but at 5 AM the average speed almost doubles to 21 MPH.

The results are shown here:

+------+-------+
| hour | speed |
+------+-------+
|    0 |  15.8 |
|    1 |  16.3 |
|    2 |  16.8 |
|    3 |  17.5 |
|    4 |  20.0 |
|    5 |  21.6 |
|    6 |  17.6 |
|    7 |  13.7 |
|    8 |  11.6 |
|    9 |  11.4 |
|   10 |  11.5 |
|   11 |  11.3 |
|   12 |  11.2 |
|   13 |  11.3 |
|   14 |  11.2 |
|   15 |  11.0 |
|   16 |  11.5 |
|   17 |  11.2 |
|   18 |  11.1 |
|   19 |  11.8 |
|   20 |  12.9 |
|   21 |  13.6 |
|   22 |  14.1 |
|   23 |  14.9 |
+------+-------+

What was the average speed of Yellow taxi trips in 2015 by day of week?

This query shows the average speed by the day of the week, for all Yellow taxi trips in 2015.

Web UI

Open the following query in the Web UI

#standardSQL
SELECT
  EXTRACT(DAYOFWEEK
  FROM
    pickup_datetime) DAYOFWEEK,
  ROUND(AVG(trip_distance / TIMESTAMP_DIFF(dropoff_datetime,
        pickup_datetime,
        SECOND))*3600, 1) speed
FROM
  bigquery-public-data.new_york.tlc_yellow_trips_2015
WHERE
  trip_distance > 0
  AND fare_amount/trip_distance BETWEEN 2
  AND 10
  AND dropoff_datetime > pickup_datetime
GROUP BY
  1
ORDER BY
  1

Command-line

bq query --use_legacy_sql=false '
SELECT
  EXTRACT(DAYOFWEEK
  FROM
    pickup_datetime) DAYOFWEEK,
  ROUND(AVG(trip_distance / TIMESTAMP_DIFF(dropoff_datetime,
        pickup_datetime,
        SECOND))*3600, 1) speed
FROM
  `bigquery-public-data.new_york.tlc_yellow_trips_2015`
WHERE
  trip_distance > 0
  AND fare_amount/trip_distance BETWEEN 2
  AND 10
  AND dropoff_datetime > pickup_datetime
GROUP BY
  1
ORDER BY
  1'

Sundays are the fastest days of the week, followed by Saturday and Monday. Why are Mondays faster than all the other week days?

The results are shown here:

+-----------+-------+
| DAYOFWEEK | speed |
+-----------+-------+
|         1 |  14.7 |
|         2 |  13.4 |
|         3 |  12.3 |
|         4 |  12.2 |
|         5 |  12.2 |
|         6 |  12.3 |
|         7 |  13.1 |
+-----------+-------+

About the data

Dataset Source: NYC OpenData

Category: Geographic, Transportation

Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — https://data.cityofnewyork.us/ — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Update Frequency: Weekly

View in BigQuery: Go to New York City dataset

Send feedback about...

BigQuery Documentation