Core GTFS
These are recommended practices for describing public transportation services in the General Transit Feed Specification (GTFS). These practices have been synthesized from the experience of the GTFS Best Practices working group members and application-specific GTFS practice recommendations. For further background, see the Frequently Asked Questions.
Document Structure
Practices are organized into three primary sections:
- Dataset Publishing & General Practices: These practices relate to the overall structure of the GTFS dataset and to the manner in which GTFS datasets are published.
- Practice Recommendations Organized by File: Recommendations are organized by file and field in the GTFS to facilitate mapping practices back to the official GTFS reference.
- Practice Recommendations Organized by Case: With particular cases, such as loop routes, practices may need to be applied across several files and fields. Such recommendations are consolidated in this section.
Frequently Asked Questions (FAQ)
Why are these GTFS Best Practices important?
The objectives of GTFS Best Practices are:
- To improve end-user customer experience in public transportation apps
- Support broad data interoperability to make it easier for software developers to deploy and scale applications, products, and services
- Facilitate the use of GTFS in various application categories (beyond its original focus on trip planning)
Without coordinated GTFS Best Practices, various GTFS-consuming applications may establish requirements and expectations in an uncoordinated way, which leads to diverging requirements and application-specific datasets and less interoperability. Prior to the release of the Best Practices, there was greater ambiguity and disagreement in what constitutes correctly-formed GTFS data.
How were they developed? Who developed them?
These Best Practices were developed by a working group of 17 organizations involved in GTFS, including app providers and data consumers, transit providers, and consultants with extensive involvement in GTFS. The working group was convened and facilitated by Rocky Mountain Institute.
Working Group members voted on each Best Practice. Most Best Practices were approved by a unanimous vote. In a minority of cases, Best Practices were approved a large majority of organizations.
Why not just change the GTFS reference?
Good question! The process of examining the Specification, data usage and needs did indeed trigger some changes to the Specification (see closed pull requests in GitHub ). Specification reference amendments are subject to a higher bar of scrutiny and comment than the Best Practices. However, there was still need to agree on a clear set of Best Practice recommendations.
The working group anticipates that some GTFS Best Practices will eventually become part of the core GTFS reference.
Do GTFS validator tools check for conformance with these Best Practices?
No validator tool currently checks for conformance with all Best Practices. Various validator tools check for conformance with some of these best practices. For a list of GTFS validator tools, see Testing Feeds. If you write a GTFS validator tool that references these Best Practices, please email gtfs@rmi.org.
I represent a transit agency. What steps can I take so that our software service providers and vendors follow these Best Practices?
Refer your vendor or software service provider to these Best Practices. We recommend referencing the GTFS Best Practices URL, as well as core Spec Reference in procurement for GTFS-producing software.
What should I do if I notice a GTFS data feed does not conform to these Best Practices?
Identify the contact for the feed, using the
proposed feed_contact_email
or feed_contact_url
fields in
feed_info.txt
if they exist, or looking up contact information on the transit agency
or feed producer website. When communicating the issue to the feed producer, link to the specific
GTFS Best Practice under discussion. See
Linking to this Document.
I would like to propose a modification/addition to the Best Practices. How do I do this?
Email gtfs@rmi.org or open an issue or pull request in the GitHub GTFS Best Practices repo.
How do I get involved?
Email gtfs@rmi.org.
Dataset Publishing & General Practices
General Recommendations |
---|
Datasets should be published at a public, permanent URL, including the zip file name. (e.g., www.agency.org/gtfs/gtfs.zip). Ideally, the URL should be directly downloadable without requiring login to access the file, to facilitate download by consuming software applications. While it is recommended (and the most common practice) to make a GTFS dataset openly downloadable, if a data provider does need to control access to GTFS for licensing or other reasons, it is recommended to control access to the GTFS dataset using API keys, which will facilitate automatic downloads. |
GTFS data is published in iterations so that a single file at a stable location always contains the latest official description of service for a transit agency (or agencies). |
Maintain persistent identifiers (id fields) for stop_id ,
route_id , and agency_id across data iterations whenever possible.
|
One GTFS dataset should contain current and upcoming service (sometimes called a “merged”
dataset). Google transitfeed tool 's
merge function can be used to
create a merged dataset from two different GTFS feeds.
|
Remove old services (expired calendars) from the feed. |
If a service modification will go into effect in 7 days or fewer, express this service change through a GTFS-realtime feed (service advisories or trip updates) rather than static GTFS dataset. |
The web-server hosting GTFS data should be configured to correctly report the file modification date (see HTTP/1.1 - Request for Comments 2616, under Section 14.29). |
Practice Recommendations Organized by File
This section shows practices organized by file and field, aligning with the GTFS reference.
All Files
Field Name | Recommendation |
---|---|
Mixed Case | All customer-facing text strings (including stop names, route names, and headsigns) should use Mixed Case (not ALL CAPS), following local conventions for capitalization of place names on displays capable of displaying lower case characters. |
Examples: | |
Brighton Churchill Square | |
Villiers-sur-Marne | |
Market Street | |
Abbreviations | Avoid use of abbreviations throughout the feed for names and other text (e.g. St. for Street) unless a location is called by its abbreviated name (e.g. “JFK Airport”). Abbreviations may be problematic for accessibility by screen reader software and voice user interfaces. Consuming software can be engineered to reliably convert full words to abbreviations for display, but converting from abbreviations to full words is prone to more risk of error. |
agency.txt
Field Name | Recommendation |
---|---|
agency_id |
Should be included, even if there is only one agency in the feed. (See also recommendation
to include agency_id in
routes.txt and
fare_attributes.txt ) |
agency_lang |
Should be included. |
agency_phone |
Should be included unless no such customer service phone exists. |
agency_email |
Should be included unless no such customer service email exists. |
agency_fare_url |
Should be included unless the agency is fully fare-free. |
Examples:
- Bus services are run by several small bus agencies. But there is one big agency that is responsible for scheduling and ticketing and from a user’s perspective responsible for the bus services.The one big agency should be defined as agency within the feed. Even if the data is split internally by different small bus operators there should only be one agency defined in the feed.
- The feed provider runs the ticketing portal, but there are different agencies that actually operate the services and are known by users to be responsible. The agencies actually operating the services should be defined as agencies within the feed.
stops.txt
Field Name | Recommendation | ||||||||
---|---|---|---|---|---|---|---|---|---|
stop_id |
Stops that are in different physical locations (i.e., different designated precise
locations for vehicles on designated routes to stop, potentially distinguished by signs,
shelters, or other such public information, located on different street corners or
representing different boarding facility such as a platform or bus bay, even if nearby each
other) should have different stop_id . |
||||||||
stop_id is an internal ID, not intended to be shown to passengers. |
|||||||||
Maintain consistent stop_id for the same stops across data iterations (see
Dataset Publishing & General Practices). |
|||||||||
stop_name |
The stop_name should match the agency 's public name for the stop,
station, or boarding facility, e.g. what is printed on a timetable, published online, and/or
presented at the location. |
||||||||
When there is not a published stop name, follow consistent stop naming conventions throughout the feed. | |||||||||
Avoid use of abbreviations other than for places that are most commonly called by an abbreviated name. See Abbreviations (#2) under All Files. | |||||||||
Provide stop names in mixed case, following local conventions, as per recommendation for all customer-facing text fields. | |||||||||
By default, stop_name should not contain generic or redundant words like
“Station” or “Stop”, but some edge cases are allowed.
|
|||||||||
stop_lat & stop_lon |
Stop locations should be as accurate possible. Stop locations should have an error of no more than four meters when compared to the actual stop position. | ||||||||
Stop locations should be placed very near to the pedestrian right of way where a passenger will board (i.e. correct side of the street). | |||||||||
If a stop location is shared across separate data feeds (i.e. two agencies use exactly the
same stop / boarding facility), indicate the stop is shared by using the exact same
stop_lat and stop_lon for both stops. |
|||||||||
stop_code |
stop_code should be included in GTFS if there are passenger-facing stop
numbers or short identifiers. |
||||||||
parent_station & location_type |
Many stations or terminals have multiple boarding facilities (depending on mode, they
might be called a bus bay, platform, wharf, gate, or another term). In such cases, feed
producers should describe stations, boarding facilities (also called child stops), and their
relation.
|
||||||||
When naming the station and child stops, set names that are well-recognized by riders, and
can help riders to identify the station and boarding facility (bus bay, platform, wharf,
gate, etc.).
|
routes.txt
Field Name | Recommendation | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
agency_id |
Must be included if it is defined in agency.txt . |
||||||||||||||||||||
route_short_name |
Include route_short_name if there is a brief service designation. This should
be the commonly-known passenger name of the service, no longer than 12 characters. |
||||||||||||||||||||
route_long_name |
The definition from Specification reference:
This name is generally more descriptive than the Examples of types of long names are below:
|
||||||||||||||||||||
route_long_name should not contain the route_short_name . |
|||||||||||||||||||||
Include the full designation including a service identity when populating
route_long_name . Examples:
|
|||||||||||||||||||||
route_id |
All trips on a given named route should reference the same route_id .
|
||||||||||||||||||||
If a route group includes distinctly named branches (e.g. 1A and 1B), follow
recommendations in the route
branches case to determine
route_short_name and route_long_name . |
|||||||||||||||||||||
route_color & route_text_color |
Should be consistent with signage and printed and online customer information (and thus not included if they do not exist in other places). |
trips.txt
- See special case for loop routes: Loop routes are cases where trips start and end at the same stop, as opposed to linear routes, which have two distinct termini. Loop routes must be described following specific practices. See Loop route case.
- See special case for lasso routes: Lasso routes are a hybrid of linear and loop geometries, in which vehicles travel on a loop for only a portion of the route. Lasso routes must be described following specific practices. See Lasso route case.
Field Name | Recommendation | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
trip_headsign |
Do not provide route names (matching route_short_name and
route_long_name ) in the trip_headsign or
stop_headsign fields. |
||||||||||||||
Should contain destination, direction, and/or other trip designation text shown on the
headsign of the vehicle which may be used to distinguish amongst trips in a route.
Consistency with direction information shown on the vehicle is the primary and overriding
goal for determining headsigns supplied in GTFS datasets. Other information should be
included only if it does not compromise this primary goal. If headsigns change during a
trip, override trip_headsign with stop_times.stop_headsign . Below
are recommendations for some possible cases: |
|||||||||||||||
example_table:
|
|||||||||||||||
Do not begin a headsign with the words “To” or “Towards”. | |||||||||||||||
direction_id |
If trips on a route service opposite directions, distinguish these groups of trips with
the direction_id field, using values 0 and 1 . |
||||||||||||||
Use values 0 and 1 consistently throughout the dataset. i.e.
|
stop_times.txt
Loop routes: Loop routes require special stop_times
considerations. (See:
Loop route case)
Field Name | Recommendation | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
pickup_type & drop_off_type |
Non-revenue (deadhead) trips that do not provide passenger service should be marked with
pickup_type and drop_off_type value of 1 for all
stop_times rows. |
||||||||||||
On revenue trips, internal “timing points” for monitoring operational performance and
other places such as garages that a passenger cannot board should be marked with
pickup_type = 1 (no pickup available) and drop_off_type = 1 (no
drop off available) |
|||||||||||||
timepoint |
The timepoint field should be provided. It specifies which
stop_times the operator will attempt to strictly adhere to
(timepoint=1 ), and that other stop times are estimates
(timepoint=0 ). |
||||||||||||
arrival_time & departure_time |
arrival_time and departure_time fields should specify time
values whenever possible, including non-binding estimated or interpolated times between
timepoints. |
||||||||||||
stop_headsign |
In the cases below, “Southbound” would mislead customers because it is not used in the station signs. |
||||||||||||
|
|||||||||||||
|
|||||||||||||
shape_dist_traveled | shape_dist_traveled must be provided for routes that have looping or inlining
(the vehicle crosses or travels over the same portion of alignment in one trip). See the
shapes.shape_dist_traveled recommendation. |
calendar.txt
Field Name | Recommendation |
---|---|
All Fields | calendar_dates.txt should only contain a limited number of exceptions to the
schedule. Regularly-scheduled service should be configured using calendar.txt .
|
Including a calendar.service_name field can also increase the human
readability of GTFS, although this is not adopted in the spec. |
calendar_dates.txt
Field Name | Recommendation |
---|---|
All Fields | calendar_dates.txt should only contain a limited number of exceptions to the
schedule. Regularly-scheduled service should be configured using calendar.txt .
|
Including a calendar.service_name field can also increase the human
readability of GTFS, although this is not adopted in the spec. |
fare_attributes.txt
Field Name | Recommendation |
---|---|
All Fields | agency_id should be included in fare_attributes.txt if it the
field is included in agency.txt . |
If a fare system cannot be accurately modeled, avoid further confusion and leave it blank. | |
Include fares (fare_attributes.txt and fare_rules.txt ) and model
them as accurately as possible. In edge cases where fares cannot be accurately modeled, the
fare should be represented as more expensive rather than less expensive so customers will
not attempt to board with insufficient fare. If the vast majority of fares cannot be modeled
correctly, do not include fare information in the feed. |
fare_rules.txt
Field Name | Recommendation |
---|---|
All Fields | agency_id should be included in fare_attributes.txt if it the
field is included in agency.txt . |
If a fare system cannot be accurately modeled, avoid further confusion and leave it blank. | |
Include fares (fare_attributes.txt and fare_rules.txt ) and model
them as accurately as possible. In edge cases where fares cannot be accurately modeled, the
fare should be represented as more expensive rather than less expensive so customers will
not attempt to board with insufficient fare. If the vast majority of fares cannot be modeled
correctly, do not include fare information in the feed. |
shapes.txt
Field Name | Recommendation |
---|---|
All Fields | Ideally, for alignments that are shared (i.e. in a case where Routes 1 and 2 operate on the same segment of roadway or track) then the shared portion of alignment should match exactly. This helps to facilitate high-quality transit cartography. |
Alignments should follow the centerline of the right of way on which the vehicle travels.
This could be either the centerline of the street if there are no designated lanes, or the
centerline of the side of the roadway that travels in the direction the vehicle moves. Alignments should not “jag” to a curb stop, platform, or boarding location. |
|
shape_dist_traveled |
Must be provided in both If a vehicle retraces or crosses the route alignment at points in the course of a trip,
|
The shape_dist_traveled field allows the agency to specify exactly how the
stops in the stop_times.txt file fit into their respective shape. A common
value to use for the shape_dist_traveled field is the distance from the
beginning of the shape as traveled by the vehicle (think something like an odometer
reading).
|
feed_info.txt
feed_info.txt
should be included, with all fields below.
Field Name | Recommendation |
---|---|
feed_start_date & feed_end_date |
Should be included |
feed_version |
Should be included |
feed_contact_email & feed_contact_url |
Provide at least one |
frequencies.txt
Field Name | Recommendation |
---|---|
All Fields | Actual stop times are ignored for trips referenced by frequencies.txt ; only
travel time intervals between stops are significant for frequency-based trips.
For clarity/human readability, it is recommended that the first stop time of a trip
referenced in frequencies.txt should begin at 00:00:00 (first
arrival_time value of 00:00:00). |
block_id |
Can be provided for frequency-based trips. |
transfers.txt
transfers.transfer_type
can be one of four values
defined in the GTFS. These
transfer_type
definitions are quoted from the GTFS Specification below,
with additional practice recommendations.
Field Name | Recommendation |
---|---|
transfer_type |
If there are multiple transfer opportunities that include a superior option (i.e. a transit center with additional amenities or a station with adjacent or connected boarding facilities/platforms), specify a recommended transfer point. |
Each data consumer calculates the amount of time they need for this interval through their
own algorithm. If this value is insufficient, or if there are other conditions the consumer
didn’t consider, you can override the time calculations after you set the
|
|
Specify minimum transfer time if there are obstructions or other factors which increase the time to travel between stops. |
|
Specify this value if transfers are not possible because of physical barriers, or if they are made unsafe or complicated by difficult road crossings or gaps in the pedestrian network. |
|
If in-seat (block) transfers are allowed between trips, then the last stop of the arriving trip must be the same as the first stop of the departing trip. |
Practice Recommendations Organized by Case
This section covers particular cases with implications across files and fields.
Loop Routes
On loop routes, vehicles’ trips begin and end at the same location (sometimes a transit or transfer center). Vehicles usually operate continuously and allow passengers to stay onboard as the vehicle continues its loop.
Headsigns recommendations should therefore be applied in order to show riders the direction in which the vehicle is going.
To indicate the changing direction of travel, provide stop_headsigns
in the
stop_times.txt
file. The stop_headsign
describes the direction for trips
departing from the stop for which it 's defined. Adding stop_headsigns
to each
stop of a trip allows you to change the headsign information along a trip.
Don’t define one single circular trip in the stop_times.txt
file for a route that
operates between two endpoints (such as when the same bus goes back and forth). Instead, split the
trip into two separate trip directions.
Examples of circular trip modeling:
Circular trip with changing headsign for each stop:
Trip_id |
arrival_time |
departure_time |
stop_id |
stop_sequence |
stop_headsign |
---|---|---|---|---|---|
trip_1 |
06:10:00 |
06:10:00 |
stop_A |
1 |
"B" |
trip_1 |
06:15:00 |
06:15:00 |
stop_B |
2 |
"C" |
trip_1 |
06:20:00 |
06:20:00 |
stop_C |
3 |
"D" |
trip_1 |
06:25:00 |
06:25:00 |
stop_D |
4 |
"E" |
trip_1 |
06:30:00 |
06:30:00 |
stop_E |
5 |
"A" |
trip_1 |
06:35:00 |
06:35:00 |
stop_A |
6 |
"" |
Circular trip with two headsigns:
Trip_id |
arrival_time |
departure_time |
stop_id |
stop_sequence |
stop_headsign |
---|---|---|---|---|---|
trip_1 |
06:10:00 |
06:10:00 |
stop_A |
1 |
"outbound" |
trip_1 |
06:15:00 |
06:15:00 |
stop_B |
2 |
"outbound" |
trip_1 |
06:20:00 |
06:20:00 |
stop_C |
3 |
"outbound" |
trip_1 |
06:25:00 |
06:25:00 |
stop_D |
4 |
"inbound" |
trip_1 |
06:30:00 |
06:30:00 |
stop_E |
5 |
"inbound" |
trip_1 |
06:35:00 |
06:35:00 |
stop_F |
6 |
"inbound" |
trip_1 |
06:40:00 |
06:40:00 |
stop_A |
7 |
"" |
Field Name | Recommendation | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
trips.trip_id |
Model the complete round-trip for the loop with a single trip. | |||||||||||||||
stop_times.stop_id |
Include the first/last stop twice in stop_times.txt for the trip that is a
loop. Example below. Often, a loop route may include first and last trips that do not travel
the entire loop. Include these trips as well.
|
|||||||||||||||
trips.direction_id |
If loop operates in opposite directions (i.e. clockwise and counterclockwise), then
designate direction_id as 0 or 1 . |
|||||||||||||||
trips.block_id |
Indicate continuous loop trips with the same block_id . |
Lasso Routes
Lasso routes combine aspects of a loop route and directional route.
- straight section from A to B;
- loop from and to B;
- straight section from B to A.
Examples: |
---|
Subway Routes (Chicago) |
Bus Suburb to Downtown Routes (St. Albert or Edmonton) |
CTA Brown Line (CTA Website and TransitFeeds) |
Field Name | Recommendation | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
trips.trip_id |
The full extent of a “vehicle round-trip” (see illustration
above) consists of travel from A to B to B and back to A. An
entire vehicle round-trip may be expressed by:
trip_id value/record in trips.txt
trip_id values/records in trips.txt ,
with continuous travel indicated by block_id . |
||||||||||
stop_times.stop_headsign |
The stops along the A-B section will be passed through in both directions.
stop_headsign facilitates distinguishing travel direction. Therefore, providing
stop_headsign is recommended for these trips.
|
||||||||||
trip.trip_headsign |
The trip headsign should be a global description of the trip, like displayed in the schedules. Could be “Linden to Linden via Loop” (Chicago example), or “A to A via B” (generic example). |
Branches
Some routes may include branches. Alignment and stops are shared amongst these branches, but each also serves distinct stops and alignment sections. The relationship among branches may be indicated by route name(s), headsigns, and trip short name using the further guidelines below.
Field Name | Recommendation |
---|---|
All Fields | In naming branch routes, it is recommended to follow other passenger information materials. Below are descriptions and examples of two cases: |
If timetables and on-street signage represent two distinctly named routes (e.g. 1A and
1B), then present this as such in the GTFS, using the route_short_name and/or
route_long_name fields. For example: GoDurham Transit routes 2, 2A, and 2B
share a common alignment throughout the majority of the route,but they vary in several
different aspects.
|
|
If agency-provided information describes branches as the same named route, then utilize
the trips.trip_headsign , stop_times.stop_headsign , and/or
trips.trip_short_name fields. For example: GoTriangle route 300 travels to
different locations depending on the time of day. During peak commuter hours extra legs are
added onto the standard route to accommodate workers entering and leaving the
city. |
About This Document
Objectives
The objectives of maintaining GTFS Best Practices is to:
- Support greater interoperability of transit data
- Improve end-user customer experience in public transportation apps
- Make it easier for software developers to deploy and scale applications, products, and services
- Facilitate the use of GTFS in various application categories (beyond its original focus on trip planning)
How to propose or amend published GTFS Best Practices
GTFS applications and practice evolve, and so this document may need to be amended from time to time. To propose an amendment to this document, open a pull request in the GTFS Best Practices GitHub repository and advocate for the change. You can slo email any comments to specifications@mobilitydata.org.
Linking to This Document
Please link here in order to provide feed producers with guidance for correct formation of GTFS data. Each individual recommendation has an anchor link. Click the recommendation to get the URL for the in-page anchor link.
If a GTFS-consuming application makes requirements or recommendations for GTFS data practices that are not described here, it is recommended to publish a document with those requirements or recommendations to supplement these common best practices.
GTFS Best Practices Working Group
The GTFS Best Practices Working Group was convened by Rocky Mountain Institute in 2016-17, consisting of public transportation providers, developers of GTFS-consuming applications, consultants, and academic organizations to define common practices and expectations for GTFS data. Members of this working group included:
- Cambridge Systematics
- Capital Metro
- Center for Urban Transportation Research at University of South Florida
- Conveyal
- IBI Group
- Mapzen
- Microsoft
- Moovel
- Oregon Department of Transportation
- Swiftly
- Transit
- Trillium
- TriMet
- World Bank
Today, this document is maintained by MobilityData International Organization.