Applying Archived Operations Data in Transportation Planning: A Primer
6. Getting Started
Any transportation organization can and should begin to make use of their data. Even with extremely limited data, staff can begin to make better planning and programming decisions. This chapter provides concrete steps and information resources that transportation planners and their planning partners can use to begin making better use of existing archived operations data.
Transportation planners and their partners should review the following checklist when they decide to begin using archived operations data:
Table 12. Checklist for getting started.
Step # |
Action |
Explanation |
1 |
Define your needs/ questions. |
What is it that you're trying to do? What questions do you need to answer? |
2 |
Inventory your data. |
This can be the hardest step. Ask each department in your agency if they have any of the data sources listed in Table 1. Create a list of all of the data that you believe you can get, from whom the data will come, and any other relevant attributes (metadata) related to what is available from within the dataset. |
3 |
Did you find the data you needed? |
If yes, then skip to step #6, if not, then continue through this list. |
4 |
If not, look elsewhere. |
If your own agency does not seem to have data (or there are political reasons why it cannot be accessed), consider reaching out to partner agencies (local department of transportations (DOTs), metropolitan planning organizations (MPOs), and university partners). There are also a number of free or third party data sets that are available as shown in Table 15, 16, and 17. Consider exploring your regional intelligent transportation system (ITS) architecture to identify opportunities for obtaining data in the region from other systems or agencies. This is one of the benefits of a maintained ITS architecture. |
5 |
Purchase data. |
If free data sets are not available, you may need to procure data from a third party. Consult Chapter 4, section "Obtaining Third-party Data and Tools" for important considerations. |
6 |
Is the data usable? |
Just because data exists, does not mean that it is usable. Is it a scan of a paper document or is it truly electronic, machine readable data? Make sure it is of the right spatial and temporal resolution and in a format that you, your team, and your systems can use. |
7 |
Do you trust the data? |
While no dataset is perfect, it is important to understand where the data came from, how it may have been massaged or filtered, and if there are any data stewards that can attest to the overall quality of the dataset. |
8 |
Determine your analysis process, and get to work! |
Will you analyze the data with in-house personnel? Will you purchase data analysis tools to help with the analysis? Will you use consultant support? Later sections of this chapter give guidance on how to work with third party data analysts. Another section below also discusses what types of skills you might need internally for successful data analysis. |
My Agency Does Not Have Any Data
If you followed the checklist above and have realized that your agency is not producing any of its own data (or there are political or technical impediments that make it too difficult to access that data), there are still other resources at your disposal. Some of these are free, while others are not. Some data are considered "raw," while other data is accompanied by readily available tools for analysis.
Examples of datasets include:
- National Performance Measures Research Data Set (NPMRDS).
- National Weather Service Data.
- Google/Waze Speed and Incident Data.
- Third Party Probe-based Speed Data.
- Third Party Origin-Destination Data.
- Partner Agency Data.
- Fatality Analysis Reporting System (FARS) Data.
These data are described in greater detail in the following tables.
Table 13. The National Performance Measurement Research Data Set (NPMRDS).
Overview |
The NPMRDS was purchased by Federal Highway Administration (FHWA) originally for internal purposes, but it has been made available to State and local transportation agencies at no cost. It will likely be used by agencies to develop Fixing America's Surface Transportation (FAST) Act congestion-related performance measures. However, there are other potential uses, including problem identification and after action review. The first iteration of NPMRDS is an unfiltered, unsmoothed data set that has gaps in temporal coverage and only covers the National Highway System (NHS). It is only available as an historic data set, made available at the end of the each month. Therefore, it has few real-time applications. |
Table 14. The National Weather Service Data.
Overview |
Historical weather data can be used to identify causes of congestion as well as area for improvement in road weather management activities. |
What is it |
What it Costs |
Access |
Tool Availability |
A host of weather products including radar, images, cloud cover, road weather information systems (RWIS), forecasts, alerts, and much more. |
Free |
https://www.nws.noaa.gov/tg |
Yes. The National Weather Service has its own host of tools available for looking at real-time and historic data. Many third-party companies also provide tools to analyze weather data. Some free tools exist (e.g., https://weatherspark.com/). There are many other historic weather data tools available. |
Table 15. Google/Waze speed deviation and incident data.
Overview |
In addition to real-time operations uses, the archived data can be used for performance measures generation, safety data analysis, various congestion studies, and problem identification. It is especially useful on roads where coverage by the agency may be limited (like on arterials). |
What is it |
What it Costs |
Access |
Tool Availability |
Crowd-sourced incident and event data plus speed variation data on segments of the roadway. |
Free (with the Connected Citizens Partnership Program). |
Google/Waze has a relatively new public sector data sharing partnership. At the time of publication, Google/Waze was brokering deals with agencies to share their crowd-sourced data with the public sector if the public sector was willing to provide their own operations data back to Google and/or the agency agrees to provide attribution to Waze or promote Waze on agency 511 websites or other materials. |
University and private sector data fusion specialists are working on tools (similar to RITIS) that will allow agencies to look at trends in congestion and incidents from the Google data sets. |
Table 16. Third party probe-based speed data.
Overview |
Probe-based speed data can be used to: develop system performance reports; identify problems; prioritize projects; perform after-action incident review; conduct before and after studies; make informed, real-time operations decisions; analyze travel time and reliability; monitor work zones; develop and publish press releases for public and media consumption; measure the economic and environmental impacts of passenger and commercial vehicle user delay. |
What is it |
What it Costs |
Access |
Tool Availability |
Speed and travel time data on Interstates and major and minor arterials from HERE, INRIX, TomTom and others. |
Varies depending on the provider. Usually a per-mile cost; however, some companies charge only for major roads and include smaller arterials for free. |
Directly from data providers, through application program interfaces (APIs), or through third- party data analytics like Iteris Performance Monitoring System (iPeMS) or Regional Integrated Transportation Information System (RITIS) Probe Data Analytics Suite. |
Yes. Some of the probe data providers offer their own analytics tools. RITIS and iPeMS also offer analytics suites for probe data. |
Table 17. Third party origin-destination data (O-D).
Overview |
This data usually comes from third-party probe data providers and includes everything from basic turning movements at intersections all the way to direct origins and destinations of vehicles moving throughout a network. O-D data can be used to track travel trends and calibrate models to support informed investment decisions. |
What is it |
What it Costs |
Access |
Tool Availability |
Information related to the movement of people through the trips they take, including their origin and destination for each trip. |
Costs depends heavily on geography and date range. |
From third- party data providers like INRIX, Airsage, and others. |
Yes. The companies that sell these data usually offer analytics tools. Some universities are also building tools along with the RITIS platform. |
Table 18. Partner agency data.
Overview |
Partner agency data is any and all data that you can get from outside of your own agency. For example, if you are a metropolitan planning organization (MPO) and do not have any of your own data, you may be able to request data directly from State or local department of transportations (DOTs), law enforcement, or other agencies in your region. Partner agency data could include any of the data mentioned previously in this document (speeds, volumes, incidents, weather, etc.) and can be used for all of the things that have been mentioned under any of the case studies that have been presented, plus any of the other relevant uses listed in the above data sets. |
What is it |
What it Costs |
Access |
Tool Availability |
Anything. Incident data, speeds and volumes, computer- aided dispatch, signal phasing and timing, transit schedules, and on-time performance, etc. |
Usually free, but it depends on the agency/ organization for which you are asking. |
Work with other agencies in a region MPOs, State and local DOTs, law enforcement, etc.) to request copies of databases. In rare instances, these agencies may have already invested in tools that they may allow to be borrowed. |
Maybe. It depends on each agency and the investments that they have already made in data and analytics. |
Table 19. Fatality Analysis Reporting System (FARS) data.
Overview |
This dataset can be used to analyze fatality data from motor vehicle traffic crashes. It can be used to identify safety issues on roadways. |
What is it |
What it Costs |
Access |
Tool Availability |
Database of all fatal collisions in the United States. |
Free. |
https://www-fars.nhtsa.dot.gov/Main/index.aspx |
Yes. The FARS website has basic tools available for querying. Other universities and private-sector companies have developed additional web-based tools for analyzing FARS data. Some of these tools also incorporate non-FARS data. |
Partnering with other organizations is also a great way to get access to additional data sets and tools that enable personnel to analyze the data. If an agency is able to leverage investments that others have already made in data collection and analysis, it is advisable to do so.
My Agency Has Limited Data or the Data is in Silos
Prior chapters of this report have provided examples of how various types of operations data are being used. Table 20 below gives an overview of what can be done with limited data sets (or data that exists in silos).
Table 20. Options for agencies with limited amounts and types of data.
Data |
What You Can Do With It |
Traffic volume, speed, class, and occupancy from point sensors |
Travel time analysis, model calibration. |
Event, work zone, and incident information from an ATMS |
Safety analytics, support of before and after studies, identification of effective operations strategies. |
Weather data |
N/A when by itself. |
Traffic speeds from probe-based data sources |
- System performance reports.
- Problem identification/project development.
- Before and after studies of construction projects, signal retiming, etc.
- Travel time analysis including reliability measurements.
|
Transit data |
Schedule adherence and fare collection can be used for identifying transit accessibility in communities and for planning for new stops, schedules, etc. |
Signal data |
Intersection and ramp meter signal data (timing plans, maintenance, etc.) can be used for arterial systems performance measures, signal retiming studies, and more. |
Police crash reports |
Safety analytics and model calibration. |
Freight movements |
Model calibration. |
Origin-Destination data |
Useful for many planning activities that require an understanding of travel patterns for freight and passenger vehicles. Also useful for real-time operations when trying to measure the impact of various traveler information strategies and the impact of incidents on arterials and other secondary roads. |
Advanced Transportation Management System (ATMS) Data Example
Most State DOTs collect some form of incident or event data through their ATMS or traveler information platforms (511). These data usually include (at a minimum) the location of an event, the type of the event, and the lane status (e.g., the number of lanes closed). In other agencies, these data might include information about which responders are on the scene, how long they took to get to the scene, which variable message signs were used to provide information, and much more.
Minimally, these data can and should be leveraged to improve operations, but, more importantly, to justify budgets and help to effect change within the organization. For example, a quick geospatial and temporal query can help an agency to understand when and where certain types of incidents are most likely to occur. The agency can then use that data to make better decisions about how many safety service patrols are needed for each shift, where they should be positioned, and what types of equipment might be needed on the vehicles to better tend to the types of events they respond to on a daily basis.
Going a few steps further, planners and operations agencies can use these data to:
- Identify which responders are arriving on the scene quicker.
- Identify problem locations that might need countermeasures to reduce the occurrence of incidents and events.
- Justify the need for additional responders or operators during certain shifts.
- Justify budgets.
My Agency Has Lots of Data
For agencies that are fortunate enough to have many disparate data sources at their disposal, blending and fusion concepts can and should be explored to attempt to get more out of the archived operations data sets. Creating links between data sets so that they can function together as a much larger database can make possible the use of more sophisticated analytics—searching for cause and effect, correlation of weather to safety issues, and others.
If a vast array of data is indeed available, big data techniques can allow an agency to enter the world of multi-variate data analytics, correlations, what-if scenarios, prediction, and more. Big data analytics are less about the size of the data sets than they are about a process for searching through multi-variate data sets to search for context, correlation, and other aspects that the agency may not have previously thought to ask. The goal is to extract value from the data more quickly and easily than traditional data analysis.
Numerous business and university groups specialize in big data concepts and data science. One example of applying big data analytics is the problem of determining the contribution of underlying factors to total congestion. Congestion and unreliable travel are caused by the interaction of several factors: physical capacity, demand, incidents, weather, work zones, traffic control devices, and operating policies. Assigning how much each of the factors contributes to measures like total delay is problematic, but big data analytics can help.
Other examples of what can be derived from the fusion of multiple data sets are shown in Table 21 below. Ultimately, as the size, quality, and diversity of data continue to grow, the potential for new and innovative data uses will be discovered.
Table 21. Options for agencies with extensive amounts and types of data.
Data |
What You Can Do With It |
Volumes and probe-based speed data. |
User delay cost, emissions, fuel consumption, model calibration. |
Advanced transportation management systems (ATMS), crash reports, speeds and travel times, and weather. |
Bottleneck analytics, causality, correlation of incident management practices with reductions in delay and incident severity, and the benefit/costs of various strategies. |
Signal data (SPaT and sensor actuation), Bluetooth re-identification or origin-destination data, volumes. |
True arterial performance measures including travel times between intersections, travel time reliability, capacity utilization at intersections, true integrated corridor management, safety applications, dynamic signal retiming, what-if scenario planning, and more. |
Before starting your analysis, consider the capabilities and availability of your in-house resources. Is there a need to hire new staff? What is the expected lead time for them to accomplish your goal? If it is lengthy, consider the costs associated with delays and consider purchasing existing tools or services. Leverage the investments that others have already made. Whether you are looking to hire in-house staff or are looking to hire university/consultant support, the following skills and job descriptions should be considered:
- Database administrator.
- Data scientist/data analyst.
- Information visualization.
Ultimately, if resources are lacking and there is neither the time nor ability to hire new staff, consider outsourcing to meet these needs. Alternatively, you can talk to other agencies to determine what tools they are using—they might be willing to share or offer guidance. Try to leverage their investments, if possible.
Guidance on Procuring Data & Analytics Services
Many agencies will struggle to analyze complex, disparate data from operations groups. When this is the case, external support from universities, consultants, and/or the purchase of tools may be necessary.
Whenever an agency procures either a system or a set of services, a certain amount of rigor is needed to ensure the agency's needs are communicated properly to bidders. When developing requirements for consultant support and/or tools, two approaches are typically used:
- Option 1: A detailed set of system requirements are issued at the beginning of the contract to identify the functions and top-level design of the system. This approach requires a major effort on the part of the agency, and can become expensive and frustrating if new requirements are identified midway through the initiative or the requirements were not thought through entirely.
- Option 2: The second approach is similar to the "design-build" methodology in construction. In this option, the requirements are developed as part of the contract prior to development beginning. This method can allow for greater dialogue between the bidders and the agency which ultimately can lead to a better end product.
Both of the options have their advantages; however, the design-build approach typically results in an end-state that is more favorable to the agency. Evaluate external service providers (consultants in the private sector or universities). Check their references and project portfolios.
If the decision is made to hire a consultant or a university, distinguish between a one-time study using available data and the development of a system for ongoing use by the agency. For one-time analysis projects, make sure to write-up a detailed scope of work that describes:
- The data available (be as detailed as possible and be ready to provide samples for evaluation).
- The questions that need to be answered.
- Expected deliverables (broken down into phases that ensure success).
- Desired skills.
- Avoid specifying technologies or techniques. Allow the consultant team to suggest an approach to solving the problem.
- Any other expectations.
- If possible, allow the consultant to suggest alternative approaches and value-added services. This will ensure that the best that each consultant has to offer will be available.
Purchasing Services
For agencies that are not comfortable using analytic tools, and are not interested in doing in-house data analysis, hiring outside consultant support may prove to be a viable option. Purchasing support from universities is a similar option. Consultants and universities frequently have access to scientists, statisticians, database programmers, economists, and other analysts that would otherwise be difficult to hire at State and local agencies. When seeking out-of-agency services, it is wise to review product and project portfolios for examples of prior work to ensure an agency's needs match the skills of the consultant or university personnel being proposed on a project. When hiring outside support (consultant or universities), consider a phased approach to projects. Start small, and ensure the consultant is able to perform basic analysis and fusion tasks with the data available. If the consultants are successful, then work can progress on bigger analysis tasks—adding layers of complexity and building off of prior work and available data sets. Initiating extremely large analysis tasks that are not easily broken down into smaller deliverables can be a recipe for confusion, cost overruns, disappointment, and waste. Regardless of who does the work, it is advisable to avoid demanding that consultants use specific tools, technologies, or techniques to deliver a solution. New technologies, methodologies, and tools are developed quickly and often, and requiring outdated technologies can result in unnecessarily limiting the agency and the consultant in performing analytical tasks. Allow the consultants to drive these decisions based on what they perceive to be the most efficient and effective tools and methods.
|