Context
Transport decisions shape climate, development, and investment outcomes. The data that supports them is fragmented across many sources, uneven in coverage and quality, and difficult to convert into the cross-source insights those decisions need. Analysts spend weeks pulling sources together by hand. Policymakers often receive single numbers without knowing which source they came from.
This page sets out the transport data situation as it stands today, the questions that go unanswered because of it, and the window that makes now a useful moment to act.
What exists today
Transport data isn't starting from nothing. It has real foundations that a broader intelligence system can build on.
The Transport Data Commons (TDC), coordinated through UNECE and supported by the FCDO-funded Climate Compatible Growth programme and GIZ, is building a shared metadata and standards layer for global transport data. It uses SDMX, the international statistical-exchange standard that already underpins the World Bank, IMF, OECD, and Eurostat's data infrastructure. In September 2025 the eight major SDMX sponsors issued a joint statement on AI-readiness, committing to make official statistics discoverable and queryable by AI agents; the IMF's StatGPT 2.0 already demonstrates the pattern. For structured statistical data, TDC and SDMX are the right foundation.
Alongside that foundation, a set of specialised platforms cover specific slices of the picture. Oxford's OPSIS and the Global Resilience Index map infrastructure networks against climate risk. The IMF's PortWatch tracks port disruptions in near-real time. The Asian Transport Outlook aggregates 450+ indicators across 51 economies. The Africa Transport Systems Database brings together continental-scale multi-modal geospatial infrastructure. Digital Transport for Africa, OpenStreetMap and OpenCycleMap cover public-transport and active-mobility data. ITDP's Atlas of Sustainable City Transport provides urban-mobility benchmarks. The SLOCAT NDC Tracker maps national climate commitments to transport. Each platform works within its own remit. Each one is hard to use in combination with any other.
Three structural gaps separate these foundations from what transport decisions actually need.
- Fragmentation across data types. Transport information spans structured statistics (country indicators, time series), geospatial infrastructure (networks, climate layers, port locations), and unstructured knowledge (policy reports, NDC commitments, project evaluations, research findings). Each lives in its own access pattern. SDMX handles the first; GIS formats handle the second; nothing standardises the third.
- Coverage and representativeness. Informal transport (minibus and motorcycle taxis, traditional coastal vessels, walking and cycling) is systematically under-documented. Rural deaths enter records less reliably than urban ones. Household travel surveys in many countries are years out of date. WHO modelled estimates of road deaths in sub-Saharan Africa are typically several multiples of police-reported figures; a trend in one series can diverge from the trend in the other.
- The synthesis problem. Even where data exists, cross-source questions require joining things that were never designed to connect: spending data with infrastructure networks, policy commitments with outcome indicators, research findings with live statistics. Today, that takes specialists weeks per question.
The questions that go unanswered
Two kinds of question matter most to transport decision-makers, and both are badly served today.
- Questions about the data itself. Does this information exist for my country, mode, and period? Is it current? Is it comparable across sources? Can I trust it? Who does it miss? Existing dashboards answer questions silently, showing numbers without exposing the constraints of the data beneath them. A minister asking "are our roads safer?" is really asking two questions at once: the real-world question (are fatalities actually declining?) and the data-availability question underneath (does the data we have let us answer the first?). Where reported figures systematically under-count deaths, the two can diverge. A credible answer addresses both.
- Questions about real-world transport. What it costs people, how fast and reliable it is, whether it's safe, sustainable, inclusive. Examples that currently take a research team weeks to assemble:
- Is Kenya on track for its transport NDC commitments, and what does the evidence say about the interventions it's chosen? Combines policy commitments, spending data, infrastructure outcomes, and research findings.
- Which East African road corridors have the highest climate risk relative to their trade importance? Joins infrastructure networks, climate exposure, and trade indicators.
- How does transport investment per kilometre of road compare across priority countries? Links aid-flow data with road-network extent.
Neither category is well served today. Dashboards don't answer the data-availability questions at all; they produce real-world numbers without declaring what's underneath them. And the cross-source real-world questions are answerable in principle, but at a cost (expert time measured in weeks) that rules them out for most of the decisions they'd actually inform.
The window
Three forces make now a useful moment.
The UN Decade of Sustainable Transport (2026–2035) launched in December 2025 with an implementation plan calling for progress monitoring across six priority areas. Proposals for a global tracking framework exist (from SuM4All, TDC, SLOCAT, and others), but the data infrastructure to underpin the Decade's accountability goals remains largely unbuilt.
AI capabilities have reached the point where natural-language reasoning across heterogeneous data is technically feasible and affordable. This was not possible three years ago. Large language models can now reason across heterogeneous sources, attribute their claims, and show their working. This only holds when they're embedded in a system engineered for evidence integrity rather than open-ended chat.
And the data already exists. An audit of twenty-five transport data sources found fourteen with open REST or SDMX APIs requiring no authentication, and another six with free access. Fragmentation is the bottleneck, not availability.
Who is in the space
The coalition around TGIS reflects the organisations already working in this space, not a single owner. The RIDE programme (Research on Infrastructure in Developing Economies), funded by UK International Development / FCDO, is leading the development, with technical delivery supported by the Frontier Tech Hub. Multilateral development banks (ADB, AfDB, EBRD, EIB, CAF, IDB), UN agencies (UNECE, UNDESA), TDC, SLOCAT, Oxford, WRI, GIZ, TRL, and partner governments are part of the conversation. The aim is a system built with the sector, not around it.
TDC is the foundation, not a competitor. It is building the trusted metadata and SDMX-based standards layer above individual data sources: the data commons. TGIS uses agentic AI to accelerate that standards work, extend reach into geospatial and unstructured data that SDMX doesn't yet cover, and enable cross-source queries. The detailed relationship is described in Concept.
Context sets the stage. The project itself (what TGIS is, how it works, what it does) is described in Concept.