The Power of Big Data Within the Transport Industry

In 2012, the Department for Transport set up the Transport Sector Transparency Board to oversee the opening up of transport data, hoping to encourage the transport industry to embrace the ‘Open Data’ revolution. Five years later, there are currently 777 data sets on, both published and unpublished, of which nearly half (364) relate to transport. But while this proliferation in open data, combined with increasingly advanced algorithms, availability of real-time processing capability, and advances in data storage, is revolutionising sectors such as online advertising and e-commerce, is the transport industry really keeping up?

Operational transport data (e.g. real-time train/bus arrival times) is becoming increasingly commoditised, with a relatively mature eco-system of suppliers and developers creating a huge range of useful solutions to improve the customer experience. However, the data may not be always easily accessible, and can often be incomplete and inaccurate thanks to a fragmented industry approach to the sharing of data across multiple operators and service providers. Additionally, there remains huge potential in the sharing of additional datasets around detailed historical and predicted performance, more accurate live positioning, vehicle loading and ticket barrier data, especially if this data can be made available in real-time. Clearly, despite the relative maturity in terms of the availability of data, the transport industry has valuable lessons to learn from other sectors around maximising the value of both the data itself, and the arguably even more valuable ‘meta data’ generated through the deployment, and use of big data.

Transport data provides its own set of unique challenges (and opportunities), but the incumbent technology providers in the transport sector do not have the right skills to exploit many of these opportunities. We have had conversations with large global engineering firms about how often we have to delete our data, or with legacy technology providers about how we structure our data in a way that allows systems to scale. The idea of deleting data or using traditional relational database technology (such as SQL Server or Oracle) would be laughable in other sectors working with ‘big data’. With a lack of experience and understanding of cloud computing, open-source NoSQL databases and real-time stream processing technology, these large, slow moving suppliers are simply incapable of creating the technology platforms required to deliver a modern big data processing capability.

There is currently an enormous amount of excitement around the use of Machine Learning (ML) and AI (Artificial Intelligence) across industry as a whole. For the transport sector, ML and AI is already being used to identify patterns in existing operational data sets – predicting when trains will be delayed, the impact of disruption, or even to predict infrastructure failure. The real impact of these technologies though is being felt in other sectors through its application of human behavioural ‘big-data’. Personalisation, customer sentiment analysis and targeted messaging are just some of the areas in which giants such as and Facebook are leading the way.

The transport sector has historically been relatively unsophisticated in their (digital) interactions with their customers – with the communication of operational data such as delays, cancellations and other disruption seen as an obligation rather than an opportunity. The data generated through how customers interact with information, when they interact, and even how long they take to make a decision based on the information presented to them, is incredibly valuable, if somewhat unstructured, data. This sort of data has been used in online advertising technology for many years (we are all familiar with those adverts that follow us around the Internet!), and underpins the ability to both understand and engage more meaningfully with your customer.

For the transport sector, this human behavioural ‘big-data’ has incredible value. For example, data on when individuals plan a journey, or check a train/bus time prior to travelling, can be used to predict aggregated demand on services over the next few hours. At an individual level, these sorts of interactions can help to build a unique profile for each and every customer, and ultimately deliver an enhanced, personalised customer experience. More importantly, aggregating this data at a network level, with potentially 100’s of millions of interactions every day, provides a unique opportunity for the transport sector to optimise capacity and to influence behaviour across the network as a whole.



“The TIMON project has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No. 636220”

We're on Social Networks. Follow us & get in touch.