HAL24K want to make the world a better place by making cities and organisations smarter through the application of advanced data science techniques including machine learning and AI. Many cities face challenges tackling traffic congestion, maximising public transportation efficiency and minimising air pollution. One way to address these challenges is through public bicycle hire schemes; HAL24K is interested in how people engage with these schemes and how they can be made smarter.
Santander cycles is a public bicycle hire scheme operating in and across central London. Transport for London (TfL) provides access to open data records of approximately 37 million bicycle journeys made over a four-year period, along with a live public data API. Using these data HAL24K asked the team to characterise the efficiency and usage of the system and investigate ways to improve it.
There were three primary aims of the project:
Over a five-week period, a team of four PhD scientists used a python analysis stack combined with a SQL database to perform initial queries and exploration of the data. Using these technologies, the team were able to store the journey data and use this to compute the flow data for their model. In addition, the team explored a graph database to enable them make more complex queries about the network.
The initial results of the analysis indicated that the bicycle network is extremely under-utilised, with each bicycle only being used for an average of 2.5 journeys per day, totalling 55 minutes. This suggests that there is a large amount of spare capacity in the system as long as the bicycles are distributed in the right locations, when travellers want to use them. Using Google's Tensor Flow Machine Learning (ML) library, the team created an Artificial Neural Network that predicts flow in and out of each of the docking stations.
Rerum est voluptate repudiandae neque voluptatem dolorum assumenda sed. Tempora et corporis totam similique voluptas. Sint expedita asperiores unde porro voluptatem labore praesentium quia et. Eos ea quos et. Et id vel accusantium dolor et. Dolorum omnis voluptatibus natus
The S2DS team produced a model that predicts the flow of bicycles at each of the 750 bicycle docking stations in London one hour in the future, based on the live API data and recent flow history, to an accuracy of almost 90%.
Furthermore, by creating efficient bicycle maintenance schedules, based on how the bicycles move around the network, this analysis has the potential to save money through efficiency savings in the maintenance programme of the bicycles in the network while at the same time providing safety warnings for bicycles that are in need of maintenance. Next steps, to improve the power of the model, would be to extend it to incorporate knowledge and data from more components of the London transport network, for example the underground, train and bus services.
This would enable the team to better predict changes in demand in response to events such as train driver strikes enabling action to be taken in advance and ultimately create a better customer experience and more efficient transportation network.
"The S2DS team has proven that combining cutting edge technologies can provide valuable insights from extremely complex problems which will be beneficial not only for the shared bike users in London, but worldwide."