Accelerating One of the Most Sophisticated Automated Railway Scheduling Systems in Europe
How the Swiss Federal Railways (SBB CFF FFS) Uses In-Memory Technology for Faster Train Scheduling
Train schedule planning is an extremely complicated process. There are many issues to consider, such as the time it takes to travel from station A to station B, the availability of train tracks, whether the number of cars in a train can fit at a given station, and what needs to be done if there is a delay in any of the trains. And when new trains are added to the system, the challenge becomes significantly greater. The largest railway systems are particularly difficult to plan for. Many academic papers have been published that study this problem, which illustrates the complexity of planning large railway timetables.
In past years, timetable planners relied on their expertise to come up with a working timetable, but with more powerful computers, it became possible to tackle the scheduling problem algorithmically in a fully automated fashion. The planning of timetables is essentially a complex math problem, so it is a natural fit for a high-performance computer system to compute. This not only allows faster schedule planning, but also can help reduce the risk of errors. And there are certainly some optimizations on how to run the computations as fast as possible to further expedite the delivery of the updated schedules.
The Swiss Federal Railways (also known as SBB CFF FFS, the acronym for the company name in German, French, and Italian, respectively, and referred to as SBB from hereafter) is the national railway company of Switzerland, with headquarters in Berne. It runs about 3000 passenger/commercial trains each day (not including cargo) in an extremely efficient and popular system. In fact, in the 2017 European Railway Performance index which includes assessments on utilization, quality of service, and safety, SBB was rated first among national European railways.
The IT professionals at SBB are organized as a team of teams that work together to help plan the operations of the railway. Each team has its own Hazelcast deployment for its own distinct needs, including use as a distributed cache to enable extremely fast access and processing of data that originates in a separate data store. One major responsibility is contributing to the workflow that develops the updated train schedules, and teams have specific outputs for their stage of the workflow which they hand to the next team. Tasks include calculating travel time, identifying routing alternatives, and creating the final schedule. The result is a conflict-free train schedule for all of Switzerland that can be implemented by the railway operations team.
The effort for developing a timetable solution for railways is a well-studied science, but still not an easy one. It is not a problem that has been completely solved at the scale to which SBB aspires, despite much research and many discussions with other railways and academics. Making it automated via high-performance computations would make it much more efficient. But as one of the densest railways in the world, SBB had many different route combinations to consider, making their planning effort a very large number-crunching problem.
The SBB IT team currently uses their computer systems to incorporate extra trains in their current daily timetable and plans to calculate timetables for an entire day algorithmically. This is a huge step forward, as the construction of a valid timetable was a time-consuming effort, even for seasoned timetable planners. Additionally, the manual construction of the timetable is an error-prone task that can lead to plans which often do not work out in practice, causing delays and additional work for the train operation teams.
Now that the construction of the timetable can be automated, the planners can concentrate on evaluating different timetable options produced by the computer. A key to optimize this process is fast access to the computed timetable data, so that they can produce the commercially best timetable fast.
SBB uses the Hazelcast Platform as a high-performance layer to accelerate data accesses in their planning system. It runs in a private-cloud deployment based on OpenShift, leveraging a microservices architecture that runs on the Spring Boot platform, with Rabbit MQ as the communications layer between microservices. They also have various data storage technologies, including on-premises S3 buckets, and they can quickly populate Hazelcast with that data for use by other parts of the system. Since Hazelcast stores the data in memory, accesses are extremely fast.
The decision to use Hazelcast was an easy one, as SBB had previously used another in-memory technology that proved to be too complicated and didn’t offer the ability to scale. The company has since standardized on Hazelcast.
The final output is essentially a dashboard that shows trains and stations mapped against temporal data. Planners can drill down into specifics in a very comprehensive visual representation of the timetable.
“There were a few lessons learned early on. But we did not have many issues to ask for support, because most of it we could figure out ourselves with the documentation.”— Adrian Burri, Software Engineer, Swiss Federal Railways (SBB CFF FFS)
The data used for the automated solution was created as a greenfield project, but computations in an early prototype were too slow. The small-scale proof-of-concept ran sufficiently but did not scale to the level that would support the entire SBB schedule.
Hazelcast dramatically sped up the data access, which increased overall system performance. Hazelcast optimizations like near-cache, in which certain data can be copied to local nodes to further improve speed by removing unnecessary network hops, helped to achieve faster performance.
The generated dashboard lets planners see various situations such as the cause of a blockage of a train waiting at a given station. They get much faster feedback, so they can see what is happening immediately, instead of having to wait several minutes for an update. As a result of all their work, they feel like their effort for automated timetables is possibly the most ambitious in all of Europe.
There were some lessons learned when incorporating Hazelcast into their infrastructure. Most notably, the cache eviction strategy needed some more attention early on, as they learned they had to get a better understanding of the details to get it right, or else they would get memory-related errors. In some instances they kept adding data to Hazelcast without releasing older data, resulting in unexpected errors. Fortunately, they felt the documentation was good, so they were able to figure out most issues on their own via the documentation. Once they learned how to optimally use Hazelcast, the ongoing effort became easier.
The integration with OpenShift also took some extra effort, but SBB had OpenShift experts who were able to help. Hazelcast engineers have recently done more work on the platform to integrate seamlessly with OpenShift, which should simplify any of their future Hazelcast work in OpenShift.
As with many Hazelcast implementations, there are future possibilities for expansion due to the broad applicability of Hazelcast capabilities. For example, Hazelcast speed advantages can enhance deployments entailing deeper analysis of data, resulting in richer insights as well as the ability to broaden out to what they offer as a service to their customers. With the Hazelcast Platform evolving into a digital integration hub, SBB has the opportunity to integrate easily with various other systems, including customer reward and compensation applications as well as the ability to integrate with GPS-based solutions that can notify users whether they can catch a particular train on time. As Hazelcast is a cloud-ready platform and is extremely portable due to its tiny application footprint, applications can be hosted on anything from handheld devices up to multi-instance cloud-based clusters. The opportunities to further engage their customers and improve on an already comprehensive service are endless.