Trivector specializes in innovating next-generation transportation systems and transport with a focus on sustainability, climate impact, and social responsibility.
TravelVu is a Trivector product. It is one of the leading digital travel survey tools in Europe. End-users activate the TravelVu mobile app to automatically log travel data. This data is shared with municipalities to understand and improve infrastructure, public transport, and commute patterns.
Each user manually corrects each travel day in TravelVu to ensure that the correct data is gathered. This was causing frustration for users, specifically because correcting a day sometimes meant multiple clicks of adding, shortening, and modifying trips. The frustration sometimes caused users to stop caring about making good corrections and reduced user activity, resulting in worse data and fewer corrected travel days.
Reduce overall frustration by predicting better travel days using machine learning and deep learning models.
Backtick collaborates closely with Trivector on multiple projects. We provide data analysis and machine learning capabilities to Trivector's teams. We also initiate new projects together where we combine traffic domain knowledge with software and machine learning expertise. In this project, we explored the usage of machine learning and deep learning in Trivector's product, TravelVu.
The app gathers data from volunteers to understand city transportation patterns with the aim of improving public transportation systems in Sweden. By anonymously monitoring user travel patterns, the app automatically recognizes different transportation modes and splits the trip into multiple segments. The trip segmentation has proven challenging as it relies on a rule-based engine developed internally.
One key benefit of TravelVu compared to other survey tools is that it has a lot more variety in the different activity types. In addition to the ones you would expect (like bus, biking, car, walking), TravelVu has activity types for "Home", "Work", and "Waiting" (as in, waiting for the bus at a bus stop).
The rule-based engine gives suggestions on how a day should be split between different activities. Of course, this isn't perfect, so in order to get correct and meaningful data, TravelVu asks users to correct each day. Corrections may require a user to:
Change travel mode (for example bus to train)
Change trip order or update travel times
Edit GPS trail
Merge or split trips
Add or remove trips
Of course, performing these corrections causes frustration for users. Some modifications require more attention and clicks than others, such as changing the GPS trail of where you moved or splitting an activity into two.
A fairly large dataset with user corrected days existed. Our idea was to use this dataset to automatically predict trip segments using machine learning and deep learning methods. Ultimately, we wanted to consider the user's entire travel day rather than just individual trips in order to reduce the amount of overall corrections need in the app.
To measure the performance of our models, we decided to create two evaluation metrics:
Frustration Score. This was based on a table we created with Trivector by asking them: How costly is each of these corrections to carry out from a user perspective? A full day was analyzed, and a score was given based on a weighted sum of each inaccuracy according to the frustration table. Ultimately, this score told us how much frustration one predicted day would generate for a user. Our goal was to decrease this metric compared to the rule-based engine.
Accuracy. Simply the mean score of correctly predicted activity types during a full day. Our goal was to increase this metric compared to the rule-based engine. Of course, the models used multiple metrics (fscore, recall) for the individual trip segment predictions, while the higher level accuracy score focused on a full day.
We trained multiple models and iterated with new ideas during the project to improve the results. We had two approaches. A machine learning model tried to correct the predictions from the rule-based system while a deep learning model predicted trip types based on the raw GPS data as input. These models differ primarily in what data they work with but also in the amount of effort to be integrated inside the existing production system.
Without going into detail, the deep learning model was inspired by concepts from text translation (where multiple words may be translated to fewer words, or vice versa) since we had similar challenges. The model produced embeddings for different token inputs and had multiple LSTM and attention layers to help with time features and relevance.
45% less frustration for users (based on frustration score, see below), improved trip segmentation accuracy from 82% to 89%.
Ever wondered how city transportation is planned? See how we helped Trivector improve their transportation data with clever use of data science.
IoT devices in the water industry are currently rolling out at scale. Multiple systems, different protocols, a variety of data formats and other challenges lies ahead.
Preventing leaks, monitoring flows, preparing for the unforeseeable. We evaluated the potential for machine learning in the water industry.