Abstract
This project aims to enhance the performance of a future HMI system for semi-autonomous trains through a European study aimed at identifying key interactions such as driver gestures or voices, as well as developing an HMI sound design that is relevant for multiple cultural user profiles, facilitating the understanding and execution of new actions.
Description
The CARBODIN project is a part of Shift2Rail, is the first European rail initiative to seek market-oriented research and innovation (R&I) solutions. It has brought together multiple stakeholders around the future train, its design, manufacture, particularly focusing on the semi-autonomous, technologically rich, driver cabin. CARBODIN aims, among other things, to improve the performance of the future HMI system through a European survey aimed at identifying key interactions such as driver gestures or voices. It also aims to develop culturally relevant HMI sound design to aid understanding and execution of new actions, particularly related to future trains’ semi-autonomous capabilities.
Sound To Sight, in collaboration with SNCF’s CIM, has been tasked with creating a clear, universally accepted sound language for future train cockpits across Europe. After an exhaustive benchmarking of sounds from various countries (Poland, France, Italy, Sweden, and Germany), where the project team collected interface sounds, sounds from connected objects, tramways, and other rolling stock, as well as popular advertisements or music, a selection of these sounds (2 per country per alert category) was then tested using a protocol developed by Sound To Sight and Simon ENJALBERT’s team from UPHF (Polytechnic University of Hauts-de-France), whose research focuses on developing a unified model of driver behavior in critical situations in road and rail driving.
The test protocol involved 1762 train drivers and instructors from these 5 countries, assessing aspects like the aesthetic quality of sounds and the perceived understanding/link between these sounds and use cases. Each sound was evaluated blindly, without information about its origin or context. The data was then analyzed to identify clusters of existing sound preferences and understandings based on alert categories, age, gender, nationality, and participant role (driver or instructor).
Each sound was assigned a global preference level according to predefined categories: action, alarm, driving mode, greeting, notification. The preferred sounds were then analyzed from an ergonomic and acoustic perspec- tive (associated use case, sound texture, rhythm, pitch, tempo, duration, presence of effects). This was followed by the creation of 2 groups of sounds, aesthetically coherent with each other (sound families), inspired by the best-perceived sounds from the panel (ergonomics and aesthetics) during the initial test.
Each group of sounds includes a greeting sound, an action sound to perform, an alarm sound, a notification sound, and a sound related to changing driving mode. These sounds, now more coherent with each other, are then tested by the same panel. The results of this second study show that the sounds are better appreciated and suitable for use cases.