User Requirements for Autonomous Vehicles – a Comparative Analysis of Expert and Non-expert-based Approach

Given the rapid progress being made in the design and development of autonomous vehicles, society is reaching the situation whereby customers will be able to access a range of semi-autonomous vehicles. These vehicles have the capability to drive autonomously in certain circumstances, with minimal input from the driver, except situations when a Request to Intervene is issued. While user requirements differ across and between types of users, there is no unified set of user requirements which will be acceptable to all drivers. Motivated by the recent explosion of interest around autonomous mobility, the authors made an attempt to extract, rank and compare the requirements that should be met according to different types of users - experts and non-experts. An initial set of user requirements was obtained, recognizing that drivers will have different priorities and preferences in this most critical of handover scenarios.


I. INTRODUCTION
Autonomous vehicles are seen as a way to reduce motor vehicle crashes due to elimination of human error. In 2010, the European Union set a target of reducing deaths in traffic accidents by 50% in a decade. Since that time EU members achieved 20,7% reduction. Meeting the target requires a similar percentage loss between 2019 and 2020, which is extremely challenging. Going further, in May 2018 the European Commission adopted a new Strategic Action Plan for Roads Safety, setting a new target planned for 2020-2030 period [1]. The action plan assumes further policy change, new vehicle safety standards and a strategy for automated driving.
To reduce human errors, the European Parliament adopted new measures to improve road safety. The foreseen technological changes comprise a number of updated mandatory minimum safety requirements for new vehicles. Coming into force in 2022, all new models have to be equipped with safety features such as Automated Emergency Braking (AEB) and an overridable Intelligent Speed Assistance (ISA). This will be standard on all existing models sold on the EU market by 2024. Moreover, as of 2028 new heavy goods vehicles will have to comply with direct vision requirements [2].

A. Automation levels
Technology development in the field of advanced sensors, software and artificial intelligence encouraged car companies to develop a self-driving vehicle (e.g. [3] [4]). Automated Driving Systems (ADS) [5] in which perception and decisionmaking are being made by machine/artificial components, are said to become a reality in the following decades. Automation requires drivers to relinquish control of the vehicle, whilst maintaining awareness to enable safe performance in case the system reaches its limits. ADS should have both automated and manual modes. According to the Society of Automotive Engineers (SAE 2018 [6]) there is a six-level scale of driving automation -from 0 (no automation) to 5 (full automation): • Level 0 -no automation, the driver performs the entire Dynamic Driving Task (DDT), even when enhanced by active safety systems.
• Level 1 -driver assistance, the driver controls the vehicle, and the system makes adjustments to speed and direction (e.g. Adaptive Cruise Control, Lane Keep Assist). The system executes only one subtasknot both simultaneously.
• Level 2 -partial automation, the driver controls the vehicle, and the system makes simultaneous adjustments to both speed and direction (e.g. Park Assist, Traffic Jam Assist).
• Level 3 -conditional automation, the system has full control over the vehicle (speed, direction, environment monitoring -e.g. Steering Collision Avoidance) but only under specific conditions (e.g. motorway, limited speed, no crossings). The driver has to be constantly ready to intervene if a dangerous situation is detected.
• Level 4 -high automation, the system has full control over the vehicle and driver presence is not necessary, but only up to the system's limits. The driver can undertake non-driving related tasks. If the actual driving conditions exceed the system performance limits, it may ask the driver to intervene or decide to stop the journey. The system works under specific conditions.
• Level 5 -full automation, the human-driver is not necessary and considered as a passenger. The system works unconditionally. An ADS performs the entire Dynamic Driving Task (DDT) and DDT fallback without any expectation that a user will respond to a Request to Intervene.
No newly developed technology will gain user's trust and acceptance if its features are not balanced with their requirements and expectations. With this in mind, every development process should be undertaken with a usercentered approach.

B. User-centered design
The concept of user-centered design was originated by Donald Norman in the 1980s [7][8] and offered four basic recommendations concerning design: • Make it easy to determine what actions are possible at any moment.
• Make things visible, including the conceptual model of the system, the alternative actions, and the results of actions.
• Make it easy to evaluate the current state of the system.
• Follow natural mappings between intentions and the required actions; between actions and the resulting effect; and between the information that is visible and the interpretation of the system state.
One approach to ensuring broad appeal of a technology such as automation, is to use personalization; that is the tailoring of systems and services to individual consumer tastes and preferences. The concept has been used in the health domain and helps people be in charge of their own destiny [9] and are hence more effective [10] [11]. Over and above comfort (e.g. choose their preferred color option on the dashboard) vehicle manufacturers offer little in the way of personalization of driver support systems.

C. Extracting requirements
User requirements can be extracted in a number of ways ranging from qualitative methods to those which afford the researcher the opportunity to collect vast amounts of quantitative data such as questionnaires, interviews, focus groups and field observations. Each has merits and drawbacks, although in the first stages of a design process it is generally agreed that benefits are greatest from a deep understanding customer insight, whether that be attitudes or motivations. Thus, many initial phases of product design involve qualitative research such as focus groups or unstructured interviews.
Motivational models of acceptance predict that users' engagement with a new technology is influenced by their perception of how effective it will be in helping them attain their personal goals [12] [13]. Thus a support system that clearly meets a drivers' personal goals will be more acceptable and hence more likely to be engage the user. Some research has explored this concept, using adaptive warnings for a forward collision warning system [14] [15]. The results were generally positive with the adaptive system being more acceptable (subjectively) and effective (objectively) than one which was more rigid in its design. One of the studies personalized the system design such that the auditory alert in a forward collision warning system was tailored to a driver's brake reaction time and this adaptiveness was particularly appreciated by drivers who demonstrated a more aggressive driving style [16].
Perhaps therefore personalization could address the often observed weak or unsustained changes in driving behaviors [17] [18]. We aimed to therefore categorize drivers according to their motivations regarding automation. By doing this, the Trustonomy framework could be more effective by being more (personally) goal oriented. User requirements will differ across and between types of user. There will not be one set of user requirements which will be acceptable to all car drivers.

A. Participants
The research was conducted in two iterations using different target groups.
The first group of participants consisted of 85 peopleexperts in automotive industry from several different domains were invited to take part: • passenger car drivers, • driver trainers, • risk analysts, • automotive/bus industry, • certification/legislation.
In the second round, 50 participants at the Trustonomy Workshop, which took place on October 17th 2019, within the VIII Congress of Public Transport and Smart City in Warsaw (Poland) took part. This group was not intended to be composed of domain experts, but among all congress participants there may have been either people with experience in R&I or students of transport departments.

B. The Q methodology
The "Q-methodology" [19] was devised and developed in the 1930s and used extensively across different subject domains. In essence, the key to this approach is to consider data in terms of the individual's whole pattern of responses, a self-reference rather than looking for patterns among people.
Participants are asked to decide what is meaningful and significant from their perspective, via a Q-sort experiment. The data from several people are then factor analyzed to reveal groups of individuals who have ranked characteristics (in this case, characteristics of a possible Trustonomy framework) in the same order.
Q-methodology has become increasingly popular in other fields such as health, education and environmental sciences and in transport to explore the relationship between social participation and transport [20] or to explain the difference in mode choice between walking and cycling [21]. There are five distinct phases in carrying out a Q-methodology [22]. First a concourse is devised, defined as the attitudes or opinions of a set of individuals towards a specific topic. There is a varied array of methods to build a concourse, the most common approach has proven to be interviewing potential participants and recording their opinions; nonetheless, other methods include the use of newspaper articles, existing literature and television shows. This process results in a set of statements about the target system or product. Once the concourse has been constructed, the researcher categorizes the statements, to ensure that all aspects of design have been considered. The researcher then condenses each of the statements, such that they can be responded to by participants on a "agree or disagree" basis. The final number of statements is debatable, some researchers use up to 80, whilst other claim 10 is sufficient. It rather depends on the homogeneity of the sample. Selecting participants who are diverse in their experience and ideas will offer a wider range of opinions that the researcher estimates will have contrasting ideas and behaviors. The statements are then presented to participants and the participants first categorize them into those they agree, disagree or feel fairly neutral about. Following this, participants place each of the statements on the Q-grid, and their location is recorded by the researcher. Thus, an individual "Q-sort" is obtained for each participant. Quantitative data analysis is then undertaken to establish patterns using correlation and factor analysis. The statistical analysis is not performed by variable, trait, or statement, but rather by person. Individuals with similar opinions show high correlations.
The Q-methodology assumes that opinions are subjective and can be shared, measured, and compared. Via the Q-grid, a quasi-normal distribution of statements is obtained, as there are fewer statements that can be placed at the extreme ends and more that are placed into the middle area which represents the neutral zone. Both the symmetry and predetermined numbers of statements in each category facilitate the quantitative methods of correlation and factor analysis.

C. Requirements ranking
Requirement elicitation methods, such as the MoSCoW method [23], underline the importance of developing a clear understanding of the customers' requirements and prioritizing them, by ranking them. This ranking helps everyone (customer, project manager, designer, developers) understand the most important requirements, in what order to develop them, and what not to deliver if there is pressure on resources. The MoSCoW method can be summarized as follows: • M -must have this requirement to meet the business needs, • S -should have this requirement if possible, but project success does not rely on it, • C -could have this requirement if it does not affect anything else on the project, • W -would like to have this requirement later, but delivery won't be this time.
It is thus a prioritization method used to decide which requirements to complete first, which must come later and which to exclude. The Must requirements are non-negotiable. Failure to deliver even one of them will likely mean the project has failed. The project team should aim to deliver as many of the should requirements as possible. Could and Would requirements are nice to have and do not affect the overall success of the project. Could requirements are the first to be omitted if the project timeline or budget comes under pressure. This was conducted during the first iteration.
In the second iteration, participants were asked to assess 25 requirements, which were chosen from previously identified throughout the first iteration. The selection was made, keeping in mind that Congress attendants do not have to have specific knowledge in the field of autonomous mobility and each question should be easily understandable by non-experts. A group of 50 participants were asked to fill in a questionnaire using a simple Likert-like scale from 1 to 5, evaluating each requirement.

A. Statements and factoring
In the first iteration of research, a concourse of thirty-two statements was derived using expert knowledge of the consortium and reference to the current literature pertaining to resumption of control (e.g. [24]). The statements covered a number of themes, such as Driver State Monitoring, Human Machine Interface, Risk assessment and Driver training. They were assessed using Q-sort methodology.
With the obtained individual sorts from each participant, a factor analysis was performed. The process consists of a Principal Components Analysis (PCA) and a Varimax rotation to identify potential factors that can represent similarities between the recorded points of view of the participants. Three factors explaining 65% of the total variance were extracted. The highest and lowest rated statements from each factor were then selected to formulate a general perspective of the different points of view represented within each of the three factors.
Factor 1 was found to reflect higher scores given to statements related to ensuring that drivers were not undertaking behaviors viewed as risky or perhaps taking advantage of automation, such as being out of position, asleep or being impaired by alcohol or drugs. In addition, this factor features a requirement that ultimately drivers should be able take control of the vehicle at any time. Therefore this factor was named "Suspicious Controllers". This group also expressed the opinion that implies the driver is the key actor, not the vehicle, and indeed were of the opinion that neither the vehicle nor passengers or other drivers should be involved in either knowing about the automation mode or creating learning algorithms.
Again, Factor 2 also reflects an interest in driver state monitoring, however this time the user requirements are more concerned with alertness and distraction, as well as a requirement that drivers should be able to resume control at any time. A training requirement also featured, as well as the notion that as they think that drivers presumably are prone to distraction, a haptic warning (which has no visual or auditory element) should be used to alert drivers. This factor was therefore named "Cognitively concerned". Interestingly, the group demonstrated high trust in the automation by disagreeing with the statement regarding drivers being able to over-ride the decision to resume control. This group also disagreed with the statement that there is a visual display that would effectively alert drivers of an impending situation to resume control, supporting the cognitive concerns of this group.
Factor 3 has a heavy focus on statements related to HMI, with particular reference to the information flow between the vehicle and the driver with regards mode, urgency and confirmation. In contrast, this group were disinclined to think there was a need for driver state monitoring such as glance behavior and distraction and instead were more concerned with the need to be deriving information and remaining task focused, having had the opportunity to practice the RtI scenario in a safe environment beforehand. This group is named "Information seekers" who also think that the information or training they require would not be sufficiently derived from an instruction manual.

B. The requirements
The extracted and factored statements were mapped to Trustonomy pillars and functionalities, to put them in a form suitable for further technical analysis (functional/nonfunctional) and to integrate them with those operational needs that could not be captured by the user, each for its own perspective. Table 1 provides a summary of system requirements, which were further sub-divided according to the following classification: <pillar> identifies the main functionality the requirement refers to: • DSM -Assessment of Driver State Monitoring systems, • HMI -Assessment of Human-Machine Interface designs, • ARA -Adversarial Risk Analysis and ethical decision support, • DTR -Driver Training, • DIP -Assessment of Driver Intervention Performance, • TRU -Driver Trust, <type> identifies the type of requirement: • FUNC -functional requirement, • PERF -non-functional performance requirement, • SU -non-functional safety & usability requirement, • SP -non-functional security & privacy requirement, • INT -non-functional interoperability requirement, • NFUNC -generic non-functional requirement, used when it cannot be classified in one of the above-listed categories, <number> is an incremental number uniquely identifying the requirement within its pillar.

DSM-FUNC-02
The DSM technologies assessed must detect the position of the driver inside the vehicle.

DSM-FUNC-03
The DSM technologies assessed must detect driver status, behaviors and actions through visual, auditory and kinesthetic/mechanical information.

DSM-FUNC-04
The DSM technologies assessed should detect passengers' status, behaviors and actions.

DSM-FUNC-05
The DSM technologies assessed must measure the sensory state, the motoric state, the cognitive state, the arousal level and the emotional level of the driver to estimate his/her state.

DSM-FUNC-06
The DSM technologies assessed could take into consideration contextual data such as vehicle type, road context, vehicle speed, etc.

DSM-FUNC-07
The DSM technologies assessed should detect if the driver is under the influence of drugs or alcohol before passing control back.

DSM-FUNC-12
The Trustonomy framework could provide an estimation of the mean driver response time based on the collected information of the current and historical driver state.

DSM-FUNC-13
The DSM technology assessed could provide the assessment of where the driver focused sight before and during Request to Intervene (RtI) and regaining control process.

DSM-SU-01
The DSM technologies assessed must not interfere with the manual driving activity.

DSM-SP-01
The DSM technologies assessed must guarantee the protection of the data collected preventing unauthorized access.

11
HMI-FUNC-02 The HMIs assessed must provide auditory and vibrating warnings (or a combination of visual and auditory and/or vibrating warnings).

HMI-FUNC-04
The HMIs assessed must show whether the vehicle is in automation or manual mode.

13
HMI-FUNC-05 The HMIs assessed must enable the driver to confirm that he/she is ready to resume control of the vehicle.
14 HMI-FUNC-09 The Trustonomy framework must enable the driver to regain control on the vehicle at any time.

HMI-PERF-02
At least one HMI design assessed or a combination of some of them must guarantee that in 99.7% of cases actual take-over time is shorter than take-over time budget.

HMI-NFUNC-01
The HMIs assessed should be easily understandable and usable by the driver.

HMI-SU-01
The HMIs assessed must not distract the driver requiring complex interactions for a correct use.

HMI-INT-01
Some of the HMI designs assessed should be interoperable and customizable according to the user characteristics.

ARA-FUNC-09
Assessed (emergency trajectory planning) algorithms should be based on multi-objective cost functions.

DTR-FUNC-04
The Trustonomy framework must assess driver understanding of training content.

DTR-FUNC-06
The Trustonomy framework must link the configuration of simulated environment to the training session.

DTR-FUNC-07
The Trustonomy framework must assess driver's simulated/real action in driving practice after conceptual training.

DIP-FUNC-01
The Trustonomy framework must assess the quality of driver intervention.

TRU-FUNC-02
The Trustonomy framework must assess and record the driver's level of trust along a simulation.

TRU-FUNC-04
The Trustonomy framework could be capable to evaluate ADS reliability and to predict the driver's trust.

C. Comparison
For the second iteration participants rank the selected 25 requirements between 1 (not important) to 5 (extremely important). The "Priority" field corresponding with the level of importance to fulfil the relevant requirement, were again identified according to the MoSCoW notation [23]. To make the data easier to analyze, conditional formatting was performed using a heatmap color scale. The results of a non-expert assessment are combined with an expert-based evaluation of requirements and presented in Table 2.
The highest value in the row, represented by a black bold letter, determines the priority result (the highest possible score is 50). The red bold letters in "Expert-based priority" column indicate requirements, which were assessed differently by the two research groups (achieved different results in both iterations).
About half of the questions (13) were answered by nonexperts in the same way as industry experts. The answers given by the non-experts show that drivers unrelated to the automotive industry without specific knowledge of autonomous vehicles consider driver monitoring systems and training as less important than interacting with on-board systems. For the DSM category, only 3 requirements were assessed identically by both groups of respondents. Although they agree on the importance of data protection and the need for estimating driver state, non-experts claim that DSM must make use of historical data, while authorities think Could is sufficient.
An interesting aspect has been found out while analyzing DUI-related question. Both research groups consider driving under influence as a dangerous factor. However, experts are willing to postpone a bit the introduction of DUI assessing technologies, the laymen states that DSM must detect if the driver is under the influence of drugs or alcohol before passing control back, meaning that this factor should not be put off.
Almost all questions referring to the HMI design received an M priority in non-expert-based assessment. The results from both iterations point out that HMI should be interoperable and customizable according to the user characteristics. Whilst according to experts HMIs should be understandable and usable by the driver, non-professional participants gave a Must answer. Apart from the question about the guaranteed time budget, it is the question with the most M answers (39).
An extremely interesting finding is that despite being strict evaluating different requirement groups (the predominance of must and should answers), non-experts represent a surprisingly liberal approach in terms of Driver Performance Assessment. This relevantly new concept is being considered as highly important for specialists, while laymen's answers suggest that the quality assessment of driver intervention is not so crucial.

IV. CONCLUSIONS
User requirements were extracted by undertaking a qualitative study. This allowed to the production of rankings of different statements according to whether they were agreed or disagreed with from the "drivers" point of view. The rankings were analyzed quantitatively and factors or groups of user requirements were established. Three distinct groups were found to exist, each characterized by a different set of positive and negative statements. For the Trustonomy project, this provides the first set of user requirements, whilst recognizing that drivers will have different priorities and preferences in this most critical of handover scenarios.
To summarize, users state the importance of driver state monitoring, with regard both physical (alcohol, drugs etc.) and cognitive (distraction) impairment. Retaining ultimate control in Request-to-Intervene situations was also deemed essential. The same applies to low-interference information to inform the driver about upcoming handover situations.