Interacting in mixed reality: exploring behavioral differences between children and adults

With the development of intelligent interfaces and emerging technologies children and adult users are provided with exciting interaction approaches in several applications. Holographic applications were until recently only available to few people, mostly experts and researchers. In this work, we are investigating the differences between children and adult users towards their interaction behavior in mixed reality, when they were asked to perform a task. Analysis of the results demonstrates that children can be more efficient during their interaction in these environments while adults are more confident and their experience and knowledge is an advantage in achieving a task.


INTRODUCTION
Today's children grow up deeply engaged with technology, which provides them with an advantage in terms of acquiring technological skills, as compared to previous generations. According to the family's socio-economic situation, many children are owners of advanced equipment like personal computers, tablets, game consoles, mobile phones, through which they can use online games, social networking sites and other applications. Hence, more and more Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. IUI '20, March 17-20, 2020, Cagliari, Italy © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-7118-6/20/03. . . $15.00 https://doi.org /10.1145/3377325.3377532 software and technology manufacturers are developing interfaces and personalized interaction for this growing market. Although children are avid users of everyday interactive media, and "smart technologies", these are usually not designed, developed, or evaluated with their participation [6]. With the development of Virtual (VR) and Mixed Reality (MR), children and adult users are provided with exciting interaction approaches. Until recently, these technologies were only available to few people, mostly experts and researchers. However, the now easy access to VR and MR devices and applications calls for further studies to assess the experiences and challenges that these technologies brought in the human computer interaction spectrum, specifically for children [10].
MR is defined as: i) Combining real and virtual content: You can see both at the same time. ii) Interacting in real time: Allow virtual content to interact with each other, iii) Represented in 3D, virtual objects can be firmly displayed in place within the space [2]. The development of highly interactive applications in 3D settings, opens new possibilities of natural interaction with the user's virtual and physical surroundings, through heavy flow of visual information [4]. The user experience of interacting in these visually rich environments using gestures and body movement [1] depends highly on the strategies for task completion (e.g. time needed, successful/unsuccessful interaction with tools/objects). These strategies might be different between adults and children users (e.g. [5]).
The contribution of this work lies primarily in providing initial insight into potential differences between adults and children when interacting in a MR environment for achieving a task. By identifying these differences, intelligent user interfaces and personalization researchers will be able to understand the needs and develop adaptation that will accommodate both groups of users.
Research in human computer interaction has been looking into the different behaviour between children and adult users in emerging technologies and smart interfaces. Here we discuss only the work that is most relevant to ours, focusing on studies that compare adults and children in interacting with technology. An interesting study is the one by Gossen et al. [5] who performed an eye-tracking experiment to compare usability and perception differences between adults and children when performing a search task. The task was performed in search engines designed specifically for the respective audience. They found significant differences between adults and children in their search effectiveness (finding the right information), as well as their efficiency (time needed for task completion), when using regular personal computers, with children performing worse than adults. A different perspective, is given in [1], where children were found to be more likely to try new gestures compared to adults when interacting in touchscreen devices. Furthermore, in [1], children found to perform the "pinch-out" gesture far more than adults did, and were significantly more likely to use both fingers when interacting. This behaviour is demonstrating how comfortable children are with modern ways of interaction. Pretorius et al. looked into the differences in visual focus [9] between adults and children when exploring a new game without instructions. Children appear to be choosing a trial-and-error strategy within the game, instead of instruction reading that adults preferred.
Based on the above reports we can see that children have different abilities and different needs than adults when interacting with different interfaces. On the one hand children seem much more comfortable with new technologies and interaction techniques but on the other hand they seem to lack the experience and knowledge to locate the right information, thus appear less effective [5]. Although, there are a number of studies focusing particularly on how children benefit when interacting with smart interfaces [7,11], there are not many studies, that compare adults and children using smart applications and devices when performing exactly the same task, with the exception of [3]. Such studies will allow us to understand the different needs of the two groups and how to design intelligent, interfaces that will allow adults and children to perform their best given their unique abilities; in extend, their engagement with the technology will improve as well as their user experience.
In this line, we conducted a study with adults and children participants, using MS HoloLens 1st generation headsets to identify whether there are any distinct differences in the way the two groups of users will interact. We asked both groups to perform the same task and we focused on their efficiency in terms of time spent on performing the task, number of taps they used, and objects gazed at, before selecting a target. In addition to the tracking data, we collected in-app videos and used structured interviews with adults and children users for understanding their experience when interacting with MS HoloLens and the task performed. The qualitative data helped us inform the statistical results and in some occasions triangulate the outcomes of the statistical analysis.

USER STUDY
The aim of this study is to compare the interaction behaviour of adults and children within a MS HoloLens application. Hence, the main research question that was examined is: What are the differences in the behaviour of adults and children, when performing a task in a mixed reality app?
Differences in the participant's behaviour will be measured in terms of i) time required on performing the task; ii) number of taps they used before selecting a target object; iii) number of objects gazed at before selecting a target object. Gaze behavior through MS HoloLens 1st generation 1 has been the primary source for data collection in this work. MS HoloLens 1st generation uses head orientation as the primary method for interacting in MR applications, while a gazed control dot allows you to "point" at objects.
Participants and Procedure: Adult participants (31 in total; 17 Male, 14 Female; Age: 20 -29, Mean: 22.35, SD: 1.74) were recruited through announcements on social media, and mailing lists. Children participants (20 in total; 11 Male, 9 Female; Age: 9 -15, Mean: 11.80, SD: 1.85) were recruited through a collaborating private institute. The minimum requirements for taking part in the study were to: i) have no vision problems that would encumber their interaction and ii) have no previous experience with HoloLens or any other type of MR/VR headsets. This was a condition that would allow us to analyze the data without concerning about prior skills and knowledge of the interaction methods used. All adult participants signed an informed consent form. For children, their parents were informed of the purpose and procedures of the study, and asked to sign the form if they wanted their child to participate.
Participants were asked to use Microsoft HoloLens MR smart glasses, which allow the user to interact with 3D objects. Gaze data were collected by our logger as points of interest. Thus, we logged the number of items the user gazed at (e.g. how many items the user gazed at Uranus during his/her searching session, prior to selecting Uranus). Furthermore, in-app video of user's interaction allowed us to view exactly what the user was seeing and provided us with supplementary qualitative data on their interaction behavior that we used for evaluating the logged data. As mentioned above, after each participant executed the given task, we asked them questions regarding their experience. In addition, the authors were acting as observers during the task execution thus, we were able to use our observations to explain some of the quantitative results.
We selected the Galaxy Explorer HoloLens app, that allows the user to explore the Milky Way Galaxy in a full 3D model via HoloLens, along with Earth's solar system and listen to some interesting information and facts about each object in our galaxy. This is an open source application allowing us to integrate the logger we developed for collecting interaction data through the app (gestures, gaze, speech, timestamps of events, hand location).
Task Execution At this point participants had to follow certain steps to locate three target objects within the Galaxy Explorer application environment (Earth, Sagittarius A* and Uranus). Each target needed a different level of effort to find in terms of searching and locating (e.g. body movement within the physical space and head movement). Uranus planet required effort to be located since it required the user to move his/her head or even body in the physical space and search for it in the 3D model of the planetary system; Sagittarius A* was requiring no effort since it was the first object the user could see when entered the Galactic Centre; and the Earth was requiring some effort since the users needed to search for it in the planetary system but did not need any movement of the body. We selected this task since it is a straight-forward search, locate, select task (Figure 1), without any major distractions in the environment (e.g. multiple items moving). There was no time restriction for the participants to complete this task. The users could use any MS HoloLens input method from voice, tap, tilt. We did not provide the clicker, since we were interested only in the input methods that required less restricted physical movement.

STUDY FINDINGS
Beginning with some overall findings, adults and children complained about how heavy MS HoloLens is and especially some of Figure 1: Adult participant interacting with 3D AR objects the children were getting tired and distracted by this. Specifically, two children were tired and could not complete the task for locating Sagittarius A* and one child could not complete the task for locating Earth. These participants were excluded per case during the statistical analysis. All children managed to locate Uranus and all adults located all objects. Children were overall very excited and willing to try the new technology but at the same time they were feeling stressed of making mistakes during task execution. Adults seemed more comfortable when interacting with the 3D models, however, participants from both groups were asking questions and clarifications throughout the task. However, we have seen that adults needed far more time to complete the task compared to children.
The statistical analysis focused on: i) the time that each participant needed to achieve each target object, ii) the gaze behavior they followed to locate the item (how many objects the participants viewed overall while searching for the target objects during the task), iii) the tap gesture they used. These were analyzed based on the participants' age group. In order to answer our research question, we run a series of Independent Sample T-Tests between adults and children. Prior to the analysis, we ensured the data were normally distributed. The results can be found in Tables 1 and 2.
T-Test results revealed no significant differences in the time required overall for the participants to complete the task (t(49)=0.820, p=0.416). However, descriptive statistics show that adults (M: 523.45, SD: 186.43) took more time to complete the overall task compared to children (M: 485.95, SD: 103.12). Interestingly, differences in the gaze behavior of the participants show that there was a statistical difference between the two groups in the total number of items they viewed during their overall search (t(47)=-2.32, p=0.025). Adults viewed notably less objects before selecting a target overall (M:19.39, SD:17.26) compared to children (M: 39.89, SD: 43.89). When locating Uranus, which required the most effort from all objects in the task, the mean time needed by adults (M:47.23, SD:29.96) was higher compared to children (M:45.55, SD:30.35), but not statistically significant (Table 1). After the participants located their target they had to select it (Figure 2). None of the participants used voice as input method, so we only analyzed tap gesture data.
T-Test revealed no statistically significant results in the number of tap gestures used for selecting Uranus between adults (M:1.45, SD:1.18) and children (M:2.3, SD:2.72), even though, by looking at the means we can see that children performed more taps (able 2). Both groups required time and effort to locate Uranus. However, children (M:62.1, SD:29.8) gazed at less objects than adults did (M:79.5, SD:54.7), before selecting Uranus, making them slightly more effective to locate their target. In addition, children (M:3.30, SD:1.86) gazed at Uranus itself significantly less times (t(49)=6.76, p=0.007) before realizing that this was their target and selecting it, compared to adults (M:6.03, SD:4.04). In app-videos show adults gazing the required object, but failing to realize that they had found their target. This was probably due to the fact that adults needed to move their body more within the physical environment compared to children in order for them to have a clear view of Uranus. Although children had located the target object, they were finding it difficult to select it (performed a series of unsuccessful taps for selection). Furthermore, children had difficulties to evaluate the correctness of the object they located. Since Sagittarius A* is not very "popular" with children compared to planets, they were undecided, thus, slower compared to adults. This was obvious through the observation data but also when we analyzed further the logged data. For Sagittarius A* children viewed more objects (M:23.44, SD:32.64) before locating the target, compared to adults (M:17.26, SD:16.89), but not statistically significant.
During the selection of a target, participants needed to enter their hands within the area that MS HoloLens is using for reading Table 1: T-Tests performed on the time required by adults and children to locate and select an object (left) and number of objects gazed at before selecting a target object (right).

Adults
Children  their input gestures (e.g. tap). However, when participants were not performing a selection, their hands did not need to be in that area.
What is interesting is that adults where entering their hands statistically more times into that area compared to children (t(49)=2.681, p=0.010). So, children were aiming for a tap only when was needed. This was confirmed through the observation notes collected during the task execution. This finding might be explaining why adults appeared to be less efficient in some of the occasions. Their hand appeared within their area of interaction and it could have been distracting, or even in some occasions, was blocking their view to a target object (in-app video data). We could not find any differences between genders in our sample.

DISCUSSION AND CONCLUSION
What are the differences in the behaviour of adults and children, when performing a task in a mixed reality app? Through this study we can see that adults and children are behaving differently in some aspects and have no significant differences in others. According to the results we can comment that children appear to be more adaptable when interacting in this demanding interfaces and are following different strategies compared to adults. Children appeared to be slightly more efficient in terms of the time they required to identify the target objects with the exception of Sagittarius A*, where adults were much quicker with making a selection. When asked to locate Uranus, which was considered the most difficult of the three objects, and the Earth, children appeared more confident, quicker and gazed at less objects before the selection. In terms of efficiency, adults appeared less efficient compared to children, who needed less time to search, locate and select an object. Children performed significantly worst than adults only for Sagittarius A* and this was mainly due to the fact that they needed to know where Sagittarius A* is located on the galaxy. Children in their majority stated that, before and during the task they were feeling stressed and afraid of failing the task. Although, after the task ended they commented that they enjoyed the interaction in MR and they could use it mainly for gaming. Adults viewed the task as an interesting experience and similar to children they would imagine using MS HoloLens mainly for gaming.
The in-app videos and observations revealed that adults and children have very different search strategies. Children have the tendency to explore, [5], hence they gazed at more objects during their interaction with the galaxy and planetary systems (see Table 2). Adults in the contrary, were more focused on achieving the task and as a result gazed at less objects. We have observed that adults can handle much better the use of gestures for performing an action in MR, consistently with when they interact with other technologies [1]. Children performed a number of unsuccessful taps for selecting two of the three objects, however, they performed significantly better than adults when selecting the Earth. This was the last object asked to locate, so, we assume that they got better at using the tap gesture by the end of the task. Consistent with previous work, adults were more talkative throughout task execution, while children were more silent and asking questions only when they really had a problem [9]. In addition, adults asked for reassurance and confirmation at several occasions during the task, a behaviour that we have not observed with children.
According to the results, we can see that there are some differences in the behavior of the two groups when interacting in MR and the intelligent user interfaces community should take these into account when designing and developing technologies for both adults and children users. For example the support that is needed in terms of instructions is obviously different. Adults would really like to read a guide before they start interacting, while children would love to explore on their own and request for help as they go along. Hence, the adaptation that is designed in intelligent interfaces should be different based on the user model, that should take into consideration the age group of the user. In addition, the lack of knowledge that is observed in children (in our study and in related work), should be taken into account when the user needs to achieve a target. That is relevant adaptive help should be provided or made available to the user if it is required. If this kind of support was available in the MR application used in this study, children would have taken advantage of it when asked to locate Sagittarius A*. Of course, further work is needed in order to be able to generalize these results and assess the skills of both adults and children in a more controlled approach.