Control-Display Affordances in Simulation Based Education

Mixed reality opens new ways of connecting users to virtual content. With simulation-based education and training (SBET), mixed reality offers an enriched environment to experience digital learning. In turn, learners can develop their mental models to process and connect 2D/3D information in real-world settings. This paper reports on the use of the Microsoft HoloLens to create a mixed reality SBET environment. The challenges of this investigation are harmonising augmented real-world content, including the use of real-time, low-latency tracking of tangible objects and the interaction of these with the augmented content. The research emphasis is on technology-mediated affordances. For example, what affordance does the HoloLens provide the leaner in terms of interactive manipulation or navigation in the virtual environment? We examine this through control-display (CD) gain in conjunction with cyber-physical systems (CPS) approaches. This work builds on previously attained knowledge from the creation of an AR application for vocational education and training (VET) of stonemasonry.


INTRODUCTION
Delivering on a simulation-based education and training (SBET) requires a closed-loop system that tracks and connects content interaction to users' technology-mediated affordance. That is, giving visceral and somatosensory feedback about the correctness of the activity. This is particularly important in vocational education and training (VET), especially trades that are craft-based such as stonemasonry.
Stonemasonry brings together practical knowledge and manual creativity; a trade that tests core functional skills and competencies in design, math, and engineering problem-solving.
While virtual reality (VR) [1,2] has been used to study masonry, its emphasis has focused on visualisation, design exploration, and 3D-space perception. To date there remains very limited series of digital resources developed to complement and address some of the traditionally more challenging theory based elements of stonemasonry vocational courses to develop employability skills [3,4]. An important aspect found during BEACONING's [5] small scale pilot studies on VET digital learning was that there is currently no formal training provision of power tools in VET colleges. One tool in particular is the portable power grinder at construction sites, responsible for major incidences. Statistics on machine tool injuries in the UK construction sector is an alarming 80% [6] while the OSHA [7] cites some 400,000 annual emergency room visits per year.
Mixed reality (MR) is viewed to offer a power tool training solution by bringing both physical and virtual elements closer. For VET, mixed reality is potentially a more pragmatic route to formal training than purely computer-based learning. With the support of tangible devices, Internet of Things and CPS this new interactivity provision not only has profound effects in the way SBET users can develop but it draws research into how mixed reality can become pervasive.
Expanding on the previous work [3,4], the power tool training will make use of the Microsoft HoloLens to provide a virtual environment which can be interacted with using tangible objects. Cyber-physical systems (CPS) are enablers for mixed reality; it is where the embedded world meets the virtual world. CPS provides the necessary interface for a truly immersive SBET. The ability to use real-world artefacts to interact with the virtual world allow real power tools (with cutting implements removed and substituted with a prop) as proxy interactive devices. Such interfaces for carrying out physical tasks is preferable to a purely virtual one given the nature of the stonemasonry and the need to provide a safe but real connection to the vocation. This paper reports on initial work to establish the requirements that will fulfil the power tool training under MR.
One key challenge in any real-virtual continuum is the dynamics and mobility afforded. Understanding the affordance of both the mixed reality technology and its environment involve Humanin-loop (HIL) and control display (CD) gains to be examined. A CPS-based approach supports the HIL-CD study of navigation and manipulation protocols.

MIXED REALITY ENVIRONMENT
Mixed reality environments cover a variety of domains from education to entertainment and training [8]. A mixed reality SBET is essentially a technological integration that combines Kolb's [9] experiential learning theory with Dewey's [9] notion on continuity of experience and interaction. A good example is MARVEL [11] a European project aimed at creating a MR learning environment for mechatronics in vocational education. Students access real laboratories through virtual worlds under the paradigm of MR. Bosche et al [12] developed a VET mixed reality HMD system to train construction workers on bricklaying and dealing with health and safety aspects on high scaffolds (see CyberBuild (http://cyberbuild.hw.ac.uk/). More recent VET work by the authors [3,4] researched CPS-based game approaches in VET to train and assess workers in realistic construction settings.
Immersion in any virtual environment is dependent on users experiencing a strong sense of ownership and agency. Agency refers to the sense of a learner being capable of taking actions and making differences [13]. This becomes more relevant in MR environments because the augmentation of the real world and the physical-based interaction is to enhance sensorial stimuli and bring about a more compelling immersive experience. Touch-Space [14] for example explores embodied interaction within a MR collaborative setting. Its game space combines different interactions modes -natural human-to-human, human-tophysical world and human-to-virtual world -to provide a novel game experience ranging from physical reality, augmented reality, to virtual reality.
What is notable from state-of-the-art surveys in virtual and mixed reality technology has been the fidelity of the environment required to support reasonable user interaction [15]. In SBET the effect that navigation and manipulation modalities of content can have implications to learner performance [12,14,15]. If every aspect of interaction in the scenario can be seen in real time it can provides us with information about the layered connection within the MR environment. Here, CPS supported HIL and CDgain analysis can assist in validating the cause-and-effect of interaction affordance as more complex, compounded actions increase. This allows early identification of problems to prioritize solutions, rather than after creating the full MR experience.

CPS-HIL AND CD-GAINS
The rules of interaction among content govern the dynamics of a real-virtual environment. Minimising fluctuations in handarm forces is desirable during interactive manipulation of realvirtual objects, as with most everyday motor tasks. Visual (and haptic) feedback is key to minimise force fluctuations to hold the arm steady.
CD gain is the scale from which an input device is mapped to the output display [21]. A CD gain of 1 indicates that the movement of the control device is represented exactly on the output. This can be found on peripherals such as a haptic pen. A computer mouse usually has a CD gain of more than 1, which means that the movement of the device is amplified on the output display, whilst a CD gain of less than 1 produces the opposite effect.
Due to the way gesture controls operate on the HoloLens, it is not necessarily optimal to have 1:1 tracking similar to that of immersive headset with controller devices, such as the HTC Vive or the Oculus Rift. This is because the hand acts as a proxy between the user and the virtual object, and usually there is no direct grabbing of an object like in immersive applications. However, when using applications with image/object-based tracking or sensors mounted to a real object, it is not necessary to account for CD gain, since the real-world object is visible and interacting with the virtual object as if they were both in the same environment.
Kroemer [16] showed that, while a 15° gaze angle is most comfortable with when looking at distant objects, a much more downward gaze angle for close objects is preferred. According to the International Standards Organisation [17] the optimum position for the most important visual display is 20 -50° below the horizontal line of sight ( Figure 1). However, for see-through display mixed reality this remains unknown. The purpose of manipulating the CD gain in this environment is to optimise the efficiency of moving the virtual objects. However, it is also important to consider the HIL aspect of the interaction. This means that the navigation and interaction of the virtual objects needs to be effective as well as efficient. If the CD gain produces an uncomfortable experience this alone is not enough.
It is unknown how CD gain will affect interactions in MR experiences. In immersive VR, previous work has shown that that changing CD gain values can result in a sensation of increased control [18], and affordances and perception can be manipulated in an immersive virtual environment [19,20]. Similarly, the effect CD gain adjustments has on mouse operation is well documented [21,22], with a consensus that a certain degree of amplification of mouse movement is beneficial to operation.

METHOD
To address the research question of what affordance does the HoloLens provide the leaner in terms of interactive manipulation or navigation in the virtual environment starts with establishing the extent of CD gains in terms of navigation and manipulation mixed reality without diminishing immersion. How useful is this cyber-physical mixed reality SBET environment offering in interfacing virtual content such that it lets users intuitively work with real world objects?
Two measures of CD gain for the HoloLens were considered in this initial study -angle/angle (A/A) gain and displacement/angle (D/A) gain. A/A gain refers to the ratio of angular movement corresponding to the displacement of a cursor, a virtual pen in this case, and the visual angle. The visual angle is calculated from the distance from the eye to a fixed object in space as viewed in the HoloLens. The visual angle is given by: Where  is the visual angle, D is the distance between the user and the hologram and H is the object height relative to the user. D/A gain refers to the ratio of linear displacement of the cursor in space, in this case scribing a line with a virtual pen, and the angle of elbow flexion and extension or angle and reach.
The visual angle accounts for the height of the force fluctuations viewed on the computer display ( Figure 2). A/A and D/A gain are generally captured internally within the application. This is demonstrated in Figure 3. While algorithmically deducing the user's field of view and virtual object distance is useful, capturing these parameters using an external independent CPS will provide more accurate results. Experimental procedure Task 1: Participants draw a line across a blank canvas using a virtual pen, representing a stone masons scriber or glaschrome pencil. The canvas has a fixed size of 2,500 pixels, equivalent to a real canvas of 500mm x 125mm. The view distance is fixed at 2 units within the HoloLens display, representing 2000mm in the real world, which is the focal plane of the HoloLens displays ( Figure 11). CD gain scores are computed on how straight a line is drawn. For example, a canvas of 100 units long, means the score decreases for every extra unit that is drawn. The reach and motion of the participant is measured using sensors attached to the forearm and bicep. The CD gain of the pen can be adjusted from a control interface as required.  The visual angle ranges from 0° at the bottom of the canvas to 3.58° at the top. The setup is shown in Figure 4. For this test, the z-axis (depth) movement of the pen is disabled to prevent the pen passing through the canvas. An additional measure to examine the effect of frame rate on the CD gain due to induced latency from the extra time required to render a frame. This is implemented once the user is halfway along the canvas, where the frame rate will be reduced from 60FPS to 20FPS. The objective is to control and manipulate a virtual object to realize the direct correspondence in the manipulation of 3D virtual objects with real world props.
Abstracted from clinical balance tests and instrumented posturography the tasked is to balance a real table tennis ball on a real table tennis bat, and then to balance a virtual ball on the real bat. A ball balance test is a classic example of minute hand movements in correspondence to spatial control and manipulation affordances. The balance experiment is also representative of controlling of a small hand-held power tool as used in stonemasonry such as a straight grinder or compact polisher. Figure 5 illustrates the setup for which CD gain will be measured. Similar to Task 1, the reach and motion of the participant is measured through sensor units attached to the forearm, bicep and bat. This allows for a direct comparison between the real and virtual object manipulation.

IMPLEMENTATION
Microsoft's HoloLens is suitably appropriate as a core MR technology in this study. It is self-contained in that it does not need any other equipment to operate; all functionality is contained within the headset itself. This feature is important when used in VET workshop settings. Figure 6 shows the actual workspace for which a study was conducted for a cohort of 20 students. The device currently runs a 32-bit modified version of Windows 10 and operates UWP applications.

HoloLens
Virtual Pen "Drag and Hold" gesture Vuforia image tracking was implemented to track the bat in real time. This target has transparent collision boxes attached according to the shape of the bat, since the position of the target on the bat is always known. The test environment applications were built in Unity v2017.0.3p1 in C#, using the Mixed Reality Toolkit (MRTK) provided by Microsoft to assist with HoloLens development. The applications are built and deployed over a LAN as 32-bit UWP applications. Unity was chosen as the engine due to its availability, ease of use and is Microsoft's development environment of choice for holographic applications. Table 1 list HoloLens specifics of interest to this study, in particular its field of view. A set of wireless inertia motion sensing units (IMU), each having nine degrees of freedom were developed. These are attached to the hand to analyse motions attributed to CD gain in the upper and lower arm and the bat (Figure 7). Relative motion between the lower arm and the bat will be equivalent to the simplified wrist motions. Wrist motions have been selected for this experiment since they are inherently coupled and vital to the ball balancing task. Nevertheless, other motions such as elbow angles can also be similarly captured and analysed with respect to CD gain. These angles are directly associated to the D/A gain which is influenced by the A/A gain. Figure 7: Experiment setup with IMU axes (blue-x, red-y, and yellow-z). Figure 8 shows the results from the Unity-based CD gain for the line drawing experiment. The effectiveness of straight line drawing was maximised with a CD gain of 2, and decreased in a linear fashion as the CD gain increased. The frame rate change did have a small impact on results due to the increased frame latency, it was reported that it was less comfortable to manipulate the pen in this state, however the results obtained showed only minor impacts in usability.  Figure 9 shows the acceleration on the bat and the wrist angles in both virtual and real scenarios on the Task 2 ball balance. Small rotations are evident on the y axis whereas x and z axes are rotated by large angles (angles are measured from the bats coordinate system). Additionally, both the x and z rotations are approximately mirrored, implying that the predominant biomechanical revolutions are taking place in between these axes is a combination of D/A and A/A and not aligned to any single axis alone. angles where the trivial y rotation is ignored, as it is generally constant. Although the same task was given to the participants in both real and virtual cases, the histograms are not identical. A small overlap is observed on the idle angle where most of the time spent significantly differs. This is an indication of a CD problem in the overall MR interaction. This can potentially be caused by various factors in real-time, for instance the inaccuracy in the physics modelling, inadequate update frequency, tracking inaccuracies, limited field of view, interaction latencies etc.

RESULTS AND DISCUSSION
Exhaustively measuring and analysing the decoupled effects caused by these individual components is an enormous technical challenge and difficult to generalise across different application scenarios. CD gain analysis in this context is beyond the scope of this paper. However, in using a CPS setup this study shows that it allows a more revealing and sensitive study into CD gains under MR mode.

Effect and mapping of CD gain for HoloLens MR Environments
While the HoloLens is currently perhaps the most suitable technology for workshop MR SBET, it has some limitations which make development and its use challenging.
Near field clipping affects all VR/MR development. The HoloLens displays are fixed to a focal plane 2.0m away from the user. When holograms are closer to the user than 2.0m, they cannot be displayed correctly across the two screens due to the convergence of the image, as the hologram will begin to disappear from one eye. This results in holograms becoming difficult to focus on. The recommended optimal zone for hologram placement is between 1.25m and 5m, as shown in Figure 11, to accommodate for this effect. Figure 11: Optimal hologram placement for HoloLens [25] The vergence-accommodation conflict can exacerbate spatial interactions, causing fatigue and discomfort [1] not least affecting the user's CD response. Task 1 is an example of CD gain with respect to vergence and accommodation responses to near vision. In trying to achieve a straight a line as possible participants tend to control the virtual pencil often with an extended arm posture. A CD gain setting of 2.0 was found to be most appealing to users, showing that despite adding an extra layer of abstraction to the manipulation, a certain degree of acceleration is desirable for comfortable operation.
The relationship between frame rate and latency and their effects on target control and/or manipulation is investigated in both Task 1 and 2. Frame rate influences target perception in the visual and motor space, and thus affects the dimensions of CD gain. Hand-eye coordination is directly linked with movement and/or posture under an interactive activity of control and/or manipulation of virtual objects in the MR environment. A baseline condition of 60 FPS and 0 millisecond latency was taken for this study (meaning 20 FPS induces an extra 33ms of latency due to increased frame persistence). Results indicate that whilst frame rates below this figure can impact results, the degree to which it impacted the users' performance was inconsistent. Drops below 60FPS can also induce a "swimming" effect on holograms, where they appear to move around the room slightly, which can cause discomfort in the user. This can be mitigated by using a "Stabilisation Plane", which involves Unity choosing the best focal point for the scene and passing into a function which stabilises the objects in the user's gaze. Thus, it is important that any apps are developed with a 60 FPS target, which can prove limiting with the HoloLens' limited graphical processing power.
The final relevant issue with HoloLens development concerns the field of view (FOV) of the displays and sensors on the device.
Firstly, the displays have a horizontal FOV of 17 degrees, and a vertical FOV of 30 degrees. By the standards of immersive headsets this is very limited (the HTC Vive has a 110 x 110degree FOV). This causes the borders of the display to be obvious when looking at a hologram too closely or looking around a room at holograms. The only methods of mitigating this are to make the holograms, especially text, small enough that they can be viewed in their entirety at a comfortable distance, nominally as close as possible to 2m as previously discussed, and to ensure that all necessary holograms that are off-screen have their positions clearly indicated. Figure 12: The gesture frame [25] For gestures to be detected, the user must perform their gestures within a "gesture frame", shown in Figure 12. Gestures outside this frame will not be detected by the device. Despite the limited FOV of the screens, the Kinect-derived depth sensors used in the device have a 120 x 120-degree FOV.
An issue which can present itself relating to the gesture frame is the position in which the gestures are performed. Anywhere outside the frame will not be detected as previously discussed, but it is not necessary for the gesture to be performed in the centre of the field of view. A common action by most people on the first time of use is to outstretch the arm as far as possible and perform the action in the centre of the frame. This leads to tiredness in the arm and can also affect the user's view of the hologram, as the hand will not occlude it. This is demonstrated in Figure 13.

CONCLUSION
The application of SBET varies considerably between industry and educational institutions. However, the common thread is to educate and support learning of commonly performed skills and procedures, orientate leaners to practices and to assess skills. The experiments have indicated there is a certain amount of inherent unevenness in the VR interaction, and that which has been created as a result of the diverse interconnected components. However, the experiments have also indicated the plasticity of the human mind and motor response to work around uneven environments which makes it challenging to quantify. Hence, the use of CPS warrants future work in establishing humancomputer interactions under a mixed reality setting.
The experiment with wrist motions as a fundamental element of the mixed reality control task could be viewed to demonstrate the variability and effect of CD gain. Synchronisation and interoperation of CPS in mixed media present interesting problems, none more so than the specialised human-in-loop interactions that require careful configuration of the default parameters of each respective engine. The experiments suggest a CD index of 2.0 can improve human-inloop (MR) experience. In relation to the demands on targetdistance control, the method ( Figure 10) of assessing simultaneous vergence and accommodation responses show there is high probability of a linear cross-linked relationship within our binocularly normal population. A larger population is planned for the coming studies.
Both tasks were specifically designed to be abstract and lightweight to ensure the virtual environment effects are minimal. Even under such circumstances the CD artefacts caused by the Unity Engine, Vuforia and HoloLens hardware were evident.
120° Sensor FOV 30° Vertical display FOV ✓ ✕ ✕ The long-term vision is for CPS-based interfacing to provide greater technological affordances to users under dynamic interaction requirements, such as using a real power tool in the mixed reality environment. It is anticipated that the HoloLens in conjunction with CPS-based interactive devices will provide apprentice stonemasons a more realistic connection to real-world practices. It can nonetheless be tuned to characterise and adjust to various complexity levels in-line with the required training standards.