Towards User Generated AR Experiences: Enable consumers to generate their own AR experiences for planning indoor spaces

Communication with a customer or future user during the planning and design phase is crucial in applications such as interior design and furniture retailing. Augmented Reality (AR) has the potential to make these communication processes highly effective and provide a better experience for the customer. Current AR authoring solutions are quite complex and require manually creating scenes or rely on objects prepared with even more complex applications such as CAD tools. However, both design experts and their customers often lack the IT skills to use these tools. In addition, many practical cases involve changing reality rather than just adding to it, thus requiring the use of Diminished Reality (DR) technologies. This paper presents a comprehensive analysis of the requirements of both professionals and consumers (gathered using user surveys and individual interviews) for a lightweight and automated authoring process of AR and DR experiences, deriving a set of requirements that can be aligned with state of the art technologies and identifying a number of challenges for AR research.


INTRODUCTION
In many application areas (e.g. interior design, furniture retailing or renovation), communication with a customer or future user during the planning and design phase is crucial to select the right products and configurations. Making this communication process effective saves costs, avoids later modifications, and results in providing tailored solutions and higher customer satisfaction. Augmented Reality (AR) has the potential to achieve this, and becomes a suitable technology with sensor-rich mobile devices and headsets being more widely available and affordable. However, the scene content needs to be created by experts from the respective domains, who often lack IT and media skills. At the same time, it is necessary to provide a lightweight AR experience for the customer.
Current AR authoring solutions are quite complex and require manually creating scenes, or rely on objects prepared with even more complex applications (e.g. CAD). There are some general and domain-independent authoring tools targeting expert users (e.g., Reality Composer 1 , but they provide a lower degree of automation. VimAI 2 provides services for reconstructing and analysing indoor spaces using video. Vuforia Studio 3 is an environment for industrial AR (targeting Microsoft HoloLens 4 , and providing interfaces to CAD models and IoT/systems data. interiAR 5 is an app based on ARCore to insert floors and furniture selected from manufacturer catalogues into captured interior scenes. Marxent 6 develops an AR platform, and provides solutions for furniture visualization. 1 https://developer.apple.com/augmented-reality/reality-composer/) 2 https://vim.ai 3 https://www.ptc.com/en/products/vuforia/vuforia-studio 4 https://www.microsoft.com/en-us/hololens/) 5 https://interiar.co 6 https://www.marxentlabs.com/ ViewAR 7 provides an AR SDK and an application to build AR applications. Consumer AR applications for interior design and furniture retail include Amikasa 8 , Cylindo 9 , DecorMatters 10 , FloorPlanner 11 , HomeStyler 12 , Housecraft 13 , Houzz 14 , IKEA Place 15 , Matterport 16 , Roomle 17 , Roomsketcher 18 , and RoOomy 19 . However, these apps typically rely on plans imported from CAD applications or entered by the user, and do not yet provide DR support.
We argue that a democratization of AR authoring is needed, so that domain experts as well as consumers are able to author and modify AR experiences as a means of communication. The goal of our work is to enable the authoring of AR experiences directly within real-world scenes, where content authors will pre-define placement options for computer-generated (CG) 3D elements within the real-world, allowing their users to experience the authored AR content and interact with it. The solution shall offer true authoring of AR experiences while abstracting and hiding the underlying complexity. It shall be lightweight and usable on mobile devices, offering a straightforward and easy to use authoring workflow for non-experts. In order to reach this goal, three important aspects need to be addressed.
• User as creator. Putting every user in the role of a creator will blur the line between authoring and consuming an AR scene, which are currently done in specific tools. The authoring functionalities require adding more powerful interaction options, which makes UX design a challenging task. • Automation. Authoring a scene consists of a number of complex tasks, some of which require expert knowledge. In order to enable users without these specific skills to create AR scenes, a high degree of automation is needed. Recent advances in Artificial Intelligence (AI)-based visual scene understanding enable this automation for constrained environments (such as private or office indoor scenes). These technologies enable detecting the layout of the captured room as well as present objects, and their boundaries and dimensions. • Replacing real objects. The realism of AR is severely degraded when the 3D objects added to the scene clash with real objects. However, many applications will require changing reality rather than just adding to it. This requires the use of Diminished Reality (DR) technologies, which enable visually concealing real objects. In current AR authoring and presentation solutions, the support for DR is still very limited. In order to diminish real objects with minimal user interaction, existing objects in the scene must be segmented, allowing to select the region to be removed with a single selection, and the background must be reconstructed with methods such as inpainting. Earlier work on analyzing user requirements for AR focuses on outdoor applications, such as in the cultural heritage domain [6], or on indoor applications such as augmenting media consumption [8] or visualizing data [7]. A paper discussing the use of AR for consumer engagement [9] addresses some relevant aspects related to consumer groups and their motivations, but does derive requirements from them.
The main contributions of this paper are a comprehensive analysis of the requirements of both professionals and consumers for a lightweight and automated authoring process of AR and DR experiences, deriving a set of requirements that can be aligned with state of the art technologies and identifying a number of challenges for AR research. The rest of this paper is organized as follows. Section 2 presents the methodology and results of the user requirements analysis. Section 3 discusses the derived technical requirements and their alignment with technologies such as AI-based scene understanding. Section 4 concludes the paper by identifying requirements beyond the current state of the art future research directions.

ANALYSING USER REQUIREMENTS
We aim to develop an AR authoring solution that will truly benefit from pre-authored content creation and the reality removal possibilities offered by DR for the domain of interior design. For this domain the authored AR experiences are very beneficial as they effectively convey the designers' concepts to their clients, as well as enabling high-quality feedback to be efficiently obtained from clients, thereby providing a more cost-effective and fruitful overall information exchange mechanism. We thus address two main target user categories: (1) professionals working with interior design and (2) consumers improving their homes.
Users in category 1 are professionals working with selling home or office furnishings, those working with selling/renting private or commercial properties, and interior designers assisting either professional clients (e.g. architects, estate agents) or consumers (e.g. homeowners/renters). Users in category 2 are those redecorating/renovating their homes and/or looking to buy new furniture.

Methodology
Our user-centred design process is based on [1] and covers the following four phases: 1. Understanding and specifying the context of use 2. Specifying the user requirements 3. Producing design solutions 4. Evaluating designs against requirements These phases are carried out in an iterative fashion, with the cycle being repeated until usability objectives have been attained. This is illustrated in Figure 1 The project has so far conducted extensive work in analysing the context of use (target users and usage situations) and investigating user requirements. However, it is natural in a user-centred process that also these are re-visited as design and development work progresses -new user requirements often emerge once users understand more about what different design solutions may offer and they see how they could use them. There will be a high degree  identify and recruit potential participants to forthcoming user research and testing activities. Individual interviews of between 40 and 30 users of different planning tools. Most Obtain an in-depth understanding of user 60 minutes in length and conducted over were based in Europe (including Austrian, requirements in relation to the different ways an online video conferencing tool. British, Dutch, German, Swedish and in which they go about planning interior Interviews were semi-structured, Swiss users) and several from the USA. spaces today. allowing the exploration of relevant In addition, a few experts in the area (with questions and needs as appropriate to the insights into user needs) were also user being interviewed. referred to.
of iterative development, especially between phases (3) and (4) as feedback from evaluating designs with users will be important in steering the design work.

User requirements gathering activities
Two main types of user research activities were conducted and are summarized in Table 1 Furthermore, to better understand the 'state of the art' of interior design and planning apps (including how AR is currently utilized), partners in the project identified potentially interesting existing solutions ('competitors'). These were reviewed and walkthroughed to understand the extent of functionality available and, from an end-user perspective, the pros/cons of different function/feature implementations. The review included 15 apps that were installed and tested out and 6 other tools that could be read about/demoed online.

Highlights from the user surveys
The data from these surveys gave a variety of insights into the situations in which both Roomle's app and other commonly used tools are used. The most common room that was worked with in a tool was the living room (57%), closely followed by kitchen (50%) and bedroom (50%). Respondents mentioned experiencing the following types of weaknesses with existing tools: • Problems entering correct measurements (particularly for rooms that weren't completely square). • That such tools generally give a poor feeling of how it would actually look like when in the room -that the room doesn't feel like their own. • Difficulties in placing furniture accurately.
• Limitations or difficulties in moving around the room / seeing different views. • Limitations in the furnishings/decorations that can be added to the room (e.g. when trying to find furniture to represent what is already there / that will be kept).

Highlights from the user interviews
The three main outputs presented in this subsection were created based on the findings from the user interviews. Details about the analysis of users can be found in [2].

Personas.
Personas [4] have been used in the our project to summarize and communicate key user characteristics in a manageable, memorable and easy to overview format for all project partners and ensure that design decisions are made with users and their needs in focus. Three personas for each user category (professionals and home users) were documented. Each persona represents a portion of the target user population and is created based on data collected from multiple individuals.

User journey maps.
User journey maps [5] were created for each of the personas, illustrating the different touchpoints that they may have with the planning tool and the actions they may take in order to understand the overall picture of interactions from a user's perspective. These give a picture of how different product interactions fit together in addressing user needs.

Functional requirements.
This analysis of requirements resulted in a detailed list of functional requirements, assessed by importance for different user types, technical feasibility and alignment with the project scope. General requirements address the planning of single or multiple rooms, proposals for layouts and furniture proposals as well as exporting and sharing, including collaborative editing. Users are interested in measuring rooms with little effort, with the option to get higher accuracy in specific cases, and also measure separate objects (e.g., existing furniture in other rooms). In the AR experience, users expect a high degree of realism, including alignment of virtual objects with real objects and avoidance of overlaps, responsiveness and the ability to handle virtual and real objects in the same way. Users expect to continue work in VR when offsite, with similar functionality and experience. In both modes users expect to change both furniture objects as well as wall, floor or ceiling decorations, or even remove walls. For DR, users expect to select objects (as well as fixtures and fittings) with one click, or as little as interaction as possible if objects were not correctly segmented, and smooth updates of diminished objects when they move around.

TECHNOLOGIES ADDRESSING REQUIREMENTS
The identified user requirements lay the ground for the definition of the envisaged system's technical requirements. The process of extracting them first required the technical team to perform a technical feasibility review, crossing out components and functionalities that the current technological stack could not easily, or robustly, support. Then, after preliminary technical implementation descriptions were produced for the remaining requirements, the user experience team reviewed them from an ease-of-use point of view, similarly crossing out those not passing the usability bar (e.g. scanning processes which are user-unfriendly). Finally, both teams analysed the remaining user requirements and their technical mappings to develop a homogeneous work plan, removing components that would disrupt a consistent user workflow, or that would be disconnected from the core workflow. As a result, the technical requirements were decomposed into three core components (see [3] for more details): • i) the backend AI services that will support the authoring experience, and • ii) the authoring view, where users (either consumers or professionals) can manipulate the authored scenes, • iii) the mobile AR view, where the users can inspect the modified scene.

AI services
The AI services run in an on-demand manner, offering higher level information about the scene, and comprise the innovation driving next-generation AR experiences.
In the context of interior scene manipulation, metric-scale, accurate 3D scene reconstruction is an important requirement as it relates to both AR and the use case itself. While AR technology implicitly reconstructs the scene to achieve the spatial emplacement of virtual elements, this is limited to sensor observations. Therefore, using a sensor to scan the scene is a tedious process, identified as one of the 'pain points' of the planning process, given that it also relies heavily on reliable measurements.
Another important advantage of a scene's preemptive 3D reconstruction is the automatic emplacement of AR objects, with minor manual placement activities, which is greatly expected to improve the user experience and authoring workflow. This additionally opens up the capacity of authoring the scene in 3D.
Furthermore, user interactions for planning and refurbishing revolve mostly around objects and object manipulation. Therefore, the system's second technical requirement is the support for objectlevel scene understanding with a two-fold goal, i) to drive the authoring aspect of replacing objects by offering rich metadata on top of the 3D reconstructed scene (e.g., for snapping), and ii) to drive the diminishing aspects by marking the areas to be removed before adding AR content.
This leads us to the final technical requirement which is AIbased layout recommendation, aiming to provide the layout author (i.e. interior designer, renovator, user), with automatic suggestions for updating the current scene's layout. Based on the different user requirements, this functionality should be conditioned on two levels, a coarse, which is based on the scene's/room's type, and a fine level, which is reliant on the relative object layout. From a technical standpoint, this requires the extraction of the current scene graph, and a conditional generation of recommended variations.

Authoring view
The results of AI services will be consolidated in an annotated 3D view that will offer an elevated understanding of each scene/room compared to traditional photos and will facilitate the interactions required to process the scene by adding and removing elements.
Its first technical requirement associated is related to the different (3D) views it will offer, which aim at aiding the planning and recommendation process. Given that the content will be 3D reconstructed, axis-aligned views should be supported (i.e. a floorplan view, or other planning-related ones). Apart from different 3D views, the authoring tool should also offer different layout proposals to ease the design and planning workflow, while at the same time offering the necessary pluralism. Further, users should also experience a connected workflow, with product-related information and annotations (i.e., other similar products). This translates to an integration with product catalogues, which can be extended to generic asset importing as well. The latter will allow for the embedding of real-sized humans, improving the perception of the recommendations' scale. Capitalizing on the 3D nature of the authoring tool and the object-level information, 3D snapping should be also supported to further smoothen the authoring workflow. Finally, while the AI-services will support automation, speeding up authoring, it is also important to provide human-in-the-loop features that will increase the robustness of the authoring experience, by allowing users to manually annotate or correct the AI-services results.

Mobile AR view
Through the mobile AR view users will be able to view the recommendations within the real-world scenes that will be renovated/refurnished. One of the core technical requirements is the capability of manipulating the augmented elements, which has been a recurring theme during the user requirement collection phase. To achieve this, a combined UI/UX design approach will need to identify the best options for such a functionality. The goal is to minimize manual user activities and allow a 'fire-and-forget' AR experience. However, in cases where users want to explore options, the selection of different alternative objects and/or configuration objects, as well as changing objects' positions shall be supported. The other important core technical requirement was the removal of objects. This functionality is the differentiator between an enhanced, planning oriented, AR experience and traditional AR. Whereas the latter only superimposes synthetic elements into real-world scenes, with DR it is possible to manipulate the real-scene view as given from the camera in order to provide a more realistic depiction of object replacement. The goal will be to initially remove an object and replace it with background information, before augmenting the scene with its replacement.
Moreover, a standard feature which is multi-device support is also essential for reaching wider audiences. Also, the AR experience should be delivered and stored to the users' devices to allow for offline AR experience playback, without requiring internet access. At the same time, a straightforward feature for AR which is 6 degrees of freedom (6DOF) viewing, becomes highly more complex due to the employment of DR. Thus, the removal of real-world elements should support 6DOF viewing. Finally, while AR technology has a real-time aspect, it can also be delivered on pre-captured images. This addresses the user requirement to make modifications to the scene when being off-location, or to discuss the scene with people off-location. This allows for pre-emptive AR experiences, which when also considering the 360°nature of the input, can accommodate augmented 360°panorama experiences.

CONCLUSION AND FUTURE WORK
In this work we have presented the progress in designing a system for novel user experiences related to indoor (re-)planning and refurbishing. The goal is to leverage emerging interactive technologies to immersively engage the user in this process, and allow for efficient and effective remote communication of the user with professionals. This way, the traditionally highly physical process of updating a room's layout and arrangement can be automated. To elevate the resulting experience, truly novel features enabled by the combination of AR and DR are found to be instrumental, paving the way for next-generation applications.
We follow a user-driven design process, and as partly expected, the expectations were very large when viewed from a technical perspective. Some of these are related to future challenges where even though progress is being made the required level of maturity has not yet been reached. The need for realism in AR is mainly hindered by the complexity of image formation with respect to lighting, a very challenging and ill-posed problem, which is nonetheless very relevant for interior design. Another prominently asked feature is the capability of extracting objects from within the scene and their re-positioning. Technically this is translated to monocular 3D reconstruction of objects, another challenging and under-constrained problem, which again, is very important for these types of applications.