Snapwear: A Snapchat AR ﬁlter for the virtual tryon of real clothes

In this paper


INTRODUCTION
Developments in Augmented Reality (AR) and its availability to a variety of devices have brought up numerous examples of innovation in its use, particularly in mobile, handheld environments. In addition to mobile phones becoming powerful enough to withhold the intense computational needs of AR applications, we are seeing new frameworks [3] being built on top of processors or even existing applications that utilize this power efficiently. As a result, the newest user experiences with AR have been seamless and have been captivating the attention of their audience at a fast-growing pace. Most existing AR experiences are targeted towards changing facial characteristics. Our objective in this work is to develop an AR experience which allows the user to visualize real-world clothing (i.e. clothing designed for production by fashion brands) on their body. As a result, the proposed work is tightly related with social media platforms and the available AR experiences, and with the recent technological shifts of the fashion industry.

Social Media and AR
The field of social media marketing has been utilizing tools such as posts and stories for promoting advertisers' products and services. However, there is an evident shift of interest towards AR as a new tool for targeted marketing [6]. AR experiences have been identified as a marketing technique with a major impact on consumer behavior as they provide an immediate interaction between the prospective client and the product or service that is being advertised. The aspects of interactivity and vividness in the display of the products have been shown to lead to immersion and, ultimately, to media usefulness and enjoyment.
Social Media companies have implemented their own software tools for creators to develop such AR experiences, which can be viewed through their respective platforms. Already existing applications such as Snapchat and Instagram are exposing their users to the magic of AR via filters, which can also be referred to as "effects" or "lenses". AR filters are computer-generated effects layered over the real-life image displayed by a mobile camera. Pictures or videos captured can then be shared via stories, posts, or direct messages.
Facebook Inc. has developed Spark AR Studio, a software that is needed to create AR experiences for either Instagram or Facebook mobile applications. On the other hand, Snap Inc. has developed Lens Studio, which targets their Snapchat app. These development frameworks have been widely used to create AR experiences that can capitalize on these platforms' user base. Despite sharing the same goal of aiding the creator to make a filter, these tools have important differences in the ways they are used and in the features they implement.
Comparing these frameworks, keeping in mind that the aim of this work is to explore how garment try-on could be implemented in a social media AR environment, body tracking and cloth simulation are the key features in evaluating the different platforms. Face tracking is generally more advanced than body tracking and has good support in both SparkAR and Snap Lens Studio. This is in line with the fact that the vast majority of existing AR filters on both platforms is face related (e.g. adding on make up, glasses and other effects on one's face). On the other hand, three-dimensional body tracking has just become available in Snap's Lens Studio 3.4, released 2021-02-02, whilst Facebook Inc. has not yet included support for fullbody tracking in their Spark AR framework at the time of writing. Lens Studio also provides out-of-the-box components for physics simulation (in particular, chain physics 1 ), which can be utilized as a functionality to enable basic cloth simulation by offering the illusion that the garment is affected by gravity. Although such features could be potentially implemented for Spark AR as well, they would need to be developed from scratch and added using their Scripting API.

Fashion Industry and Technology
The fashion industry has been transformed by social media: Advertisements are now targeted toward a specific audience that is defined by the marketing specialist to great detail. Moreover, product information is more readily available to the consumer since they can preview a collection on the respective mobile application. Last but not least, it has been shown that more intricate characteristics of a brand's philosophy, such as minimalism and luxury, can be sensed by the user by looking at the brand's Instagram profile.
It is also important to consider the fashion industry turn to technology concerning the current hygiene crisis. The COVID-19 pandemic has caused major transformations in human behavior, and more importantly in everyday aspects of life. It is natural, that physical try-ons will be minimized, and people would be more willing to virtually try on clothes with the current health concerns in mind.
With respect to the internal operations of the fashion industry, in the past years, there has been a shift from using the traditional design process (i.e. in paper) towards a more contemporary design process, which includes heavily the utilization of 3D Fashion Design software. There are currently different options available as far as design software is concerned. The two state-of-the-art systems that are used widely by a variety of clothing companies are Clo3D by CLO Virtual Fashion Inc. 2 and VStitcher by Browzwear 3 . The two systems offer very similar features such as modular design, 2D pattern design, 3D simulation, 3D garment editing, fabric simulations, etc.
The use of such software speeds up the design process as digital iterations can happen at a much faster rate than physical design iterations. Additionally, this creates the opportunity for various stakeholders, such as product managers and consumers, to get the first glimpse or give feedback without the need for physical production of the garment. However, such systems are used by corporations to create cloth designs that will later be used for production and this results into production-ready designs that are not appropriate for visualization in an AR Experience using an AR environment, such as Lens Studio.
The purpose of this paper is to document the steps needed to prepare a garment for use in an AR environment. Moreover, the project aims to explore both the use of user interactions with the garment in an AR experience setting and the degree of realism that can be achieved as far as the simulation of physical forces is concerned. This is achieved through the development of a Snapchat effect as a promotional tool for the fashion sector. The user will be able to overlay a selection of garments on themself through the use of their phone's camera, as well as interact with these garments by moving around in space and changing their colorway. Chain physics is also applied to provide the effect of gravity on the garment and add to the realism of the simulation.

Creative Utilization of 3D Garments
Advancements in 3D design led to the rise of digital-only clothing companies such as The Fabricant 4 , a digital fashion house that is leading the fashion industry towards a new sector of digital-only clothing. The Fabricant has made available a variety of digital garments that have been developed by artists and cloth designers. This digital-only fashion can be used and traded in VR and AR environments.
Louis Vuitton, a luxury fashion brand, has utilized augmented reality to raise awareness of its handbag collaboration with Japanese artist Yayoi Kusama through a mobile application. Moreover, they have launched their Objets Nomades limited-edition collection of design objects and furniture. These 3D objects can be visualized in real space using augmented reality and native frameworks for mobile devices 5 .
Such fashion companies are sponsoring competitions for the creative rendering of digital garments. For example, the Making Strides competition was sponsored by The Fabricant in collaboration with Adidas and Karlie Kloss 6 . It involved the usage of a 3D iteration of the Wind.RDY Parka Jacket and asked the world's emerging digital creators to let their imagination run wild and create, share and submit their designs.
It is important to note that all the garments mentioned above have been designed in 3D fashion software with the intent of 3D visualization and not actual production. This hints at the challenges posed by the utilization of production-ready garments in the field of AR visualization.

Virtual Try-On Software
There have been previous applications of AR in the fields of cosmetics and eye-wear. Through Sephora's "Virtual Artist" mobile application 7 , the user can get a virtual makeover by previewing different styles of makeup on their face, including lipstick, eyeshadows, and false lashes. Moreover, Warby Parker includes an AR aspect in their application that enables users to virtually wear several glasses models 8 . These mobile applications showcase the implementation of face tracking and face segmentation features of AR frameworks. The absence of body tracking features alludes to the intricacies in its implementation.

AR Filters
It is important to note a submission in the Making Strides competition by The Fabricant which entailed the development of a Snapchat filter, much similar to the one showcased by this paper. Egidijus Uckuronis, the creator, used the 3D design of the jacket provided by the Fabricant and visualized it on the user's body using the 3D body tracking feature. However, the garment-as mentioned abovewas intended for 3D use and not production. On the contrary, the main research question explored by this paper is What are the required steps to turn a production-ready 3D garment into a format that's optimal for use in a virtual tryon scenario, especially in an AR environment such as Snap's Lens Studio?.

METHODOLOGY
The process of preparing a garment for use in an AR experience framework is described in the diagram of Figure 1. In the proposed work, our aim is to transform a garment designed for production using Browzwear's VStitcher to a Snapchat Lens compatible form. The majority of work is centered around the rigging and skinning process, as those are what enable the 3D model to deform (rotate/twist/bend/etc.) according to the movements performed by the user. Rigging refers to the process of creating the bone structure of a 3D model. This bone structure is used to manipulate the 3D object with a series of deformations, like a puppet for animation. Hence, the rig is a series of connected joints used to describe an animation. Skinning, also known as vertex blending, enveloping, or skeleton-subspace deformation, is the process of transforming a mesh's vertices according to the rig that was created earlier.
Following the preparation of the garment, the process of creating the AR experience can be summarized in the diagram of Figure 2.

Avatar Generation and Modification
Lens Studio provides a body mesh that it's already rigged [5], so it can be used for draping purposes in the garment design software (see Figure 3).

Figure 3: Lens Studio body mesh dressed in VStitcher
However, if we want to represent better draping, we need a more realistic body mesh. In this section we describe the process of obtaining such a mesh, which can be used both in VStitcher and Lens Studio.

Body mesh from tape measurements
If we know the tape measurements of the person that we want to represent, we can generate a statistical body model based on a PCA model computed from CAESAR body scans [1]. However, since we want the filter to generalize for every person trying it out, we use the average body shape extracted from the CAESAR data set, expecting to match the average body shape of our filter's user base. Table 1 shows the tape measurements of the average body shapes per gender from the CAESAR dataset, while figure 4 shows the front view of the corresponding average shape avatars.

Rig restructuring
This step is necessary to make the avatar compatible with the body tracking mechanism offered by Lens Studio. The average shape avatar that is generated by the CAESAR dataset has a complex rig structure that allows a wide variety of pose modifications, but it doesn't correspond with the joints used by Snap Studio in body tracking. The skeleton structure and naming convention expected by Lens Studio is described in their 3D Body Tracking documentation [4], while the much more complex generated avatar and its structure can be seen in Figure 5. The average shape avatar skeleton is then modified through the use of custom tools to match the one required by Snapchat Lens Studio. The tools use the Asset Importer library to import an avatar in COLLADA or FBX format, while applying a series of operations to it. These operations are stored in a text file. Three different types of operations are performed: removal of unneeded nodes and their children, removal of unneeded bones and reparenting children, and node renaming to match the naming conventions used by the AR framework.
When a node is removed from the skeleton structure, the skinning weights for the vertices it affected should be re-calculated. The skinning weights associated with the removed node are assigned to the parent node. When re-computing the skinning-weights, we need to make sure the skinning weights for any given vertex add up to 1, so the linear blending skinning equation behaves correctly during rendering. The result is an avatar with a correctly modified rig structure. Figure 6 shows the average male avatar after applying those transformations. The body mesh stays the same, but the joints and skinning weights have been updated.

Draping points and poses
The Fashion Design software (i.e. Browzwear in our case) needs to be able to use this avatar to dress it in the garment. Hence, the avatar needs to include a list of anchor points, as described in the documentation of the fashion software. These anchor points are automatically generated when we generate a body mesh from tape measurements. The white dots in Figure 5 are those anchor points.
We also need to embed a few poses that VStitcher will use to drape the garment. The default pose of our avatars, or bind pose, is a T-pose. But if we drape the garment on a T-pose, the wrinkles would Figure 5: Visualisation of the default avatar rig. The bones of the skeleton are represented by green lines that connect each joint, represented as a circle. The color on the mesh represents the skinning weights for a particular joint, that is, its contribution to vertex deformation for skeletal animation. The white points and labels correspond to the anchor points for draping. look unnatural for most poses. So we add a generic A-pose, i.e. a pose with the arms down, that we will use for draping garments. It is important that the arms are not too close to the body to avoid artefacts during garment rigging. That will be explained in more detail in Section 3.3.
The avatar service automatically creates an FBX file which contains the anchor points as locators and the poses as keyframes. That file is ready to use in VStitcher, and we can then simulate the fit of the designed garment on the avatar.

Garments in Fashion Design Software
There is a variety of options regarding the choice of fashion design software. For the design of our garments, we have used VStitcher 3D Apparel Design Software by Browzwear. It is assumed that the garment has already been created by designers and pattern makers.
The avatar created in the previous step is imported into VStitcher and the garment is simulated on it. Figure 7 shows the garment simulated and ray-traced in VStitcher, and how it looks in Blender after exporting and skinning (see Section 3.3).
The simulated garment is exported using specific export properties that are offered by VStitcher: • FBX file format is selected, as it is needed for use in the AR framework.
• Inside and Thickness meshes of the garment's geometry which are generated by VStitcher have to be excluded because they are not visible in the result and could result in a glitched simulation (Z-fighting artefacts) if present.
• Use of single UV map for the garment, for simplicity and to save memory. It is possible to generate a different UV map per piece, resulting on at least 4 textures per piece of the garment. A regular outfit can have around 20 pieces, resulting on 80 textures. We don't have an automatic way of importing those textures in Lens Studio, so we want to avoid having to set that Figure 6: Average male avatar after applying the rig restructuring. many textures. Also, the maximum size of a project is 4 MB, so we want to make the project as compact as possible.
The result is an FBX file containing the garment simulated on the modified avatar. This file contains only the mesh, with the UVs, and the materials. It does not have a rig, and thus cannot be deformed to match the user's body pose. The next section explains how to rig the garment.

Garment rigging
For the garment to be able to move along with the user's real-life bones and follow their body pose, it needs to be rigged and skinned. The rigging can be done using 3D content creation software such as Autodesk® Maya®, or its open-source counterpart Blender®. There are tools to automatically skin the garment, but they often result on broken sleeves and other artefacts.
Since we already have an avatar with the correct bone structure, we opted for transferring the skeleton and the skinning weights from the avatar into the garment. The asset importing and exporting is done using the FBX SDK [2]. The process is as follows: 1. The skeleton is copied over from the avatar.
2. Any submeshes in the garment that share a material are fused together into a single mesh.
3. The avatar is placed in the same pose that it was used for simulation.
4. The geometry of the garment is translated to match the reference from VStitcher (VStitcher automatically recenters all poses around the crotch).

A K-nearest-neighbour algorithm is used to match every vertex
in the garment to the nearest K vertices in the posed avatar. The skinning weights of those K vertices are averaged and transferred to the garment.
6. Once we have the garment rigged, we undo the pose to place the garment back into the T-pose. 7. Bones are re-oriented so the up axis of every joint with a single child points towards that child. This is important because Lens Studio does not understand arbitrary rotation axis in the rig.
The reposing step is necessary because we simulate the garment on an A-pose so the draping looks more natural, but Lens Studio expects the garment to be on a T-pose. Figure 7 (right) shows the output of this garment rigging process (displayed in Blender).
The result is a properly rigged and skinned garment, ready to import and use in the AR experience studio, where the Snapchat filter will be set up.

AR Experience Studio
For the creation of the Snapchat filter, Lens Studio is the tool that is used as the AR experience studio. This is provided by Snap Inc. on their website and is regularly updated to include new features.
The Snapchat filter that is developed for this project features three main components: the body tracking component and the rendering of the garment onto the user's body through the use of their phone camera, the approach for dealing with body occlusions of the garment mesh, and chain physics to simulate the effect of gravity on the garment.

Body Tracking and garment rendering
A 3D Object Tracking component is used to attach the garment to the body features found in the camera. This is a component that takes in a Tracking Asset that provides information to the component as to what it should track. The choice in this use case is a Body Tracking Asset, which is available in the Lens Studio Asset Library as a resource. The 3D Object Tracking component has some properties that can be modified.
By changing the Object Index property, we can specify which object this component affects when multiple objects are identified in the camera. The Tracking Mode property has an available enumeration value of either Pose Only or Proportions and Pose. In the first case, the object is modified by the user's pose. In the second case, besides by the pose, the attached object's proportion is also modified by the tracked object's size. Last, the Boolean property Track Position dictates whether or not to change the position transform of the attached scene object based on the position of the tracked object.
The garment is imported into Lens Studio as an FBX file, and the skeleton structure is visible in the file overview. With the use of the body tracking asset, the list of available tracked points is presented in the inspector panel, and the joints of the skeleton are connected to the respective fields. This can be done manually by dragging and dropping the different joints into the tracked points value fields, or automatically by choosing the Match Hierarchy option. For the latter method to work, a specific naming convention should be applied to the joints when the skeleton is created in the 3D content creation software.
The last important step is to set the material textures of the mesh. These are PNG files that are exported from the Fashion Design Software and should be added to the lens's resources and then linked to the mesh's material. Both a base texture and a normal texture is applied. This process needs to be repeated for all the garments that are going to be a part of the lens.
In Figure 8a, we can see the result of the lens created by this process on one of Snapchat's test videos. We show one indicative frame of the video.

Body Mesh Occlusion
In order to occlude the garment when the user's body overlaps it, Lens Studio's body mesh was used. The body mesh is a component offered in the latest release of Lens Studio and it creates a custom mesh that mimics the user's body as a 3D mesh in real time. It is available through Lens Studio's Asset Library. The Body Mesh component was imported into the project and a graph occluder material was assigned to it. The 3D position of the Body Mesh had to be manually aligned with the garment, in a manner that looks like the garment is worn by the body. The result, which can be seen in figure 8b, is that when the user's limbs go over the garment, the garment's mesh is occluded in real time.

Physics Simulations
The final step is to introduce the garment physics so as to emulate the garment movements relative to the body movements tracked by the Body Tracker component. In this direction, we want to simulate the effect of gravity on a garment, which is visible by the "wobble" when the user moves around. For this, Position-Based Dynamics are used, which is a stable and controllable type of simulation that omits velocity and works directly with positions. It is widely used in joint, chain, or cloth simulations. Lens Studio provides a component that implements this method and makes it available through their Asset Library. The component is connected to the garment's spine joints and the stiffness is adjusted for the movement to look natural.

Social Media Platform
Lens Studio is directly connected to the developer's Snapchat account so that a filter can be directly published from within the software. The developer may upload a filter icon and a preview video for the filter and then specifies the filter's title. These will all be visible to the end-users via the Snapchat app in the capture screen.
After the filter is uploaded, it is reviewed by Snap to ensure it follows the guidelines for submitting lenses. If the developer has an Ad account linked to their Snapchat account, they can submit their lens as a Business lens, instead of a Community lens. A sponsored lens can be used as a media object in the Ad account, which can be turned into a creature that can be attached to a new campaign. An Ad creative is an object that contains all the data for visually rendering a targeted ad on the social media platform.

FAHSION INDUSTRY
The objective of this work is to define a pipeline for turning a production-ready garment into an AR Experience and to provide multiple ways for the user to interact with the garment in the Augmented Reality space. Initially, the difficulty level of preparing a garment for use in an AR context will be discussed.
First of all, the ability to create a well-defined pipeline for this process is very important. This pipeline discussed in the methodology section of this paper can be used to process any garment that was created using fashion design software and produce an AR-ready version of it.
On the other hand, this pipeline might be too complex for someone who is not familiar with 3D modeling. The process of rigging and skinning the garment requires specialized knowledge. Consequently, unless a fashion firm has dedicated software engineers with experience in the 3D modeling space, it would be hard to create such an experience.
Moreover, the installation and use of specialized software are needed to create the AR experience. This can be perceived as a disadvantage since it is a diversion from the fashion firm's main sector of involvement. Again, this implies that an expert would need to be present to create the AR experience.
Thankfully, the learning curve for using AR experience software is steep, meaning that one can get started pretty easily, however it is more difficult to master. The process of creating the lens is not as difficult when compared to the preparation of the garment's 3D model.
It is also important to consider the fact that this way of promotion offers relatively easy and cost-effective exposure to a vast user base, which is defined by how many people are using Snapchat (this number was around 280 million during the first quarter of 2021!).
However, some initial advertisement might be required for a Snapchat Lens (or any AR experience) to be viewed and interacted with by a large number of people. It all depends on the initial traction that the experience will get, and how the social media platform's algorithm will go about displaying it to more users. It is possible to be exposed to more users by creating a business lens and include it as a part of a paid promotional campaign through the platform's ad console.
On tracking the traction of a Snapchat Lens we can observe the following variables. There are four main ways for the user to interact with filters: Opens, Captures, Saves, and Shares. Opens are defined as the number of times people have opened an effect through the application cameras. Captures are the number of times someone took a photo or video that featured these effects in application cameras. Saves are the number of times people took a photo or video while using an effective and saved it to their device. Last, shares describe the number of times someone took a photo or video featuring an effect and shared it: shares are counted in stories, posts, and messages. These interactions are being tracked and are available to the creator of the filter to view and analyze through an insights dashboard to assess the performance and effectiveness of the filter over the audience.

CONCLUSION & FUTURE WORK
Taking all the above into consideration, it is safe to conclude that the solution proposed by this paper is suitable to its purpose. There are a lot of possibilities when it comes to connecting the fashion world with emerging technologies, such as Augmented Reality. This particular Use Case is only one example of many.
It is evident that the industry follows this movement towards more innovative ways to promote and disseminate products. This can be seen through the implementation of body tracking and cloth simulation features by companies such as Snapchat, Apple, Facebook and Google, through their respective platforms.
In this paper, we have shown how we can transform a garment designed for production in a form that is compatible with an AR framework (Snapchat Lens) without interrupting the design workflow of the brand. As our future work, we plan to gather and analyze the interaction of users with our filter in order to get useful insights for its usability.
Another future endeavour that will be considered is the level of user interaction, that is to what extent can the user immerse into the AR experience and how realistic the simulation is. Firstly, User Interface components can be included in the Snapchat Lens experience (e.g. to change the colour of the garment or between different garments), which can be complementary to the functionality of the AR experience. Practically, one does not need a User Interface to be able to view an overlay of a garment on themselves. The User Interfaces can range from non-existent to very complex structures. However, it is important to consider the short interaction time a user has with the particular Lens. Snapchat lenses are previewed through the use of a carousel slideshow component in the app's user interface. It is very easy for the user to swipe to the next lens without thoroughly exploring the capabilities of a Lens with a complex UI structure.