Analysis of Intrapersonal Synchronization in Full-Body Movements Displaying Different Expressive Qualities

Intrapersonal synchronization of limb movements is a relevant feature for assessing coordination of motoric behavior. In this paper, we show that it can also distinguish between full-body movements performed with different expressive qualities, namely rigidity, fluidity, and impulsivity. For this purpose, we collected a dataset of movements performed by professional dancers, and annotated the perceived movement qualities with the help of a group of experts in expressive movement analysis. We computed intra personal synchronization by applying the Event Synchronization algorithm to the time-series of the speed of arms and hands. Results show that movements performed with different qualities display a significantly different amount of intra personal synchronization: impulsive movements are the most synchronized, the fluid ones show the lowest values of synchronization, and the rigid ones lay in between.


INTRODUCTION
In this paper we introduce an approach to dance movement analysis in which a set of distinct dance performances are analyzed and classified according to the dancer's movement qualities. Our work stems from the analysis of syn-Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. chronization of body joint's velocity in a dance performance. Despite being a very intuitive quantity, velocity of joints can be a very descriptive feature: for example, Bernhardt and Robinson [1] propose to detect emotional states from simple movement features, such as hands' speed and acceleration. Castellano et al. [7] show that emotional states can be inferred from user's hands' velocity and acceleration, quantity of motion, contraction index and directness of movement. Gross et al. [12] analyze kinematics (by extracting range of motion and the velocity) and expressive qualities of simple movements (e.g., knocking at the door) charged with an emotional intention.

AVI '16 Bari, Italy
In this paper we propose a method to distinguish between different movement qualities using a multilayered framework approach. In such framework low-level features (e.g., single joint velocity) are used to compute high level features (such as fluidity, or emotion). Our goal, in particular, is to demonstrate that intra-personal synchronization (i.e., the level of synchronization of the joints composing the kinematic chains of a single person) can help to automatically distinguish movements displaying following expressive qualities: Fluidity, Rigidity, and Impulsivity.
The presented work is part of a more general framework, that is, the EU ICT H2020 DANCE Project 1 , aiming at developing techniques and models for human body movement quality analysis, with a focus on the expressive component of non-verbal communication and in particular on how movement qualities of a dancer are perceived by an external observer. The importance of this challenging scenario is evident in several domains and applications, such as diagnosis of psycho-pathological disorders [31], therapy and rehabilitation [26], expressive and natural interfaces [4], and affective computing [2], [10].
Future practical applications of our models and algorithms, besides the main goals of the DANCE project mentioned above, can be numerous in many fields. Automated detection of impulsivity in a video surveillance system could allow one to identify, for example, dangerous events, during which people produce impulsive, unusual movements. HCI systems . At each layer a number of movement features are extracted based on the features extracted in the lower ones. Moving from bottom to top, we start from the physical level (e.g., body joints positions and rotation) toward more complex features, models and concepts. At the same time we pass from instantaneous measurements to larger time scales, that is, the higher we move in the stack the larger the time needed to extract the movement features becomes. The movement features extracted at each level are relative to the user modeled by the stack of layers. A subset of features (i.e., synchronization, entrainment, cohesion, leadership) can be computed both on a single user (intra-personal features) and on the features of multiple users (inter-personal features). For example, synchronization can be computed between the joints' velocity of a single user or between the emotional state of multiple users. The features described in the paper are highlighted in bold. could improve user experience by automatically detecting user's rigidity, that could be an indicator of stress, reflected on muscles tension. In the same way, fluid movements could indicate relaxation in experienced users.
The paper is organized as follows: in Section 3 we introduce a framework for multimodal analysis of expressive features which includes definitions and descriptions of the features we take into account; Section 4 describes a dataset we created containing dance performances characterized by the qualities we are looking for; Section 5 introduces the techniques we employed to analyze the data; Section 6 presents the obtained results and finally Section 7 concludes the paper.

RELATED WORK
Movement synchronization has been exploited in various scenarios: Repp [24] studied synchronization in a musical context: he investigated how musicians are required to coordinate their movements together in order to follow a common rhythm. In [28], authors created an interactive system where inter-personal synchronization derived from data captured by mobile phones controlled the activation of various audio tracks, and in [29] they used motor synchronization for social interaction purposes i.e., to identify a possible dominant person in a group. Authors of [18] introduced a rehabilitation system based on limbs synchronization that demonstrated effective in stabilizing the walking of patients affected by Parkinson's disease and hemiplegia. In [13] Leman et al. show how music might be an excellent domain to explore non-verbal communication and how synchronization and entrainment can be used to measure collaboration and coalition between users. In [14], authors concentrate on the effects of beat synchronized walking in human beings on movement timing and vigor.

MULTILAYERED FRAMEWORK
The research work we present in this paper is part of a more general scenario: modelling human body movement communication. This is the research context of the ongoing EU ICT H2020 Project DANCE. We are not interested in physical space occupation, movement direction, or "functional" physical movements of a user: our focus is on the implications at the expressive level. For example, let us consider the movement "Knocking at a door". We do not aim to analyze the functional action of the gesture itself (i.e., hitting with a closed hand on the surface of the door), but the intention that is behind/beyond the performed action (e.g., the emotional state guiding the lover that knocks at the door of her beloved) [22]. In particular, our perspective is on how movement qualities are perceived by an external

observer.
We conceive quality of movement of single and multiple users as a multilayered framework [3], illustrated in Figure  1. From bottom to top, the physical layer mainly concerns kinematics, e.g., trajectories and velocities of joints, or the shape of a silhouette of the body. Biomechanic features of single joints at a small time scale (observable frame by frame) are defined at the low-level features layer: for example, "smoothness", defined in the literature in terms of minimum jerk [30,19,21] or in terms of curvature of velocity trajectories [16,17]. A third layer, the mid-level features layer, addresses more complex qualities, usually extracted on groups of joints or on the whole body, and requires significantly longer temporal intervals to observe them (e.g., rhythmic patterns typically require a range of 0.5s -5s to be detected [9]). The concepts at the highest layer encompasses models and techniques that, based on user's quality of movement detected in the lower layers, represent, for example, user's emotional states and social attitudes.
The framework does not only define models for movement qualities of a single user but it can be extended to a multiuser scenario. To this aim, a particular subset of features deals with user's intra and inter personal movement qualities. It is the case, for example, of intra-and inter-personal synchronization. This feature can be computed between the movements of joints of a single user (e.g., to determine whether the user's movement is coordinated [17]) or between the movements of joints of a group of users (e.g., to measure the level of entrainment in a group of musicians [11]). Another example is the computation of leadership on both a single users (to find out which joint is leading movement of the whole body) and a group of users (to discriminate between group leaders and followers). The implementation of the framework is one of the main research activities in the H2020 EU ICT DANCE project. In the following sections we describe the three mid-level features we address in this paper: Fluidity, Impulsivity and Rigidity.

Fluidity
Fluidity is often considered as a synonym of "good" movement (e.g., in certain dance styles) and is one of the properties that seem to contribute significantly to perception of emotions [3]. Fluidity has been investigated by the work of Caridakis et al. [6] on hands trajectories, where it was com-puted as the sum of the variance of the norms of the hands' motion vectors. Piana et al. [21] studied human motion trajectories and defined a fluidity index based on the minimum jerk law.
We propose the following definition of Fluidity of movement. A Fluid movement can be performed by a part of the body or by the whole body and is characterized by the following properties: (I) the movement of each involved joint of the (part of) the body is smooth, following the standard definitions in the literature of biomechanics [30]; (II) the energy of movement (energy of muscles) is free to propagate along the kinematic chains of (parts of) the body (e.g., from head to trunk, from shoulders to arms) according to a coordinated wave-like propagation.

Rigidity
Rigidity is a movement quality strictly linked with the internal emotional state of a user. Being rigid by performing a movement can be a consequence of stress, fear or tension. For example, a stressed person tends to increase the tension in her muscles, producing rigid movements [5]. A better understanding and automatic detection of rigidity could greatly improve the adaptability of Human-Computer interfaces.
In [25] rigidity is considered as one of the motor cues to recognize emotions and mental states of children characterized by Autism Spectrum Conditions and it is measured as of the relative movement of different parts of the body. Moreover, rigidity is one of the few movement qualities that are addressed in credibility assessment in the information systems area. In [27] authors developed automated interviewing systems based on kinetic rigidity detection, in order to detect the amount of non-credible information during an interview.

Impulsivity
As reported by [20], impulsivity is an important component of emotion expression. According to Loewenstein and Lerner [15], "people commonly display impulsive behavior when they are hungry, thirsty, sexually aroused, or in elevated emotional states such as anger or fear". In psychology, impulsivity is an important component of various disorders, including e.g., substance use disorders, bipolar disorder, antisocial personality disorder, and so on. In dance, an impul- sive movement can be characterized as, according to Bishko, "A movement of increasing intensity ending with an accent is considered impactive".
In Physics, the impulse is defined as the variation of an object's momentum in time. The momentum depends on the object's mass and velocity. So basically an impulse corresponds to a high variation of the object's speed or, in other words, to high object's acceleration/deceleration. A similar concept can be found in psychological studies, for example in [8]: "actions that are poorly conceived, prematurely expressed". That is, an impulse can be considered as a movement with high acceleration performed with no premeditation.

RECORDINGS AND SEGMENTATION
We recorded short performances of professional dancers who were asked to exhibit full-body movements with one among the expressive qualities: Fluidity, Rigidity or Impulsivity. Two professional female dancers participated in the recording sessions. At the beginning of each session, the dancers were given definitions of the expressive qualities (see Section 3). For each expressive quality, the instruction we provided to the dancers were the following: 1. to perform several repetitions of predefined movements (e.g., avoiding an imaginary and sudden danger, throwing an object with a wave-like arm movement) by focusing on the expressive quality; 2. to perform an improvised choreography containing movements that, in the opinion of the dancer, better expressed the expressive quality; For the recordings we used a Qualisys motion capture system sampling dancers' movement at 100 Hz and synchronized with a video recording system (1280x720, 50fps). We placed 6 single markers and 11 rigid bodies plates on the dancer body, as illustrated in Figure 4. The resulting data consisted of the 3D positions of 19 markers: 6 corresponding to the single markers plus 11 corresponding to the rigid bodies' barycenters and 3 corresponding to all the markers attached rigid body placed on the dancer's head (see Figure 2). Two experts in the domain of expressive movement analysis segmented the recorded data. They were instructed to select segments that exhibited each expressive quality in a regular way. Feature segments were not validated in a formal manner: the identification of the most representative segments is the result of a discussion afterwards with the domain experts. Thus, segmentation was based on the observer's perception of the dancer's expressive quality, and not on the dancers' expressive intention. We obtained a dataset of 60 segments: it contains 10 highly impulsive, 10 highly fluid, and 10 highly rigid segments for each dancer. The mean segments duration is 5.85 seconds (SD = 3.76) and the total duration is 5 minutes 51 seconds.

EVENT SYNCHRONIZATION EXTRAC-TION
Our goal is to show that the amount of intra-personal synchronization, that is, the synchronization between the limbs' joints, can significantly contribute to the detection of expressive qualities.
The synchronization technique we chose for our analysis is the Event Synchronization (ES) algorithm [23]. It is based on time delay patterns between a pair of time-series containing event occurrences. The ES extraction process is summarized in Figure 3. To be computed, ES requires the following steps to be performed:   In this study, events were defined as abrupt changes of limbs velocity during the performances. In our dataset we extracted events by detecting peaks of the velocity module (velocity is computed as the derivative of position, given joint's 3D coordinates frame-by-frame). For each segment S of N frames in the dataset, we selected three joints of the right arm Jw, Je and Js (wrist, elbow, and shoulder, respectively) and we extracted the corresponding velocity modules vw, ve and vs. Then, we applied a supervised event detection algorithm (i.e., parameterized peak-detector) on velocity signals ve, vw, vs in S to extract significant events. The output of this process consists in three time-series tsv w , tsv e and tsv s containing all the events occurrences coupled with the exact time of the occurrence (e.g., ti is the time of the i-th event that occurred in the time series). In such time-series an "1" corresponds to event occurrence while a "0" means no occurrences. Each time-series has the same number of N items.
We computed ES on the following pairs of time-series: (tsv w ,tsv e ), (tsv e ,tsv s ) and (tsv w ,tsv s ). As an example, given the following pair of time-series (tsv w ,tsv e ).
Let's ti x be the time of the i-th event in the time series tsx. In our case event occurrences are: ti vw , tj ve with (i = 1, ..., mv w ; j = 1, ..., mv e ) where mx is the total number of detected events in the time-series tsx.
The number of times an event appears in tsv w "shortly" after it occurs in tsv e and vice-versa are respectively computed as: where τ represents the time lag allowed between two events to be considered synchronized. We set-up a value of τ = 20f rames (corresponding to 20 ms at 100 Hz) that in our context is a reasonable time for abrupt changes in velocity of movements to be perceived as synchronous by an observer. K τ ij and K τ ji are computed as follows: and the overall degree of synchronization Q τ in the timeseries is given by:

DATA ANALYSIS AND DISCUSSION
We computed average Q τ various (see equation 5) for three pairs of joints: elbow-shoulder, elbow-wrist and wristshoulder for 60 segments described in section 4. Detailed results are presented in Table 1 and Figure 5.
For the elbow-wrist pair (see Figure 6) post hoc comparisons using the LSD test with Bonferroni correction indicated that the synchronization indexes of Fluidity were significantly lower compared to Rigidity ones (p < .01), and Impulsivity ones (p < .001). Additionally the synchroniza- tion indexes of Rigidity scores were significantly lower than Impulsivity ones (p < .01). For the shoulder-wrist pair (see Figure 6) post hoc comparisons using the LSD test with Bonferroni correction indicated that only the synchronization indexes of Fluidity were significantly lower than Impulsivity ones (p < .01). There was no significant difference neither between the synchronization indexes of Fluidity and Rigidity (p = .108) nor between Rigidity and Impulsivity (p = .320). Results indicates that it is possible to distinguish between movements performed with different qualities by exploiting the arm's joints velocity synchronization analysis. Figure 6) shows that impulsive movements are the most synchronized, the fluid ones show the lowest values of synchronization, and the rigid ones lay in between. Although results show differences between the three qualities the differences between dancers were also significant for one pair of joints (elbow-wrist). The significant differences in synchronization between all thee considered expressive qualities occur in the most external joints of the body (elbow-wrist) (see Figure 5). Less strong differences in synchronization occur between pairs: elbow-shoulder and wrist-shoulder. It might be due to the fact that not all the movements have to include the shoulder motion. Indeed the average synchronization indexes Q τ for two joint pairs that include shoulder are lower compared to elbow-wrist correspondences.

CONCLUSION
In this paper we showed that intra-personal synchronization might contribute to distinguish movements performed with different expressive qualities, namely rigidity, fluidity, and impulsivity. By applying the Event Synchronization algorithm on the right arm joint's velocity we found out that different synchronization patterns characterize each considered quality. The next step will be an evaluation test conducted on a larger dataset of segments related to more than two dancers. Also, we will address a larger set of movement features. We plan to completely automatize the event detection step of our algorithm by taking into account the characteristics of the input signal (e.g., for a slowly varying signal a more "sensible" analysis will be needed in order to detect peaks).
Techniques and models deriving from the work presented in the paper can find a variety of applications in HCI. For example, the possibility of detecting movement features like fluidity and rigidity with non-invasive sensors will allow the development of stress-aware visual interfaces, capable of adapting their behavior to different users with, for example, different levels of expertise. More precisely, the possibility of recognizing and measuring such movement qualities, can lead to a significant improvement of future human-computer interfaces: they will have to be efficient, enjoyable (i.e., userfriendly) and able to capture information about the user's internal state (e.g., mood). During the interaction, interfaces will "tune" their behavior according to both user's needs and multimodal feedback.