PE-HRI-temporal: A Multimodal Temporal Dataset in a robot mediated Collaborative Educational Setting

doi:10.5281/zenodo.5576058

Published November 1, 2021 | Version v1

Dataset Open

PE-HRI-temporal: A Multimodal Temporal Dataset in a robot mediated Collaborative Educational Setting

1. École Polytechnique Fédérale de Lausanne (EPFL)

This data set consists of multi-modal temporal team behaviors as well as learning outcomes collected in the context of a robot mediated collaborative and constructivist learning activity called JUSThink [1,2]. The data set can be useful for those looking to explore evolution of log actions, speech behavior, affective states, and gaze patterns for students to model constructs such as engagement, motivation, collaboration, etc. in educational settings.

In this data set, team level data is collected from 34 teams of two (68 children) where the children are aged between 9 and 12. There are two files:

PE-HRI_learning_and_performance.csv: This file consists of the team level performance and learning metrics which are defined below:

last_error: This is the error of the last submitted solution. Note that if a team has found an optimal solution (error = 0) the game stops, therefore making last error = 0. This is a metric for performance in the task.
T_LG_absolute: It is a team-level learning outcome that we calculate by taking the average of the two individual absolute learning gains of the team members. The individual absolute gain is the difference between a participant’s post-test and pre-test score, divided by the maximum score that can be achieved (10), which grasps how much the participant learned of all the knowledge available.
T_LG_relative: It is a team-level learning outcome that we calculate by taking the average of the two individual relative learning gains of the team members. The individual relative gain is the difference between a participant’s post-test and pre-test score, divided by the difference between the maximum score that can be achieved and the pre-test score. This grasps how much the participant learned of the knowledge that he/she didn’t possess before the activity.
T_LG_joint_abs: It is a team-level learning outcome defined as the difference between the number of questions that both of the team members answer correctly in the post-test and in the pre-test, which grasps the amount of knowledge acquired together by the team members during the activity

PE-HRI_behavioral_timeseries.csv: In this file, for each team, the interaction of around 20-25 minutes is organized in windows of 10 seconds; hence, we have a total of 5048 windows of 10 seconds each. We report team level log actions, speech behavior, affective states, and gaze patterns for each window. More specifically, within each window, 26 features are generated in two ways:

non-incremental
incremental

A non-incremental type would mean the value of a feature in that particular time window while an incremental type would mean the value of a feature until that particular time window. The incremental type is indicated by an "_inc" at the end of the feature name. Hence, in the end, within each window, we have 52 values:

T_add/(_inc): The number of times a team added an edge on the map in that window/(until that window).
T_remove/(_inc): The number of times a team removed an edge from the map in that window/(until that window).
T_ratio_add_rem/(_inc): The ratio of addition of edges over deletion of edges by a team in that window/(until that window).
T_action/(_inc): The total number of actions taken by a team (add, delete, submit, presses on the screen) in that window/(until that window).
T_hist/(_inc): The number of times a team opened the sub-window with history of their previous solutions in that window/(until that window).
T_help/(_inc): The number of times a team opened the instructions manual in that window/(until that window). Please note that the robot initially gives all the instructions before the game-play while a video is played for demonstration of the functionality of the game.
T1_T1_rem/(_inc): The number of times either of the two members in the team followed the pattern consecutively: I add an edge, I then delete it in that window/(until that window).
T1_T1_add/(_inc): The number of times either of the two members in the team followed the pattern consecutively: I delete an edge, I add it back in that window/(until that window).
T1_T2_rem/(_inc): The number of times the members of the team followed the pattern consecutively: I add an edge, you then delete it in that window/(until that window).
T1_T2_add/(_inc): The number of times the members of the team followed the pattern consecutively: I delete an edge, you add it back in that window/(until that window).
redundant_exist/(_inc): The number of times the team had redundant edges in their map in that window/(until that window).
positive_valence/(_inc): The average value of positive valence for the team in that window/(until that window).
negative_valence/(_inc): The average value of negative valence for the team in that window/(until that window).
difference_in_valence/(_inc): The difference of the average value of positive and negative valence for the team in that window/(until that window).
arousal/(_inc): The average value of arousal for the team in that window/(until that window).
gaze_at_partner/(_inc): The average of the the two team member's gaze when looking at their partner in that window/(until that window). Each individual member's gaze is calculated as a percentage of time in that window/(until that window).
gaze_at_robot/(_inc): The average of the the two team member's gaze when looking at the robot in that window/(until that window). Each individual member's gaze is calculated as a percentage of time in that window/(until that window).
gaze_other/(_inc): The average of the the two team member's gaze when looking in the direction opposite to the robot in that window/(until that window). Each individual member's gaze is calculated as a percentage of time in that window/(until that window).
gaze_at_screen_left/(_inc): The average of the the two team member's gaze when looking at the left side of the screen in that window/(until that window). Each individual member's gaze is calculated as a percentage of time in that window/(until that window).
gaze_at_screen_right/(_inc): The average of the the two team member's gaze when looking at the right side of the screen in that window/(until that window). Each individual member's gaze is calculated as a percentage of time in that window/(until that window).
T_speech_activity/(_inc): The average of the two team member's speech activity in that window/(until that window). Each individual member's speech activity is calculated as a percentage of time that they are speaking in that window/(until that window).
T_silence/(_inc): The average of the two team member's silence in that window/(until that window). Each individual member's silence is calculated as a percentage of time in that window/(until that window).
T_short_pauses/(_inc): The average of the two team member's short pauses over their speech activity in that window/(until that window). Each individual member's short pause refers to a brief pause of 0.15 seconds and is calculated as a percentage of time in that window/(until that window).
T_long_pauses/(_inc): The average of the two team members long pauses over their speech activity in that window/(until that window). Each individual member's long pause refers to a pause of 1.5 seconds and is calculated as a percentage of time in that window/(until that window).
T_overlap/(_inc): The average percentage of time the speech of the team members overlaps in that window/(until that window).
T_overlap_to_speech_ratio/(_inc): The ratio of the speech overlap over the speech activity of the team in that window/(until that window).

Apart from these 52 values, within each window, we also indicate:

team: The team to which the window belongs to.
time_in_secs: Time in seconds until that window.
window: The window number.
normalized_time: The time when this window occurred with respect to the total duration of the task for a particular team.

Lastly, we briefly elaborate on how the features are operationalised. We extract log behaviors from the recorded rosbags while the behaviors related to both gaze and affective states are computed through the open source library OpenFace [6] that returns both facial actions units (AUs) as well as gaze angles. For voice activity detection (VAD), that classifies if a piece of audio is voiced or unvoiced, we made use of the python wrapper for the open source Google WebRTC VAD. The literature that inspired our log, audio and video features as well as the tools used to extract them are described in more detail in [3,4]. However, in those papers, we make use of only the aggregate version of this data [5].

Files

PE-HRI_behavioral_timeseries.csv

Files (3.3 MB)

Name	Size	Download all
PE-HRI_behavioral_timeseries.csv md5:b3a7a5ddd1173b60350f9a6446b16644	3.2 MB	Preview Download
PE-HRI_learning_and_performance.csv md5:527144cd9d158ee9062eb685a0e23ed8	2.3 kB	Preview Download

Additional details

References: Conference paper: 10.1109/RO-MAN47096.2020.9223343 (DOI); Dataset: 10.5281/zenodo.4633092 (DOI); Journal article: 10.1007/s12369-021-00766-w (DOI)

ANIMATAS – Advancing intuitive human-machine interaction with human-like social capabilities for education in schools 765955: European Commission

[1] J. Nasir, U. Norman, B. Bruno, and P. Dillenbourg, "You Tell, I Do, and We Swap until we Connect All the Gold Mines!," ERCIM News, vol. 2020, no. 120, 2020, [Online]. Available: [https://ercim-news.ercim.eu/en120/special/you-tell-i-do-and-we-swap-until-we-connect-all-the-gold-mines](https://ercim-news.ercim.eu/en120/special/you-tell-i-do-and-we-swap-until-we-connect-all-the-gold-mines).
[2] J. Nasir*, U. Norman*, B. Bruno, and P. Dillenbourg, "When Positive Perception of the Robot Has No Effect on Learning," in 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), Aug. 2020, pp. 313–320, doi: [10.1109/RO-MAN47096.2020.9223343](https://doi.org/10.1109/RO-MAN47096.2020.9223343).
[3] J. Nasir, B. Bruno, M. Chetouani, P. Dillenbourg, "What if social robots look for productive engagement?", in International Journal of Social Robotics. DOI https://doi.org/10.1007/s12369-021-00766-w
[4] J. Nasir, A. Kothiyal, B. Bruno, P. Dillenbourg, "Many Are The Ways to Learn: Identifying multi-modal behavioral profiles of collaborative learning in constructivist activities," accepted in International Journal of Computer-Supported Collaborative Learning (2021). https://infoscience.epfl.ch/record/289454?&ln=en
[5] Nasir, Jauwairia, Norman, Utku, Bruno, Barbara, Chetouani, Mohamed, & Dillenbourg, Pierre. (2021). PE-HRI: A Multimodal Dataset for the study of Productive Engagement in a robot mediated Collaborative Educational Setting [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4633092
[6] Baltrusaitis, T., Robinson, P., & Morency, L.-P. (2016). "Openface: An open source facial behavior analysis toolkit.", 1-10. doi: 10.1109/WACV.2016.7477553

	All versions	This version
Views	605	603
Downloads	169	168
Data volume	558.7 MB	555.5 MB

PE-HRI-temporal: A Multimodal Temporal Dataset in a robot mediated Collaborative Educational Setting

Files

PE-HRI_behavioral_timeseries.csv

Files (3.3 MB)

Additional details

Related works

Funding

References

PE-HRI-temporal: A Multimodal Temporal Dataset in a robot mediated Collaborative Educational Setting

Creators

Description

Files

PE-HRI_behavioral_timeseries.csv

Files (3.3 MB)

Additional details

Related works

Funding

References