Providing Behaviour Awareness in Collaborative Project Courses

: Several studies show that awareness mechanisms can contribute to enhance the collaboration process among students and the learning experiences during collaborative project courses. However, it is not clear what awareness information should be provided to whom, when it should be provided, and how to obtain and represent such information in an accurate and understandable way. Regardless the research eﬀorts done in this area, the problem remains open. By recognizing the diversity of work scenarios (contexts) where the collaboration may occur, this research proposes a behaviour awareness mechanism to support collaborative work in undergraduate project courses. Based on the authors previous experiences and the literature in the area, the proposed mechanism considers personal and social awareness components, which represent metrics in a visual way, helping students realize their performance, and lecturers intervene when needed. The trustworthiness of the mechanisms for determining the metrics was veriﬁed using empirical data, and the usability and usefulness of these metrics were evaluated with undergraduate students. Experimental results show that this awareness mechanism is useful, understandable and representative of the observed scenarios.


Introduction
Learning by doing is one of the most used instructional paradigms to promote meaningful learning in engineering education [Freeman et al. 2014].Lecturers usually implement this paradigm in their courses making students work in teams to address particular tasks or projects [Felder et al. 2003].Typically, the course lecturers and also the students have low visibility about individual and team performance during collaboration processes, which limits their capability to react on time to take corrective pedagogical measures or to rectify and improve the students behaviour patterns.This situation shows the need to count on automatic mechanisms to monitor team members' activities and provide feedback accordingly to support students and lecturers.However, implementing this feedback mechanism is a complex task due to the large diversity of contexts where the collaboration may occur.
For helping address that challenge, this research work explored the feasibility of defining an awareness mechanism to support the collaborative learning activities in undergraduate project courses.The result of this work is a Behaviour Awareness Mechanism (BAM) that provides visual feedback to students and lecturers.The feedback provided by the BAM is aimed at promoting reflection and encouraging social interactions between students.In addition, the BAM is intended to be used across different Computer-Supported Collaborative Learning (CSCL) systems and contexts, providing dynamic and comprehensive feedback to the users.Therefore, this awareness mechanism involves several metrics that should be captured as automatically as possible.Consequently, we consider courses that are supported by software tools that record information about the students' activities such as learning management platforms, software repositories with version control, project management systems and discussion forums.
The effectiveness of BAM was evaluated using empirical data from courses of the Polytechnic University of Catalonia (UPC), in Spain.This evaluation provided evidence that the proposed awareness mechanism can be potentially computerized and automated, while allowing the intervention of expert users (i.e., lecturers) for the validation or customization of the awareness provision.Therefore, the BAM could be embedded as a service in collaborative learning applications.
Next section discusses the related work.[Section 3] describes the design of the prototype developed to provide visual feedback to students and [Section 4] the implementation.[Section 5] reports results of the evaluation of the accuracy of the awareness provided by BAM.[Section 6] shows and discusses the results obtained in the evaluation of the usability and usefulness of BAM.Finally, [Section 7] presents the conclusions and the future work.

Related Work
A significant body of research has focused on studying the factors that contribute to the effectiveness and quality of collaboration.For instance, in [O'Dea et al. 2007] the authors conduct a literature review to define seven general dimensions that affect the success of a collaboration process (i.e., communication, coordination and knowledge management).Similarly, in [Lin et al. 2008] the authors identify some other factors that influence the effectiveness of virtual teams (i.e., relationship building, cohesion and satisfaction).These dimensions and factors are typically used to develop indicators that help determine the usefulness of the collaborative systems and also monitor and assess collaboration processes.For instance, in [de Melo et al. 2014] the authors propose a metric to measure the productivity and engagement of the members of a software team, based on the information provided by the version control system.Similarly, in [Cosentino et al. 2014] the information of GitHub version control system is used to assess the success level of open source software projects.There are also similar works in collaborative learning scenarios.

Determining Collaboration in Learning Scenarios
In [Daradoumis et al. 2003], the asynchronous interactions among members of 60 virtual teams were evaluated using information recorded by the BSCW system.Similarly, in [Chounta et al. 2013] the authors used log files, collected from the interactions of students' teams with handheld devices to assess the teams performance with regard to the results of a location-based learning game.Results from the study showed that the teams with the highest performance in the game were those with the highest activity levels and with the lowest delays between actions.The study presented in [Chounta et al. 2014] uses properties of network graphs as metrics to assess the quality of collaboration of synchronous collaborative learning activities mediated by a shared workspace application.The study considered four factors affecting the collaboration (communication, joint information processing, coordination and interpersonal relationship), and the results showed that the numbers of nodes of the network had a high correlation with those factors.

Representing Collaboration in Learning Scenarios
It is recognized that awareness is a valuable feature that affects motivation [Wu et al. 2014] and group coordination [Kwon et al. 2013], and therefore the quality of any collaboration process.Consequently, some interesting works in CSCL have been prompted by the need of providing appropriate awareness support to promote active learning and coordinate students' activities [Kwon et al. 2013, Fransen et al. 2011].In that line, feedback has been regarded as an extremely important awareness mechanism, which influences positively the learning process by providing students with information that allows them to improve their performance and learning behaviour [Schneider et al. 2015].
Many studies have addressed the feasibility of providing visual feedback functions in software systems supporting collaborative learning activities.Some of these studies aim at providing awareness and feedback in e-Learning environments.For instance, in [Kwon et al. 2013] the authors propose a Web-based group coordination tool that visually shows the assessments of the team members performance and allows comparing theses values with those from other teams.A different approach is proposed in [Lambropoulos et al. 2012] in which indicators of presence, participation and interactions among students are used to provide awareness of the teamwork.
Awareness mechanisms have also been used to support face-to-face collaborative learning activities.For instance, in [Melero et al. 2015] the authors propose several task-specific visualizations designed to provide awareness during a gamified location-based learning activity.In [Ogata and Yano 2004] the authors represent collaboration through a knowledge awareness map that shows the relationship between the shared knowledge and the current and past interactions of learners.This representation is used by the students to find potential collaborators and helpers.In a later work these authors present a system that not only recommends educational materials to learners, but also potential peer learners sharing similar interests and experiences [El-Bishouty et al. 2007].In [Govaerts et al. 2012] the authors propose the Student Activity Meter, a visual representation of the students actions, designed to increase the awareness for learners and teachers and also to support self-reflection.Janssen et al. [Janssen et al. 2007] study the effects of this type of visualization on the students participation during computer-supported collaborative learning processes.
The previous works differ from the BAM mainly in two different aspects: (i) they focus on a particular type of awareness, such as individual contributions, conflict or peer feedback, whereas our proposal includes a wider range of sources for feedback, providing both subjective and objective information, (ii) their methods of feedback provision are restricted to particular contexts; i.e., the previously mentioned solutions provide awareness only within the context of a specific collaborative activity, and they are linked to a particular collaborative application.By contrast, BAM is intended to be used across different CSCL systems and contexts.As a result, it uses diverse sources for the calculation of metrics of effectiveness of collaboration, which allows the provision of dynamic and comprehensive feedback to the users.

Making Recommendations
Typically, determining and representing collaboration levels in an instructional scenario are required steps for intervening the learning process.This intervention can be done by providing awareness to the involved people, or particular recommendations.Concerning the second approach, in [González-Ibáñez et al. 2015] the authors propose a method to determine the feasibility of transforming, based on an analysis of benefits and costs, collaboration opportunities into explicit recommendations for collaboration.In [Zheng and Yano 2007] the authors propose a framework for peer-recommendation based on context awareness in e-learning scenarios.
There are important research efforts to tackle the problem of link prediction in co-authoring network [Benchettara et al. 2010, Brandão et al. 2013], which can be adapted for making recommendations in collaborative learning scenarios.These proposals typically use Social Network Analysis techniques for predicting new links.
The proposal presented in BAM provides recommendations about potential collaborators for conducting team learning activities, based on the people previous actions and features.Particularly, this proposal considers the students' collaborative behaviour: communication, coordination, motivation, performance and satisfaction.Next we explain the proposal in detail.

Design of the Behaviour Awareness Mechanism
The Behaviour Awareness Mechanism (BAM) requires a computer-supported environment that provides students activities information for generating visual feedback to students and lecturers.The course lecturer and students use regular software tools to support their activities (including their projects and assignments).Typically, these tools record information about the user activity, which can be automatically retrieved through an application programming interface (API) or by processing the data source.Having this information is mandatory to use BAM; therefore, the first operation of the supporting environment is the Data Capture [Fig.1], which must determine and ensure that the information required to illustrate the users' activities is recorded by the data source and accessible through an external software component.
During the second step, known as Data Processing, the users' information is retrieved and processed by information extractors to obtain the metrics that will be used to instantiate the awareness components.Finally, the visual feedback to the user is provided (third step) using metrics obtained in the previous stage.This feedback intends to promote behavioural changes that impact positively on the activities of the students.This is reflected on the students interactions using the supporting tools.The lecturer uses this feedback to understand the students individual and team performance, intervening when necessary.
The visual awareness was designed considering two basic facts: (i) any awareness mechanism must provide an understanding of the activities of others as a context for the activities of the individual [Antunes et al. 2014], and (ii) the feedback provision must ensure that the students are able to relate their current state of learning and performance with specific targets or standards [Nicol et

Personal Awareness Component
The Personal Awareness Component (PAC) provides information about the collaborative patterns of a specific student, which is represented using several features of the students' collaborative behaviour.We conducted a literature review based on previous research work about quality assessment of computer-supported collaboration processes to determine the features that should be represented in the PAC.From this study, we found several basic dimensions related to the effectiveness of collaboration.These dimensions included aspects related to the collaboration processes and the way in which the students interact within different teams (e.g., participation and coordination).We also considered personal features affecting collaboration and learning (e.g., motivation, satisfaction, and individual performance), characteristics of the students' social interactions (e.g., social presence and connectedness) and elements associated with the collaboration outcomes (e.g., productivity, solution quality, and team performance).For the sake of simplicity and to facilitate the visual representation of the PAC, we classified and condensed these dimensions into the following types: communication, coordination, motivation, performance and satisfaction.

Social Awareness Component
The Social Awareness Component (SAC) provides social (collective) awareness and proposes possible suitable collaborators to the user.We use the Multi-Dimensional Scaling (MDS) method to represent students as points in a 2D space [Buja et al. 2008].Using MDS the values of the five collaboration features the Collaborative Behaviour Index (CBI) can be mapped into a point in a 2D space, in such a way that distances between points are preserved.Thus, we can represent, two students with similar behaviour as two points located at a short distance from each other.However, it could happen that students having similar CBI could also have very dissimilar values of the several collaboration features that compose this index.In that case, the MDS allows us to represent such students as two distant points in the SAC; therefore, these students will not be suggested as potential collaborators.

Implementation of the Behaviour Awareness Mechanism
Next subsections present a use case that describe the three steps required for the implementation of the proposed awareness mechanism.This use case provides details about the tools and methods used for the implementation of the personal and social awareness components of the BAM.

Data Capture
In this use case we used real data traces collected from the behaviour of a group of students during an academic semester, as the information required to provide awareness.The data traces correspond to 42 third year students enrolled in the course "Design of Applications and Services (DSA)", delivered at the Castelldefels School of Telecommunications and Aerospace Engineering of the Polytechnic University of Catalonia.In this course the students had to run software development project in teams, using several software systems that allowed the organization and coordination of the team activities as well as the submissions of assignments and tests.
This dataset included information from the students' activities recorded in log files and opinions collected from online surveys while working within the formal learning context.This also included the interactions among students through collaborative learning tools and software systems used to support their learning activities.The analysis of the surveys intended to identify the students' feelings, opinions and behaviour during the course (both inside and outside the classroom), as well as the lecturers' observations about the state and progress of the students collaboration activities.On the other hand, the log files collected from the software platforms provided information about the students' activities and performance while working on the course project.These data traces provided diverse qualitative and quantitative metrics that were used for the calculation of the five collaborative behaviour features represented in BAM.Table 1 shows the data sources considered in this study and a detailed list of the metrics collected using those sources.

Data Processing
The calculation of the five collaborative behaviour features considered in the BAM, we used different combinations of metrics of the data traces collected from different sources.Hence, each feature is calculated using specific metrics, normalizing (from 0 to 100) the values that each metric provides, assigning weights as multiplying factors, aggregating the resulting values, and applying a corrective factor.This process can be represented through the following equation: where F eature x represents the collaborative feature to be calculated, x is the identificacion number of this feature, n is the number of metrics used, α n corresponds to the multiplying factors (from 0 to 1) that weigh each metric, M etric n is a function of a variable s -student -with range between 0 to 100, and β is a corrective term.The sum of α n and β is 1.From the previous equation and for the considered feature, we obtain a value within the range 0-100.As an example, let us consider that we use three metrics to calculate the "Performance" of a specific student.That metric includes the individual and group grades of Moodle assignments and also the coding frequency as calculated by GitHub.In this case, the resulting equation for Performance is the following: It is important to take into account that the metrics from each data trace can lay within any possible range of values; therefore, we must normalize the values of these metrics.For instance, the "GitHub coding frequency" indicates the number of items added by a particular student to the software project repository within a certain time period.We can normalize the GitHub coding frequency, assigning the values of 0 and 100 respectively to the theoretical maximum and minimum number of expected additions for a specific period.Thus, 0 and 100 could correspond to coding frequencies of 1 and 5 additions per week respectively.Also, notice that in this case the corrective factor is zero and that we assigned different weights to the multiplying factors, giving more importance to some measurements than to others.
Following the previous considerations, we can automatically generate the five collaborative features considered in the BAM, and also determine the weight that should be given to each metric.The use of machine learning techniques allows the system learn, over a period of time, how the students collaborate and interact using the software systems considered as data sources.Based on the learned information, the system can recalculate the collaborative behaviour features using new data.The machine learning algorithms ease the interpretation of the models used for the features calculation, allowing monitoring by human experts (e.g., lecturers) and enabling the validation and fine-tuning of the generated models.
The first step to generate a model that automatizes the calculation of the features, is to filter the metrics collected from the data sources to discard those that are not significant for the results.To do so, we first measured the values of the collaborative features for each individual student using the ratings provided by human observers; e.g., assessments of the lecturers and self-reports of the students about the quality of collaboration during the course.
Once obtained the "real" values of these features, we used three different methods to select the most relevant metrics as shown in Table 2.The Correlation analysis is used to discard metrics that had either a very high correlation between them or a very low correlation with the features.The Correlation Feature Selection Subset Evaluator [Hall 1998] allows us make a fine-grained selection of subsets of metrics that are highly correlated with the features while having low intercorrelation.Finally, the Wrapper method [Kohavi et al. 1997] allows us select sets of metrics that are specifically significant for a particular learning scheme.In our case, we considered a Linear Regression scheme.Using these methods, we reduced the initial set of 44 metrics to a subset of only 14.The correlation values between the subset of selected features and the collaborative behaviour features is shown in Table 3.

Representation of the Personal Awareness Component
We divided the PAC visualizations into two subcomponents to represent the students' collaborative behaviour features.The PAC-CBI [Fig.2.a] displays an   The PAC-Features subcomponent [see Fig. 2.b] is represented through a radar diagram.Each feature of the students' collaborative behaviour is depicted as a vertex of a coloured pentagon, which size corresponds to the normalized value of the features.Similar to the PAC-CBI, we depict four concentric pentagons, where the first one represents the theoretical ideal value expected for all the behaviour features.This value will be defined by lecturers according to specific targets that ideally the students could achieve.The pentagons of variable size represent the normalized minimum, average and maximum values of the features for the overall group of students.This enables a student to compare, for each feature, his own performance to the one of his/her peers.
Notice that both, the PAC-CBI and the PAC-Features visualizations, represent the behaviour of a specific student to whom the feedback is being displayed as colour-filled shapes (a circle and a pentagon, respectively).Moreover, the visualization of additional blank shapes, which represent ideal as well as minimum, average and maximum values, provides such a student with an understanding of his/her current state with regard to his/her fellow students and the ideal value.

Representation of the Social Awareness Component
For the visual representation of the SAC, we defined two different criteria to recommend collaborators.[Fig.4] shows the "highly recommended collaborators" and the "other recommended collaborators" areas, where the former includes at least one potential collaborator that is located at the closest MDS distance from the represented student, and the latter includes the previous one and it has a range that covers at least a 20% of the closest potential collaborators.This percentage was decided on the basis of the Pareto's principle or 80-20 rule [Hardy 2010]; therefore we considered that 20% of all possible collaborators can produce the most significant impact on the collaboration process.
This method for suggesting collaborators is based on the correlation between values of the CBI components for several students.Here, it is possible to recommend students who have similar or complementary behaviours in the same collaboration dimensions.Particularly, [Fig.4.a] represents the feedback provided to a student, where only those possible collaborators with complementary behaviour are proposed.In this case, students who have high values of certain behaviour features are suggested as potential collaborators of other students who have small values in such features and vice verse.[Fig.4.b] suggests collaborators using different colours depending on if they have a behaviour similar or complementary to the student receiving the feedback.

Evaluating the Accuracy of BAM
We used the Weka workbench system [Hall et al. 2009] for representing and evaluating the models used to calculate the features of the students' collaborative behaviour.We chose this system since it incorporates a variety of learning algorithms and some tools for the evaluation and comparison of the results.To evaluate the accuracy of these models we used the Correlation Coefficient (CC), the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE).The CC indicates the degree of correlation between the models and the values of the features as captured by the ratings provided by human observers.The MAE and RSME are common measures used to determine the quality of prediction models.The MAE gives the same weight to the deviations between the real and predicted values.By contrast, the RSME weighs large errors higher than small ones.We use the following definitions: In a first stage we randomly divided the dataset (42 students) in ten groups.One randomly chose a set used for training and the other nine were used in the evaluation process; i.e., we conducted a 10-fold cross-validation.
Once the model was processed using the training data set, we evaluated its prediction accuracy by running the model against the test set, which allowed us to assess the model using new data.Results show that there is a high correlation between the prediction models and the "real" values of the features [see Tab.4].We can also observe that the prediction errors are relatively small, especially considering that the granularity required for the visual representation of the features should not be very high.Therefore, we can conclude that the prediction models allow a trustworthy visualization of the collaborative behaviour features that is representative of the real-word observations.The awareness model requires a minimum of four value levels for the representation of the features, which correspond to the minimum, average, maximum and ideal values represented by the BAM.This granularity could be increase up to eight levels to represent intermediate values between.
In a second stage we conducted a new evaluation of the model, where we ordered the dataset chronologically, and then we split it in two segments.The first data segment was used to train the model, and the second one was used to evaluate it.In this second evaluation process we considered training segments of 24%, 47% and 71% of the whole dataset.The results show the error decreases while increases the size of the training set; particularly in communication, coordination and performance Tab. 5.The error tends to be acceptable when using  a training data segment with a size of at least 47% of the dataset.This does not occur in the motivation and satisfaction dimensions since they involve few instances; therefore, some few estimations with a large error produce an important negative impact on the prediction of these variables.We have also observed that large errors in motivation and satisfaction do not occur simultaneously in the same students.In fact, a large error in motivation tends to be linked to a small error in satisfaction, and vice verse.Therefore, if we consider as acceptable an error below 0.25, then all students of the dataset have 4 or 5 features that can be considered as representative of we can observe in practice.
The results also show that the model generated by the Simple Linear Regression scheme obtains the best results.This can be explained due there is overfitting in the generation of the Multiple Lineal Regression model.This means that because of the use of multiple metrics, some extreme values (not representative of the overall behaviour of the students) are taken into account for the creation of the model.Consequently, it might be necessary to perform further processing of the data input to eliminate outliers from the data traces.

Empirical Evaluation of BAM
The usability of BAM was evaluated using a software component embedded in the Moodle learning platform, which is used by the students of the DSA course to support their activities.Then, we conducted a user study involving 42 students of that course whom did not participate in the previous experience.The visual awareness used in the study involved the three representations considered in BAM.We asked participants to complete three tasks, one for each visualization type, to evaluate the fitness of the awareness proposal.Then, they had to indicate whether those figures represented "poor", "average" or "good" student performance, and if some students represented in the SAC were "highly recommended", "recommended" or "not recommended" as collaborators.For simplicity, we named this classification tasks according to the rating levels that they represent as "good", "medium" or "bad".
[Fig.5] shows the results of the classification tasks for the three elements of the BAM, which compose the visual representations of our proposal.As we can observe, there is a high rate of correct answers (94.91% in average) for all the figures, which supports the suitability of our feedback proposal.These results were useful to provide insights on how suitable the proposed awareness mechanism is to classify different learning behavioural patterns and suggest possible collaborators.
In addition, we asked participants to answer several questions to assess the usability of the three components of the BAM.These questions were taken from the Usability Perception Scale (UPscale) [Karlin et al. 2013] and the Post-Study System Usability Questionnaire (PSSUQ) [Lewis 2002].Both tools were adapted to suit the purposes of this study and formatted in a 5-point Likert scale.The resulting usability questionnaires included questions designed to evaluate attributes of the visualizations, such as ease of interpretation, learnability, usefulness, relevance and intention of use.
The prototype evaluation considered the analysis of the perceived usefulness of the feedback model, and also its suitability to be used as part of the awareness  6] shows the results obtained from the UPscale, which suggest very positive participants' perceptions about the usability (70.42% in average) and engagement (65.69% in average) of the three kinds of visualizations.These results helped us evaluate the students' perceived satisfaction concerning the information quality and its representation, as well as the usefulness and comprehensibility of the feedback.
Similarly, the results from the PSSUQ questionnaire indicate a high rate of participants' satisfaction (76.31% in average) for such visualizations [Fig.7].Considering both usability questionnaires, the results revealed a highest satisfaction of the students with the representation provided by the SAC component, followed by the PAC-Features and the PAC-CBI respectively.

Conclusions and Future Work
This paper proposes a Behaviour Awareness Mechanism (BAM) as a method to provide visual feedback to students while they perform collaborative learning activities.This method is intended to enhance the learning experience and encourage self-reflection about the collaboration process.The usability and usefulness of this proposal was evaluated through a proof-of-concept evaluation in an undergraduate course, at the Polytechnic University of Catalonia, Spain.This evaluation involved the collection of data traces from 42 students enrolled in an undergraduate software engineering course.Then, a set of 24 new students of that course used an implementation of BAM to support the activities of their software development teams.
The obtained results indicate that the BAM is useful to provide aggregate feedback about the students' behaviour and performance, using information from different data sources.The implemented awareness mechanism was able to prop-

Figure 1 :
Figure 1: Computer-supported environment required for implementing the BAM

Table 1 :
Data sources and metrics used for the implementation of the BAM of the quality of the information and tools provided -Students ratings of satisfaction with the learning process and outputs -Students ratings of the quality of collaboration

Figure 2 :
Figure 2: Design of the visual representations of the CBI and the collaboration features

Figure 3 :
Figure 3: Example representations displayed for the PAC [Fig.3]  shows an example of the visual representation of the collaborative behaviour of a student as displayed in the PAC.This representation considers the CBI index [Fig.3.a], as a measure of the overall collaborative behaviour, and also the detail of the five collaboration dimensions [Fig.3.b].Notice that the values of the features for a student can exceed the maximum value of the overall group of students, if the student was excluded from the calculation of the minimum, maximum and average values of the PAC.

Figure 4 :
Figure 4: Design of the visual representations of the Social Awareness Component

Figure 5 :
Figure 5: Results of the classification tasks

Figure 6 :
Figure 6: Results of the UPscale questionnaire

Figure 7 :
Figure 7: Results of the PSSUQ questionnaire

Table 2 :
Techniques used for the selection of metrics

Table 3 :
Selected metrics and correlation with the collaborative behaviour features

Table 4 :
Evaluation of the model using a cross-validation method

Table 5 :
Evaluation of the model using a training-test method