Knowledge presented in concept maps: correlations with conventional cognitive knowledge tests

Our study focuses on the correlation of concept map (CMap) structures and learning success tested with short answer tests, taking into particular account the complexity of the subject matter. Novice sixth grade students created CMaps about two subject matters of varying difficulty. The correlation of the complexity of CMaps with the post-test was small but highly significant in both subject matters. The complexity of the CMaps correlated with the long-term knowledge in the difficult subject matter but not in the context of the easy one. Furthermore, the high number of technical errors makes it close to impossible to estimate students’ knowledge. In summary, CMaps do not provide an adequate alternative to conventional short answer knowledge tests, but together with them they may offer a better comprehension of a student’s knowledge structure and aid in the preparation of further instruction tailored to individual needs.


Introduction
The present work analyses the applicability of concept mapping (CM) as a tool for evaluating students' knowledge by comparing knowledge represented in concept maps (CMaps) with results of conventional knowledge tests in a multiple-choice format.

CMaps in school
A CMap is a diagram with nodes representing concepts such as ideas, images or words, connected with each other by a labelled arrow describing the relation between the concepts (Figure 1). This representation is similar to a road map with locations and connecting roads. CM is discussed as a helpful learning tool, because the structure of a CMap seems to reflect the mental presentation of knowledge, as Collins and Quillian (1972) found in their studies of "semantic memory" or the mental organisation of information, as Ausubel describes in his "assimilation theory of cognitive learning" (Ausubel, Novak, and Hanesian 1978;Novak and Cañas 2006).
Because science education deals with complex, interdisciplinary subject matter with a variety of concepts, within this context CM is particularly interesting as a tool for a visual presentation of coherences (Novak 1995). CMaps can help to understand complicated conceptions, for example in ecology, which are often difficult to penetrate cognitively. For a meaningful understanding students need a networked thinking (Vester 2002), which is why CM is regarded as an adequate educational tool with the special possibility of training conceptual learning (Mintzes, Wandersee, and Novak 1997). It has been evaluated as a method for both learning (Slotte and Lonka 1999) and illustrative instruction (Toth, Suthers, and Lesgold 2002). Learning with previously constructed CMaps is especially helpful for students with low reading comprehension skills or with hardly any preknowledge about the subject matter (Rewey et al. 1989;Amer 1994;O'Donnell, Dansereau, and Hall 2002).
Creating a CMap by finding connections between different concepts helps learners in the active metacognitve progress of long-term knowledge. This is an essential part of meaningful learning. The complexity level of CMaps indicates the level of understanding and apprehending of scientific texts (Slotte and Lonka 1999). Moreover, the capacity to structure knowledge is itself an indicator of competence (Glaser and Bassok 1989). CM is an interesting tool for learners both as an instruction tool and for organisation of their own knowledge. For teachers it is difficult to analyse students' CMaps, although various approaches have been developed for assessing the quality of CMap structure, and hence the mental ability of the constructor (Novak and Gowin 1984;McClure and Bell 1990;Kinchin, Hay, and Adams 2000;Schaal 2006;Gerstner and Bogner 2009). The focus of this study is to find a method for teachers to analyse their students' CMaps quickly. Figure 1. Examples for digitised CMap about subject (B) "Ecosystem Lake" annotation: dotted arrow = wrong links in respect to content; arrow = wrong link in respect to method; round/spottet/dobble frame = items of subnets.

The multimedia tool of CM
Learners are dual encoders with limited capacity who actively process information in order to integrate it meaningfully into their existing knowledge (Mayer 2001). This active form of learning is provided with CM that can be understood as multimedia learning using a pictorial presentation of verbal information (Mayer 2001). On the one hand, this has the advantage of dual coding of the learning matter (Paivio 1971). On the other hand, constructing a CMap needs metacognitive processes, that may help learners to arrange individual knowledge meaningfully.
However, two codes and additionally a complex technique may inhibit learning success, because the requirements of any learning method may cause additional demand on the limited capacity of working memory's mental activity labelled "Cognitive Load" (CL) (e.g. Tarmizi and Sweller 1988;Baddeley 1992;Sweller 2006). Working memory is needed both to integrate new information and to handle the instructions. Sweller, Merrienboer, and Paas (1998) assumed three CL components: (i) an intrinsic load caused by content complexity, (ii) extraneous load caused by the instructional mode and (iii) germane load necessary for individually processing information and passing it on to long-term memory. As all three components are assumed to be additive (Sweller 2006), an increase of component (i) and/or (ii) without decrease of the other component may cause Cognitive Overload. The available capacity for (iii) germane load would be reduced and consequently cognitive learning of the subject matter would decline. In the present study, we analyse the effect of varied intrinsic load on learning success and CM with two differently complex subject matters.
1.3. CMaps: state of the art As CM was developed to portray the emerging knowledge of children (Novak and Gowin 1984), its usefulness for evaluating learners' concepts is obvious (Ruiz-Primo and Shavelson 1996). CMaps reflect newly acquired individual knowledge (Stracke 2004). In this context, CM is often discussed as an appropriate method of knowledge testing (Novak and Gowin 1984;Horn and Mikelskis 2003;Schaal 2006), especially because young students enjoy demonstrating their concepts in this creative way (Stice and Alvarez 1987;Conradty and Bogner 2010).
The combination of CM and knowledge tests for receiving information about students' knowledge is not unfamiliar. After introducing CM, Willerman and MacHarg (1991) examined pupils' gain in knowledge by using a cognitive knowledge test. Heinze-Fry and Novak (1990) investigated students' knowledge achievement after CM with a post-and delayed post-test design. Ðhlberg and Ahoranta (2008) propose the use of both short answer knowledge tests and CMaps for a better comprehension of students' learning achievements.
As the constructer of a CMap needs to understand the relationship between concepts, only complex knowledge should be found in the CM. There should be a correlation of CMaps and knowledge tests that is even stronger with long-term memory tests, as knowledge presented in CMaps is linked and meaningful.
Knowledge represented in a conventional knowledge tests with short answer format and knowledge in a CMap does not hold the same level but depends on each other. Multiple-choice questionnaires provide aided recall, as one of the offered answers is correct. Learning for such tests may result in rote learning, whereas CM provides meaningful learning, because newly acquired knowledge has to be reorganized for the construction of the CMap (Novak and Gowin 1984;Novak 1990).

Research questions
In this article, we analysed the correlation of scores reflecting the complexity of CMaps with cognitive knowledge tests taking into account the difficulty of the subject matter. Our hypothesis was: I. The complexity of a CMap correlates with newly acquired knowledge, presented in short answer knowledge test sum scores. II. Knowledge reflected in CMaps is meaningful and consequently long lasting.
Accordingly, the complexity of a CMap correlates with long-term knowledge, presented in short answer knowledge test sum scores, tested six weeks later. III. This correlation of CMap complexity and knowledge test sum score depends on the grade of difficulty of the subject matter. IV. Technical errors cause an underestimation of knowledge, as presented in CMaps.

Methodology
Our study was conducted in ten high school classes of the highest stratification level (Gymnasium) in Bavaria, Germany. We selected novice sixth grade students (N = 283). The participants' mean age was 12.56 (SD = 0.08) years. The gender distribution was by chance perfectly balanced. In order to reduce teacher effects, a single teacher previously unknown to all students tutored all lessons (cf. Table 1). To standardise the pre-lesson, a computer-aided learning unit was implemented, with CM introduced as a consolidation phase by recapitulating the material learned computer-aided. The cohort already had sufficient experience with computers because informatics is part of the regular syllabus but they had no experiences with CM. We decided in favour of groups of two students assembled by participants' choice, because several studies have reported that cooperative CM resulted in higher achievement scores than that realised by students who constructed their CMaps individually (Okebukola and Jegede 1989;Okebukola 1992).
To analyse the effect of the complexity of the subject matter, students constructed CMaps about two varying difficult subjects. The difficulty of the subject matter was defined in the following way: we presume that the less pre-knowledge is provided, the more difficult it is to learn more (Mayer 2001), thus the more difficult the subject is. Furthermore we expected subject (A) to be easier as students have already had hands-on experience with it. Due to the curriculum, subject (B) should be new to the age-group. Additionally, the content is quite abstract and included Latin terms.
The pre-lesson students worked cooperatively with autonomous computer-aided instruction for 60 min on each subject matter ( Table 1). The first (moderate) module (A) "From Polliwog to Frog" highlighted the following topics: (a) relationship of body and mode of life (physique, living in the course of the year), (b) food relationships, (c) reproduction and development, (d) endangerment and conservation issues related to the species and (e) hands-on experiments to support abstract rational thinking skills. The second (complex) module (B) "Ecosystem Lake" incorporated a higher difficulty level and highlighted the following subjects: (a) basic concepts of ecology, (b) plants (prominence and function of photosynthesis), (c) energy conversation and respiration, (d) food webs, (e) information about several fascinating animals to create awareness of the need for protection (more examples for the relation of physiques and mode of life) and (f) endangerment and conservation of ecosystems. A workbook with questions provided a guide through the lesson. All students were requested to complete their own workbook autonomously. There was no additional teacher support needed, except when technical problems of the in-school computer network appeared.
For the introduction of the technique of CM the novice students produced a CM about a general well-known example unrelated to our subject content in a separate 15 min preface under teacher supervision. Students then created individual CMaps about subject (A) and subsequently (B). For each subject 35 min were available. All items for the CM were predefined (Table 2), presumably thus reducing CL (Nückles et al. 2004). Furthermore, predefined items may support CM validation, as the same  (Figure 1). In order to test student's pre-knowledge, short-term and long-term learning success, a multiple-choice knowledge test with three distractors and one correct answer was applied three times: one week before the instruction (pre-test, K1), immediately after the instruction (post-test, K2) and six weeks after the instruction (retention-test, K3), Table 1. For the item set an expert rating was employed. To control test effects our quasi-experimental BACI design (Smith 2002) included a control group (n = 56), not participating in the treatments but filling in the knowledge tests.
We tested learning achievements using two different methods of knowledge testing: (i) traditional knowledge test in multiple-choice format and (ii) knowledge represented in CMaps. To analyse the applicability of CM for knowledge testing, we correlated both K2 and K3 with the scores derived from CMaps. As there could be differences in learning achievements and learning style dependent on the complexity of the subject matter all students had lessons, CMs and tests on an easy and a difficult subject matter.

Analysis
For statistical analyses we used SPSS 14.0 and SPSS 16.0. The hand-drawn CMaps were digitalised with Mannheimer Netzwerk Elaborations Technik MaNet Version 1.6.1. ÓMannheim Research Company MaResCom GmbH www.marescom.net.

Analyse of knowledge test
Previous knowledge levels and changes in knowledge were measured by means of pre-test (K1) scores, applied two weeks before lesson participation, and a post-test (K2) immediately after the lesson and a retention test six weeks later (Table 1). To guarantee content validity, items were constructed according to the learning goals of the syllabus-based intervention. Seventeen items in multiple-choice format with four possible response options were applied by providing three distractors and one correct answer (see examples, Table 3). Thus, the pure guessing probability was 0.25.
Item difficulties, defined as percentage of correct answers, should range between 0.2 and 0.8. (Bortz and Döring 1995). Items outside of this range were discarded. The Corrected Item-Total Correlation met the +0.2 to +0.5 range. Generally, knowl-  (Lienert and Raatz 1998). Within K2 as well as K3 reliability scores reached higher values of a = 0.75. Split by the two subject matters the reliability values remained acceptable with an a = 0.65 for both subject matters. Due to a non-normal distribution of our knowledge sum scores (Shapiro-Wilk p < 0.0001), we applied non-parametric tests. Knowledge items were scored as correct (1 point) or incorrect (0 point) and analysed as sum scores. Significance of learning success was calculated with Wilcoxon signed-rank test and correlations between knowledge tests and CM factors with Spearman's rho.

Analysis of CMaps
The maximal complexity is the total number of connections the student created in a CMap. Students made several types of errors in their CMaps, as a result both of technical reasons and of misconceptions (Conradty and Bogner 2010). After deletion of these incorrect relations the remaining connections form the actual complexity (AC). The corrected actual complexity (CAC) is the complexity after deletion of all relations incorrect with regard to the content. Errors in respect of technique were not deleted. We also counted the number of mistakes and subnets, i.e. nets without connection to each other within the CMap (Figure 1). In this study, a perfect CMap has only one net. To evaluate the applicability of CMaps for knowledge testing we correlated the knowledge test sum scores with the number of subnets and the complexity of corrected CMaps without wrong connections. Although the total number of connections could be an indication of the concepts of the student, wrong connections reveal misconceptions, which is why the original CMap cannot be compared with knowledge test sum scores.

Preconditions
Pre-, post-and retention-tests reveal that all students gained cognitive knowledge about both subject matters (Wilcoxon p < 0.0001).
A multiple-choice test with four alternatives has a guessing probability of 25%, so that results higher than 50% indicate knowledge. The pre-test verifies sufficient previous knowledge about subject (A) (mean score 58%), but hardly any about subject (B) (mean score 39%). Due to these findings we distinguish between the subjects as (A) easy and (B) complex.

Analysis of CMaps
Fifty-nine per cent of the CMaps of subject (A) showed less than six mistakes with a median of five. Only 27% of the CMaps of subject (B) were almost correct. Sixty-three per cent of them contained 6-25 mistakes with a median of eight mistakes ( Figure 2). The median of AC of the CMaps is 13 for subject (A) and 11 for subject (B), Figure 3. By ignoring technical errors the CAC is higher with a median of 14 for subject (A) and 14.5 for subject (B). The experts' CMaps had a complexity of 28 (A) and 32 (B), respectively. The median of subnets per CMap is five for subject (A) and a large 13 subnets for subject (B), Figure 4.

Correlation of CMaps and knowledge test
The correlation of AC and post-tests K2 is highly significant as is that with the CAC without deletion of technical errors (Table 4). However, the correlations with the post-tests are weak with coefficients about 0.2 (Table 4). The correlations of AC and CAC with K3 are also weak and hardly significant in subject (A), but still highly significant with subject (B). Significant correlations with other aspects of CMap quality such as number of subnets with cognitive tests were not found.

Subject difficulty
The subjects were of varying difficulty, as students had less pre-knowledge and cognitive achievements in subject matter (B) that was the curriculum of an older age-group and included Latin terms.

Quantitative analysis
The more difficult the subject matter, the more mistakes were made and thereby more subnets combined with a lower AC were generated in the CMaps. The easier the subject, the fewer the mistakes and subnets and higher AC were observed. These results indicate that pre-knowledge is essential for learning, as explained by Ausubel's cognitive assimilation theory (Ausubel 1968;Novak and Gowin 1984). Because (B) was more complicated also due to including Latin terms and a high number of items (Nückles et al. 2004;Scharfenberg, Bogner, and Klautke 2007), a high CL might have existed (Baddeley 1992).
After deleting technical errors, the CAC was higher in (B) than in (A). This is comparable to the experts' CMaps that were also more complex in (B) than in (A). This indicates AC underestimating students' knowledge but CAC fitting in well. We reach the same conclusion on the basis of the correlation of AC and CAC with knowledge sum scores. That CAC correlates better with K2 than AC in both the easy and the difficult subject indicates an overestimation of errors or rather an underestimation of the student's illustrated knowledge with the AC, simply because of technical errors. Therefore, CM could be a promising method for knowledge testing if students are conversant with the technique: a little initial training and a second corrective introduction to CM is needed to eliminate technical errors. Students and their teacher have to use the same CM "code": rules for arrows and their labelling must be standardised similar to grammar. Common mistakes in labelling could be figured out with examples of frequently made errors, e.g. the list of Conradty and Bogner (2010).
In spite of the difficulty of subject (B) the correlation of CAC was significant with K3 about subject (B), but not about (A). This indicates that knowledge about the difficult subject matter was limited, but consistent, whereas knowledge about the easy subject matter was lost within six weeks. This is in contrast to Kinchin, Hay, and Adams (2000), who found that the quality of CMaps indicates quality of the retention knowledge tests. In the present study this is given in the complex subject matter (B) but not in the easy one. This may indicate that students learnt less but meaningfully about the difficult topic.

Conclusion
CMaps very likely are capable of representing students' knowledge; however, under the conditions of the present study, CMaps are no substitute for conventional (short answer) knowledge tests. Difficult subject matters appear to cause an increase in CM errors, resulting in an underestimation of cognitive knowledge. The correlation of knowledge tests with AC and CAC respectively are highly significant, but weak. This is consistent with Novak, Gowin, and Johansen (1983). A CMap reflects the knowledge of its draughtsman. The CACwithout deletion of technical errorscorrelates even better. This suggests CAC could represent cognitive knowledge of the subject whereas the AC mainly represents the ability of CM.
However, quite contrary to our expectations, the correlations of long-term knowledge tests with (C) AC disappeared in the easy topic. Too many factors exist aside from the subject knowledge, such as verbal ability, that may affect the ability to handle CMaps, especially after a computer-aided pre lesson as approached in the present study. Students may have difficulties in structuring and integrating information provided by hypertext in an appropriate way in the CMap (Reader and Hammond 1994). As Britt and colleagues described, learners often cannot integrate information from multiple texts (Britt et al. 1999). Although students did not create CMaps but learned with them, Hilbert and Renkl (2008) found that participants in the cluster with the lowest learning outcome had significantly worse verbal abilities than learners in the cluster with the best learning outcome. Further research is needed to focus on the effect of verbal abilities on CM and learning with CM. In natural science education, conceptual change seems to be a helpful method of instruction. For instance, a specific counsel of Ausubel to teachers simply was: "The most important single factor influencing learning is what the learner already knows. Ascertain this and teach accordingly" (Ausubel 1968). Students already own conceptions about their environment, though they probably might not be scientifically correct. CM may be a helpful technique to facilitate conceptual change, as students have an overview of their misconceptions. Furthermore, CM provides meaningful learning. Especially following hypermedia and computer aided learning instructions CM could be a useful tool to foster learning (Hilbert and Renkl 2008).
As an evaluation technique for students' newly acquired knowledge in terms of marking students it may not be appropriate. However, for an effective instruction a better comprehension of students' knowledge is helpful. Therefore, we think the consideration of both, CM results and short answer tests, can be complementary and a worthwhile tool for both learners and teacher, just as Ðhlberg and Ahoranta (2008) had recommended to teachers.