Variable,Description,Type,Possible Values,Example,Comment company_name,Extracted company name.,String,132 unique values,Allianz, report_year,Year of the report.,Integer,"Min: 2017 Max: 2022",2022, llm_page,Page number extracted by the LLM.,String,62 unique values,73,"Page is a string due to non-numeric page names in the PDFs (e.g., “Env33”)." llm_year,Year extracted by the LLM.,Integer,"Min: 2008 Max: 2022",2019, llm_scope,Scope extracted by the LLM.,Categorical (String),"Possible values: - 1 - 2mb - 2lb - 3",3, llm_value,Value extracted by the LLM.,Numeric,"Min: 0 Max: 35119419",556660, llm_unit,Unit extracted by the LLM.,String,68 unique values,Metric tons (t) CO2e, non_exp_1_id,Unique identifier for non-expert 1.,Categorical (Integer),"Min: 0 Max: 8",1, non_exp_2_id,Unique identifier for non-expert 2.,Categorical (Integer),"Min: 0 Max: 8",4, non_exp_1_is_value_correct,Non-expert 1’s classification of the LLM-extracted value as correct or incorrect.,Categorical (String),"Possible values: - ""Yes"" - ""No"" - NA",Yes, non_exp_2_is_value_correct,Non-expert 2’s classification of the LLM-extracted value as correct or incorrect.,Categorical (String),"Possible values: - ""Yes"" - ""No"" - NA",No, non_exp_1_corrected_value,Corrected numeric emission value provided by non-expert 1.,Numeric,"Min: 0 Max: 8730756",1228129, non_exp_2_corrected_value,Corrected numeric emission value provided by non-expert 2.,Numeric,"Min: 0 Max: 6736962",1506, non_exp_1_is_unit_correct,Non-expert 1’s classification of the LLM-extracted unit as correct or incorrect.,Categorical (String),"Possible values: - ""Yes"" - ""No"" - NA",Yes, non_exp_2_is_unit_correct,Non-expert 2’s classification of the LLM-extracted unit as correct or incorrect.,Categorical (String),"Possible values: - ""Yes"" - ""No"" - NA",Yes, non_exp_1_corrected_unit,Corrected emission unit provided by non-expert 1.,String,44 unique values,"1,000 t-CO2", non_exp_2_corrected_unit,Corrected emission unit provided by non-expert 2.,String,40 unique values,10^3 tonnes CO2e, non_exp_1_is_page_correct,Non-expert 1’s classification of the LLM-extracted page as correct or incorrect.,Categorical (String),"Possible values: - ""Yes"" - ""No"" - NA",No, non_exp_2_is_page_correct,Non-expert 2’s classification of the LLM-extracted page as correct or incorrect.,Categorical (String),"Possible values: - ""Yes"" - ""No"" - NA",Yes, non_exp_1_corrected_page,Corrected page number provided by non-expert 1.,String,47 unique values,22, non_exp_2_corrected_page,Corrected page number provided by non-expert 2.,String,39 unique values,276, non_exp_1_expert_requested_for_rec,Whether non-expert 1 requested expert review for the record.,Binary,"Possible values: - ""Yes"" - ""No""",Yes, non_exp_2_expert_requested_for_rec,Whether non-expert 2 requested expert review for the record.,Binary,"Possible values: - ""Yes"" - ""No""",No, non_exp_1_expert_requested_for_doc,Whether non-expert 1 requested expert review for the document.,Binary,"Possible values: - ""Yes"" - ""No""",Yes, non_exp_2_expert_requested_for_doc,Whether non-expert 2 requested expert review for the document.,Binary,"Possible values: - ""Yes"" - ""No""",No, non_exp_1_metric_name,"Metric name identified by non-expert 1, as described in the report.",String,176 unique values,Scope 3 GHG Emissions, non_exp_2_metric_name,"Metric name identified by non-expert 2, as described in the report.",String,161 unique values,Scope 3 (Indirect emissions: business travel), non_exp_1_display_type,Display format of the metric identified by non-expert 1.,String,6 unique values,Table, non_exp_2_display_type,Display format of the metric identified by non-expert 2.,String,5 unique values,Graphic, non_exp_filter_doc,Ad-hoc column to identify documents that require expert revision. ,Binary,"Possible values: - ""Yes"" - ""No""",No, non_exp_filter_rec,Ad-hoc column to identify rows that require expert revision. ,Binary,"Possible values: - ""Yes"" - ""No""",No, non_exp_1_value,"Non-expert 1's value depending on classification in non_exp_1_is_value_correct (""Yes"" => llm_value, ""No"" ==> non_exp_1_corrected_value, ""NA"" ==> NA)",Numeric,"Min: 0 Max: 35119419",772607, non_exp_2_value,"Non-expert 2's value depending on classification in non_exp_2_is_value_correct (""Yes"" => llm_value, ""No"" ==> non_exp_2_corrected_value, ""NA"" ==> NA)",Numeric,"Min: 0 Max: 35119419",467234, non_exp_1_unit,"Non-expert 1's unit depending on classification in non_exp_1_is_unit_correct (""Yes"" => llm_unit, ""No"" ==> non_exp_1_corrected_unit, ""NA"" ==> NA)",String,65 unique values,kt CO2, non_exp_2_unit,"Non-expert 2's unit depending on classification in non_exp_2_is_unit_correct (""Yes"" => llm_unit, ""No"" ==> non_exp_2_corrected_unit, ""NA"" ==> NA)",String,68 unique values,kg CO2e, non_exp_1_page,"Non-expert 1's page depending on classification in non_exp_1_is_page_correct (""Yes"" => llm_page, ""No"" ==> non_exp_1_corrected_page, ""NA"" ==> NA)",String,71 unique values,69, non_exp_2_page,"Non-expert 2's page depending on classification in non_exp_2_is_page_correct (""Yes"" => llm_page, ""No"" ==> non_exp_2_corrected_page, ""NA"" ==> NA)",String,65 unique values,11, non_exp_1_value_reasoning,Non-expert 1's explanation of why the LLM extracted the wrong value. ,Categorical (String),"7 possible values: 0. Missed out on correct value (false NA)"" 1. WV: Irrelevant, not absolute GHG emissions"" 2. WV: Extracted value is not related to the whole company"" 3. WV: Extracted value is related to different scope"" 4. WV: Extracted value is related to different year"" 5. WV: LLM Hallucination"" 6. WV: other reasons (please specify in comment)"" 7. NA","3. WV: Extracted value is related to different scope ","WV stands for ""Wrong Value""" non_exp_2_value_reasoning,Non-expert 2's explanation of why the LLM extracted the wrong value.,Categorical (String),"7 possible values: 0. Missed out on correct value (false NA)"" 1. WV: Irrelevant, not absolute GHG emissions"" 2. WV: Extracted value is not related to the whole company"" 3. WV: Extracted value is related to different scope"" 4. WV: Extracted value is related to different year"" 5. WV: LLM Hallucination"" 6. WV: other reasons (please specify in comment)"" 7. NA",5. WV: LLM Hallucination,"WV stands for ""Wrong Value""" non_exp_1_unit_reasoning,Non-expert 1's explanation of why the LLM extracted the wrong unit. ,Categorical (String),"7 possible values: 0. Missed out on correct value (false NA)"" 1. WV: Irrelevant, not absolute GHG emissions"" 2. WV: Extracted value is not related to the whole company"" 3. WV: Extracted value is related to different scope"" 4. WV: Extracted value is related to different year"" 5. WV: LLM Hallucination"" 6. WV: other reasons (please specify in comment)"" 7. NA",5. WV: LLM Hallucination,"WV stands for ""Wrong Value""" non_exp_2_unit_reasoning,Non-expert 2's explanation of why the LLM extracted the wrong unit.,Categorical (String),"7 possible values: 0. Missed out on correct value (false NA)"" 1. WV: Irrelevant, not absolute GHG emissions"" 2. WV: Extracted value is not related to the whole company"" 3. WV: Extracted value is related to different scope"" 4. WV: Extracted value is related to different year"" 5. WV: LLM Hallucination"" 6. WV: other reasons (please specify in comment)"" 7. NA",2. WV: Extracted value is not related to the whole company,"WV stands for ""Wrong Value""" non_exp_manually_added_record,TRUE if record was added manually by non-experts,Binary,"TRUE FALSE",TRUE,"To distinguish LLM-generated records from manually added records. Manually added records are prefixed by ""x""." exp_group_1_exp_1_id,Unique identifier for the first expert in group 1. ,Categorical (Integer),"Min: 11 Max: 14",13, exp_group_1_exp_2_id,Unique identifier for the second expert in group 1. ,Categorical (Integer),"Min: 12 Max: 14",12, exp_group_1_value_who_is_right,Expert group 1 selects which of the non-experts' value annotations is deemed correct.,Categorical (String),"4 possible values: - Ann1 - Ann2 - Neither - NA",Neither, exp_group_1_corrected_value,Expert group 1's corrected numeric emission value.,Numeric,"Min: 14067 Max: 78645",26959, exp_group_1_value_reasoning,Expert group 1's explanation of why the LLM extracted the wrong value. ,Categorical (String),"7 possible values: 0. Missed out on correct value (false NA)"" 1. WV: Irrelevant, not absolute GHG emissions"" 2. WV: Extracted value is not related to the whole company"" 3. WV: Extracted value is related to different scope"" 4. WV: Extracted value is related to different year"" 5. WV: LLM Hallucination"" 6. WV: other reasons (please specify in comment)"" 7. NA",6. WV: other reasons (please specify in comment),"WV stands for ""Wrong Value""" exp_group_1_unit_who_is_right,Expert group 1 selects which of the non-experts' unit annotations is deemed correct.,Categorical (String),"4 possible values: - Ann1 - Ann2 - Neither - NA",Ann2, exp_group_1_corrected_unit,Expert group 1's corrected unit.,String,7 unique values,Emissions (Tonnes), exp_group_1_unit_reasoning,Expert group 1's explanation of why the LLM extracted the wrong unit. ,Categorical (String),"7 possible values: 0. Missed out on correct value (false NA)"" 1. WV: Irrelevant, not absolute GHG emissions"" 2. WV: Extracted value is not related to the whole company"" 3. WV: Extracted value is related to different scope"" 4. WV: Extracted value is related to different year"" 5. WV: LLM Hallucination"" 6. WV: other reasons (please specify in comment)"" 7. NA",2. WV: Extracted value is not related to the whole company,"WV stands for ""Wrong Value""" exp_group_1_corrected_page,Expert group 1's corrected page.,String,3 unique values,47, exp_group_2_exp_1_id,Unique identifier for the first expert in group 2. ,Categorical (Integer),"Min: 11 Max: 14",11, exp_group_2_exp_2_id,Unique identifier for the second expert in group 2. ,Categorical (Integer),"Min: 12 Max: 14",12, exp_group_2_value_who_is_right,Expert group 2 selects which of the non-experts' value annotations is deemed correct.,Categorical (String),"4 possible values: - Ann1 - Ann2 - Neither - NA",Ann2, exp_group_2_corrected_value,Expert group 2' corrected numeric emission value.,Numeric,"Min: 5835 Max: 6436",5835, exp_group_2_value_reasoning,Expert group 2's explanation of why the LLM extracted the wrong value.,Categorical (String),"7 possible values: 0. Missed out on correct value (false NA)"" 1. WV: Irrelevant, not absolute GHG emissions"" 2. WV: Extracted value is not related to the whole company"" 3. WV: Extracted value is related to different scope"" 4. WV: Extracted value is related to different year"" 5. WV: LLM Hallucination"" 6. WV: other reasons (please specify in comment)"" 7. NA",2. WV: Extracted value is not related to the whole company,"WV stands for ""Wrong Value""" exp_group_2_unit_who_is_right,Expert group 2 selects which of the non-experts' unit annotations is deemed correct.,Categorical (String),"4 possible values: - Ann1 - Ann2 - Neither - NA",Ann1, exp_group_2_corrected_unit,Expert group 2's corrected unit.,String," 9 unique values ",metric tonnes of CO2 equivalent, exp_group_2_unit_reasoning,Expert group 2's explanation of why the LLM extracted the wrong unit. ,Categorical (String),"7 possible values: 0. Missed out on correct value (false NA)"" 1. WV: Irrelevant, not absolute GHG emissions"" 2. WV: Extracted value is not related to the whole company"" 3. WV: Extracted value is related to different scope"" 4. WV: Extracted value is related to different year"" 5. WV: LLM Hallucination"" 6. WV: other reasons (please specify in comment)"" 7. NA",5. WV: LLM Hallucination,"WV stands for ""Wrong Value""" exp_group_2_corrected_page,Expert group 2's corrected page.,String,2 unique values,66, exp_group_1_value,"Expert group 1's value depending on selection in exp_group_1_value_who_is_right (""Ann1"" => non_exp_1_value, ""Ann2"" ==> non_exp_2_value, ""Neither"" ==> exp_group_1_corrected_value)",Numeric,"Min: 0 Max: 8730756",1540, exp_group_2_value,"Expert group 2's value depending on selection in exp_group_2_value_who_is_right (""Ann1"" => non_exp_1_value, ""Ann2"" ==> non_exp_2_value, ""Neither"" ==> exp_group_2_corrected_value)",Numeric,"Min: 0 Max: 8730756","0.2 ", exp_group_1_unit,"Expert group 1's unit depending on selection in exp_group_1_unit_who_is_right (""Ann1"" => non_exp_1_unit, ""Ann2"" ==> non_exp_2_unit, ""Neither"" ==> exp_group_1_corrected_unit)",String,40 unique values,MT CO2e, exp_group_2_unit,"Expert group 2's unit depending on selection in exp_group_2_unit_who_is_right (""Ann1"" => non_exp_1_unit, ""Ann2"" ==> non_exp_2_unit, ""Neither"" ==> exp_group_2_corrected_unit)",String,41 unique values,CO2e in metric tons), exp_group_1_page,"Expert group 1's page depending on selection in exp_group_1_value_who_is_right (""Ann1"" => non_exp_1_page, ""Ann2"" ==> non_exp_2_page, ""Neither"" ==> exp_group_1_corrected_page)",String,40 unique values,92, exp_group_2_page,"Expert group 2's page depending on selection in exp_group_2_value_who_is_right (""Ann1"" => non_exp_1_page, ""Ann2"" ==> non_exp_2_page, ""Neither"" ==> exp_group_2_corrected_page)",String,38 unique values,121, exp_disc_value,Corrected value resulting from expert discussion ,Numeric,"Min: 0 Max: 746750",51, exp_disc_unit,Corrected unit resulting from expert discussion ,String,22 unique values,"CO2 emissions (1,000 tonnes)", exp_disc_page,Corrected page resulting from expert discussion ,String,19 unique values,67,