README — Anonymized Raw Session Data
"Does Unfairness Hurt Women? The Effects of Losing Unfair Competitions"
Piasenti, Valente, Van Veldhuizen, Pfeifer — Economic Journal


================================================================
1. OVERVIEW
================================================================
This folder contains anonymized versions of all raw experimental
session files. The folder structure and all filenames are identical
to the original raw_data folder (11 session subfolders).

The script anonymize.R, which produced these files, is stored in 
the raw_data folder (not distributed). Sensitive columns are 
either set to NA or transformed using a deterministic rule. 
All experimental outcome variables are untouched.


================================================================
2. CHANGES APPLIED
================================================================

2.1  XLSX files (one per session)

  Column                  Action               Reason
  ---------------------------------------------------------------
  PROLIFIC_PID            Transformed          Participant identifier;
                                               same rule used across
                                               all datasets in this study
  IPAddress               Set to NA            Identifies home network
                                               and approximate location
  ResponseId              Set to NA            Qualtrics internal key
                                               mapping to individual
                                               responses
  RecipientLastName       Set to NA            Name field populated by
                                               Qualtrics
  RecipientFirstName      Set to NA            Name field populated by
                                               Qualtrics
  RecipientEmail          Set to NA            Email field populated by
                                               Qualtrics
  ExternalReference       Set to NA            External reference that
                                               may contain identifying
                                               information
  LocationLatitude        Set to NA            Precise GPS coordinates
  LocationLongitude       Set to NA            Precise GPS coordinates

2.2  Prolific CSV and TXT files (one or more per session)

  Column                  Action               Reason
  ---------------------------------------------------------------
  participant_id          Transformed          Prolific user ID; same
                                               rule as PROLIFIC_PID so
                                               the two can be linked
  session_id              Set to NA            Prolific session
                                               identifier


================================================================
3. WHAT WAS NOT CHANGED
================================================================
The following columns are kept exactly as in the original files:

  All CSV/TXT columns except participant_id and session_id:
    timestamps, age, entered_code, num_approvals, prolific_score,
    status, time_taken, and all demographic variables (Sex,
    Nationality, Country of Birth, Country of Residence, Ethnicity,
    First Language, Education, Employment, Student Status).

  All experimental variables in xlsx:
    task scores, win/loss outcome, merit, belief variables, payment
    variables, treatment assignment, timestamps.

  All Qualtrics page-timing columns:
    _First Click, _Last Click, _Page Submit, _Click Count.


================================================================
4. FILE TYPES PER SESSION
================================================================
Each session subfolder contains:

  [session name].xlsx
    Experimental data (all outcome variables and PROLIFIC_PID).

  prolific data space.csv
    Prolific participant export, comma-separated with header.
    This is the primary file read by the merge code (01_code_merge.R).

  prolific data comma.txt  (most sessions)
    Comma-separated, no header row. Supplementary participants
    appended by the merge code.

Note: the 1st session MEN, 1st session WOMEN, and 3rd session Women
folders have only one Prolific file each (no comma-variant).


================================================================
5. REPRODUCIBILITY
================================================================
The ID transformation is fully deterministic. Re-running anonymize.R
from the original raw_data produces identical output.

The replication code that uses these files is:
  replication_code/01_code_merge.R
  — merges all 11 session subfolders into dat_numeric.csv (2086 x 375)
  — verified to exactly reproduce the dataset used in the paper
