Published July 14, 2025 | Version v1
Dataset Open

Notarial Systems, Legal Origins, and the Property Rights Gap: Evidence from Global IPRI Data

Description

Title

Notaries, Property Institutions, and the Civil Law Trap: Evidence from Global IPRI Data

Description

This project examines the relationship between national notarial systems and institutional quality across countries using data from the International Property Rights Index (IPRI). We analyze how different legal traditions—particularly civil law systems with Latin Notariat models—affect property rights enforcement, regulatory quality, and corporate governance.

By applying Principal Component Analysis (PCA) to 13 institutional variables from the IPRI dataset, we derive three dimensions of governance:

  1. Property Rights and Enforcement

  2. Political Governance and Rule of Law

  3. Registration and Financing Access

We then use logistic regression to estimate the probability that a country uses a specific notarial system (e.g., Latin Notariat) based on these principal components. The findings underscore how legal traditions shape institutional outcomes and help explain the persistence of economically inefficient practices like monopolistic notary regimes.

This work contributes to law-and-economics scholarship, legal origins theory, and institutional reform debates, with practical implications for international development, business law harmonization, and civil code modernization.

Code (Python – Google Colab compatible)

# Step 1: Install & import libraries
import pandas as pd
import statsmodels.api as sm
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from google.colab import files

# Step 2: Upload CSV
uploaded = files.upload()

# Step 3: Load CSV
df = pd.read_csv('IPRI_with_Notary_Systems.csv')

# Step 4: Rename IPRI columns
rename_dict = {
    'Legal and Political Environment (LP)': 'LP',
    'Judicial Independence': 'Judicial_Independence',
    'Rule of Law': 'Rule_of_Law',
    'Political Stability': 'Political_Stability',
    'Control of Corruption': 'Control_of_Corruption',
    'Physical Property Rights (PPR)': 'PPR',
    'Physical Property Protection': 'Physical_Property_Protection',
    'Registering Process': 'Registering_Process',
    'Access to Financing': 'Access_to_Financing',
    'Intellectual Property Rights (IPR)': 'IPR',
    'Intellectual Property Protection': 'Intellectual_Property_Protection',
    'Patent Protection': 'Patent_Protection',
    'Copyright Protection': 'Copyright_Protection',
    'Trademark Protection': 'Trademark_Protection'
}
df.rename(columns=rename_dict, inplace=True)
ipri_vars = list(rename_dict.values())

# Step 5: Drop missing values
df.dropna(subset=['Notary System'] + ipri_vars, inplace=True)

# Step 6: Create dummy variables for notary systems
df['Notary_System_Clean'] = df['Notary System'].str.replace(" ", "_")
notary_dummies = pd.get_dummies(df['Notary_System_Clean'], prefix='Notary', drop_first=False)

# Step 7: Standardize IPRI variables for PCA
scaler = StandardScaler()
X_scaled = scaler.fit_transform(df[ipri_vars])

# Step 8: Run PCA
pca = PCA(n_components=3)
X_pca = pca.fit_transform(X_scaled)
df[['PC1', 'PC2', 'PC3']] = X_pca

# Step 9: Interpret principal components by top loadings
pca_loadings = pd.DataFrame(pca.components_.T, columns=['PC1', 'PC2', 'PC3'], index=ipri_vars)
top_features = {pc: pca_loadings[pc].abs().sort_values(ascending=False).head(3).index.tolist() for pc in ['PC1', 'PC2', 'PC3']}
pc_names = {
    'PC1': f"Property Rights and Enforcement ({', '.join(top_features['PC1'])})",
    'PC2': f"Political Governance and Rule of Law ({', '.join(top_features['PC2'])})",
    'PC3': f"Registration and Financing Access ({', '.join(top_features['PC3'])})"
}

# Step 10: Run logistic regression for each notary system
logit_results = {}
for notary_type in notary_dummies.columns:
    df['target'] = notary_dummies[notary_type].astype(int)
    X = sm.add_constant(df[['PC1', 'PC2', 'PC3']])
    y = df['target']
    model = sm.Logit(y, X).fit(disp=0)
    logit_results[notary_type] = model.summary()

# Step 11: Print interpretation of PCs
print("🔍 Principal Component Names and Top Influences:")
for pc, name in pc_names.items():
    print(f"{pc}: {name}")

# Step 12: Print regression summary for one system
print("\n📊 Logistic Regression: Civil_Law_(Latin)")
print(logit_results.get('Notary_Civil_Law_(Latin)', "Notary type not found."))

License

Creative Commons Attribution 4.0 International (CC BY 4.0)

Let me know if you would like me to generate the README.md, metadata.json, or help you upload the full dataset + notebook to Zenodo.

Files

IPRI_with_Notary_Systems.csv

Files (27.5 kB)

Name Size Download all
md5:bdd95fae8070830fe4a7a67c81bafd27
10.8 kB Preview Download
md5:71839c6aa6a157e6515a15d9bc594bba
16.7 kB Preview Download