The Institutional Basis of Gender Inequality: The Social Institutions and Gender Index (SIGI)

This study uses variables from the Organisation for Economic Co-operation and Development (OECD) Centre's Gender, Institutions and Development (GID) Database to construct the Social Institutions and Gender Index (SIGI) and its subindices Family code, Civil liberties, Physical integrity, Son preference, and Ownership rights. Instead of measuring gender inequality in outcomes, the SIGI and its subindices measure long-lasting social institutions defined as societal practices and legal norms that frame gender roles. The SIGI combines them into a multidimensional index of women's deprivation caused by gendered social institutions. Inspired by the Foster–Greer–Thorbecke poverty measures, the SIGI offers a new way of aggregating gender inequality by penalizing high inequality in each dimension and allowing only partial compensation between subindices. The indices identify countries and dimensions of gendered social institutions that deserve attention. Empirical results confirm that the SIGI complements other gender-related indices.


INTRODUCTION
Despite considerable progress in recent decades, gender inequality in many dimensions of well-being remains pervasive in many developing countries. This is an intrinsic issue of equity, as affected women are deprived of basic freedoms (Amartya Sen 1999). Furthermore, there is considerable evidence that these discrepancies have high costs for society in the form of lower human capital, worse governance, and lower growth (World Bank 2001, 2011Stephan Klasen 2002;Stephan Klasen and Francesca Lamanna 2009;Ray Rees and Ray Riezman 2012). The intrinsic and instrumental value of gender equality has been recognized and incorporated in the development agenda, for example, in Millennium Development Goal 3: Promote Gender Equality and Empower Women, as well as the Convention on the Elimination of Discrimination against Women.
To measure the extent of gender inequality across countries, several gender-related indices have been proposed. They include, but are not limited to, three measures from UNDP: the Gender-Related Development Index (GDI) and the Gender Empowerment Measure (GEM;UNDP 1995) and the recently published Gender Inequality Index (GII; UNDP 2010); the Global Gender Gap Index from the World Economic Forum (Augusto Lopez-Claros and Saadia Zahidi 2005); and the Gender Equity Index developed by Social Watch (2005) or the African Gender Status Index proposed by the Economic Commission for Africa (2004). These measures focus on gender inequality in well-being or in agency, and they are typically outcome-focused -usually considering gaps in education, health and survival, employment, and political participation. They show that there is great heterogeneity in levels and trends of such outcome-based measures of gender-inequality, as well as in gender gaps in agency. 1 While these gender-equality measures have clearly contributed to important research and policy, focusing only on these outcomes neglects the question of the origins of these inequalities. Gender inequality is the result of human behavior, and institutions influence how people behave and interact. Thus to understand gender inequality beyond outcomes, one needs to study the institutional basis of gender inequality. These institutions include formal institutions such as laws and codes of conduct as well as informal institutions such as norms and values that guide and constrain behavior. From an economics perspective, institutions are the result of collective choices in a society to achieve efficiency, solve collective action dilemmas, and reduce transaction costs (Douglass C. North 1990). Other social sciences emphasize legitimacy and appropriateness instead of efficiency. Institutions influence the preferences of actors and provide role models that are internalized by them (Peter A. Hall and Rosemary C. R. Taylor 1996;Indra de Soysa and Johannes Jütting 2007).
In the literature, the only other composite indices that are closer to our intention here are the gender-specific human rights measures of the CIRI Human Rights Data Project. These include the Women's Political Rights index (WOPOL), which focuses on the right of women to vote, petition, and be elected; the Women's Economic Rights index (WECON), which focuses on women's equal rights in the labor market; and the Women's Social Rights index (WOSOC), which focuses on rights in the social sphere (marriage, inheritance, travel, education, etc.). These indices measure on a yearly basis whether a number of internationally recognized rights for women are included in laws, and whether governments enforce The SIGI and the subindices are useful tools to compare the societal situation of women in over 100 non-OECD countries from a new perspective, allowing the identification of countries and dimensions of social institutions that deserve policymakers' attention and scrutiny. The SIGI presents a different approach by focusing on the institutional drivers of gender inequality, and our empirical results show that the SIGI provides additional information to that of other well-known gender-related indices. Regression analysis also shows that the SIGI provides new insights into gender-specific outcomes, even if one controls for region, religion, and level of economic development. 3

CONSTRUCTION OF THE INDICES
Building reliable, useful, and internationally comparable indices is generally a tough challenge. In the field of gender analysis, particular problems relate to the lack of gender-disaggregated statistics and the need for more careful interpretation of dimensions where men and women differ for biological reasons (for example, how to interpret indicators of fertility or reproductive health in gender gap analyses). 4 And institutions often pose particular challenges, as they cannot often be assessed well in quantitative terms. As a result, any index that attempts to capture gender inequality in social institutions across the world will invariably run up against conceptual and data constraints. For many items, the data may be unavailable, unreliable, or noncomparable; and sometimes the data are hard to interpret. As a result, difficult compromises have to be made when choosing indicators and scoring them; data availability, coverage, and statistical validity often assume as much importance as conceptual superiority of a particular indicator or scoring approach. It is important, however, to advance this research agenda by putting together data and indicators even in these hard-to-measure dimensions of gender inequality, even if the resulting index can only be seen as a starting-point for a more ambitious data-gathering and research agenda.
The SIGI is a multidimensional composite index that reflects the deprivation of women caused by social institutions related to gender inequality. The SIGI is composed of five dimensions that are measured by five subindices: Family code, Civil liberties, Physical integrity, Son preference, and Ownership rights. These dimensions were chosen as each represents an important yet distinct aspect of a gender-based institutions (see below).
The subindices are built out of variables from the OECD Development Centre's GID Database (Morrisson and Jütting 2005;Jütting et al. 2008). This is a cross-country database covering about 120 non-OECD countries with more than twenty variables measuring social institutions related to gender inequality. 5 These variables proxy social institutions through prevalence rates, legal indicators, or indicators of social practices; all are coded from 0 to 1, 6 with the value 0 meaning no or very low inequality, and the value 1 indicating high inequality. The choice of the variables used for the construction of the social institutions indicators is guided by the informational content they provide, their relevance for a comprehensive measure of social institutions related to gender, and their coverage, so that as many countries as possible can be ranked.

The subindices
The subindices each measure one dimension of social institutions related to gender inequality. Before combining the variables to the respective subindex, we checked whether these variables measure the same underlying concept estimating the statistical association between them. Then, the variables are combined into a one-dimensional subindex using the method of polychoric principal component analysis (PCA) to extract the common information of the variables (Stanislav Kolenikov and Gustavo Angeles 2009). 7 We use the first principal component as a proxy for the common information contained by the variables corresponding to the subindices. 8 The weight that each variable gets in these linear combinations is obtained by analyzing the correlation structure in the data. As in the case of the variables, the subindices and the SIGI range from 0 to 1, with 0 corresponding to no inequality and the value 1 to complete inequality.
The precise variable lists and coding guidelines are presented in the online appendix, available under the supplemental content tab on the publisher's website, and will only be summarized here. The Family code dimension refers to the private sphere with institutions that influence women's decision-making power in the household. Family code is measured by the variables Parental authority of women during marriage and after divorce, Inheritance rights, the prevalence of Early marriage among teenage girls, and the acceptance or legality of Polygamy. 9 The Civil liberties dimension captures the social sphere by measuring the freedom of women's social participation, an important pre-condition for their opportunities to participate on an equal footing in public and economic life. It includes the variables Freedom of movement of women outside their own household and Freedom of dress, measuring the requirement to follow a dress code when leaving the house.
The Physical integrity dimension comprises two indicators on violence against women, which measure the institutional basis of women's control over their bodies. The variable Violence against women measures existence of laws against domestic violence, sexual assault or rape, and sexual harassment, and Female genital mutilation measures the prevalence of the practice.
The dimension Son preference measures a manifestation of son preference under scarce resources; it encapsulates social institutions (relating to marriage practices, locality of sons and daughters after marriage, and oldage arrangements) that lead parents to prefer sons to daughters as offspring (Stephan Klasen and Claudia Wink 2003). It includes the variable Missing women, which measures the share of women that have suffered from gender bias in mortality or pre-birth sex selection. 10 The Ownership rights dimension covers the economic sphere of social institutions proxied by women's access to several types of property with three variables referring to access to land, credit, and property other than land, respectively. 11 Supplemental Table 1 in the online appendix indicates the coding guidelines of all variables. Experts using coding manuals did the coding, and the OECD Development Centre undertook the actual coding as part of the construction of the GID Database. While we believe that these indicators and dimensions capture essential elements of social institutions affecting gender inequality, there are some notable limitations. First, some indicators only partially capture the dimension we wish to capture. For example, Violence against women is based on laws only, as comparable data on actual prevalence were not available; similarly, Freedom of dress only captures one particular aspect of women's obligations and restrictions when leaving the home. Second, in some dimensions one could think of additional indicators that could better capture the inequality in social institutions. For example, the Physical integrity subindex does not include any indicators related to social institutions associated with reproduction. We experiment with an indicator of abortion rights as one possible dimension, and report below on how this would change the results. Similarly, one could expand the Ownership rights dimension by including more specifically rights and institutions in the labor market; the Son preference dimension could also include indicators of fertility preference. In both cases, data limitations prevented an extension. Fourth, the scoring is often based on subjective interpretation of available information and can therefore be subject to criticism. Lastly, one may wonder why no measures of women's political rights and participation are included. This was based on the decision that measures are meant to focus on social institutions and thus not explicitly consider formal political institutions. Clearly, this is another area where one can reasonably disagree.
Thus we want to emphasize that our proposed measure could still be improved if better data on some of the dimensions of social institutions were available. We do, however, see our indicators and the composite index as a good starting-point for furthering this important research agenda.

The SIGI
Based on the indicators and subindices discussed above, the SIGI is then an unweighted average of a nonlinear function of the subindices. We use equal weights for the subindices, as we see no reason for valuing one dimension more or less than others. The nonlinear function arises because we assume that inequality in gender-related social institutions leads to deprivation experienced by affected women, and that deprivation increases more than proportionally when inequality increases. Thus, high inequality is penalized in every dimension. The nonlinearity also means that the SIGI does not allow for total compensation of inequality among subindices but permits only partial compensation. Partial compensation implies that high inequality in one dimension, that is one subindex, can only be partially compensated with low inequality on another dimension.
For our specific five subindices, the value of the index, the SIGI, is then calculated as follows: SIGI = 1/5(SubindexFamily code) 2 + 1/5(SubindexCivil liberties) 2 + 1/5(SubindexPhysical integrity) 2 + 1/5(SubindexSon preference ) 2 + 1/5(SubindexOwnership rights) 2 (1) Using a more general notation, the formula for the SIGI I (X ), where X is the vector containing the values of the subindices x i with I = 1, . . . , n, is derived from the following considerations. For any subindex x i , we interpret the value 0 as the goal of no inequality to be achieved in every dimension. We define a deprivation function φ ( Higher values of x i should lead to a penalization in I (X ) that increases with the distance x i to 0. In our case, the deprivation function is the square of the distance to 0 so that deprivation increases more than proportionally as inequality increases.
This is inspired by the FGT poverty measures; this formula is defined for y i ≤ z as: where Y is the vector containing all incomes, y i with i = 1, …, n is the income of individual i, z is the poverty line, and α > 0 is a penalization parameter.
To compute the SIGI, the value 2 is chosen for α as the square function has the advantage of easy interpretation. With α = 2, the transfer principle is satisfied (Foster, Greer, and Thorbecke 1984). In the case of the SIGI, the transfer principle means that, starting from a situation of equal score in two dimensions, an increase in score (that is, higher inequality) in one dimension and an equal-sized decrease of the score in the other dimension (that is, lower inequality) will raise the SIGI, thereby signaling higher overall gender inequality. 12 To highlight the effects of partial compensation as compared to total compensation, we computed the statistical association between the SIGI and a simple arithmetic average of the five subindices that allows for total compensation and compared the country rankings of both measures. The Pearson correlation coefficient between the SIGI and the simple arithmetic average of the five subindices shows a very high and statistically significant correlation between both measures (See Table 1). However, when we compare the ranks of the SIGI with those obtained using a simple arithmetic average of the five subindices in Supplemental Table 2 in the online appendix, we observe that there are noticeable differences in the rankings of the 102 included countries. Examples are China and Nepal. China ranks in position 55 using the simple average but worsens to place 83 in the SIGI ranking. Nepal has place 84 considering the simple average and improves to rank 65 using the SIGI. For China, this is due to the high inequality on the subindex Son preference, which in the SIGI case cannot be fully compensated with relatively low values for the other subindices. For Nepal, we observe the opposite case as all subindices have values reflecting moderate inequality.

RESULTS
Country rankings and regional patterns Table 2 presents the results for the SIGI and its five subindices. Among the 102 developing countries considered by the SIGI, 13 Paraguay, Croatia, Kazakhstan, Argentina, and Costa Rica have the lowest levels of gender inequality related to social institutions. Sudan is the country that occupies the last position, followed by Afghanistan, Sierra Leone, Mali, and Yemen, which means that gender inequality in social institutions is a major problem there. As can be seen by studying the subindices, most of the top third of performers have no inequality in Civil liberties, no evidence for Son preference, and no inequality in Ownership rights. Therefore, for these countries, the final ranking is heavily influenced by performance in Family code and Physical integrity where (nearly) all countries show some inequalities. Particularly, the acceptance of violence against women plays a rather important role here.      In the subindex Family code, the best performers are China, Jamaica, Croatia, Belarus, and Kazakhstan, while the worst performers are Mali, Chad, Afghanistan, Mozambique, and Zambia. In the dimension Civil liberties, the top two-thirds of countries report no inequalities. Sudan, Saudi Arabia, Afghanistan, Yemen, and Iran occupy the last five positions of high inequality. In the subindex Physical integrity, Hong Kong, Bangladesh, Taiwan, Ecuador, El Salvador, Paraguay, and Philippines are at the top of the ranking, while Mali, Somalia, Sudan, Egypt, and Sierra Leone are at the bottom. In the dimension Son preference, two-thirds of countries report no inequality; the countries that rank worst are China, Afghanistan, Papua New Guinea, Pakistan, India, and Bhutan; for them, poor performance in this indicator has a sizable influence on their ranking in the overall SIGI. Finally, in the subindex Ownership rights, 42 countries share position 1, as they have no inequality in this dimension. On the other hand, the four worst performing countries are Sudan, Sierra Leone, Chad, and the Democratic Republic of Congo. Thus it is noticeable that, despite some correlation of performance across subindices, there is a great deal of heterogeneity in country performance across indicators, which further justifies only allowing partial compensation across dimensions. China is most extreme here, as it is ranked best in three dimensions (Family code, Civil liberties, and Ownership rights), while it performs abysmally on Son preference and also rather poorly on Physical integrity. Similarly, there are a number of Sub-Saharan African countries who score perfectly on Son preference and Civil liberties but very poorly on the other three, leading to poor rankings overall. Conversely, the countries that are most balanced in their (generally poor) performance across dimensions are from South Asia and the Middle East and North Africa, although there are some individual country exceptions.
To find out whether apparent regional patterns in social institutions related to gender inequality are systematic, we divide the countries in quintiles following the scores of the SIGI and its subindices ( Table 3). The first quintile includes countries with lowest inequality and the fifth quintile countries with highest in equality. The SIGI does not rank any country in Europe and Central Asia (ECA) or Latin America and the Caribbean (LAC) in the two quintiles that reflect the highest inequality in social institutions related to gender. In contrast, most countries in South Asia (SA), Sub-Saharan Africa (SSA), and Middle East and North Africa (MENA) rank in these two quintiles. Despite this, it is interesting to note that two countries from these regions rank in the first (that is, best) two quintiles. These are Mauritius (SSA) and Tunisia (MENA). East Asia and Pacific (EAP) has countries in all five quintiles with Philippines, Thailand, Hong Kong, and Singapore in the best quintile and China in the worst quintile. The latter result is heavily influenced by China being the worst performer in the Son preference dimension (Klasen and Wink 2003). Examining the subindices, the patterns are overall similar to the one of the SIGI and are briefly summarized as follows: (1) Family code: No country in ECA, LAC, or EAP shows high inequality in this dimension. SA, MENA, and SSA remain problematic, with most countries having social institutions related to high gender inequality. Exceptions are Bhutan in SA, Mauritius in SSA, and Tunisia and Israel in MENA.
(2) Civil liberties: Only three groups of countries using the quintile analysis can be generated with the first group including the first three quintiles. In SSA, over half of the countries are now in the first group. Also, in MENA, there are some countries with good scores (Israel, Morocco, and Tunisia). No country in SA is found in the first three quintiles of low and moderate inequality. While these rankings for the SIGI and its subindices generate interesting results for the prevalence and country distribution of social institutions related to gender inequality, one may wonder to what extent these are driven by data limitations, choice of indicators, and dimensions. In particular, we discuss briefly three topics related to the selection of variables and country sample. The first one is that it could be argued that some of the indicators we use here are mostly relevant to a given region; for example, Son preference in SA, Female genital mutilation in SSA, or Freedom of dress as an issue for countries with Muslim populations. We investigated this issue in some detail. 14 First, none of the subindices or the individual indicators has a perfect regional correlation in the sense that inequality only occurs in one region. Son preference is an issue affecting all regions, and Female genital mutilation is an issue in five of the six regions. Of course, different regions are affected to different degrees, but that is precisely one of the issues this research would hope to uncover. Second, even the converse is (mostly) true -namely there are hardly any subindices and indicators where an individual region is entirely unaffected in the sense of having perfect equality. The exceptions are that ECA and LAC score perfectly on the Civil liberties subindex (and, by implication, on the indicators Freedom of movement and dress); and that in ECA, there is no issue of Female genital mutilation. All other indicators and subindices show some inequality in all regions. Third, there is substantial within-region heterogeneity in all indicators and subindices. 15 Lastly, we consider the issue of Freedom of dress, an issue that typically affects countries with a sizable Muslim population. Even if Freedom of dress is mainly an issue for countries with a Muslim majority, the correlation between religion and this variable, which arguably would indicate a social institution that makes it more difficult for women to participate in public life, is not automatic.
Of the forty-one Muslim majority countries, in nineteen there is a perfect score on the Civil liberties indicator (meaning no inequality), while only four countries rank highest for this inequality. 16 We will now briefly discuss the impact of including one additional variable -the legality of abortion -where we have complete data for this factor available. The legality of abortion variable could arguably be included in the Physical integrity subindex. United Nations (2007) provides information on the legal availability of abortion by countries, classifying seven legal reasons for abortion, ranging from "to save the life of the woman" to "available upon request." Based on the approach taken by David E. Bloom, David Canning, Günther Fink, and Jocelyn E. Finlay (2009), we use the seven categories to equidistantly code the variable (with "available on request" receiving a score of 0 and "not allowed under any circumstances" a score of 1).
As a robustness exercise, we consider a reformulated SIGI using the same methods but including the abortion rights indicator (scored as just described) as an additional variable in the Physical integrity subindex. The results for the countries for which we can compute both the SIGI and the reformulated SIGI are shown in Table 4. Since many Latin American countries have more restrictive abortion policies, presumably related to their Catholic heritage, while many ECA countries, largely due to their socialist heritage, have particularly liberal policies, including this indicator changes the Physical integrity rankings at the top of the SIGI league table. In particular, Croatia now tops the list and seven ECA countries are among the top ten. Only Argentina and Cuba remain in the top ten, while Paraguay, El Salvador, and Ecuador rank a bit lower. But the change in rankings is based on rather small changes in the overall SIGI, and it only has a noticeable impact on rankings of countries in these two regions. At the bottom of the rankings, there are few changes.
While these are useful results, we decided ultimately not to include the abortion rights indicator in the final index, for the following two reasons.   First, there is the question of the extent to which restrictions on abortions can be seen as gender inequality in Physical integrity. While one may agree that abortion restrictions in instances of rape, incest, or when the mother's life or health are endangered are legitimate issues of gender inequality in Physical integrity, it is more controversial whether restrictions related to socioeconomic reasons or the health of the fetus are issues of gender inequality in Physical integrity. It is also unclear how to quantitatively treat the different restrictions in the variable scoring. Second, there are also limitations to the data available. As noted in UN (2007), in a number of countries where abortion is not legal under any circumstances, it is unclear whether "a defense of necessity" to save a woman's life would be accepted by a count as de facto justification; thus, it is unclear whether a score of 1 in these cases is actually justified.
Lastly, we want to discuss the issue that the SIGI is produced only for non-OECD countries. The main problem is that our indicators are not appropriate for an accurate assessment of social institutions related to gender inequality in OECD countries. Using our indicators, the vast majority of OECD member countries (with the exception of Turkey, Mexico, and South Korea) would get a perfect score in the SIGI. This is partly due to the fact that legal discrimination that governs women's economic, social, and public life is largely absent in OECD countries; it is partly also due to the indicators that we use. For example, violence against women continues to be a problem afflicting OECD countries; but our proxy, as discussed above, does not pick up the prevalence (only the legality of it) -which again gives most OECD countries a perfect score. Therefore, by not including OECD countries we avoid the misleading impression that there are no remaining inequalities in social institutions that affect OECD countries. One way out could be to produce a different SIGI using different indicators for OECD countries, as similarly done with the two versions of UNDP's (1996) Human Poverty Index, or to extend the SIGI to include more dimensions that have greater relevance for OECD countries. Both options are fruitful avenues to pursue this matter further.

Simple correlation with other gender-related indices
The SIGI seeks to understand gender inequality in a new way by focusing on gender gaps in social institutions that influence the basic functioning of society and explain gender inequality in outcomes. From this perspective, the SIGI contributes to existing gender-related measures. Additionally one can take an empirical redundancy perspective, asking whether it provides additional empirical information as compared to other measures. Nevertheless, one can also check whether the index is empirically redundant with an empirical analysis of the statistical association between the SIGI and other well-known gender-related indices. Relying on Mark McGillivray and Howard White (1993), we use a correlation coefficient of 0.80 in absolute value as the threshold to separate redundancy from nonredundancy.
We also calculate Kendall Tau-b as a measure of rank correlation between the SIGI and each of the following indices: the GDI and GEM from UNDP (2006); the Global Gender Gap Index (GGG) from Ricardo Hausmann, Laura D'Andrea Tyson, Saadia Zahidi, and Klaus Schwab (2007); and the CIRI Women's Social Rights Index (CIRI Human Rights Data Project n.d.). As the GDI and the GEM have been criticized in the literature (for example, Stephan Klasen [2006]; Dana Schüler [2006]), we also do the analysis for two alternative measures, the Gender Gap Index Capped (GGI) and a revised Gender Empowerment Measure (GEM revised) based on income shares proposed by Stephan Klasen and Dana Schüler (2011). 17 For all the indices considered, Kendall Tau-b is lower than 0.60 in absolute value and statistically significant (Table 5); and rankings differ substantially (Table 6). 18 Clearly, the SIGI is related to these gender inequality measures but is nonredundant. This suggests that the SIGI conceptually reflects a different approach to measuring gender inequality, and it also empirically captures different aspects as currently available measures. Interestingly, the highest correlation in absolute value (around 0.50) is found between the SIGI and the GDI and GGI (capped) with both measures combining health, education, and income (or labor force participation). The lowest correlation (around 0.43) is observed for the two empowerment measures GEM and GEM revised. The results for GGG and WOSOC are in between (around 0.48). 19 Similar results regarding correlations of the SIGI with other gender indices are reported by Irene van Staveren (2011). She finds that the SIGI is actually least correlated when studying the correlations of the SIGI, UNDP's new GII, the GGG, and the Gender Equity Index based on the Indices of Social Development (ISD) database, with the Pearson correlation coefficients of the SIGI running from 0.64 to 0.77.
Summarizing these correlations, it is clear that the SIGI is related to outcome-based measures; but this correlation is far from perfect. This is what we would expect. Clearly, gender inequality in social institutions should be an important driver of gender inequality in outcomes; but we would not expect a perfect match. We therefore now turn to investigate to what extent the SIGI and its components can indeed be seen as a driver of gender inequality outcomes. 20 gaps in outcome variables related to basic rights such as health, economic participation, and political empowerment. The second response variable is the ratio of GDI to Human Development Index (HDI) as a composite measure of gender inequality in the dimensions health, education, and income. As the GDI is not really a measure of gender inequality but a measure of human development that penalizes for gender inequality one can use the ratio of GDI to HDI as a proxy for gender inequality. Additionally, we also use the ratio of the female to male HDI as calculated by Klasen (2006) as another measure of gender gaps in development outcomes. In all three regressions, we control for the level of economic development using the log of per capita GDP in constant prices (US$, PPP, base year: 2005; World Bank 2008); for religion using a Muslim majority and a Christian majority dummy, the left-out category being countries that have neither a majority of Muslim nor a majority of Christian population (CIA 2009); and for geography and other unexplained heterogeneity that might go together with region using region dummies, the left-out category being Sub-Saharan Africa. 21 Table 7 presents the regression results. With GGG as a dependent variable, the SIGI is negatively associated with GGG and significant at the 1 percent level. The second regression, with the ratio of GDI to HDI as dependent variable, shows that the SIGI is again negatively associated with the response variable, and this association is statistically significant at the 1 percent level; the same is true when using the ratio of the female to male HDI, where the SIGI has a strong and highly significant negative impact, confirming again that gender inequality in well-being and empowerment is strongly associated with social institutions that shape gender roles. 22 To check that our findings are not driven by observations that have large residuals and/or high leverage, we also run a range of robustness checks, obtaining similar results. 23 While these regressions document a significant correlation, one should certainly be careful with any statement about causality as there could be omitted variables, measurement error, and reverse causality (Jeffrey Wooldridge 2002). We include control variables in the regressions with the objective to minimize omitted variable bias, but it is not possible to rule out this problem; as the institutions we capture tend to be long-lasting, we also believe that reverse causality is rather unlikely.
In addition, we submit that the SIGI might be a useful measure to tackle endogeneity in other types of regression. For example, in regressions examining the impact of gender gaps in education or health on economic growth or other development outcomes, endogeneity is likely to be an issue. To the extent that the SIGI is able to explain these gender gaps and is not directly related to growth or the development outcome examined, it would be a plausible instrument to tackle endogeneity issues in such types of analyses.

CONCLUSIONS
In this study, we presented a composite index that approaches gender inequality in a way that has been neglected in the literature and by other gender measures that focus mainly on well-being and agency. Instead of measuring gender inequality in well-being or agency outcome dimensions, the proposed measures proxy the underlying social institutions that are mirrored by societal practices and legal norms that might produce inequalities between women and men in developing countries. We construct five subindices each capturing one dimension of social institutions related to gender inequality that we combine into the SIGI, a multidimensional index of deprivation of women caused by social institutions related to gender inequality. The aggregation procedure used for the SIGI has the advantage of penalizing high inequality in each dimension and allowing for only partial compensation among the five dimensions. At the same time, the SIGI is easy to understand and to communicate. The SIGI's composite measures allow for comparison and ranking of the deprivation of women in over 100 developing countries. Empirical results show that the SIGI is statistically nonredundant and adds new information to other well-known gender-related measures. The SIGI and the five subindices can help policymakers detect the problems that need to be addressed in certain developing countries and in specific dimensions of social institutions. The SIGI suggests that regions with highest inequality are South Asia, Sub-Saharan Africa, and Middle East and North Africa. The composite measures can be valuable instruments to generate public discussion. Moreover, the SIGI and its subindices have the potential to influence current development thinking, as they highlight social institutions that affect overall development. As shown in the literature (World Bank 2011;Stephen Knowles, Paula K. Lorgelly, and P. Dorian Owen 2002;Klasen 2002;Klasen and Lamanna 2009), gender inequality in education and employment negatively affects overall development. Economic research investigating these outcome inequalities should consider social institutions related to gender inequality as possible explanatory factors. Results from regression analysis show that the SIGI is related to gender inequality in wellbeing and empowerment, even after controlling for region, religion, and level of economic development.
When constructing composite indices, one is always confronted with decisions and trade-offs concerning the choice and treatment of the variables included, the weighting scheme, and the aggregation method. Some limitations of the subindices and the SIGI must be noted. First, a composite index depends on the quality of the data used as input. Social institutions related to gender inequality are hard to measure, and the creation of the OECD Development Centre's GID Database containing several indicators on social institutions is an important step forward (Morrisson and Jütting 2005;Jütting et al. 2008). It is worthwhile to continue this endeavor and invest more resources in the measurement of social institutions related to gender inequality. This includes improving data coverage, coding schemes, and the expansion and refinement of indicators. It would also be useful to exploit prevalence and perception data available, for example, the Demographic and Health Surveys (DHS) capture women's perceptions of domestic violence. Similarly, more comparable data on fertility preferences or social institutions involving labor markets would be useful. In some cases, extensions or even new data-gathering exercises will be required.
Second, by aggregating variables and subindices, some information is inevitably lost. Figures and rankings according to the SIGI and the subindices should not substitute a careful investigation of the variables from the database. Furthermore, to understand the situation in a given country, additional qualitative information could be valuable. Detailed information on each country is available in OECD (2010), which includes a country discussion on the five dimensions of the SIGI.
Third, the SIGI only measures institutions at the country level. For some dimensions, the use of micro data could be useful to generate more disaggregated version of the SIGI; here again, the DHS or other crosscountry comparable micro data sets (such as UNICEF's MICS, the World Values Survey, or Gallup World Poll data) would be useful sources. 24 Fourth, the omission of OECD countries remains a problem of the measure. While an inclusion in the current formulation of the SIGI is problematic for the reasons discussed above, creating a SIGI specifically for OECD countries or enhancing the indicator suite to make it more sensitive to gender issues in OECD countries would be desirable. Similarly, generating data to develop indicators in currently unmeasured aspects of social institutions could also affect the ranking among developing regions. As our sensitivity analysis with abortion rights shows, inclusion of an additional indicator can affect the ranking of regions. Thus we caution that the good performance of some regions (including Latin America and Eastern Europe and Central Asia) might be partly due to the omission of indicators on gender gaps in social institutions there.
Centre "Poverty, Equity and Growth in Developing and  (2009), we use polychoric PCA, which relies on polychoric and polyserial correlations. These correlations are estimated with maximum likelihood, assuming that there are latent normally distributed variables that underlie the ordinal categorical data. 8 The first principal component is the weighted sum of the standardized original variables that captures as much of the variance in the data as possible. The proportion of explained variance by the first principal component is 70 percent for Family code, 93 percent for Civil liberties, 60 percent for Physical integrity, and 87 percent for Ownership rights. The standardization of the original variables is done as follows: In the case of continuous variables, one subtracts the mean and then divides by the standard deviation; in the case of ordinal categorical variables, the standardization uses results of an ordered probit model. 9 Acceptance of polygamy in the population might proxy actual practices better than the formal indicator legality of polygamy, as laws might be changed faster than practices. Therefore, the acceptance variable is the first choice for the subindex Family code. The reason for using legality when acceptance is missing is to increase the number of countries included. 10 Originally, Missing women was part of the dimension Physical integrity, but we argue that missing women reflects another dimension of gender inequality. The two components of Physical integrity, Violence against women and Female genital mutilation, focus on freedom from bodily harm, while Missing women is a more general proxy for Son preference that results in skewed fertility strategies and allocation decisions favoring sons. It also turns out that the statistical association between the two indicators of Physical integrity and Son preference is rather weak, suggesting that it is measuring a different concept. 11 Note that these indicators are based on legal rights, not actual prevalence. See Cheryl R. Doss, Caren Grown, and Carmen Diana Deere (2008) for a careful discussion of how to generate micro-based indicators of asset ownership by gender. 12 Some differences between the SIGI and the FGT measures must be highlighted. In the case of the SIGI, we are aggregating across dimensions and not over individuals. Moreover, in contrast to the income case, a lower value of x i is preferred, and the normalization achieved when dividing by the poverty line z is not necessary as 0 ≤ xi ≤ 1, i = 1, . . . , n. 13 The subindices are computed only for countries that have no missing values on the relevant input variables. In the case of the SIGI, only countries that have values for every subindex are considered. 14 Most of the results we report here can be deduced from the tables with the country rankings. We did not report separate tables for this analysis, but they are available on request. 15 The only exception here is in MENA, where Inheritance rights uniformly score 0.5. 16 Moreover, from a statistical point of view, the rank correlation coefficient Kendall Tau-b between the other variable in the subindex -namely Freedom of movement, and Civil liberties as it is defined here -is close to 0.9. This suggests that excluding the variable Freedom of dress, and having Freedom of movement as the only variable capturing the freedom of women's social participation, would not lead to a major change in the ranking of countries according to this subindex. 17 The GGI is a geometric mean of the ratios of female to male achievements in the dimensions health, education, and labor force participation. "Capped" means that every component is capped at one before calculating the geometric mean. This is done to ensure that only gaps hurting women are considered. GGI can be more directly interpreted as a measure of gender inequality, while the GDI measures human development penalizing gender inequality. The GEM has three components: political representation, representation in senior positions in the economy, and power over economic resources. The most problematic component is power over economic resources proxied by earned incomes. This component measures female and male earned incomes using income levels adjusted by gender gaps; it is empirically largely driven by income levels, not by gender gaps. To avoid this problem, the revised GEM only uses income shares of men and women in this component. 18 We have also computed the Pearson correlation coefficient between SIGI and all the measures. The Pearson correlation coefficient is lower than 0.80 for all correlations. 19 It must be noted that the samples used for computing the rank correlation differ from case to case, ranging from thirty-three countries (GEM) to ninety-nine (WOSOC). 20 See Branisa, Klasen, and Ziegler (2013) and Branisa and Ziegler (2010) for more detailed assessments of the empirical relevance of the SIGI and its subindices in explaining development outcomes. 21 As the number of observations is lower than 100, we use HC3 robust standard errors proposed by Russell Davidson and James G. MacKinnon (1993) to account for possible heteroscedasticity in our data. 22 Using the difference between the HDI and the GDI, another possible measure of gender inequality, the impact of the SIGI is similarly significant. 23 Results are available upon request. The type of robust regression we perform uses iteratively reweighted least squares and is described in Lawrence C. Hamilton (1992). A regression is run with ordinary least-squares, then case weights based on absolute residuals are calculated, and a new regression is performed using these weights. The iterations continue as long as the maximum change in weights remains above a specified value. 24 See Doss, Grown, and Deere (2008) for suggestions regarding developing micro data on gender inequality in asset holdings.