Regional Governance Matters: Quality of Government within European Union Member States

Charron N., Dijkstra L. and Lapuente V. Regional governance matters: quality of government within European Union member states, Regional Studies. This study presents novel data (European QoG Index – EQI) on the ‘quality of government’ (QoG) – understood as low corruption, impartial public services and rule of law – for national and sub-national levels in twenty-seven European Union countries. The EQI shows notable within-country variations: while high-performing regions in Italy and Spain (for example, Bolzano, País Vasco) rank amongst the best European Union regions, others perform well below the European Union average. The index is highly correlated with sub-national levels of socio-economic development and levels of social trust, yet political decentralization is uncorrelated with greater within-country, or higher levels of overall, QoG.


INTRODUCTION
The principal aim of this study is descriptive in natureto present newly created data on 'quality of government' (QoG) for twenty-seven European Union (EU) countries and 172 sub-national regions, primarily taken from the largest multi-country, sub-national-level survey on QoG to date. The first and primary part of this paper provides a detailed account of the construction of the data, the robustness checks and it shows the final results, along with providing the data for all countries and regions in Appendix A, which is free for scholarly use. 1 Overall, the data show that even in a highly developed area such as Europe, significant QoG variation exists between and within countriesnot only between new and old member states, but also even among the original six countries.
The secondary aim of the study is to use the newly created data to test several hypotheses on the relationship between QoG and numerous commonly tested national-level correlates. While the empirical section is admittedly modest, it is emphasized that such tests at the sub-national level were not possible before the creation of these data and the data presented here are to encourage future QoG research at the sub-national level (for example, CHARRON and LAPUENTE, 2013). Overall, it is believed that the data provided here significantly contribute to one's understanding of the surprisingly vast amount of QoG variation across and within EU countries and they provide a valuable tool for researchers interested in moving past national comparisons to more detailed, regional-level studies within the EU.

WHY STUDY QUALITY OF GOVERNMENT IN EUROPE?
During the latest two decades numerous studies have indicated that QoG is a major determinant of many variables associated with the well-being of individuals within a country. This literature emphasizes the importance of how a government delivers its policies, instead of what a government deliversthat is, the size or 'quantity' of government. And, in particular, the focus is on the extent to which a government delivers its policies irrespective of their nature and degree or provisionin an effective and impartial way and without corruption.
Evidence of such attention in the way a government performs its tasks can be found not only in the rise of academic publications with a focus on this topic, along with interest from international organizations such as The World Bank and the United Nations, which have increasingly underlined the value of good governance and sound institutions from a development perspective (HOLMBERG et al., 2009). This has in turn given rise to a recent surge in new data creation, quantifying aspects of QoG and, in particular, its most measurable components (even if the measures are most often subjective or perception based), such as the lack/control of corruption, the strength of the rule of law and bureaucratic quality or 'government effectiveness'. There is such a high correlation amongst these cross-country indicators that comparative scholars have coined the term 'quality of government' (QoG) to encapsulate the concept of a government that it is impartial, efficient and noncorrupt (ROTHSTEIN and TEORELL, 2008). Countries with high QoG score higher in almost all dimensions related to the welfare of their citizens (HOLMBERG et al., 2009). QoG has been foundin an extensive and growing literatureto lead to outcomes such as better economic performance (KNACK and KEEFER, 1995;MAURO, 1995;MO, 2001), higher environmental sustainability (MORSE, 2006;WELSCH, 2004), lower income inequality and poverty (GUPTA et al., 1998), better education and health outcomes (MAURO, 1998), higher levels of subjective happiness (FREY and STUTZER, 2000), and lower probabilities of civil armed conflict (ÖBERG and MELANDER, 2010).
Despite the importance of these findings, empirical measures within this subfield are still relatively underdeveloped. One of the major shortcomings is that most data and research related to QoG has focused exclusively on the national level, with a particular focus on developing countries. The two implicit assumptions in the extant research efforts to gather data on QoG have thus been that national differences matter more than subnational ones and that, across similar Western democracies, the differences in QoG are fairly minor. This study challenges both assumptions. First, it focuses exclusively on the twenty-seven member states of the EU, arguably all moderately to highly developed countries, yet, at the same time, these countries present noticeableand statistically significantdifferences in QoG. 2 Second, this study gathers data on both national and sub-national differences, uncovering how the latter tend to trump the former quite frequentlyfor example, it is found that the gap between Italy's Bolzano region, which ranks near the top of all EU regions, and Campania, which is among the lowest, is wider than the gap between the countries of Denmark and Hungary, for example.
The main findings of the study are the following. First, it is found that there is significant variation in QoG across mainly four main cluster groups of states: the top performers are mostly from the Scandinavian, Germanic and English-speaking countries; a second group would largely be formed by Mediterranean countries together with Estonia and Slovenia; and the third group would consist of most new member states plus, notably, Italy and Greece; and a fourth group is made of the two newest member states -Romania and Bulgaria. It is found, however, that in several countries QoG national-level data offer a highly distorted picture due to the presence of significant sub-national variations in QoGoverestimating low-performing regions, while underestimating higher-performing ones. The previous literature has pointed in that direction. For example, differences between Northern and Southern Italy are widely known thanks to several influential works (for example, PUTNAM et al., 1993), and the divergences between Flanders and Wallonia in Belgium as well as the provinces in Spain are often discussed. Yet, such regional differences in QoG in Belgium and Spainand most others in the EUhave not been quantified throughout the EU in a systematic way. The most encompassing empirical studies of European regional differences (for example, TABELLINI, 2005) have mostly relied on a rather indirect measurelevels of gross domestic product per capita as proxies of the level of QoG in a region. Thus, despite its cross-sectional nature given that this is the first time this information on QoG is gathered, the data presented in this study are thus a pioneering effort to corroborate for the first time within-country QoG variations in most European countries simultaneously.
In addition, five basic hypotheses are tested using the newly created European Quality of Government Index (EQI), essentially seeking to test whether national-level findings hold at the regional level. First, it is hypothesized that the EQI will be strongly correlated with measures of economic and social developmentsuch as gross domestic product per capita, health measures and levels of education (Hypothesis 1). It is important to bear in mind that, as the previous literature has noted, the causality can work in both directions thanks to feedback effects: that is, QoG may be both cause and consequence of these socio-economic variables. Next, it is tested whether the size of a regionbe it population or area sizehas any association with QoG levels (Hypothesis 2). Third, based on numerous national-level studies, it is hypothesized that social trust will be positively related with the EQI (Hypothesis 3). Finally, it is hypothesized that various types of political decentralizationin the form of federalism or regional administrative authoritywill be associated with greater degrees of disparity of QoG among regions within a country (Hypothesis 4) or systematically linked with country-levels of QoG (Hypothesis 5). Similar to the empirical literature with a national-level focus, strong evidence is found for Hypotheses 1 and 3 at the regional level in the EU. As regards Hypothesis 2, some empirical evidence is found that population and area size are related to QoG within countries, yet not EU wide. Finally, contrary to the extensive literature on decentralization and QoG, no evidence is found for Hypotheses 4 or 5.
The remainder of the paper is structured as follows. First, it begins the measurement of QoG within the EU with a national-level assessment, using existing data from The World Bank's Governance Indicators (KAUFMANN et al., 2009). Next, it describes the regional-level survey undertaken in 172 EU regions from the largest eighteen member states. Subsequently, it combines national and regional QoG data into the full index (for example, the EQI) for the entire EU. Next, it tests the hypotheses discussed above with the EQI. It concludes with several suggestions of important empirical puzzles that could be addressed in future using these new data.

MEASURING QUALITY OF GOVERNMENT AT THE NATIONAL LEVEL IN EUROPE
According to the existing contemporary, national-level data, QoG, or 'good governance', is on average higher for EU countries as compared with other world regions. This in and of itself is not surprising, but a closer look reveals that there is significant variation among many of the countries in the EU, which is discussed in this section. As noted, a proliferation of QoG-type data has emerged since the mid-1990s, measuring such concepts as corruption, rule of law and others at the national level. Many of the indicators cover most or all of the EU countries, such as Transparency International's Corruption Perception Index (CPI), the International Country Risk Guide (ICRG), or the World Economic Forum's business leader survey on corruption and bureaucratic effectiveness, to name but a few. After reviewing all available QoG indicators covering EU countries, it is found that The World Bank's World Governance Indicators (WGI) (KAUFMANN et al., 2009) data are the most suitable source on which to compare and assess QoG for EU countries. First, as opposed to only focusing on one particular concept of QoG, such as corruption, it covers four main, interrelated 'pillars' of QoG that the authors find to be highly salient: . Control of corruption. . Rule of law. . Government effectiveness. . Voice and accountability.
Second, the WGI covers all EU countries for at least twelve years, going back to the mid-1990s, and it is now published annually. Third, it is a 'composite index' and is transparent in the way that it is constructed publishing freely all underlying data on which it is built, along with a relatively clear description of the conceptual meaning of each concept and the methodology used to create each variable. Fourth, the theoretical scope of each QoG concept is wide rather than narrow. The authors believe that unless specified, all aspects of corruption, rule of law, etc. should be included rather than focusing on narrow aspects alone. This allows for more information to be included, which is good for reliability checks of the data for example. 3 As far as the underlying data indicators are concerned for each pillar, the number of sources vary from country to country in the dataset covering all countries in the world (some small island states have only one source, for example, for a given pillar while some states have more than fifteen). However, the advantage of the EU sample is that there are at least nine common sources for each individual data indicator of QoG for the WGI, and in the case of Rule of Law, there are at least twelve for all countries, making for much more reliable comparison than countries with only a few (or fewer) sources in common. 4 The sources of the underlying data are mainly from 'risk assessment' institutes or 'expert' surveys, yet they do also contain non-governmental organization assessments such as Reporters Sans Frontiers and Freedom House, along with data from government agencies and citizen-based survey data, such as Gallop World Poll, 5 reducing the likelihood that a country's score was driven by one source or, moreover, that a country's score was exclusively influenced by business interests. 6 The twenty-seven EU countries were ranked according to each of the four 'pillars' of QoG listed above. 7 However, the authors were uncertain about the robustness of the data. Thus, all data used to construct these four QoG indices for the year 2008 were taken, the original results were replicated, and extensive sensitivity tests and internal/external consistency checks were conducted on each of the four areas of QoG. After running a total of 264 alternative simulations, whereby the original weighting scheme was altered, the method of aggregation and individual data sources were removed one at a time, the data and country ranking of each pillar were found to be remarkably robust to changes, along with being strongly internally consistent. 8 After confirming the robustness of the original estimates for each of the four composite indices, the four indices were then combined to create a nationallevel 'QoG index', 9 the results of which can be seen in Table 1.
According to the WGI's own margins of error, the QoG estimates between countries such as Denmark and Finland or the Czech Republic and Hungary are indistinguishable. Thus, hierarchical cluster analysis is used to assess the national-level variance in QoG across EU countries using Ward's method and squared Euclidian distancing for the four pillars of QoG to identify the number of appropriate cluster groupings, which serve as a helpful tool to identify EU member states that share common challenges to building QoG at the national level. Although distinguishing the number of groups in this type of analysis can be arbitrary at times, it was found that the most appropriate alignment was to distinguish between four groups in the analysis. k-means clustering with squared Euclidian distancing was then used to assign each country to a cluster group. The results show mainly that with some exceptions there are noticeable geographic and historic similarities to the countries within each group. Without claiming that these groups are 'set in stone' so to speak, the data indicate that cluster 1 countries exhibit the highest levels of QoG in Europe, while clusters 2 and 3 show, respectively, a moderate and a moderate-to-low performance in QoG, and cluster 4 on the lowest end comparatively. 10 According to this picture based on aggregate national data, there thus seems to be four Europes with respect to QoG. A first group contains the top performers, mostly from Scandinavian, Germanic and Anglo-Saxon countries. A second group consists mostly of Mediterranean countries plus the two best performers in Central-Eastern Europe (Estonia and Slovenia). The third group consists mostly of post-Communist EU members and, significantly, two Western European countries: Italy and Greece. The fourth group is the two most recent member states: Bulgaria and Romania, which on average show the lowest levels of QoG in each of the four pillars. To put this variation into a more global context, the groups of EU states with countries outside the EU were compared with equivalent WGI scores. It was found that according to the WGI, Denmark, Sweden and Finland are on par with the highest performing countries in the world such as Canada, New Zealand and Singapore. The average country in group 2, which includes countries such as Portugal and Slovenia, has a WGI score similar that of South Korea, Israel and Qatar. The average country in the third group, which includes countries such as Latvia and Greece, has similar QoG levels as Botswana, Costa Rica, Uruguay or Kuwait. The two countries in the lowest group -Bulgaria and Romania are ranked on par with countries such as Colombia, Panama, India and Ghana. Thus, while the EU states score in the top fiftieth percentile globally, these global comparisons indeed show evidence of significant variation at the country level, which merits further investigation.

MEASURING QUALITY OF GOVERNMENT AT THE REGIONAL LEVEL IN EUROPE
While certainly relevant as a starting point, the nationallevel cluster groups do not tell the whole story.
National-level data have of course proliferated in recent years, yet measuring QoG at the regional level within most individual countries is still 'uncharted territory', let alone measuring regional QoG in a multicountry context. Several recent surveys have been launched by Transparency International in Mexico and India to build measurements of corruption at the regional level. However, in most countries, in particular those in the Europe, such data do not exist and those that do are more narrowly focused on capturing corruption alone, mostly in Italy (for example, DEL MONTE and PAPAGNI, 2007;GOLDEN and PICCI, 2005).
To add to the necessary nuance to the national-level WGI data, the authors take advantage of data acquired for a large, European Commission-funded project on measuring QoG within the EU (CHARRON et al., 2011). The authors began with a survey of approximately 34 000 EU citizens, which constitutes the largest survey ever undertaken to measure QoG at the sub-national level to date. A regional-level QoG index score for 172 NUTS-1 and NUTS-2 regions (Nomenclature des Unités Territoriales Statistiques) within eighteen EU countries was built based on survey questions on citizen perception of QoG. 11 As a compliment to national-level QoG data, the citizenbased data offer a source of information that is not subject to the common criticism that QoG data are biased toward 'business friendly' environments (KURTZ and SCHRANK, 2007). For a more detailed description of the survey, see Appendix A.
To capture the most relevant sub-national variation in QoG possible, the work focused on three public services that are often financed, administered or politically accounted for by sub-national authorities, at either regional, county or local level: education, healthcare and law enforcement. 12 Respondents were asked to rate these three public services with respect to three related concepts of QoG based on their own experiences as well as perceptions: quality, impartiality and the level of corruption of said services.
It can be argued that the administrative and political responsibility of the regions in these three public services varies in different countries and thus this may be problematic for data-gathering. However, it is argued here otherwise. The paper seeks to capture all regional variation within a country and, as noted in the literature (for example, TABELLINI, 2005), numerous empirical indications and much anecdotal evidence suggest that the provision and quality of public services controlled by a powerful central government can nonetheless vary substantially across regions.
Furthermore, regions have become more salient in the EU in terms of expenditure and authority. Public expenditure managed by regional authorities in the EU has grown substantially from 18% of total public expenditure in 1995 to 32% in 2008 (EUROPEAN COM-MISSION, 2010b). In addition, a recent study by HOOGHE et al. (2010) shows how over the last forty years the political and fiscal authority of regions in Europe has grown. Lastly, a large share of EU Cohesion Policy funds is managed by the regions themselves. Therefore, regions are becoming more important actors and in the cases where they are currently merely statistical units 13 they are likely to become more relevant in future.
The regional data itself combine sixteen survey questions about QoG at the regional level. To construct the regional index, the guidelines expressed in the OECD's Handbook on Constructing Composite Indicators (NARDO et al., 2008) were followed carefully. Although the robustness of the results were thoroughly checked, testing alternative methods to building the data, the index was constructed as follows. First, all sixteen QoG questions were aggregated from the individual to the regional level as a mean score. Next, the sixteen regional scores were standardized, so as to have a common range via standardization. 14 A factor analysis was then performed to see if the sixteen questions formed significant subgroups in the data. Three relevant groups were found which were labelled 'pillars': questions pertaining to impartiality, to corruption and to quality all constituted separate factor components (media and election questions aligned with the 'quality' pillar). 15 Each variable was given equal weight within each pillar. Finally, the three pillars were combined using equal weighting to form the regional index.
It was found that the results of the aggregation of the regional data revealed fairly predictable patterns among the regions with respect to QoG. All regions within the top performing EU members with regard to the national QoG index (Denmark, Sweden and Netherlands) were in the top 15% of all 172 regions. Among the new member states, all but one of the regions were in the bottom 50% (that is, they had a score lower than zero), with the only exceptions being Nord Vest (RO11) in Romania. In contrast, most of the EU-15 regions were in the top 50%, with Portugal and Greece being the only exceptions having all their respective regions under the mean. Moreover, several of the regions in France and Italy were under the EU mean, with the later containing two in the bottom 10% of the sample. 16 As with the national-level data, internal consistently checks and a rigorous sensitivity test of the regional data were performed. To test the internal consistency of the sixteen indicators, Chronbach's Alpha correlation test, pairwise correlations and a principal component factor analysis were used. 17 Sixty-two alternative simulations were then performed in which the sensitivity of the data was tested. First, the robustness of the equal weighting scheme was checked using factor weights instead. Second, the additive method of aggregation was substituted with geometric aggregation, and the data were normalized via a 'minimum-maximum' method in place of standardization. Third, each individual question was removed one at a time as well as whole question groups (for example, all questions pertaining to 'quality', 'impartiality' or 'corruption'). Finally, for several simulations the data were re-aggregated from the individual level, whereby certain demographic groups, such as men, high-income respondents, young respondents, higher educated respondents and those who did not have any interaction with any of the public services in question within the last twelve months, were excluded. It was found that even in the most extreme scenarios, the Spearman rank coefficient never fell below 0.90 and that the median shift in the rankings was never above 11 as compared with the original index. The results of the sensitivity testing demonstrate that the regional data and scores are strongly robust and internally consistent. 18

COMBINING THE TWO LEVELS OF DATA -THE EUROPEAN QOG INDEX (EQI)
Although the entire sample of respondents in the regional level survey was large (34 000), the number of respondents per region was on the smaller side (200). Further, the authors sought to add the country context to each region's QoG score, which it was assumed would also be influenced by such factors as the national legal system, immigration, trade and security areas that are not captured in the regional QoG data. Thus, credible and robust observations were added to the regional-level data to compensate for any outlying region or country in the regional survey (which could be the results of limited observations) while adding the 'national context' of QoG. To accomplish thisalong with including the nine other smaller EU countries in the samplethe WGI external assessment was combined with the citizen-based, regional-level data to create a comprehensive European QoG Index (EQI). The aim was to come up with a method that included the omitted EU countries from the survey while simultaneously maintaining the richness of the withincountry variation in several of the countries surveyed in the regional-level study.
To calculate the score for each region and country, the country average from the WGI data was taken from Table 1 and standardized for the EU-27 sample. For countries outside of the regional survey, there was nothing to add to the WGI Country score, thus the WGI data were used as the QoG estimate alone 19 because regional variation was unobserved.
For the eighteen countries with regional data, the national average based on the WGI was taken and the within-country variance based on the regional-level data described above was added. Simply speaking, one starts by calculating a national, population-weighted average of the regional scores for each of the countries in the survey. This national average score is then subtracted from each region's individual QoG score in the country, the result of which shows if a region is above or below its national average and by how much. This figure is then added to the national-level WGI data, so each region has an adjusted score, nationally centred on the WGI. The formula employed is the following: where EQI is the final score from each region or country in the EQI; WGI is The World Bank's national average for each country; Rqog is each region's score from the regional survey; and CRqog is the country average in country Y (weighted by regional population) of all regions within country Y from the regional survey. In keeping with the same scale as the WGI data, EQI is standardized so that the mean is zero with an SD (standard deviation) of 1. The data were also readjusted to go from zero to 100; both scales are given in Appendix A for scholars and practitioners who prefer this range (EQI100).
Although the national-level data and regional-level data are indeed directed at different levels of government the WGI taps into the quality of the national public sector broadly speaking, while the regional survey explicitly asked respondents about their regional servicesit is argued that these two measures are indeed similar enough to combine. First and most obviously, they both capture aspects of QoG such as corruption, quality of services, impartiality and rule of law. While the national data might be focused on several sectors of the national bureaucracy that are not measured by the regional-level datafor example, defence, immigration, etc.this is not found to be problematic. While administrative and fiscal responsibilities vary from region to region in the EU, such areas of the public sector are out of the realm of all regional governments, thus it is most appropriate that they are not included in any regional studies. Second, the WGI data are robust, well-established and internationally used measures, and are thus suited to estimate the country-level scores. In adjusting the national-level scores of the member states, none of the rich sub-national variation from the regional-level survey data is sacrificed. Finally, in using the WGI as an 'anchor' so to speak, around which each country's regional variation is explained, one can retroactively adjust data if in future there are rounds of regional data collection when additional countries or regions are added. 20 Fig. 1 shows the combined data between the WGI national-level QoG scores and the regional QoG data; Fig. 2 shows the national averages with the withincountry range of sores. For a full list of scores for each region and country in rank order, see Appendix A. The data show that eleven of the EU-15 countries have all their regions and/or national scores above the EU average, while all regional and national-level scores for the new member states are under the EUwide, mean average. 21 Five EU-15 countries -Italy, Spain, France, Belgium and Portugalcontain regions that are both above and below the mean score, while Greece is the only EU-15 country to have all of its regions below the mean level of QoG in the EU. Among the new member states, the regional ranks are all below the EU mean, with the highest ranking region being Jihozápad (CZ03) from the Czech Republic (-0.05).
To facilitate reliable comparisons across regions, a margin of error at the 95% confidence level was constructed. This level equates to the probability that a margin of error around the reported QoG estimate for each region would include the 'true' value of QoG or, in other words, that the margin of error indicates that one can say with about 95% confidence that a region's estimate of QoG can be found within a ±1 margin of error. 22 While not exactly a 'margin of error' in a traditional sense, the range expresses the uniformity around which respondents ranked their region's QoG in the sixteen questions. Thus, the regions that have the largest margins of error are those in which respondents expressed a relatively large gap in the response between two or more sets of questions for a service or concept (such as education or 'impartiality', for example). For example, most respondents in Spanish regions believed their public services were among the most impartial in Europe, yet they ranked them below average on corruption questions, for example, leading to wider margins of error than most other EU regions. The lowest margin of error belongs to the Polish region of Kujawsko-Pomorskie (0.166), meaning that respondents ranked their services in this region very consistently across all sixteen questions. The result show that Danish and Polish regions on the whole have the tightest confidence intervals, while Spanish, Romanian and Czech regions tend to have the widest margins. For all margins of errors around the final EQI estimates, see Appendix A. 23 Fig. 2 shows the within-country regional variation of the EQI using a simple method of 'minimummaximum' comparison. 24 Interestingly, the data show that within-country QoG variation is at times equally or more important than cross-country variation. For example, Fig. 2 shows the rank order of EU countries (again, using the WGI, national assessment as the country mean). For instance, the gap between Bolzano (ITD1) and Campania (ITF3) in the data is much larger than the gap in the national averages between Denmark and Portugal, while the gap between Flanders (BE2) and Wallonia (BE3) is larger than that between Belgium as a whole and Hungary. Further, while the national gap between Bulgaria and Romania at the national level is negligible, their national scores are noticeably lower than the national scores of other states such as Slovakia, Poland, Italy and Greece. However, the top regions from each country -Nord Vest (RO11) in Romania and Severoiztochen (BG33) in Bulgariaare statistically indistinguishable from average ranking regions within those other four countries. It is noteworthy to mention that the EQI can be employed in cross-sectional analysis only at the regional level at this point, yet the WGI combined national-level index can be employed to make comparisons over time. 25

FIVE HYPOTHESES ON WHY SOME REGIONS HAVE BETTER QUALITY OF GOVERNMENT
This section seeks to elucidate some general patterns of QoG variation within and across countries by testing five prevailing hypotheses in the literature using the newly constructed EQI. Following most of the empirical literature on QoG, this study does not aim to provide a unique causal directionespecially given the crosssectional nature of the databut only to show if a statistically meaningful relationship is present. First, it has been argued and found in several empirical cross-national studies that indicators of QoG are highly correlated with proxies for socio-economic development, such as education attainment, income levels, technology or health. Scholars have consistently found a strong empirical connection between reaching higher levels of economic development and higher levels of various measures of QoG (ACEMOGLU et al., 2004; KNACK and KEEFER, 1995;MAURO, 1995;HOLMBERG et al., 2009), as well as measures of macro-level indicators of health in society (MAURO, 1998;GUPTA et al., 1998). Therefore, the following hypothesis was tested: Hypothesis 1: Levels of socio-political development will be positively associated with the European QoG Index (EQI) in regions and countries across the European Union. 26 Second, the relationship between several demographic variables and the EQI, such as the regional population and size of the geographical area, was tested. Several studies have examined these factorsor equivalent oneswith, generally speaking, mixed results on their importance to explain variation in the QoG (ALESINA and SPOALARE, 1998;KNACK and AZFAR, 1999;ALESINA, 2003). The arguments and evidence are diverse regarding the size of a polity and its level of QoG. On the one hand, the argument that smaller populations are more manageable goes back to ancient Greece. Aristotle wrote that 'experience has also shown that it is difficult, if not impossible, for a populous state to be run by good laws ' (quoted in ALESINA 2003, p. 303). On the one hand, this seems a reasonable prediction, as relatively small Nordic countries, like Denmark, Sweden and Finland, are all among the best performers in most QoG measures worldwide. On the other hand, KNACK (2002) provides evidence that larger US states have higher-quality management practices, even when controlling for a number of socioeconomic variables. However, KNACK and AZFAR (1999) find no relationship between size and corruption in a large cross-country sample. Thus, there is no clear direction to predict, but it is tested whether QoG and population or area size are systematically related within and across EU countries: Hypothesis 2: Quality of government (QoG) within and across countries in the European Union is systematically related to the size of a region or country.
Third, based on several recent studies, it is hypothesized that regions/countries with higher degrees of social trust will have higher scores in the EQI. It has been extensively argued that higher levels of generalized trustthat is, trust in strangers or people who do not belong to 'your group'is a function of higher QoG (ROTH- STEIN and USLANER, 2005). Where 'people have faith only in their in-group'understanding by it either a family, a clan, an ethnic group or other social groupings such as a political partya society, and thus its politics, is 'seen as a zero-sum game between conflicting groups' (ROTHSTEIN and USLANER, 2005, pp. 45-46). In these conditions, citizens feel less attached to their political communities than to a particular social group and thus less eager to contribute to the provision of general public goods, such as paying taxes, respecting and protecting public spaces, and, very importantly, engaging in social and political mobilizations asking for improvements in QoG. Generally speaking, 'freeriding' becomes more frequent at all social levels. In turn, public authorities lack both adequate resources and incentives to deliver policies, consolidating a 'vicious cycle'. The following hypothesis was therefore tested: Hypothesis 3: Quality of government (QoG) is positively associated with social trust within and across countries in the European Union.
Finally, it was tested whether there was an empirical link between the level of political decentralization and/or federalism and the amount of within-country variation in QoG as well as the level of QoG itself across countries.
Although mostly untested empirically due to a lack of empirical data at the regional level, several scholars have asserted that greater levels of decentralization are associated with larger disparities from region to region with respect to variables such as bureaucratic quality or corruption in the public sector (TANZI, 2001). In other words, when regions gain more decisionmaking control, the stronger ones will perform better and the weaker ones will sink even deeper, creating larger gaps within decentralized states than in centralized ones. Further, several studies have tested whether decentralization leads to higher or lower QoG across countries, with several arguing that political decentralization and/or federalism create greater problems of collective action and more cumbersome decision-making rules (GERRING and THACKER, 2004), while others such as LIJPHART (1977) and WATTS (1999) argue that greater vertical power sharing in the form of decentralization or federalism would lead to better QoG outcome. Two hypotheses regarding the impact of decentralization over within-country variance and over levels of QoG were therefore tested: Hypothesis 4: Greater levels of political decentralization will be associated with higher levels of within-country variance of quality of government (QoG).
Hypothesis 5: Greater levels of political decentralization will systematically impact the level of quality of government (QoG) at the country level.

RESULTS
The results are presented in two steps: first, with bivariate scatterplots, and, second, with multivariate ordinary least squares (OLS) with country fixed effects for the five hypotheses. While admittedly simplistic and unable to determine causal direction, it is argued that as a 'first cut' analysis with the EQI, the straightforward scatterplots and basic fixed-effects regressions are quite revealing.
One of the most established indicators of socio-economic development is used to test Hypothesis 1the Human Development Index (HDI). This is a composite index based on several measures such as life expectancy with good health, net-adjusted household income, and the ratio of high and low education achievement in the population aged 25-64 years (BUBBICO and DIJKSTRA, 2011). The HDI ranges from zero to 100, with higher values equalling greater levels of socio-economic development. 27 Given the strong likelihood of endogeneity between QoG and HDI, as well as additional factors that could cause increases/decreases in either QoG or human development, this section begins with the most basic analyses: a bivariate scatter plot with significance values to show the regional and country variation across the EU, and an OLS regression with fixed country effects to account for unobserved country differences to test whether the relationship between development and QoG is also present within countries in Table 2. Fig. 3 shows clear support for Hypothesis 1. The R 2 statistic shows that the HDI explains almost 60% of the total variation of the EQI. The beta-coefficient from the bivariate regression reveals that that an increase by 25 points in the HDI is associated with an increase in the EQI by 1 (or a full 1 SD). Table 2 shows that the relationship between the EQI and HDI holds when controlling for population, area size and country fixed effects (model 1), and even when controlling for social trust in a more limited sample (model 3). In all cases, the HDI is significant at the 99% level of confidence. Fig. 4 illustrates the relationship between QoG and the size of a region (population or area). Both population (in thousands) and area (km 2 ) are taken from Eurostat. In order to explore different connections, the relationship between the EQI and logged variables for the size of regions is shown. 28 Here no evidence is found to suggest that more (or less) populous regions have higher levels of QoG in the EU-wide sample. Nor is it found that area size is systematically linked to the EQI. Yet, Table 2 shows an interesting finding, namely that when fixed effects and HDI are included, both variables become significant. This means that while not significant EU wide, more populous regions and larger regions in terms of area have lower and higher QoG, respectively, within countries. A closer look at the data shows that in several countries the populous and smaller area regions such as Bucharest, Sofia, Prague, London, Brussels and Budapest have the lowest QoG score in their respective country, which is mainly driven by the fact that citizens in these regions rated the three public services to be more corrupt compared with other citizens in other regions within these countries. It was found that area size is positively related to the EQI when controlling for country fixed effects, yet this finding is not as robust as the population variable. It is concluded that while the two variables are not significant EU wide, they are systematically related, on average, within countries, demonstrating somewhat mixed support for Hypothesis 2. An explanation could be that in large cities (which tend to have a large and often diverse population in a small area) residents have more opportunities to experience corruption than in other regions in the same country; this explanation, however, requires further research. Fig. 5 tests Hypothesis 3. Data on social trust are taken from TABELLINI (2005) and are available at the regional level for seventy-three EU regions. 29 Fig. 4 shows a strong, positive relationship between trust and QoG and the best coefficient (0.03) is positive and significant at the 99% level of confidence. The bivariate relationship weakens somewhat when removing the Italian regions from the analysis (beta drops to 0.018), yet the relationship remains 99% significant. 30 Furthermore, even when controlling for country fixed effects, it is found that higher levels of QoG are associated with higher social trust within countries as well, yet the relationship drops from significance in model 3 of Table 2. 31 Again, deciding whether this correlation follows a causal relationshipas well as the direction of this causality (for a thorough discussion on this issue, see ROTHSTEIN and USLANER, 2005)or if it is spurious is, because of the cross-sectional nature of the data analysed, beyond the scope of this analysis.
In testing Hypotheses 4 and 5, the paper began by looking at the level of within-country variation (measured as the distance between highest QoG regions minus the lowest in each country 32 ) and country levels of QoG in federal, semi-federal and unitary countries. Fig. 2 shows the EU-27 in rank order with respect to the EQI and the sub-national variation in each country. 33 Here it is found that some EU countries considered as truly federal -Austria and Germanyhave less within-country variation in QoG than Romania or Bulgaria, which are two highly centralized countries. In addition, Austria and Germany have less within-country variation than two highly centralized older EU members such as Portugal or Greece, yet another federal country, Belgium, reveals much variation among its regions.
The countries that are considered 'semi-federal' and have meaningful political and administrative regions within the data -Spain and Italyhave quite high within-country variation. Nevertheless, with only two data points it is too difficult to generalize. Moreover, while the three federal countries perform above the EU mean, the two semi-federal countries are at (Spain) or below (Italy) the EU average and unitary countries range from highest (Denmark and Sweden) to lowest (Romania and Bulgaria). What can be seen, howeverat least in terms of a unitary/federal variableis that there is no clear relationship with this and withincountry QoG variation or countrywide QoG levels.
To test Hypotheses 4 and 5 further, this paper takes advantage of several recent indicators of political decentralization from HOOGHE et al. (2010). It was tested whether four of their variables indicating the level of political decentralization are related to either higher within-country QoG variation or simply higher levels of QoG levels. Those indicators of decentralization are 'policy scope', 'representation', 'law-making' and 'constitutional reform'. 34 Figs 6 and 7 show the relationship between the four decentralization variables and QoG for the eighteen countries that have regional data in the EQI. 35 Fig. 6 shows no evidence suggesting that higher levels of policy scope or representation have any relationship with Fig. 6. Political decentralization, European QoG Index (EQI) and within-country variation either within-country QoG variation or levels of QoG at the national level. Although all four Pearson correlation coefficients are in the expected direction (positive), none is statistically significant at even the 90% level of confidence. Fig. 7 shows that neither higher levels of decentralizationmeasured as law-making and constitutional reform are associated with greater disparities of within-country QoG among regions. In this case, Pearson correlation coefficients are in the opposite direction than expected (that is, negative), yet they are statistically indistinguishable from zero. Some empirical evidence, however, was found to suggest that EU countries that have greater regional law-making and constitutional reform have higher levels of overall QoG according to the EQI, yet the correlations are not within but only near the levels of conventional acceptability, yet with a majority of EU countries scoring zero on both these measures, any generalization from these results on decentralization and variation in, or actual levels of, QoG should be made with a good deal of caution.

CONCLUSIONS
The original data and analysis presented in this analysis make several contributions to the literature. First and foremost, the paper has mapped out the levels of QoG among 172 EU regions based on the experiences and perception of citizens, which, in combination with the external, largely expert assessment of the nationallevel data, represents the most encompassing data up to date on sub-national variation in corruption or good governance variables. The authors believe that this study and data may be highly valuable to both scholars and practitioners alike focusing on a wide range of topics regarding governance in Europe. For example, a region with a low QoG in the EU is much less likely to use the Cohesion Policy funds in an efficient and effective manner, or to have lower levels of small business entrepreneurship (CHARRON et al., 2012). Such a region may remain stuck in low growth and low QoG equilibrium, while the regional government remains to some degree sheltered from the financial consequences of low QoG through continuing support from the EU. This may explain why the reform of EU Cohesion Policy puts a greater emphasis on creating the right conditions for development as an important prerequisite to (continue to) receive funding (EUROPEAN COMMISSION, 2010b).
Finding the right mix of incentives and policies that improve QoG in lagging regions could make a substantial contribution to higher growth in those regions and thus to more convergence between EU regions. The data presented here can serve as a valuable benchmark to monitor changes in governance at the national and regional levels in the EU.

Fig. 7. Political decentralization, European QoG Index (EQI) and within-country variation
This study has found a notable amount of variation both between and within EU countries. At the national level, a first group of Northern European countries tend to show the highest levels of QoG. They distinguish themselves from the three groups next on the ladder: the second group encompasses most Southern Mediterranean statestogether with Estonia and Sloveniawith moderate levels of QoG; the third group covers most of new member states that demonstrate moderate to low levels of QoG; and the fourth group consists of the two newest member states which have the lowest levels of QoG in the EU. At the regional level, significant within-country variations can be found in federal or semi-federal nations such as Italy, Belgium or Spain, but also, noticeably, in more centralized ones, such as Portugal, Romania or Bulgaria. Other countries, like Denmark, Poland, Austria or Slovakia, show very little variation across regions.
Five hypotheses were tested that could help to explain some of the variation in QoG found between and within EU countries. Strong empirical evidence was found that the HDI is positively related to the indicator of QoGboth within and across countries in the EU. Similar evidence was found with respect to the variable for social trust. On Hypothesis 2, a more nuanced relationship was foundthe EQI had no statistical relationship either within or across the full sample, yet when accounting for country-fixed effects robust evidence was found to suggest that more populous (regions with greater area size) have lower (higher) QoG within countries themselves on average, which may imply that QoG is lower in the large cities in a country, which the data show is largely driven by higher corruption in these areas relative to other regions.
Most surprising, given the sizeable literature on the consequences of federalism and/or decentralization for governance, was the lack of a relationship between a relatively large number of proxies for political decentralization and QoG. It was hypothesized that countries with greater degrees of political decentralization would exhibit higher degrees of within-country variation of QoG for their respective regions. No such evidence was found using several different measures of political decentralization. Furthermore, there is no empirical pattern between decentralization and country levels of QoG according to the EQIdecentralized and/or federal countries are not more or less likely to have higher levels of aggregate QoG relative to more unitary/centralized states within the EU.
One explanation could be that variation within a country is linked to variation in both political decisionmaking (as one would expect in federal/more politically decentralized countries) and the quality of implementation of a (theoretically) centrally administered service, which has been relatively overlooked in the theoretical literature on federalism. For example, even though a country like Romania is highly politically centralized, certain regions may have developed specific patterns of policy implementation (for example, more meritbased and less patronage-based public organizations), which may play a decisive role in the quality of their public services. The only consistent pattern is that irrespective of decentralization, the countries in the highest cluster group all have relatively low withincountry QoG variationeven Germany and Austria, which are federal countries.
The findings presented in this study open the door to several relevant questions which could be explored in future research. For instance, which cultural legacies, economic variables or institutional factors may explain the notable regional differences on governance? How are, for example, regional QoG and political party or electoral systems at the regional level related? The data presented here can thus be of use for scholars addressing these questions in fields as diverse as comparative political economy, EU studies, federalism, decentralization and regional politics or comparative public administration. In addition, with regions playing such a growing role in the provision of public services and being the recipients of large transfersat national level in many EU member states, but also at European level through the EU Development Fundsthe data presented here can serve as an initial tool of empirical assessment for practitioners interested in regional development policy and aid allocation.
Based on the findings of this study, the authors would like to conclude by underscoring the importance of focusing on QoG not only in developing regions of the world, but also inside the EU. As this study has shown, still too many EU residents report having firsthand experience of corruption and discrimination, and the share of residents confronted with these issues is far higher in some regions and countries. Despite the methodological problems always inherent in capturing a concept like 'good governance', the preliminary data indicate that QoG in the EU seems to vary to a very large extent both between countries and between regions within these countries. In addition, those regions where QoG is perceived to be low by their own citizens are those regions that perform the worst in the standard indicators of human development. A tentative normative conclusion would thus be thatapart from the existing transfer policiesa joint and targeted effort to improve QoG in those regions with lower levels could substantially improve the economic prospects of these regions and the lives of their residents.

Description of sub-national survey
The European Union regional survey was undertaken between 15 December 2009 and 1 February 2010 by Efficience 3, a French market-research, survey company. The respondents, ranging from eighteen years of age or older, were contacted randomly via telephone in the local language by the 'birthday method' with replacement. As found by LONGSTRETH and SHIELDS (2009), although not as demographically representative as the 'quota method', the birthday method obtains a reasonably representative sample of the population while providing a better distribution of opinion.
In trying to capture any regional variation within a country, thirty-four QoG and demographic-based questions were asked to the approximately 200 respondents per NUTS region. Regarding the QoG questions, respondents were asked about three general public services in their regions: education, healthcare and law enforcement. National-level publically administered areas such as immigration, customs or national security were intentionally avoided because these are dealt with at the national or even the supra-national level. In focusing on these three services, respondents were asked to rate their public services with respect to three related concepts of QoG: the quality, the impartiality, and the level of corruption of said services. 36 In addition, two further questions were included in the index: one about the fairness of regional elections; and the other about the strength and effectiveness of the media in the region to expose corruption.
Sixteen survey questions incorporated in the regional QoG index Rule of law-focused questions . How would you rate the quality of the police force in your area? (Low/high, 0-10) . The police force gives special advantages to certain people in my area. (Agree/disagree, 0-10) . All citizens are treated equally by the police force in my area. (Agree, rather agree, rather disagree, or disagree, 1-4) . Corruption is prevalent in the police force in my area.

Government effectiveness-focused questions
. How would you rate the quality of public education in your area? (Low/high 0-10) . How would you rate the quality of the public healthcare system in your area? (Low/high 0-10) . Certain people are given special advantages in the public education system in my area. (Agree/disagree, 0-10) . Certain people are given special advantages in the public healthcare system in my area. (Agree/disagree, 0-10) . All citizens are treated equally in the public education system in my area. (Agree, rather agree, rather disagree, or disagree, 1-4) . All citizens are treated equally in the public healthcare system in my area. (Agree, rather agree, rather disagree, or disagree, 1-4) Voice and accountability-focused questions . In your opinion, if corruption by a public employee or politician were to occur in your area, how likely is it that such corruption would be exposed by the local mass media? (unlikely/likely, 0-10) . Please respond to the following: Elections in my area are honest and clean from corruption. (Agree/disagree, 0-10)

Corruption-focused questions
. Corruption is prevalent in my area's local public school system. (Agree/disagree, 0-10) . Corruption is prevalent in the public healthcare system in my area. (Agree/disagree, 0-10) . In the past 12 months have you or anyone living in your household paid a bribe in any form to: Health or medical services? (Yes/no) . In your opinion, how often do you think other citizens in your area use bribery to obtain public services? (Never/very often, 0-10)  -1990-1991 and 1995-1997and assigns each responded to their corresponding region. The mean number of respondents per region in the sample is 320. The TRUST variable ranges from 14.18 to 64.14, with higher values equating to higher levels of social trust.
. Population: total population of a country or region (logged) (from Eurostat). . Area size: total area is in km 2 (logged) (from Eurostat). (from HOOGHE et al., 2010) . 'Policy scope', which gauges the extent to which regions in a country have authority over policies such as culture-education, welfare, police, economic policy and control over local governments. . 'Representation', which indicates the extent to which regional assemblies and executives obtain their authority (through election, appointment, there is no regional executive/parliament, etc.). . 'Law-making', which shows the extent of regional law-making influence at the national level, from no representation to the ability for a majority of regions to veto national legislation. . 'Constitutional reform', which measures the extent to which a majority of regions (independent of the national parliament) can change the national constitution. All variables are coded so that higher values mean higher levels of decentralization.     (2010). 9. For the sake of parsimony, equal weighting is used, yet the sensitivity of this aggregation method is tested by using factor weights. The Spearman rank coefficient was over 0.99. 10. It is worth noting several alternative cluster groupings. For example, if just two groups had been chosen, then countries from Cyprus and above in Fig. 1 would belong to group 1 and from Estonia and below would belong to group 2. Three groups would remain the same, except that Romania and Bulgaria would join group 3. The next division would have been six groups, whereby groups 1 and 2 in Fig. 1  Reform, Denmark's former sixteen counties were replaced with the five NUTS-2 regions, all now having elected regional governments with near exclusive political and administrative power over the healthcare system, along with transportation and other local policy areas. A similar type of reform was made in Poland in 1999, as the elected sub-regions were drawn around the European Commission's NUTS-2level regions in preparation for European Union membership. For more information on Denmark's reform, see: http://www.regioner.dk/; for more on Poland's reform, see FERRY (2003). 14. For example, some questions range from zero to ten, others from zero to three, and others are dichotomous.

Measures of political decentralization
To combine two or more indicators into a composite index, the data must be adjusted to have a common range. 15. To determine the number of factor groupings, the Kaiser criteria was followedwhereby a significant group must have an eigenvalue greater than 1 and the sum total of all significant factors must equal 60% or greater of the total variation. 16. Although only the finalized EQI data are reported in Appendix A, for the raw regional data, see http:// www.qog.pol.gu.se/data/eu-project-2010/. 17. Chronbach's alpha coefficient of reliability was 0.94, while 89% of the pairwise correlations among the sixteen questions were positive and significant. Principal component analysis (PCA) demonstrated that the questions factored together according to QoG concepts of corruption, impartiality and quality. 18. As noted by NARDO et al. (2008), it is important to check the interaction effects of each of these adjustments, thus, for example, testing the removal of the 'corruption' pillar in all possible combinations of weighting, aggregation and normalization of the data. For a more detailed account of the robustness checks, see CHARRON et al. (2011). 19. The score is slightly changed due to the re-standardization when the national and regional level estimates are combined so as to set the final EQI's mean to zero and standard deviation to 1. 20. For example, round 2 of the EQI data, scheduled for 2013, will include countries such as Croatia and Turkey in addition to the eighteen countries in this study. 21. However, on specific combined pillars, such as Rule of Law or GE, several regions in the Czech Republic are above the European Union mean score. 22. The authors know from basic statistical probability that in a sample 'x', 95% of the area of a basic normal Bell curve is between the estimate (µ) 1.96 ± the standard error around µ. The standard error (SE) is calculated as: The margin of error for each individual region is based around the QoG estimate: 1.96 + s/ n √ with N = 16, because there are sixteen indicators in the QoG index which have been aggregated from the survey data. Each region thus has its own individual margin of error based on the consistency of the estimates for each of the sixteen aggregated questions in the survey. The authors end up with an average margin of error of 0.338, or about one-third of a full 1 SD, with a minimum of 0.166 to a maximum of 0.705. 23. In addition to the standardized scale for the EQI, the data were also normalized to range from zero to 100. The authors thank an anonymous reviewer for this suggestion. 24. Although minimum-maximum is sometimes overly simplistic and can overlook variation within the minimum and maximum regions, it is worth noting that when this was compared with other measures of withinregional variation of QoG, such as a Gini index, the coefficient of variation and the Thiel index, it was found that all measures correlate vary highly (a Spearman rank correlation of 0.85 or higher) with the measure of minimum-maximum shown in Fig. 2. 25. The next round of data collection at the regional level is scheduled for 2013, thus making a limited overtime comparison possible by 2014. 26. By 'countries' is meant the nine European Union member states outside of the regional survey, which covers the eighteen largest European Union members. 27. For a description of all variables in this section, see Appendix A. The HDI is not available for oversees French departments. 28. In addition, the relationship with the non-logged variables is tested, with the only difference being a slightly weaker relationship with the EQI. 29. Regions are available for Belgium, Italy, Spain, the UK, Germany (West only), Portugal and the Netherlands. 30. The authors thank an anonymous reviewer for this suggestion. The scatter plot without Italian regions is not shown here. The bivariate relationship results yielded the following results: β = 0.018, p = 0.001, R 2 = 0.24, and number of observations = 52. 31. That 'trust' falls from significance in model 3 is most likely due to multicollinearitythe correlation between HDI and trust is 0.63. 32. In addition to the parsimonious 'minimum-maximum' method of calculating regional variation within a country, Hypothesis 4 was tested using a Gini index and the coefficient of variation. Differences in the results were negligible. 33. There are, of course, multiple ways of measuring the extent to which 'within-country variation' is present. The most simple method was chosen for the sake of parsimony: minimum-maximum (the maximum regional score minus the minimum score in each country). For more approaches to this issue, see SHANKAR and SHAH (2003, pp. 1422-1425. 34. For a description and statistical summary of each variable, see Appendix A. For a more detailed description of these variables, see HOOGHE et al. (2010, pp. 126-136). 35. In both figures, the left-hand side (within-country variation) tests Hypothesis 4, while the right-hand side tests Hypothesis 5 (EQI). 36. These are related concepts that have come up frequently in the comparative QoG literature, thus the authors try to include citizens' opinion regarding all three; for more information, see HOLMBERG et al. (2009).