This file includes the code for a chapter to be submitted to the Routledge Handbook of Philosophy of Economics (edited by Conrad Heilmann and Julian Reiss).
Loading packages:
require(data.table)
require(ggplot2)
require(ggrepel)
require(tidyr)
require(igraph)
require(RMySQL)
require(bibliometrix)
require(dplyr)
require(stringr)
require(tm)
require(RColorBrewer)
require(janeaustenr)
require(tidytext)
require(knitr)
require(zoo)
require(viridis)
require(tools)
require(xtable)
require(DT)
Loading project-specific functions:
source("FCT_util.R")
We load and transform some smaller objects that will be central to the workflow. The bigger objects will be loaded in individual code chunks (and then removed at the end of the chunk) to avoid caching too much data.
# Loading discipline info:
discipline_info <- readRDS("/projects/digital_history/interdisciplinarity/data/discipline_info.rds")
# We want philosophy and science studies together
discipline_info[Code_Discipline %in% c(126, 139), discipline:= "Philosophy and Science Studies" ]
# When we start
first_y <- 1990
# When we stop for JEL metho corpus (see below for the explanation)
last_y_metho = 2018
#To be sure that the tf-idf graph and the topic_thru_time graph have the same colors, we define them here.
colors_cluster <- viridis(9)
names(colors_cluster) <- c("Moral\nPhilosophy", "Big M", "Political\nEconomy",
"Decision\nTheory", "History of\nEconomics", "Small m","Critical\nRealism",
"Institutional\nEconomics", "Behavioral\nEconomics")
#Our cleaning process produces some spelling errors, so we correct them with this conversion table.
conversion_table <- c("terence hutchison" = "Terence Hutchison",
"data mining" = "data-mining",
"mccloskey" = "McCloskey",
"datamining" = "data-mining",
"adam smith" = "Adam Smith",
"amartya sens" = "Amartya Sen",
"tony lawson" = "Tony Lawson",
"post keynesian" = "Post Keynesian",
"blaug" = "Blaug",
"john" = "John",
"mill" = "Mill",
"nobel" = "Nobel",
"alfred marshall" = "Alfred Marshall",
"lionel robbins" = "Lionel Robbins",
"veblen" = "Veblen",
"friedman" = "Friedman",
"friedman methodology" = "Friedman methodology",
"lakatos" = "Lakatos",
"boland" = "Boland",
"cambridge" = "Cambridge",
"cambridge controversy" = "Cambridge controversy",
"evidencebased" = "evidence-based",
"sens" = "Sen's",
"symposium amartya" = "symposium Amartya",
"sens philosophy" = "Sen's philosophy",
"mises" = "Mises",
"weintraubs" = "Weintraub's",
"cook" = "Cook",
"coase theorem" = "Coase theorem",
"soros" = "Soros",
"keyness" = "Keynes's",
"george" = "George",
"coats" = "Coats",
"hausman" = "Hausman",
"american" = "American",
"austrian" = "Austrian",
"kuhns paradigm" = "Kuhn's paradigm",
"kuhnian perspective" = "Kuhnian perspective",
"zillak" = "Zillak",
"darwinism" = "Darwinism",
"hayeks" = "Hayek's",
"coases" = "Coase's",
"postkeynesian" = "Post-Keynesian",
"keynesian economics" = "Keynesian economics",
"friedrich hayek" = "Friedrich Hayek",
"marxist" = "Marxist",
"american school" = "American school",
"industrialrelations" = "industrial relations",
"economicthought action" = "economic-thought action",
"lucas" = "Lucas",
"shiller" = "Shiller",
"stuart mill" = "Stuart Mill",
"ricardos method" = "Ricardo's method",
"malthus" = "Malthus",
"john stuart" = "John Stuart",
"italian" = "Italian",
"schumpeter" = "Schumpeter",
"pareto" = "Pareto",
"german" = "German")
#For the tf-idf, we decided to hard code the order in which the topic are supposed to appear in the graph. We want it to be in the same order as the topic through time graph for readability. Since the order of discipline in the topic through time graph is calculated after the tf-idf graph is made, we decided to hard code it here instead of changing the order in which graph appears in the markdown (because the present order makes more sense).
order_disc_philo <- c("Moral\nPhilosophy", "Behavioral\nEconomics", "Big M", "Small m", "Decision\nTheory")
order_disc_metho <- c("Institutional\nEconomics","Critical\nRealism", "Political\nEconomy", "Big M", "Small m", "History of\nEconomics")
JEL_doc_topic_map <- data.table(document = 1:6, Topic = c("Big M","Political\nEconomy","History of\nEconomics","Critical\nRealism", "Institutional\nEconomics", "Small m"))
We have two corpora coming from distinct bibliometric sources. Although we cannot give the data away (because of license restrictions), we give in this section much of the pretreatment code use. If someone fetches the data from the bibliometric database, it should be possible to replicate our results.
Our corpus capturing the field of specialized philosophy of economics is composed of two journals:
Although our team typically gets its bibliometric data from the Web of Science's (WoS) version of the Observatoire des sciences et technologies, WoS contains data for JEM only since 2013. To have a complete corpus, we thus turned to Scopus. Data were retrieved in early 2020, so we have complete records for the journals up to and including 2019.
The R package Bibliometrix makes it easy to load the data and generate initial descriptive results.
dt_one_j <- convert2df(file =
"/projects/digital_history/philo_and_economics/data/BibTeX files/Corpus A - Core Phil Eco Journals and Books/Eco & Phil Vols 1-35 (1985-2019).bib",
dbsource= "scopus", format = "bibtex")
##
## Converting your scopus collection into a bibliographic dataframe
##
## Done!
##
##
## Generating affiliation field tag AU_UN from C1: Done!
# print(paste("There are",nrow(dt_one_j), "documents from this journal."))
print("Automated analysis from the bibliometrix package.")
## [1] "Automated analysis from the bibliometrix package."
results_one_j <- biblioAnalysis(dt_one_j, sep = ";")
# summary of the analysis (error with the function)
summary(object = results_one_j, k = 10, pause = FALSE)
##
##
## MAIN INFORMATION ABOUT DATA
##
## Timespan 1985 : 2019
## Sources (Journals, Books, etc) 1
## Documents 614
## Average years from publication 16.2
## Average citations per documents 14.38
## Average citations per year per doc 0.8024
## References 15728
##
## DOCUMENT TYPES
## article 422
## article in press 9
## conference paper 31
## editorial 10
## erratum 4
## letter 2
## note 10
## review 126
##
## DOCUMENT CONTENTS
## Keywords Plus (ID) 45
## Author's Keywords (DE) 388
##
## AUTHORS
## Authors 512
## Author Appearances 758
## Authors of single-authored documents 346
## Authors of multi-authored documents 166
##
## AUTHORS COLLABORATION
## Single-authored documents 497
## Documents per Author 1.2
## Authors per Document 0.834
## Co-Authors per Documents 1.23
## Collaboration Index 1.42
##
##
## Annual Scientific Production
##
## Year Articles
## 1985 19
## 1986 11
## 1987 19
## 1988 16
## 1989 13
## 1990 13
## 1991 16
## 1992 13
## 1993 17
## 1994 19
## 1995 16
## 1996 9
## 1997 16
## 1998 8
## 1999 14
## 2000 15
## 2001 13
## 2002 11
## 2003 19
## 2004 20
## 2005 24
## 2006 18
## 2007 22
## 2008 26
## 2009 18
## 2010 12
## 2011 13
## 2012 16
## 2013 20
## 2014 23
## 2015 20
## 2016 18
## 2017 15
## 2018 33
## 2019 39
##
## Annual Percentage Growth Rate 2.137593
##
##
## Most Productive Authors
##
## Authors Articles Authors Articles Fractionalized
## 1 HAUSMAN DM 11 NA NA 11.0
## 2 NA NA 11 HAUSMAN DM 10.0
## 3 SUGDEN R 11 BROOME J 8.5
## 4 BROOME J 9 SUGDEN R 8.5
## 5 FLEURBAEY M 7 QIZILBASH M 6.0
## 6 LIST C 7 SEN A 6.0
## 7 MONGIN P 6 FLEURBAEY M 5.5
## 8 QIZILBASH M 6 MONGIN P 5.5
## 9 SEN A 6 GUSTAFSSON JE 5.0
## 10 VOORHOEVE A 6 CARTER I 4.5
##
##
## Top manuscripts per citations
##
## Paper TC TCperYear
## 1 BINMORE K, 1987, ECON PHILOS 292 8.59
## 2 MORRIS S, 1995, ECON PHILOS 174 6.69
## 3 SUGDEN R, 2000, ECON PHILOS 157 7.48
## 4 BUCHANAN JM, 1991, ECON PHILOS 153 5.10
## 5 HIRSCHMAN AO, 1985, ECON PHILOS 139 3.86
## 6 ETZIONI A, 1986, ECON PHILOS 138 3.94
## 7 FLEURBAEY M, 1995, ECON PHILOS 137 5.27
## 8 NUSSBAUM MC, 2001, ECON PHILOS 136 6.80
## 9 DONALDSON T, 1995, ECON PHILOS 117 4.50
## 10 STALNAKER R, 1996, ECON PHILOS 111 4.44
##
##
## Corresponding Author's Countries
##
## Country Articles Freq SCP MCP MCP_Ratio
## 1 USA 74 0.4966 71 3 0.0405
## 2 UNITED KINGDOM 23 0.1544 21 2 0.0870
## 3 FRANCE 8 0.0537 8 0 0.0000
## 4 GERMANY 7 0.0470 7 0 0.0000
## 5 CANADA 5 0.0336 4 1 0.2000
## 6 SWEDEN 5 0.0336 5 0 0.0000
## 7 NETHERLANDS 4 0.0268 4 0 0.0000
## 8 BELGIUM 3 0.0201 2 1 0.3333
## 9 NORWAY 3 0.0201 3 0 0.0000
## 10 AUSTRALIA 2 0.0134 1 1 0.5000
##
##
## SCP: Single Country Publications
##
## MCP: Multiple Country Publications
##
##
## Total Citations per Country
##
## Country Total Citations Average Article Citations
## 1 USA 1218 16.46
## 2 UNITED KINGDOM 212 9.22
## 3 SWEDEN 191 38.20
## 4 NORWAY 80 26.67
## 5 ISRAEL 50 25.00
## 6 FRANCE 45 5.62
## 7 SWITZERLAND 43 21.50
## 8 GERMANY 29 4.14
## 9 BELGIUM 24 8.00
## 10 AUSTRALIA 17 8.50
##
##
## Most Relevant Sources
##
## Sources Articles
## 1 ECONOMICS AND PHILOSOPHY 614
##
##
## Most Relevant Keywords
##
## Author Keywords (DE) Articles Keywords-Plus (ID) Articles
## 1 DECISION THEORY 7 ABORTION 6
## 2 PRIORITARIANISM 6 BEHAVIOR 6
## 3 FAIRNESS 5 FAMILY PLANNING 6
## 4 NUDGE 5 ECONOMICS 5
## 5 EGALITARIANISM 4 CRITIQUE 4
## 6 EXPLOITATION 4 ECONOMIC FACTORS 4
## 7 BEHAVIOURAL ECONOMICS 3 FERTILITY CONTROL 4
## 8 CLIMATE CHANGE 3 INDUCED 4
## 9 DELIBERATION 3 POSTCONCEPTION 4
## 10 DISTRIBUTIVE JUSTICE 3 PSYCHOLOGICAL FACTORS 4
# Multiple graphs:
plot(x = results_one_j, k = 10, pause = FALSE)
# print("Getting the most-cited references")
CR <- citations(dt_one_j, field = "article", sep = ";")
kable(cbind(CR$Cited[1:10]), caption = "Most-cited references")
| RAWLS, J., (1971) A THEORY OF JUSTICE, , HARVARD UNIVERSITY PRESS | 24 |
| RAWLS, J., (1971) A THEORY OF JUSTICE, , CAMBRIDGE, MA: HARVARD UNIVERSITY PRESS | 17 |
| BROOME, J., (2004) WEIGHING LIVES, , OXFORD: OXFORD UNIVERSITY PRESS | 15 |
| COHEN, G.A., ON THE CURRENCY OF EGALITARIAN JUSTICE (1989) ETHICS, 99, PP. 906-944 | 15 |
| ANDERSON, E., WHAT IS THE POINT OF EQUALITY? (1999) ETHICS, 109, PP. 287-337 | 11 |
| BROOME, J., (1991) WEIGHING GOODS, , OXFORD: BLACKWELL | 11 |
| KAHNEMAN, D., TVERSKY, A., PROSPECT THEORY: AN ANALYSIS OF DECISION UNDER RISK (1979) ECONOMETRICA, 47, PP. 263-291 | 10 |
| PARFIT, D., (1984) REASONS AND PERSONS, , OXFORD UNIVERSITY PRESS | 10 |
| DWORKIN, R., WHAT IS EQUALITY? PART 2: EQUALITY OF RESOURCES (1981) PHILOSOPHY AND PUBLIC AFFAIRS, 10, PP. 283-345 | 9 |
| HARSANYI, J.C., CARDINAL WELFARE, INDIVIDUALISTIC ETHICS, AND INTERPERSONAL COMPARISONS OF UTILITY (1955) JOURNAL OF POLITICAL ECONOMY, 63, PP. 309-321 | 9 |
# print("Getting the most-cited first author")
CR <- citations(dt_one_j, field = "author", sep = ";")
kable(cbind(CR$Cited[1:10]), caption = "Most-cited first authors")
| SEN A | 404 |
| SUGDEN R | 192 |
| BROOME J | 178 |
| RAWLS J | 158 |
| SEN A K | 156 |
| KAHNEMAN D | 146 |
| TVERSKY A | 119 |
| FLEURBAEY M | 102 |
| HAYEK F A | 101 |
| COHEN G A | 99 |
rm(dt_one_j,results_one_j)
dt_one_j <- convert2df(file =
"/projects/digital_history/philo_and_economics/data/BibTeX files/Corpus A - Core Phil Eco Journals and Books/JEM Vols 1-26 (1994-2019).bib",
dbsource= "scopus", format = "bibtex")
##
## Converting your scopus collection into a bibliographic dataframe
##
##
## Warning:
## In your file, some mandatory metadata are missing. Bibliometrix functions may not work properly!
##
## Please, take a look at the vignettes:
## - 'Data Importing and Converting' (https://cran.r-project.org/web/packages/bibliometrix/vignettes/Data-Importing-and-Converting.html)
## - 'A brief introduction to bibliometrix' (https://cran.r-project.org/web/packages/bibliometrix/vignettes/bibliometrix-vignette.html)
##
##
## Missing fields: IDDone!
##
##
## Generating affiliation field tag AU_UN from C1: Done!
# print(paste("There are",nrow(dt_one_j), "documents from this journal."))
print("Automated analysis from the bibliometrix package.")
## [1] "Automated analysis from the bibliometrix package."
results_one_j <- biblioAnalysis(dt_one_j, sep = ";")
# summary of the analysis (error with the function)
summary(object = results_one_j, k = 10, pause = FALSE)
##
##
## MAIN INFORMATION ABOUT DATA
##
## Timespan 1994 : 2019
## Sources (Journals, Books, etc) 1
## Documents 611
## Average years from publication 12.2
## Average citations per documents 9.142
## Average citations per year per doc 0.6864
## References 22574
##
## DOCUMENT TYPES
## article 495
## conference paper 34
## editorial 25
## erratum 4
## letter 2
## note 14
## review 37
##
## DOCUMENT CONTENTS
## Keywords Plus (ID) 0
## Author's Keywords (DE) 1564
##
## AUTHORS
## Authors 507
## Author Appearances 779
## Authors of single-authored documents 323
## Authors of multi-authored documents 184
##
## AUTHORS COLLABORATION
## Single-authored documents 487
## Documents per Author 1.21
## Authors per Document 0.83
## Co-Authors per Documents 1.27
## Collaboration Index 1.48
##
##
## Annual Scientific Production
##
## Year Articles
## 1994 19
## 1995 16
## 1996 15
## 1997 16
## 1998 11
## 1999 21
## 2000 18
## 2001 29
## 2002 17
## 2003 28
## 2004 23
## 2005 29
## 2006 23
## 2007 25
## 2008 18
## 2009 25
## 2010 27
## 2011 26
## 2012 29
## 2013 34
## 2014 25
## 2015 31
## 2016 26
## 2017 22
## 2018 22
## 2019 36
##
## Annual Percentage Growth Rate 2.589274
##
##
## Most Productive Authors
##
## Authors Articles Authors Articles Fractionalized
## 1 DAVIS JB 14 DAVIS JB 10.78
## 2 MKI U 11 MKI U 10.33
## 3 MAYER T 10 HAUSMAN DM 9.00
## 4 SUGDEN R 10 REISS J 9.00
## 5 BACKHOUSE RE 9 MAYER T 8.33
## 6 HAUSMAN DM 9 BACKHOUSE RE 8.00
## 7 REISS J 9 NA NA 8.00
## 8 ROSS D 9 ROSS D 7.50
## 9 HOOVER KD 8 SUGDEN R 6.67
## 10 NA NA 8 HOOVER KD 6.50
##
##
## Top manuscripts per citations
##
## Paper TC TCperYear
## 1 SUGDEN R, 2000, J ECON METHODOL-a 170 8.10
## 2 HODGSON GM, 2007, J ECON METHODOL 164 11.71
## 3 WITT U, 2004, J ECON METHODOL 103 6.06
## 4 SCHRAM A, 2005, J ECON METHODOL 95 5.94
## 5 MKI U, 2005, J ECON METHODOL 87 5.44
## 6 MORGAN MS, 2001, J ECON METHODOL 85 4.25
## 7 MORGAN MS, 2005, J ECON METHODOL 78 4.88
## 8 CHICK V, 2005, J ECON METHODOL 76 4.75
## 9 WOODWARD J, 2006, J ECON METHODOL 74 4.93
## 10 READ D, 2005, J ECON METHODOL 71 4.44
##
##
## Corresponding Author's Countries
##
## Country Articles Freq SCP MCP MCP_Ratio
## 1 USA 69 0.2738 66 3 0.0435
## 2 UNITED KINGDOM 52 0.2063 48 4 0.0769
## 3 NETHERLANDS 26 0.1032 22 4 0.1538
## 4 FRANCE 13 0.0516 13 0 0.0000
## 5 GERMANY 13 0.0516 12 1 0.0769
## 6 FINLAND 12 0.0476 10 2 0.1667
## 7 ITALY 10 0.0397 9 1 0.1000
## 8 SOUTH AFRICA 7 0.0278 4 3 0.4286
## 9 GEORGIA 6 0.0238 4 2 0.3333
## 10 CANADA 5 0.0198 5 0 0.0000
##
##
## SCP: Single Country Publications
##
## MCP: Multiple Country Publications
##
##
## Total Citations per Country
##
## Country Total Citations Average Article Citations
## 1 NETHERLANDS 573 22.04
## 2 UNITED KINGDOM 560 10.77
## 3 USA 503 7.29
## 4 GERMANY 233 17.92
## 5 FINLAND 89 7.42
## 6 FRANCE 77 5.92
## 7 ITALY 59 5.90
## 8 SOUTH AFRICA 35 5.00
## 9 GEORGIA 34 5.67
## 10 PORTUGAL 30 10.00
##
##
## Most Relevant Sources
##
## Sources Articles
## 1 JOURNAL OF ECONOMIC METHODOLOGY 611
# Multiple graphs:
plot(x = results_one_j, k = 10, pause = FALSE)
# print("Getting the most-cited references")
CR <- citations(dt_one_j, field = "article", sep = ";")
kable(cbind(CR$Cited[1:10]), caption = "Most-cited references")
| HAUSMAN, D., (1992) THE INEXACT AND SEPARATE SCIENCE OF ECONOMICS, , CAMBRIDGE: CAMBRIDGE UNIVERSITY PRESS | 33 |
| HAUSMAN, D.M., (1992) THE INEXACT AND SEPARATE SCIENCE OF ECONOMICS, , CAMBRIDGE: CAMBRIDGE UNIVERSITY PRESS | 33 |
| LAWSON, T., (1997) ECONOMICS AND REALITY, , LONDON: ROUTLEDGE | 26 |
| HANDS, D.W., (2001) REFLECTION WITHOUT RULES: ECONOMIC METHODOLOGY AND CONTEMPORARY SCIENCE THEORY, , CAMBRIDGE: CAMBRIDGE UNIVERSITY PRESS | 23 |
| KAHNEMAN, D., TVERSKY, A., PROSPECT THEORY: AN ANALYSIS OF DECISION UNDER RISK (1979) ECONOMETRICA, 47, PP. 263-291 | 15 |
| BLAUG, M., (1980) THE METHODOLOGY OF ECONOMICS, , CAMBRIDGE: CAMBRIDGE UNIVERSITY PRESS | 13 |
| HUTCHISON, T.W., (1938) THE SIGNIFICANCE AND BASIC POSTULATES OF ECONOMIC THEORY, , LONDON: MACMILLAN | 12 |
| KEYNES, J.M., (1936) THE GENERAL THEORY OF EMPLOYMENT, INTEREST AND MONEY, , LONDON: MACMILLAN | 12 |
| LAWSON, T., (2003) REORIENTING ECONOMICS, , LONDON: ROUTLEDGE | 11 |
| SMITH, V.L., MICROECONOMIC SYSTEMS AS AN EXPERIMENTAL SCIENCE (1982) AMERICAN ECONOMIC REVIEW, 72, PP. 923-955 | 11 |
# print("Getting the most-cited first author")
CR <- citations(dt_one_j, field = "author", sep = ";")
kable(cbind(CR$Cited[1:10]), caption = "Most-cited first authors")
| MKI U | 241 |
| SUGDEN R | 198 |
| SEN A | 193 |
| KAHNEMAN D | 170 |
| FRIEDMAN M | 164 |
| TVERSKY A | 151 |
| HAYEK F A | 141 |
| LOEWENSTEIN G | 129 |
| GIGERENZER G | 127 |
| LAWSON T | 127 |
rm(dt_one_j,results_one_j)
Now, we combine the two journals in a single corpus, keep only articles and reviews, and drop documents published prior to 1990.
#Loading corpus
df_CorpusA <- list(convert2df(file = "/projects/digital_history/philo_and_economics/data/BibTeX files/Corpus A - Core Phil Eco Journals and Books/Eco & Phil Vols 1-35 (1985-2019).bib",dbsource= "scopus", format = "bibtex"),
convert2df(file = "/projects/digital_history/philo_and_economics/data/BibTeX files/Corpus A - Core Phil Eco Journals and Books/JEM Vols 1-26 (1994-2019).bib",dbsource= "scopus", format = "bibtex")) %>% rbindlist(fill = TRUE)
##
## Converting your scopus collection into a bibliographic dataframe
##
## Done!
##
##
## Generating affiliation field tag AU_UN from C1: Done!
##
##
## Converting your scopus collection into a bibliographic dataframe
##
##
## Warning:
## In your file, some mandatory metadata are missing. Bibliometrix functions may not work properly!
##
## Please, take a look at the vignettes:
## - 'Data Importing and Converting' (https://cran.r-project.org/web/packages/bibliometrix/vignettes/Data-Importing-and-Converting.html)
## - 'A brief introduction to bibliometrix' (https://cran.r-project.org/web/packages/bibliometrix/vignettes/bibliometrix-vignette.html)
##
##
## Missing fields: IDDone!
##
##
## Generating affiliation field tag AU_UN from C1: Done!
#We only take articles and reviews and published in 1990 or later
dt_CorpusArticles <- df_CorpusA[DT %in% c("ARTICLE", "REVIEW") & PY>=first_y]
print("Automated analysis from the bibliometrix package.")
## [1] "Automated analysis from the bibliometrix package."
results_cA <- biblioAnalysis(dt_CorpusArticles, sep = ";")
# summary of the analysis (error with the function)
summary(object = results_cA, k = 10, pause = FALSE)
##
##
## MAIN INFORMATION ABOUT DATA
##
## Timespan 1990 : 2019
## Sources (Journals, Books, etc) 2
## Documents 1007
## Average years from publication 13.3
## Average citations per documents 11.35
## Average citations per year per doc 0.7613
## References 33760
##
## DOCUMENT TYPES
## article 846
## review 161
##
## DOCUMENT CONTENTS
## Keywords Plus (ID) 45
## Author's Keywords (DE) 1749
##
## AUTHORS
## Authors 836
## Author Appearances 1259
## Authors of single-authored documents 550
## Authors of multi-authored documents 286
##
## AUTHORS COLLABORATION
## Single-authored documents 808
## Documents per Author 1.2
## Authors per Document 0.83
## Co-Authors per Documents 1.25
## Collaboration Index 1.44
##
##
## Annual Scientific Production
##
## Year Articles
## 1990 13
## 1991 16
## 1992 13
## 1993 17
## 1994 38
## 1995 32
## 1996 23
## 1997 29
## 1998 19
## 1999 32
## 2000 32
## 2001 38
## 2002 27
## 2003 34
## 2004 27
## 2005 44
## 2006 35
## 2007 35
## 2008 40
## 2009 41
## 2010 37
## 2011 36
## 2012 44
## 2013 46
## 2014 42
## 2015 45
## 2016 32
## 2017 30
## 2018 43
## 2019 67
##
## Annual Percentage Growth Rate 5.817198
##
##
## Most Productive Authors
##
## Authors Articles Authors Articles Fractionalized
## 1 SUGDEN R 20 HAUSMAN DM 15.00
## 2 HAUSMAN DM 16 SUGDEN R 14.17
## 3 GUALA F 10 MKI U 10.00
## 4 MKI U 10 GUALA F 9.00
## 5 DAVIS JB 9 REISS J 8.00
## 6 MAYER T 9 ROSS D 7.50
## 7 ROSS D 9 MAYER T 7.33
## 8 GOLDFARB RS 8 HANDS DW 7.00
## 9 REISS J 8 QIZILBASH M 7.00
## 10 BACKHOUSE RE 7 DAVIS JB 6.78
##
##
## Top manuscripts per citations
##
## Paper TC TCperYear
## 1 MORRIS S, 1995, ECON PHILOS 174 6.69
## 2 SUGDEN R, 2000, J ECON METHODOL-a 170 8.10
## 3 HODGSON GM, 2007, J ECON METHODOL 164 11.71
## 4 SUGDEN R, 2000, ECON PHILOS 157 7.48
## 5 BUCHANAN JM, 1991, ECON PHILOS 153 5.10
## 6 FLEURBAEY M, 1995, ECON PHILOS 137 5.27
## 7 NUSSBAUM MC, 2001, ECON PHILOS 136 6.80
## 8 DONALDSON T, 1995, ECON PHILOS 117 4.50
## 9 STALNAKER R, 1996, ECON PHILOS 111 4.44
## 10 CUBITT RP, 2003, ECON PHILOS 96 5.33
##
##
## Corresponding Author's Countries
##
## Country Articles Freq SCP MCP MCP_Ratio
## 1 USA 113 0.3343 109 4 0.0354
## 2 UNITED KINGDOM 63 0.1864 59 4 0.0635
## 3 NETHERLANDS 27 0.0799 24 3 0.1111
## 4 FRANCE 19 0.0562 19 0 0.0000
## 5 GERMANY 18 0.0533 17 1 0.0556
## 6 ITALY 11 0.0325 10 1 0.0909
## 7 FINLAND 10 0.0296 9 1 0.1000
## 8 CANADA 9 0.0266 9 0 0.0000
## 9 SWEDEN 7 0.0207 6 1 0.1429
## 10 BELGIUM 6 0.0178 5 1 0.1667
##
##
## SCP: Single Country Publications
##
## MCP: Multiple Country Publications
##
##
## Total Citations per Country
##
## Country Total Citations Average Article Citations
## 1 USA 1218 10.78
## 2 UNITED KINGDOM 686 10.89
## 3 NETHERLANDS 493 18.26
## 4 SWEDEN 185 26.43
## 5 GERMANY 159 8.83
## 6 FRANCE 113 5.95
## 7 NORWAY 93 23.25
## 8 FINLAND 86 8.60
## 9 ITALY 59 5.36
## 10 BELGIUM 51 8.50
##
##
## Most Relevant Sources
##
## Sources Articles
## 1 JOURNAL OF ECONOMIC METHODOLOGY 532
## 2 ECONOMICS AND PHILOSOPHY 475
##
##
## Most Relevant Keywords
##
## Author Keywords (DE) Articles Keywords-Plus (ID) Articles
## 1 METHODOLOGY 48 ABORTION 6
## 2 ECONOMIC METHODOLOGY 21 BEHAVIOR 6
## 3 RATIONALITY 20 FAMILY PLANNING 6
## 4 MODELS 18 ECONOMICS 5
## 5 EXPERIMENTAL ECONOMICS 16 CRITIQUE 4
## 6 EXPLANATION 16 ECONOMIC FACTORS 4
## 7 EXPERIMENTS 13 FERTILITY CONTROL 4
## 8 BEHAVIORAL ECONOMICS 12 INDUCED 4
## 9 ECONOMICS 12 POSTCONCEPTION 4
## 10 GAME THEORY 12 PSYCHOLOGICAL FACTORS 4
# Multiple graphs:
plot(x = results_cA, k = 10, pause = FALSE)
# print("Getting the most-cited references")
CR <- citations(dt_CorpusArticles, field = "article", sep = ";")
kable(cbind(CR$Cited[1:10]), caption = "Most-cited references")
| HAUSMAN, D., (1992) THE INEXACT AND SEPARATE SCIENCE OF ECONOMICS, , CAMBRIDGE: CAMBRIDGE UNIVERSITY PRESS | 32 |
| HAUSMAN, D.M., (1992) THE INEXACT AND SEPARATE SCIENCE OF ECONOMICS, , CAMBRIDGE: CAMBRIDGE UNIVERSITY PRESS | 32 |
| KAHNEMAN, D., TVERSKY, A., PROSPECT THEORY: AN ANALYSIS OF DECISION UNDER RISK (1979) ECONOMETRICA, 47, PP. 263-291 | 24 |
| RAWLS, J., (1971) A THEORY OF JUSTICE, , HARVARD UNIVERSITY PRESS | 24 |
| HANDS, D.W., (2001) REFLECTION WITHOUT RULES: ECONOMIC METHODOLOGY AND CONTEMPORARY SCIENCE THEORY, , CAMBRIDGE: CAMBRIDGE UNIVERSITY PRESS | 22 |
| RAWLS, J., (1971) A THEORY OF JUSTICE, , CAMBRIDGE, MA: HARVARD UNIVERSITY PRESS | 22 |
| LAWSON, T., (1997) ECONOMICS AND REALITY, , LONDON: ROUTLEDGE | 21 |
| BROOME, J., (2004) WEIGHING LIVES, , OXFORD: OXFORD UNIVERSITY PRESS | 15 |
| CAMERER, C., LOEWENSTEIN, G., PRELEC, D., NEUROECONOMICS: HOW NEUROSCIENCE CAN INFORM ECONOMICS (2005) JOURNAL OF ECONOMIC LITERATURE, 43, PP. 9-64 | 15 |
| COHEN, G.A., ON THE CURRENCY OF EGALITARIAN JUSTICE (1989) ETHICS, 99, PP. 906-944 | 15 |
# print("Getting the most-cited first author")
CR <- citations(dt_CorpusArticles, field = "author", sep = ";")
kable(cbind(CR$Cited[1:10]), caption = "Most-cited first authors")
| SEN A | 476 |
| SUGDEN R | 358 |
| KAHNEMAN D | 289 |
| MKI U | 255 |
| TVERSKY A | 236 |
| HAYEK F A | 217 |
| FRIEDMAN M | 210 |
| BROOME J | 185 |
| SEN A K | 181 |
| LOEWENSTEIN G | 169 |
dt_CorpusArticles$ID <- 1:nrow(dt_CorpusArticles)
save(dt_CorpusArticles,file = "/projects/digital_history/philo_and_economics/data/dt_ScopusCorpus_PhiEcon.RData")
rm(df_CorpusA)
The descriptive results above show that some cleaning is necessary on the references. For instance, Hausman shows up as 'HAUSMAN, D.', but also as 'HAUSMAN, D.M.' The following code chunk is far from being optimized for speed. It was run once and its output was saved.
dt_refs <- data.table(ID=numeric(),order= numeric(), refs = character())
for(i in unique(dt_CorpusArticles$ID)){
if (dt_CorpusArticles[ID==i]$CR != "" & !is.na(dt_CorpusArticles[ID==i]$CR))
{
refs <- dt_CorpusArticles[ID==i]$CR %>% strsplit("; ") %>% unlist()
dt_refs <- rbind(dt_refs,data.table(ID=i,order= 1:length(refs), refs = refs))
}
}
dt_refs[,Year := str_extract(refs, '(?<=\\()[0-9-]+(?=\\))')]
#Adding info from citing doc (journal + citing year)
dt_refs <- merge(dt_refs, dt_CorpusArticles[,list(ID, SO, PY)], by="ID", all.x = TRUE)
#Finding duplicates with authors name
dt_good_format_refs <- dt_refs[!is.na(Year) & str_length(Year) == 4]
refs_splits <- strsplit(dt_good_format_refs$refs, ",")
dt_good_format_refs$Author <- lapply(refs_splits, function(l) l[[1]])
dt_good_format_refs$Author <- as.character(dt_good_format_refs$Author) #There are 8212 unique authors
setkey(dt_good_format_refs, Author)
dt_before <- dt_good_format_refs # to look what sources were changed by the algorithm
h = 0; old_perc =0 #to keep track of where we're at in the loop
n_aut = length(unique(dt_good_format_refs$Author))
for (author in unique(dt_good_format_refs$Author))
{
h = h + 1
for (i in c(1:nrow(dt_good_format_refs[Author == author])))
{
for (j in c(i+1:nrow(dt_good_format_refs[Author == author])))
{
dt <- duplicatedMatching(dt_good_format_refs[Author == author][c(i,j)], Field = "refs", tol = 0.85)
if (nrow(dt) == 1)
{
dt_good_format_refs[Author == author][c(i,j)]$refs <- dt$refs
}
}
}
perc = round(h/n_aut*100) # conditional print function to keep track of each step in the percentage
if(perc > old_perc){
print(paste(round(h/n_aut*100), "% of authors completed at", Sys.time()))
old_perc = perc
}
}
#looking at what was changed
dt_difference <- merge(dt_good_format_refs[,list(ID, order, refs)], dt_before[,list(ID, order, refs)], by = c("ID", "order"))
dt_refs <- dt_good_format_refs
save(dt_refs, file = "/projects/digital_history/philo_arefnd_economics/data/dt_refs.RData")
rm(dt_refs,dt_good_format_refs,dt_before)
source("/home/olivier/my_projects/central_banks/functions/functions_for_bibliometrics.R")
load("/projects/digital_history/philo_and_economics/data/dt_refs.RData")
#The previous chunk of code got rid of duplicates, but we still have references from different editions.
#This is problematic because when we do the top 5 cited refs, we sometime have twice the same source but from different editions.
#Our goal here is to create some kind of unique identifier for references.
#To do so we calculate the levenshtein distance between the first 50 characters of references from the same author
#The title of a reference is usually within those first 50 characters, and the edition comes after
#If the distance is less than 10, we consider it must be the same reference
if(!"ref_id" %in% colnames(dt_refs))
{
#We give an ID to references
identifying_refs <- dt_refs[,list(refs)] %>% unique()
identifying_refs[,ref_id := c(1:.N)]
dt_refs <- merge(dt_refs, identifying_refs, by = "refs", all.x = TRUE)
}
#We don't want to run this twice if it was already done
if (!"unique_ref_id" %in% colnames(dt_refs))
{
lv_distance_trhld <- 10
size_of_ref_chunk <- 50
#Since the goal of the procedure is to get top citations, we only try to find duplicates for the top 500
#authors
top_authors <- dt_refs[, .N, by = Author][order(-N)] %>% head(500)
#First we need ids for references
ref_ids <- dt_refs[,list(refs)] %>% unique()
setkey(ref_ids, refs)
ref_ids[,ref_id := c(1:.N)]
#Then we only take references from the top 500 authors
dt <- dt_refs[Author %in% top_authors$Author, list(Author, refs)] %>% unique()
dt <- merge(dt, ref_ids, by = "refs") #merging ref_id
#Creating all combinations of references per author
dt_refs_duplicates <- dt[,list(Target = rep(refs[1:(length(refs)-1)],(length(refs)-1):1),
Source = rev(refs)[sequence((length(refs)-1):1)]),
by= Author]
#Merging back ids
dt_refs_duplicates <- merge(dt_refs_duplicates, dt[,list(refs, ref_id)], by.x = "Target", by.y="refs")
setnames(dt_refs_duplicates, "ref_id", "Target_ID")
dt_refs_duplicates <- merge(dt_refs_duplicates, dt[,list(refs, ref_id)], by.x = "Source", by.y="refs")
setnames(dt_refs_duplicates, "ref_id", "Source_ID")
#Cleaning the references to get rid of special characters and numbers (we don't want the year to matter, only the title)
dt_refs_duplicates[,`:=`(Target = fct_clear_string_for_fuzzy_match(Target) %>% removeNumbers(),
Source = fct_clear_string_for_fuzzy_match(Source) %>% removeNumbers())]
dt_refs_duplicates[, size_of_ref_chunk := ((str_length(Source) + str_length(Target)) / 4) %>% round()]
dt_refs_duplicates[, lv_distance_trhld := (size_of_ref_chunk * 0.1) %>% round()]
#Calculating levenshtein distance
dt_refs_duplicates[, distance := stringdist::stringdist(substr(Target, 1, size_of_ref_chunk),
substr(Source, 1, size_of_ref_chunk))]
#Only keeping couples of refs with a distance of less than 10
dt_refs_ids <- dt_refs_duplicates[distance < lv_distance_trhld]
#Sorting in order to have ref_ids that are only in the Source column
dt_refs_ids[, `:=`(Target = c(Target,Source) %>% sort() %>% first(), Source = c(Target,Source) %>% sort() %>% last())
,by = .(Target,Source)]
#Taking only the rows of the sources_ID that are not in the Target column
dt_refs_ids <- dt_refs_ids[!Source_ID %in% Target_ID, list(ID = Source_ID, Source_ID, Target_ID)]
#Binding the Source and the Target ids together
dt_refs_ids <- rbind(dt_refs_ids[,list(ref_id = Target_ID, unique_ref_id = ID)],
dt_refs_ids[,list(ref_id = Source_ID, unique_ref_id = ID)]) %>% unique()
#In some rare cases, some ref_ids have more than one unique_ref_id. We take the unique_ref_id that is the most frequent
dt_refs_ids[,N := .N, by = unique_ref_id]
setorder(dt_refs_ids, -N)
dt_refs_ids <- dt_refs_ids[,head(.SD, 1), by = ref_id]
#We merge back the unique_ref_ids on the dt_refs table
#dt_refs <- merge(dt_refs, ref_ids, by = "refs")
dt_refs <- merge(dt_refs, dt_refs_ids[,list(ref_id, unique_ref_id)], by = "ref_id", all.x = TRUE)
#Taking care of reference that don't have unique_ref_ids
dt_refs[is.na(unique_ref_id), unique_ref_id := ref_id]
save(dt_refs, file = "/projects/digital_history/philo_and_economics/data/dt_refs.RData")
rm(dt_refs_duplicates); rm(dt_refs_ids)
}
rm(dt_refs)
This corpus is composed of articles deemed "economic methodology" among economists, but not published in the main journals of specialized philosophy of economics. Creating the corpus involves a few steps:
#Loading xml file
data <- xml2::read_xml("/projects/digital_history/philo_and_economics/data/2020-09-21_JEL-Methodology.xml")
#This line create a kind of list of all entries
data <- xml2::xml_find_all(data, "//rec")
#Extracting data
title <- xml2::xml_text(xml2::xml_find_first(data, ".//atl"))
year <- data.table(xml2::xml_text(xml2::xml_find_first(data, ".//dt")))
year[,V1 := substr(V1,1,4)]
journal <- xml2::xml_text(xml2::xml_find_first(data, ".//jtl"))
vol <- xml2::xml_text(xml2::xml_find_first(data, ".//vid"))
no <- xml2::xml_text(xml2::xml_find_first(data, ".//iid"))
pubType <- xml2::xml_text(xml2::xml_find_first(data, ".//pubtype"))
pages <- xml2::xml_text(xml2::xml_find_first(data, ".//pages"))
dt_JEL <- data.table(Title = title, Year = year$V1, Journal = journal, Vol = vol, No = no, Pages = pages, PubType = pubType)
# get all revues!
dt_revue <- fread("/projects/digital_history/interdisciplinarity/data/revues.txt")
# names of journals from JEL
dt_JEL[,Journal := toupper(Journal)]
# merge
# creating cleaner strings to merge
dt_JEL[,match_field := Journal %>% str_replace_all(" AND ", " & ") %>%
str_replace_all("[:punct:] "," ") %>% str_replace_all(" "," ")]
dt_revue[,match_field := Revue %>% str_replace_all(" AND ", " & ") %>%
str_replace_all("[:punct:] "," ") %>% str_replace_all(" "," ")]
dt_JEL <- merge(dt_JEL[,], dt_revue, by= "match_field", all.x=TRUE, all.y=FALSE) # actual merge
# Manual fix for journal mismatches
j_no_match <- dt_JEL[PubType == "Journal Article" & is.na(Code_Revue),.N,by=Journal][N>4][order(-N)]
dt_JEL[grepl("JOURNAL OF ECONOMIC ISSUES",Journal), Code_Revue := 8600]
dt_JEL[Journal == "JOURNAL OF INSTITUTIONAL AND THEORETICAL ECONOMICS", Code_Revue := 9055]
dt_JEL[Journal == "JOURNAL OF THE HISTORY OF ECONOMIC THOUGHT (CAMBRIDGE UNIVERSITY PRESS)", Code_Revue := 18851]
dt_JEL[Journal == "REVUE D'ECONOMIE POLITIQUE", Code_Revue := 14216]
dt_JEL[grepl("RECHERCHES ECONOMIQUES DE LOUVAIN",Journal), Code_Revue := 13831]
dt_JEL[grepl("ZEITSCHRIFT FUR WIRTSCHAFTS-[ ]?UND SOZIALWISSENSCHAFTEN",Journal), Code_Revue := 18816]
dt_JEL[grepl("CANADIAN JOURNAL OF DEVELOPMENT STUDIES",Journal), Code_Revue := 2850]
dt_JEL[grepl("OXFORD ECONOMIC PAPERS",Journal), Code_Revue := 12488]
dt_JEL[Journal == "ECONOMICS: THE OPEN-ACCESS, OPEN-ASSESSMENT E-JOURNAL", Code_Revue := 19904]
dt_JEL[grepl("CANADIAN JOURNAL OF AGRICULTURAL ECONOMICS",Journal), Code_Revue := 2832]
dt_JEL$Code_Discipline <- NULL
dt_JEL <- merge(dt_JEL, dt_revue[,list(Code_Revue, Code_Discipline)],
by= "Code_Revue", all.x=TRUE, all.y=FALSE) # actual merge
# tools used to diagnose mismatches
# for(j in j_no_match$Journal){
# if(!j %in% c("METHODUS", "OECONOMICA", "OECONOMIA", "EKONOMIA","INNOVATIONS")){ # skipping a few that generate many false positive
# hit <- dt_revue[agrepl(j,Revue)]
# if(nrow(hit)>0){
# print(paste(j, "is detected as similar enough to", hit$Revue))
# }
# }
# }
#
# txt <- "JOURNAL OF BEHAVIOURAL ECONOMICS"
# dt_JEL[grepl(txt, Journal) & PubType == "Journal Article" & is.na(Code_Revue)]
# dt_revue[grepl(txt,Revue)]
# to check if journals are on WoS: https://mjl.clarivate.com/home
dt_JEL_all <- dt_JEL
dt_JEL_all$match_field <- NULL
dt_JEL_all <- dt_JEL_all %>% unique()
save(dt_JEL_all, file = "/projects/digital_history/philo_and_economics/data/dt_JEL_all.RData" )
# Writing small file with
write(paste( dt_JEL_all[!is.na(Code_Revue),unique(Code_Revue)],
collapse = ", "),
"/projects/digital_history/philo_and_economics/data/journals_w_JEL_methodo.txt")
# getting some basic info
n_docs <- nrow(dt_JEL_all)
n_art <- dt_JEL_all[PubType == "Journal Article" & Year >= first_y] %>% nrow()
n_art_w_j <- dt_JEL_all[PubType == "Journal Article" & !is.na(Code_Revue) & Year >= first_y] %>% nrow()
n_j <- dt_JEL_all[!is.na(Code_Revue) & Year >= first_y,unique(Code_Revue)] %>% length()
prop_missing_j <- dt_JEL[PubType == "Journal Article" & is.na(Code_Revue) & Year >= first_y,.N]/ dt_JEL[PubType == "Journal Article" & Year >= first_y,.N]
rm(data,dt_JEL_all, dt_JEL,dt_revue,year,pages,vol,pubType,no,journal)
In the EconLit database, we retrieve all articles with the JEL code corresponding to 'Economic Methodology'. More specifically, the search string used is SU "Economic Methodology". After 1991, the corresponding JEL code is B4, but using B4 or (B4*) as a JEL code results in too few articles being retrieved for 1990 and 1991 because the JEL classification system was transitionning from older codes (where 'Economic Methodology' was code 00360). Our extraction from EconLit on September 21st 2020 gave us 9166 documents.
Among these documents, we take those that are journal articles and focus on documents published since 1990, which brings us down to 4081 documents.
We match the journal names to list of journals in Web of Science. Some manual corrections were necessary at this step because spelling of journals sometimes differed between EconLit and Web of Science. Given that Web of Science includes only a subset of academic journals, only 2586 articles in our corpus can be matched to their corresponding journal in Web of Science.
#This data was parsed above
load("/projects/digital_history/philo_and_economics/data/dt_JEL_all.RData" )
dt_JEL <- dt_JEL_all[PubType == "Journal Article" & !is.na(Code_Revue) & Year >= first_y]
dt_JEL[,c("First_Page", "Last_Page") := tstrsplit(Pages, "-", fixed=TRUE)]
# Loading journal issues (retrieved from WoS after previous step)
dt_revueID <- fread("/projects/digital_history/philo_and_economics/data/JEL_meth_issueIDs.tsv")
# data.table(read.csv2("/projects/digital_history/behavioral\ economics/data/revueID.csv"))
#We put NAs to NULL so we can try to match all entries with our database
dt_JEL[is.na(Vol), Vol := "NULL"]
dt_JEL[is.na(No), No := "NULL"]
dt_JEL$Year <- as.numeric(dt_JEL$Year)
#ECONOMICS-THE OPEN ACCESS OPEN-ASSESSMENT E-JOURNAL ECONOMICS-KIEL (Code Revue 19904) doesn't have volume and Numero
dt_Economics_Kiel <- dt_JEL[Code_Revue == 19904]
#Manually attributing id_art to the five articles found in WoS
dt_Economics_Kiel[order(Year,Vol,Title), ID_Art := c(NA, 47615209, 47615210, 47711327, 54947652, 56295156)]
# Removing the journal before processing to avoid issues.
dt_JEL <- dt_JEL[Code_Revue != 19904]
#Journal number 7495 has weird number nomenclature
dt_JEL[Code_Revue == 7495 & No == "4/5", No := "4-5"]
dt_JEL[Code_Revue == 7495 & No == "7/8", No := "7-8"]
dt_JEL[Code_Revue == 7495 & No == "3/4/5", No := "3-5"]
dt_JEL[Code_Revue == 7495 & No == "7/8/9", No := "7-9"]
dt_JEL[Code_Revue == 7495 & No == "9-10-11", No := "9-11"]
dt_JEL[Code_Revue == 7495 & No == "10-11-12", No := "10-12"]
# Journal 352 has issue filled instead of volume in WoS
dt_JEL[Code_Revue == 352 , `:=`(No = Vol, Vol = "NULL")]
#Merging journal unique identifier (IssueID)
dt_JEL <- merge(dt_JEL, dt_revueID[,list(Code_Revue, IssueID, Volume, Numero, Annee_Bibliographique)],
by.x = c("Code_Revue", "Vol", "No", "Year"), by.y = c("Code_Revue", "Volume", "Numero", "Annee_Bibliographique"),
all.x = TRUE, all.y = FALSE)
# Making sure that our specialized journals are excluded:
dt_JEL <- dt_JEL[Journal != "ECONOMICS AND PHILOSOPHY" &
Journal != "JOURNAL OF ECONOMIC METHODOLOGY"]
# We have quite a few articles with matching journal but no matching issue
n_wo_issueID <- dt_JEL[is.na(IssueID), .N]
# Here's some code to test what these articles without matches are
# We found out that it is mostly because WoS did not index these journals for the relevant issues
# (j <- dt_JEL[is.na(IssueID),list(first(Journal), .N),by= Code_Revue][order(-N)] %>% head(10))
# for(j_i in dt_JEL[is.na(IssueID),list(first(Journal), .N),by= Code_Revue][order(-N)]$Code_Revue){
# if(any(dt_JEL[is.na(IssueID) & Code_Revue %in% j_i,
# unique(Year)]
# %in% dt_revueID[Code_Revue %in% j_i,
# unique(Annee_Bibliographique)]
# )){
# print(paste("Possibly something to improve with journal", dt_JEL[Code_Revue == j_i, unique(Journal)], " which is Code_Revue = ", j_i))
# }
# }
#
# j_i <-8601 # Checking missing Journal of Economic Literature
# (dt_temp <- dt_JEL[is.na(IssueID) & Code_Revue %in% j_i,
# list(Code_Revue,Journal, IssueID, Year, Vol,No, First_Page,Last_Page)][order(Year)]
# )
# dt_revueID[Code_Revue %in% j_i & Annee_Bibliographique %in% unique(dt_temp$Year),
# list(Code_Revue,Revue, IssueID, Annee_Bibliographique, Volume,Numero)][
# order(Annee_Bibliographique)]
# Shrinking the table to only lines with issueID
dt_JEL <- dt_JEL[!is.na(IssueID)]
# Saving a small file to be able to match in WoS at the article level
fwrite(dt_JEL[,list(IssueID,First_Page,Code_Revue)],
file = "/projects/digital_history/philo_and_economics/data/articles_in_JEL_corpus_to_match_w_WoS.csv")
save(dt_JEL, file = "/projects/digital_history/philo_and_economics/data/dt_JEL_with_issueID.RData" )
rm(dt_JEL,dt_revueID, dt_JEL_all)
load("/projects/digital_history/philo_and_economics/data/dt_JEL_with_issueID.RData")
#Fetching Article table
dt_Articles <- fread("/projects/digital_history/philo_and_economics/data/2020-09-28_id_art_of_JEL_econ_methodo.tsv",
quote="")
dt_cleaned_JEL <- merge(dt_JEL, dt_Articles, by = c("IssueID", "First_Page", "Code_Revue"), all.x = TRUE)
n_art_no_idart <- dt_cleaned_JEL[is.na(ID_Art),.N]
dt_cleaned_JEL <- dt_cleaned_JEL[! is.na(ID_Art)]
# Bringing the KIEL articles back in the dataframe
dt_cleaned_JEL <- rbindlist(list(dt_cleaned_JEL,dt_Economics_Kiel[!is.na(ID_Art)]),fill = T)
# Loading the references (that were fetched on WoS with Sql and then improved upon with New_id2 in the R script 'script_only_for_JEL-WoS-refs.R')
load("/projects/digital_history/philo_and_economics/data/dt_citing_cited_metho_FULL.RData")
# Keeping only refs with identifier
dt_ref <- dt_ref[!is.na(New_id2)]
# table with primary key being New_id2 (cited doc unique info)
dt_refs_of_JEL_metho <- dt_ref[,list(First_author = first(Nom), cited_year = first(Annee),
Publication = first(Revue_Abbrege), Volume = first(Volume), Page = first(Page),
last_name = first(last_name), times_cited = .N
),by=New_id2]
# Simple citing-cited table
dt_citing_cited <- unique(dt_ref[,list(ID_Art, New_id2)])
# Articles with refs:
dt_Articles <- dt_cleaned_JEL[ID_Art %in% unique(dt_ref$ID_Art)]
n_art_w_refs <- nrow(dt_Articles)
#Our code doesn't work if a ref is cited in only one article, or if an article has only one reference.
#dt_ref <- dt_ref[,N := .N, by = "ID_Art"][N > 4]
#dt_ref <- dt_ref[,N := .N, by = "New_id2"][N > 4]
# Cleaning journal names:
j <- "JOURNAL OF ECONOMIC ISSUES"
dt_Articles[grepl(j,Journal), Journal := j]
save(dt_refs_of_JEL_metho, file = "/projects/digital_history/philo_and_economics/data/dt_refs_of_JEL_metho.RData")
save(dt_citing_cited, file = "/projects/digital_history/philo_and_economics/data/dt_citing_cited_metho.RData")
save(dt_Articles, file = "/projects/digital_history/philo_and_economics/data/dt_Articles_metho.RData")
rm(dt_ref); rm(dt_Articles); rm(dt_JEL)
rm(dt_refs_of_JEL_metho,dt_citing_cited, dt_cleaned_JEL)
load("/projects/digital_history/philo_and_economics/data/dt_Articles_metho.RData")
load("/projects/digital_history/philo_and_economics/data/dt_citing_cited_metho.RData")
ggplot(dt_Articles[Year>=first_y],aes(x=Year)) + geom_bar() + ggtitle("Number of articles in the JEL 'Methodology' corpus")
# dt_Articles[Year>=first_y,.N,by=Year][order(Year)]
# dt_Articles[between(Year,first_y,2018),.N,by=Year][, mean(N)]
n_2019 <- dt_Articles[Year == 2019, .N]
average1990_2018 <- dt_Articles[between(Year, first_y,last_y_metho), .N,Year][,mean(N)]
n_final <- dt_Articles[between(Year, first_y,last_y_metho), .N]
n_refs_JEL <- nrow(dt_citing_cited[ID_Art %in% dt_Articles[between(Year,first_y,last_y_metho), ID_Art]])
n_journals_JEL <- dt_Articles[between(Year, first_y,last_y_metho),unique(Code_Revue)] %>% length()
rm(dt_Articles, dt_citing_cited)
Although EconLit "includes the most sought-after economics publications", some journals that are included do not fall neatly in economics as a discipline. Given that our WoS database has a disciplinary classification for each journal, we can compute the share of each discipline and of each journal in our corpus:
load("/projects/digital_history/philo_and_economics/data/dt_Articles_metho.RData")
# Name of disciplines with articles:
dt_Articles <- merge(dt_Articles[between(Year, first_y, last_y_metho)],
discipline_info[,list(Code_Discipline,Discipline = str_replace(discipline," \n |\n"," "))], by= "Code_Discipline")
# breakdown by disciplines over the full period
top_citing_disc <- dt_Articles[,list(nb_cit =.N),by=Discipline]
top_citing_disc[order(-nb_cit), list(Discipline, `Share of citations` = round(nb_cit / sum(nb_cit),3) )][1:10] %>%
kable(caption = "Share of articles for the JEL Methodology corpus")
| Discipline | Share of citations |
|---|---|
| Economics | 0.775 |
| Other Social Sciences | 0.078 |
| Humanities | 0.041 |
| Political Science & Public Administration | 0.027 |
| Management | 0.024 |
| International Relations | 0.023 |
| Geography | 0.019 |
| Psychology | 0.004 |
| Law | 0.004 |
| Demography | 0.002 |
prop_economics <- top_citing_disc[Discipline == "Economics",
nb_cit]/sum(top_citing_disc$nb_cit)
# breakdown by journals
top_citing_journals <- dt_Articles[,list(Discipline = first(Discipline),
Journal = first(Journal), nb_cit =.N),by=Code_Revue]
top_citing_journals <- top_citing_journals[order(-nb_cit)]
top_citing_journals[, list(Discipline, Journal, `Share of citations` = round(nb_cit / sum(nb_cit),3) )][1:12] %>%
kable(caption = "Share of articles for the JEL Methodology corpus")
| Discipline | Journal | Share of citations |
|---|---|---|
| Economics | CAMBRIDGE JOURNAL OF ECONOMICS | 0.131 |
| Economics | JOURNAL OF ECONOMIC ISSUES | 0.067 |
| Economics | HISTORY OF POLITICAL ECONOMY | 0.065 |
| Economics | JOURNAL OF POST KEYNESIAN ECONOMICS | 0.038 |
| Economics | JOURNAL OF ECONOMIC BEHAVIOR AND ORGANIZATION | 0.034 |
| Economics | AMERICAN JOURNAL OF ECONOMICS & SOCIOLOGY | 0.032 |
| Economics | JOURNAL OF INSTITUTIONAL ECONOMICS | 0.032 |
| Other Social Sciences | CRITICAL REVIEW | 0.030 |
| Economics | JOURNAL OF INSTITUTIONAL AND THEORETICAL ECONOMICS | 0.029 |
| Economics | REVIEW OF SOCIAL ECONOMY | 0.027 |
| Other Social Sciences | SCIENCE AND SOCIETY | 0.025 |
| Economics | EUROPEAN JOURNAL OF THE HISTORY OF ECONOMIC THOUGHT | 0.023 |
biggest_j_not_in_econ <- top_citing_journals[Discipline != "Economics", first(Journal)]
rank_biggest_j_not_in_econ <- which(top_citing_journals$Journal == biggest_j_not_in_econ)
rm(dt_Articles,top_citing_disc,top_citing_journals)
It is noteworthy that disciplines other than economics have together 22.5% of the articles and that a journal such as Critical Review is at rank 8.
# Loading the specialized philo of econ corpus
load("/projects/digital_history/philo_and_economics/data/dt_ScopusCorpus_PhiEcon.RData")
art_spec_phi_econ <- dt_CorpusArticles[SO %in% c("ECONOMICS AND PHILOSOPHY", "JOURNAL OF ECONOMIC METHODOLOGY")]
rm(dt_CorpusArticles)
# Loading the JEL corpus
load("/projects/digital_history/philo_and_economics/data/dt_Articles_metho.RData")
load("/projects/digital_history/philo_and_economics/data/dt_citing_cited_metho.RData")
setkey(dt_Articles,ID_Art)
art_JEL_corpus <- dt_Articles[J(unique(dt_citing_cited$ID_Art))][ between(Year, first_y,last_y_metho) ]
rm(dt_citing_cited, dt_Articles)
# Combining the two corpora
art_combo <- rbind(art_spec_phi_econ[,list(Corpus = "Specialized Philosophy\nof Economics", Year = PY)],
art_JEL_corpus[,list(Corpus = "JEL Economic Methodology", Year)]
)
art_combo_n <- art_combo[Year >=1990,list(n= .N), by= .(Year,Corpus)]
ggplot(art_combo_n,aes(x=Year, y=n,color = Corpus)) + geom_line(lwd=1.5) + ylim(0,max(art_combo_n$n, na.rm = TRUE)) + ylab("Number of articles")
rm(art_spec_phi_econ,art_JEL_corpus,art_combo, art_combo_n)
One way to see how far apart are the two corpora is to simply compare the citation ranks:
# Philo of econ
load("/projects/digital_history/philo_and_economics/data/dt_refs.RData")
dt_refs <- dt_refs[SO %in% c("ECONOMICS AND PHILOSOPHY", "JOURNAL OF ECONOMIC METHODOLOGY")]
# aggregating it:
refs_phi <- dt_refs[,list(n_philo = .N, First_surname = first(Author),
cited_year = first(Year), refs_philo = first(refs)),
by = unique_ref_id][order(-n_philo)]
setnames(refs_phi,"unique_ref_id","ID_phi")
rm(dt_refs)
# Correcting a few things:
refs_phi[grepl("KUHN",x = First_surname),cited_year := c(1970,cited_year[2:.N])]
refs_phi[grepl("ROBBINS",x = First_surname),cited_year := c(1935,cited_year[2:.N])]
refs_phi[grepl("MARSHALL",x = First_surname),cited_year := c(1920,cited_year[2:.N])]
refs_phi[grepl("MCCLOSKEY",x = First_surname),cited_year := c(1998,cited_year[2:.N])]
refs_phi[grepl("BHASKAR",x = First_surname),cited_year := c(1978,cited_year[2:.N])]
refs_phi[grepl("BLAUG",x = First_surname),cited_year := c(1992,cited_year[2:.N])]
refs_phi[grepl("HUME",x = First_surname),cited_year := c(1978,cited_year[2:.N])]
refs_phi[grepl("SCHUMPETER",x = First_surname) &
grepl("THEORY OF ECONOMIC DEVELOPMENT",x = refs_philo) ,
cited_year := c(1934,cited_year[2:.N])]
refs_phi[grepl("POPPER",x = First_surname) &
grepl("LOGIC",x = refs_philo) ,
`:=`(cited_year = c(rep(1968,3),cited_year[4:.N]),
ID_phi = c(rep(ID_phi[1],3),ID_phi[4:.N] ))]
refs_phi[grepl("SCHUMPETER",x = First_surname) &
grepl("CAPITALISM",x = refs_philo),
cited_year := c(1950)]
refs_phi[grepl("VEBLEN",x = First_surname) &
grepl("LEISURE",x = refs_philo),
cited_year := c(1994)]
refs_phi[grepl("SAYER",x = First_surname),cited_year := c(1992,cited_year[2:.N])]
refs_phi[grepl("JEVONS",x = First_surname),cited_year := c(1957,cited_year[2:.N])]
refs_phi[grepl("ROSENBERG",x = First_surname),cited_year := c(1992,cited_year[2:.N])]
refs_phi[grepl("FRIEDMAN",x = First_surname) &
grepl("POSITIVE EC",x = refs_philo) &
cited_year == 1953,
ID_phi := ID_phi[1]]
refs_phi[grepl("MARX",x = First_surname) &
grepl("CAPITAL",x = refs_philo),
`:=`(cited_year = 1970,
ID_phi = ID_phi[1])]#,cited_year := c(1992,cited_year[2:.N])]
refs_phi[,n_philo := sum(n_philo), by =ID_phi]
refs_phi <- refs_phi[order(-n_philo),head(.SD,1), by =ID_phi]
# JEL methodology
load( file = "/projects/digital_history/philo_and_economics/data/dt_refs_of_JEL_metho.RData")
setnames(dt_refs_of_JEL_metho, c("last_name", "times_cited"), c("First_surname","n_metho"))
dt_refs_of_JEL_metho <- dt_refs_of_JEL_metho[order(-n_metho), list(ID_metho = New_id2,
n_metho, First_surname, cited_year,
publication_metho = Publication, vol_metho = Volume,
page_metho = Page)]
# a bit of cleaning
dt_refs_of_JEL_metho[grepl("FRIEDMAN",x = First_surname) & cited_year == 1953, ID_metho := ID_metho[1]]
dt_refs_of_JEL_metho[grepl("SMITH",x = First_surname) & cited_year == 1776, ID_metho := ID_metho[1]]
dt_refs_of_JEL_metho[, n_metho := sum(n_metho), by = ID_metho]
dt_refs_of_JEL_metho <- dt_refs_of_JEL_metho[order(-n_metho), head(.SD,1), by = ID_metho]
# merging the two:
refs_combo <- merge(refs_phi,dt_refs_of_JEL_metho, by = c("First_surname", "cited_year"),all=TRUE)
refs_combo <- refs_combo[order(-n_philo,-n_metho)]
refs_combo[, dist := adist(publication_metho,refs_philo), by= .(ID_metho, ID_phi)]
setcolorder(refs_combo,c("n_philo", "n_metho", "dist", "First_surname", "cited_year","publication_metho", "refs_philo"))
refs_combo <- refs_combo[order(-n_metho,-n_philo,dist)]
max_id <- max(refs_combo$ID_metho,na.rm = TRUE)
refs_combo[is.na(ID_metho), `:=`(ID_metho = (max_id + (1:.N)), n_metho = 0)]
refs_combo <- refs_combo[,head(.SD,1),by=ID_metho]
refs_combo <- refs_combo[order(-n_philo, -n_metho,dist)]
max_id <- max(refs_combo$ID_phi,na.rm = TRUE)
refs_combo[is.na(ID_phi), `:=`(ID_metho = (max_id + (1:.N)), n_philo = 0)]
refs_combo <- refs_combo[,head(.SD,1),by=ID_phi]
# Creating rank variable (giving the smallest rank to ties)
refs_combo[, `:=`(rank_phi = rank(-n_philo,ties.method="min"),
rank_metho = rank(-n_metho,ties.method="min"))]
# The articles in top 50 of two corpora:
refs_combo <- refs_combo[order(rank_phi+ rank_metho)] # ordering by weighing each rank as much
n_top = 50
refs_combo[ rank_phi <= n_top & rank_metho <= n_top, `In both` := TRUE]
refs_combo[ is.na(`In both`), `In both` := FALSE]
# in_two_tops <- refs_combo[ rank_phi <= n_top & rank_metho <= n_top, list(`Rank Phi` = rank_phi,
# `Rank Meth` = rank_metho,
# `First Author` = First_surname %>% tolower %>% toTitleCase(),
# `Year` = cited_year,
# `Abbreviated Publication` = publication_metho %>% tolower %>% toTitleCase()) ]
in_tops <- refs_combo[ rank_phi <= n_top | rank_metho <= n_top,list(`In both`, `Rank Phi` = rank_phi,
`Rank Meth` = rank_metho,
`First Author` = First_surname %>% tolower %>% toTitleCase(),
`Year` = cited_year,
`Abbreviated Publication` = publication_metho %>% tolower %>% toTitleCase()) ]
in_tops %>%
datatable(options = list(ordering = TRUE),
caption = "Documents among the 50 most cited in at least one of the two corpora. Those with TRUE in the first column are in both top 50.")
# code to identify what to manually correct above
# refs_combo <- refs_combo[order(-n_philo, -n_metho,dist)]
# refs_combo[rank_metho > (rank_phi +1000),list(n_metho,rank_phi,
# rank_metho,First_surname,cited_year,
# publication_metho,refs_philo)]
# refs_combo[rank_phi <= 50,list(n_philo, n_metho,rank_phi,
# rank_metho,First_surname,cited_year,publication_metho,refs_philo)]
# refs_combo <- refs_combo[order(-n_metho,-n_philo,dist)]
# refs_combo[rank_metho <= 50,list(n_philo, n_metho,rank_phi,
# rank_metho,First_surname,cited_year,publication_metho,refs_philo)]
We test different community detection algorithms: Louvain, fast greedy, walktrap and infomap. We select the algorithm that produces the partition with the highest modularity score, which is Louvain in our case (as is typicaly the case).
#dt_refs contains all articles references.
load("/projects/digital_history/philo_and_economics/data/dt_refs.RData")
dt_refs <- dt_refs[SO %in% c("ECONOMICS AND PHILOSOPHY", "JOURNAL OF ECONOMIC METHODOLOGY")]
#Giving an ID to every references
dt_references <- unique(dt_refs[,list(refs)])
dt_references$refID <- c(1:nrow(dt_references))
#Merging the refID back on a single table
dt_edges <- merge(dt_refs[, list(ID, refs, PY)], dt_references, by = "refs")[,list(ID, refID, PY)]
#Getting rid of references that appear only once
dt_edges <- unique(dt_edges[,N := .N, by = "refID"][N > 1][, list(ID, refID, PY)])
dt_edges <- dt_edges[,N := .N, by = "refID"][N > 1][, list(ID, refID, PY)]
bib_coup <- bibliographic_coupling(dt_edges, "ID", "refID")
#Getting louvain communities from the bibliographic coupling table
graph <- graph_from_data_frame(bib_coup, directed=FALSE, vertices=NULL)
louvain_result <- cluster_louvain(graph, weights = bib_coup$N)
fast_greedy <- cluster_fast_greedy(graph, weights = bib_coup$N)
walktrap <- cluster_walktrap(graph, weights = bib_coup$N)
infomap <- cluster_infomap(graph, e.weights = bib_coup$N)
#What algorithm has the best modularity?
modularity_value <- c(modularity(louvain_result),
modularity(fast_greedy),
modularity(walktrap),
modularity(infomap))
#Naming the modularity results
clustering_algorithm <- c("Louvain", "Fast Greedy", "Walktrap", "Infomap")
modularity_test_results <- data.table(clustering_algorithm, modularity_value)
kable(modularity_test_results, caption = "Modularity value for different clustering algorithms on our corpus")
| clustering_algorithm | modularity_value |
|---|---|
| Louvain | 0.3772045 |
| Fast Greedy | 0.3341597 |
| Walktrap | 0.3381432 |
| Infomap | 0.2532636 |
#Formatting the results
article_and_com_id <- data.table()
for (i in (1:length(louvain_result)))
{
com <- data.table(ID = louvain_result[[i]])
com$com_ID <- i
article_and_com_id <- rbind(article_and_com_id,com, fill = TRUE)
}
#Only keeping communities that have more than 10 articles
article_and_com_id <- article_and_com_id[,N:=.N, by="com_ID"][N > 10]
article_and_com_id$ID <- as.numeric(article_and_com_id$ID)
save(article_and_com_id, file = "/projects/digital_history/philo_and_economics/data/article_and_com_id.RData")
com_w_nb <- article_and_com_id[,.N,com_ID]
rm(article_and_com_id); rm(dt_edges); rm(dt_references)
After excluding clusters with 10 or less articles, we are left with 5 clusters that have the following article distribution (labels are given manually based on what are in the clusters, see below):
doc_topic_map <- data.table(document = 1:5, Topic = c("Moral\nPhilosophy","Big M","Decision\nTheory","Small m","Behavioral\nEconomics"))
com_w_nb <- merge(com_w_nb,doc_topic_map ,by.x = "com_ID", by.y= "document")
com_w_nb[order(-N),list(Cluster = str_replace(Topic, "\n"," "), `Number of articles`=N)]
Our first representation uses the most characteristics tokens in the title of the articles of each cluster over the full period. We manually named the clusters based on these words and on perusing the documents they cite most often (see below).
load("/projects/digital_history/philo_and_economics/data/article_and_com_id.RData")
load("/projects/digital_history/philo_and_economics/data/dt_ScopusCorpus_PhiEcon.RData")
Communities <- merge(dt_CorpusArticles[,list(ID, TI,PY,AU)], article_and_com_id[,list(ID, com_ID)], by="ID")
setnames(Communities, c("TI", "com_ID"), c("titre", "modularity_class"))
corpus <- cleaning_corpus(Communities, "titre")
imp_bigrams <- find_most_significant_bigram(corpus)
#Generating the unigram table (all unigram for each document that aren't part of the significant bigram we found)
path_dic_philo <- "/projects/digital_history/philo_and_economics/data/dictionary_philo.RData"
dt_ngram_occurences <- make_unigram_and_bigram_occurence_table(corpus = corpus, imp_bigrams = imp_bigrams,
unstem_dictionary_path = path_dic_philo)
dt_ngram_occurences[,conversion := conversion_table[n_gram]][!is.na(conversion), n_gram := conversion]
dt_ngram_occurences[,conversion := NULL]
#Naming topics
dt_ngram_occurences <- merge(dt_ngram_occurences,doc_topic_map ,by = "document", all.x = TRUE)
plot_tfidf(dt_ngram_occurences, colors = colors_cluster, order_disc = order_disc_philo, title_graph = "")
From our position as participants in the field, the sets of keywords are quite telling and our labelling of each cluster was straightforward. Only a few keywords might be surprising for an insider. We add a subsection investigating why these keywords are present
Our first surprise is to find 'reversals' as the second most distinctive token for the cluster that we label Big M based on every other information. We look at titles with this token:
# Checking "reversal" in Big M
token <- "reversa"
com <- "Big M"
revers_papers <-
Communities[
modularity_class ==doc_topic_map [grepl(com, Topic), document] &
grepl(pattern = token, x = toTitleCase(titre),ignore.case = T),
list(
Author = AU, Year = PY, Title = toTitleCase(tolower(titre))
)
][order(Year)]
kable(revers_papers)
| Author | Year | Title |
|---|---|---|
| TAMMI T | 1999 | Incentives and Preference Reversals: Escape Moves and Community Decisions in Experimental Economics |
| GUALA F | 2000 | Artefacts in Experimental Economics: Preference Reversals and the Beckerdegrootmarschak Mechanism |
| ANGNER E | 2002 | Levi's Account of Preference Reversals |
We see that we have three titles with this keyword in the Big M cluster, but there is 0 title in the other clusters with this keyword. We accept that these papers might not be best described as Big M, but clustering algorithms necessarily make some debatable allocations by creating mutually exclusive sets among shades of grey. Two of these authors are well-known in the field (Angner and Guala), so we can verify that their papers tend to be put in the clusters that is closer to their main topic of research: Behavioral Economics (and, by extension, experimental economics):
# Our three relevant authors
aut_rev <- c("TAMMI T", "GUALA F", "ANGNER E")
aut_rev <- merge( Communities[
# modularity_class ==doc_topic_map[grepl(com, Topic), document] &
grepl(pattern = paste0(aut_rev,collapse = "|"), x = AU,ignore.case = T),],
doc_topic_map[,list(modularity_class =document, Cluster = str_replace(Topic, "\n", " "))],
by = "modularity_class")
# correcting AU for cases where Guala coauthors with others:
one_auth <- "GUALA F"
aut_rev[grepl(one_auth,x = AU), AU := one_auth]
setnames(aut_rev,"AU", "Scholar")
aut_rev[,list(`Number of papers` = .N),by = .(Scholar,Cluster)][order(Scholar,-`Number of papers`)] #%>% kable()
We find indeed that the general allocation fits with our prior understanding of these scholars' profile.
Now, our cluster Decision Theory includes two potentially surprising surnames: Keynes and Soros. We look at the keyword "Keynes's" first:
# Checking "Soros" in Decision Theory
token <- "Keynes's"
com <- "Decision"
the_papers <-
Communities[
modularity_class ==doc_topic_map[grepl(com, Topic), document] &
grepl(pattern = token, x = toTitleCase(titre),ignore.case = T),
list(
Author = AU, Year = PY, Title = toTitleCase(tolower(titre))
)
][order(Year)]
kable(the_papers)
| Author | Year | Title |
|---|---|---|
| COTTRELL A | 1993 | Keynes's Theory of Probability and Its Relevance to His Economics: Three Theses |
| CHICK V | 2003 | Theory, Method and Mode of Thought in Keynes's General Theory |
| DOW SC;GHOSH D | 2009 | Fuzzy Logic and Keynes's Speculative Demand for Money |
So there are three title included in Decision Theory that have this keyword and 0 title in the other clusters has it. We can intuitively understand why these articles are in Decision Theory: at least two of them have a formal dimension (either Keynes's theory of probability or fuzzy logic). Finally, "Soros"" is included because of a 2013 symposium on his work on the importance of "reflexivity" in economic decisions:
# Checking "Soros" in Decision Theory
token <- "Soros"
com <- "Decision"
the_papers <-
Communities[
modularity_class ==doc_topic_map[grepl(com, Topic), document] &
grepl(pattern = token, x = toTitleCase(titre),ignore.case = T),
list(
Author = AU, Year = PY, Title = toTitleCase(tolower(titre))
)
][order(Year)]
kable(the_papers)
| Author | Year | Title |
|---|---|---|
| CALDWELL B | 2013 | George Soros: Hayekian? |
| NOTTURNO MA | 2013 | Soros and Popper: On Fallibility, Reflexivity, and the Unity of Method |
| HANDS DW | 2013 | Introduction to Symposium on 'Reflexivity and Economics: George Soros's Theory of Reflexivity and the Methodology of Economic Science' |
| DAVIS JB | 2013 | Soros's Reflexivity Concept in a Complex World: Cauchy Distributions, Rational Expectations, and Rational Addiction |
| CROSS R;HUTCHINSON H;LAMBA H;STRACHAN D | 2013 | Reflections on Soros: Mach, Quine, Arthur and Far-from-Equilibrium Dynamics |
All in all, there are extremely few anomalies in what we have observed by looking more closely at some titles. Our labelling of clusters seems to rest on solid ground.
We can repeat the same procedure with tf-idf, but for each decade in order to have an idea of the temporal evolution of topics.
load("/projects/digital_history/philo_and_economics/data/article_and_com_id.RData")
load("/projects/digital_history/philo_and_economics/data/dt_ScopusCorpus_PhiEcon.RData")
Communities <- merge(dt_CorpusArticles[,list(ID, TI)], article_and_com_id[,list(ID, com_ID)], by="ID")
setnames(Communities, c("TI", "com_ID"), c("titre", "modularity_class"))
#merging Publication Year and Author
Communities <- merge(Communities, dt_CorpusArticles[,list(ID,AU, PY)], by = "ID")
#Adding decade column
for (i in c(1990,2000,2010)){
Communities[between(PY,i,i+9),decade:=i]
}
for(d in unique(Communities[!is.na(decade)]$decade)){
corpus <- cleaning_corpus(Communities[decade == d], "titre")
imp_bigrams <- find_most_significant_bigram(corpus)
#Generating the unigram table (all unigram for each document that aren't part of the significant bigram we found)
path_dic_philo <- "/projects/digital_history/philo_and_economics/data/dictionary_philo.RData"
dt_ngram_occurences <- make_unigram_and_bigram_occurence_table(corpus = corpus, imp_bigrams = imp_bigrams,
unstem_dictionary_path = path_dic_philo)
dt_ngram_occurences[,conversion := conversion_table[n_gram]][!is.na(conversion), n_gram := conversion]
dt_ngram_occurences[,conversion := NULL]
#Naming topics
dt_ngram_occurences <- merge(dt_ngram_occurences,doc_topic_map,by = "document", all.x = TRUE)
print(plot_tfidf(dt_ngram_occurences,title_graph = paste0("TF-IDF for ",d, " to ", d+9)
,colors = colors_cluster, order_disc = order_disc_philo))
}
We produce a set of tables about the citations for each cluster. The first table is the one we use in the chapter: we produce an html version and the LaTeX version for the paper. We then present the same results in a long format (full citation information). The last table is not what the clusters cite most, but rather which articles in each cluster have received the most academic citations overall.
load("/projects/digital_history/philo_and_economics/data/Communities_partition_30years.RData")
load("/projects/digital_history/philo_and_economics/data/dt_refs.RData")
Communities[, Topic := str_replace(Topic, "\n", " ")]
Communities <- merge(Communities, dt_refs[,list(refs, unique_ref_id)] %>% unique(), by = "refs", all.x = TRUE, all.y = FALSE)
#Top citations per community
top_ref_per_topic <- Communities[,list(nb_refs = .N), by = c("Topic", "unique_ref_id")]
top_ref_per_topic <- merge(top_ref_per_topic, dt_refs[,head(.SD, 1),
by = unique_ref_id][,list(unique_ref_id, Author, Year)],
by = "unique_ref_id")
setorder(top_ref_per_topic, Topic, -nb_refs)
#top_ref_per_topic[,refs := paste0(Author, "-", Year)]
top_ref <- unique(top_ref_per_topic[,list(Author, Year, Topic)])[,head(.SD, 5), by="Topic"]
top_ref[,refs := paste0(Author %>% tolower() %>% toTitleCase(), "~", Year)]
#Formating the table so it has the form : author-date/author-date/author-date/..
top_ref <- top_ref[,aggregate(refs, list(Topic), paste0, collapse = " ")]
setnames(top_ref, c("Group.1", "x"), c("Cluster", "Full period"))
top_ref_full_period <- copy(top_ref)
#kable(top_ref, caption = "Top 5 of most cited documents")
#Top citations per decades
top_ref_per_topic_decade <- Communities[,list(nb_refs = .N), by = c("Topic", "unique_ref_id", "decade")]
top_ref_per_topic_decade <- merge(top_ref_per_topic_decade, dt_refs[,head(.SD, 1),
by = unique_ref_id][,list(unique_ref_id, Author, Year)],
by = "unique_ref_id")
setorder(top_ref_per_topic_decade, Topic, decade, -nb_refs)
#top_ref_per_topic_decade[,refs := paste0(Author, "-", Year)]
top_ref <- unique(top_ref_per_topic_decade[!is.na(decade),list(Author, Year, Topic, decade)])[,head(.SD, 5), by=.(Topic,decade)]
top_ref[,refs := paste0(Author %>% tolower() %>% toTitleCase(), "~",
Year)]
#Formating the table so it has the form : author-date/author-date/author-date/..
top_ref <- top_ref[,aggregate(refs, list(Topic), paste, collapse = " "), by = "decade"]
setnames(top_ref, c("Group.1", "x"), c("Cluster", "reference"))
# Making a column per decade
top_ref <- spread(top_ref,decade, reference)
setnames(top_ref,c("1990","2000","2010"), c("1990-1999","2000-2009","2010-2019"))
# Getting the ordering as in the tf-idf and time series graph
top_ref <- merge(top_ref, data.table(Order = 1:length(order_disc_philo),
Cluster = str_replace(order_disc_philo, "\n"," ")),
by = "Cluster" )
# Bringing in full period data:
top_ref <- merge(top_ref, top_ref_full_period, by = "Cluster")
# ordering rows:
setorder(top_ref,Order)
top_ref$Order <- NULL
# ordering columns:
setcolorder(top_ref, c("Cluster", "Full period"))
# Printing
kable(top_ref, caption = "Top 5 of most cited documents per decade (compact format)")
| Cluster | Full period | 1990-1999 | 2000-2009 | 2010-2019 |
|---|---|---|---|---|
| Moral Philosophy | Rawls~1971 Nozick~1974 Parfit~1984 Sen~1970 Broome~1991 | Rawls~1971 Parfit~1984 Nozick~1974 Harsanyi~1955 Broome~1991 | Rawls~1971 Broome~1991 Nozick~1974 Scanlon~1998 Arrow~1951 | Rawls~1971 Sen~1999 Sen~1970 Broome~2004 Harsanyi~1955 |
| Behavioral Economics | Kahneman~1979 Savage~1954 Camerer~2005 Gul~2008 Ross~2005 | Kahneman~1979 Friedman~1953 Keynes~1971 Allais~1953 Becker~1993 | Kahneman~1979 Savage~1954 Ellsberg~1961 Smith~1982 Allais~1953 | Savage~1954 Kahneman~1979 Camerer~2005 Gul~2008 Kahneman~2011 |
| Big M | Hausman~1992 Friedman~1953 Mccloskey~1985 Blaug~1962 Robbins~1932 | Hausman~1992 Mccloskey~1985 Blaug~1962 Friedman~1953 Rosenberg~1993 | Hausman~1992 Friedman~1953 Hands~2001 Hutchison~1938 Blaug~1962 | Hausman~1992 Friedman~1953 Reiss~2012 Robbins~1932 Hands~2001 |
| Small m | Haavelmo~1944 Hoover~2001 Mccloskey~1985 Pearl~2001 Spirtes~2000 | Mccloskey~1985 Mirowski~1989 Cooley~1985 Engle~1987 Gilbert~1986 | Haavelmo~1944 Hoover~2000 Hendry~1995 Hoover~2001 Kuhn~1962 | Deaton~2010 Haavelmo~1944 Pearl~2001 Spirtes~2000 Hoover~2001 |
| Decision Theory | Keynes~1936 Luce~1957 Pearce~1984 Aumann~1976 Lewis~1969 | Keynes~1936 Binmore~1987 Selten~1975 Aumann~1976 Bernheim~1984 | Hollis~1998 Keynes~1921 Keynes~1936 Lewis~1969 Bernheim~1984 | Soros~2013 Bacharach~2006 Keynes~1936 Mackenzie~2008 Schelling~1960 |
top_ref_compact <- top_ref
top_ref_compact %>% xtable(align= c("r|","L{0.14\\textwidth}", rep("L{0.17\\textwidth}",4)),
caption = "Most cited documents per cluster in the corpus of specialized philosophy of economics",
label = "tab:most_ref_phi") %>%
print(include.rownames=FALSE, sanitize.text.function = identity,
hline.after=-1:nrow(top_ref_compact), size = "small"
)
## % latex table generated in R 3.6.3 by xtable 1.8-4 package
## % Fri Oct 2 02:32:40 2020
## \begin{table}[ht]
## \centering
## \begingroup\small
## \begin{tabular}{L{0.14\textwidth}L{0.17\textwidth}L{0.17\textwidth}L{0.17\textwidth}L{0.17\textwidth}}
## \hline
## Cluster & Full period & 1990-1999 & 2000-2009 & 2010-2019 \\
## \hline
## Moral Philosophy & Rawls~1971 Nozick~1974 Parfit~1984 Sen~1970 Broome~1991 & Rawls~1971 Parfit~1984 Nozick~1974 Harsanyi~1955 Broome~1991 & Rawls~1971 Broome~1991 Nozick~1974 Scanlon~1998 Arrow~1951 & Rawls~1971 Sen~1999 Sen~1970 Broome~2004 Harsanyi~1955 \\
## \hline
## Behavioral Economics & Kahneman~1979 Savage~1954 Camerer~2005 Gul~2008 Ross~2005 & Kahneman~1979 Friedman~1953 Keynes~1971 Allais~1953 Becker~1993 & Kahneman~1979 Savage~1954 Ellsberg~1961 Smith~1982 Allais~1953 & Savage~1954 Kahneman~1979 Camerer~2005 Gul~2008 Kahneman~2011 \\
## \hline
## Big M & Hausman~1992 Friedman~1953 Mccloskey~1985 Blaug~1962 Robbins~1932 & Hausman~1992 Mccloskey~1985 Blaug~1962 Friedman~1953 Rosenberg~1993 & Hausman~1992 Friedman~1953 Hands~2001 Hutchison~1938 Blaug~1962 & Hausman~1992 Friedman~1953 Reiss~2012 Robbins~1932 Hands~2001 \\
## \hline
## Small m & Haavelmo~1944 Hoover~2001 Mccloskey~1985 Pearl~2001 Spirtes~2000 & Mccloskey~1985 Mirowski~1989 Cooley~1985 Engle~1987 Gilbert~1986 & Haavelmo~1944 Hoover~2000 Hendry~1995 Hoover~2001 Kuhn~1962 & Deaton~2010 Haavelmo~1944 Pearl~2001 Spirtes~2000 Hoover~2001 \\
## \hline
## Decision Theory & Keynes~1936 Luce~1957 Pearce~1984 Aumann~1976 Lewis~1969 & Keynes~1936 Binmore~1987 Selten~1975 Aumann~1976 Bernheim~1984 & Hollis~1998 Keynes~1921 Keynes~1936 Lewis~1969 Bernheim~1984 & Soros~2013 Bacharach~2006 Keynes~1936 Mackenzie~2008 Schelling~1960 \\
## \hline
## \end{tabular}
## \endgroup
## \caption{Most cited documents per cluster in the corpus of specialized philosophy of economics}
## \label{tab:most_ref_phi}
## \end{table}
# Same thing, but in "long format"
top_ref_per_topic_decade <- Communities[,list(
nb_refs = .N
), by = c("Topic", "unique_ref_id", "decade")]
setorder(top_ref_per_topic_decade, Topic, decade, -nb_refs)
top_ref_per_topic_decade <- merge(
top_ref_per_topic_decade[,head(.SD,5), by = c("Topic", "decade")],
dt_refs[,list(reference = first(refs)), by = unique_ref_id],
by = "unique_ref_id")
setorder(top_ref_per_topic_decade, Topic, decade,- nb_refs)
top_ref_per_topic_decade %>% datatable(caption = "Top 5 of most cited documents per decade (long format)")
#Top cited article in community
setorder(Communities, Topic, -TC)
datatable(unique(Communities[,list(AU, titre, TC, Cluster = Topic)])[,head(.SD,5), by= "Cluster"], caption = "Articles in each community that are the most cited in general")
One surprising result is the centrality of Keynes's General Theory to the cluster we labelled Decision Theory. Here are the 21 references to this book in the Decision Theory cluster:
load("/projects/digital_history/philo_and_economics/data/Communities_partition_30years.RData")
load("/projects/digital_history/philo_and_economics/data/dt_refs.RData")
Communities[, Topic := str_replace(Topic, "\n", " ")]
Communities <- merge(Communities, dt_refs[,list(refs, unique_ref_id, Author,Year)] %>% unique(), by = "refs", all.x = TRUE, all.y = FALSE)
Communities[grepl("Decision", Topic) & Year == 1936 & Author == "KEYNES",
list(`First Author` = unlist(str_split(AU, ";"))[1],
Title = titre %>% tolower %>% toTitleCase(), Year = PY),by = ID][order(Year),
list(`First Author`,Year,Title)] %>%
kable(row.names = TRUE)
| First Author | Year | Title | |
|---|---|---|---|
| 1 | COTTRELL A | 1993 | Keynes's Theory of Probability and Its Relevance to His Economics: Three Theses |
| 2 | COTTRELL A | 1995 | Intentionality and Economics |
| 3 | MORRIS S | 1995 | The Common Prior Assumption in Economic Theory |
| 4 | MAYER T | 1997 | The Rhetoric of Friedman's Quantity Theory Manifesto |
| 5 | SNOWDON B | 1998 | Transforming Macroeconomics: An Interview with Robert E. Lucas Jr. |
| 6 | VERCELLI A | 1999 | The Evolution of is-Lm Models: Empirical Evidence and Theoretical Presuppositions |
| 7 | FIORETTI G | 2001 | von Kries and the Other German Logicians': Non-Numerical Probabilities Before Keynes |
| 8 | CHICK V | 2003 | Theory, Method and Mode of Thought in Keynes's General Theory |
| 9 | CHICK V | 2005 | The Meaning of Open Systems |
| 10 | BACKHOUSE RE | 2009 | An Unfinished Manuscript by Terence Hutchison |
| 11 | WITT U | 2009 | Novelty and the Bounds of Unknowledge in Economics |
| 12 | WILSON MC | 2009 | Creativity, Probability and Uncertainty |
| 13 | DOW SC | 2009 | Fuzzy Logic and Keynes's Speculative Demand for Money |
| 14 | FRYDMAN R | 2013 | Fallibility in Formal Macroeconomics and Finance Theory |
| 15 | DAVIS JB | 2013 | Soros's Reflexivity Concept in a Complex World: Cauchy Distributions, Rational Expectations, and Rational Addiction |
| 16 | ROSENBERG A | 2013 | Reflexivity, Uncertainty and the Unity of Science |
| 17 | BRONK R | 2013 | Reflexivity Unpacked: Performativity, Uncertainty and Analytical Monocultures |
| 18 | GASPARD M | 2014 | Logic, Rationality and Knowledge in Ramsey's Thought: Reassessing 'Human Logic' |
| 19 | KOUMAKHOV R | 2014 | Conventionalism, Coordination, and Mental Models: From Poincar to Simon |
| 20 | BALLANDONNE M | 2019 | The Historical Roots (18801950) of Recent Contributions (20002017) to Ecological Economics: Insights from Reference Publication Year Spectroscopy |
| 21 | IVAROLA L | 2019 | Alternative Consequences and Asymmetry of Results: their Importance for Policy Decision Making |
rm(dt_refs, Communities)
load("/projects/digital_history/philo_and_economics/data/article_and_com_id.RData")
load("/projects/digital_history/philo_and_economics/data/dt_ScopusCorpus_PhiEcon.RData")
load("/projects/digital_history/philo_and_economics/data/dt_refs.RData")
Communities <- merge(dt_CorpusArticles[,list(ID, TI)], article_and_com_id[,list(ID, com_ID)], by="ID")
setnames(Communities, c("TI", "com_ID"), c("titre", "modularity_class"))
#merging Publication Year, Author and Citation count (TC)
Communities <- merge(Communities, unique(dt_refs[,list(ID, PY, refs )]), by = "ID")
Communities <- merge(Communities, dt_CorpusArticles[,list(ID,AU,TC)], by = "ID")
#Adding decade column
for (i in c(1990,2000,2010))
{
Communities[between(PY,i,i+9),decade:=i]
}
#naming communities
Communities <- merge(Communities,doc_topic_map[,list(modularity_class = document, Topic)],by = "modularity_class", all.x = TRUE)
#We save that table so we can use it to find to top citations per cluster
save(Communities, file = "/projects/digital_history/philo_and_economics/data/Communities_partition_30years.RData")
#Plotting topics through time
plot_topic_thru_time(Communities, 1990, colors = colors_cluster)
## `geom_smooth()` using formula 'y ~ x'
The cluster Decision Theory decreased markedly over the studied period. To test the hypothesis that work on the topic has simply shifted elsewhere, we look at the citation pattern of the three core texts in game theory highly cited by the cluster: Luce and Raiffa (1957), Aumann (1976) and Pearce (1984). We use web of science data.
# Loading data fetched from WoS
cit_to_gt_classics <- fread("/projects/digital_history/philo_and_economics/data/2020-09-27_papers_citing_at_least_one_game_th_classic.tsv",
quote = "")
## Warning in fread("/projects/digital_history/philo_and_economics/data/
## 2020-09-27_papers_citing_at_least_one_game_th_classic.tsv", : Discarded single-
## line footer: <<Completion time: 2020-09-27T13:23:08.1770850-04:00>>
setnames(cit_to_gt_classics, "Annee_Bibliographique", "Year")
# journal code for Econ & Philo is 4721 and for JEM is 21784
# Coding refs from philo of econ and from the rest
name_phi_econ <- "E&P or JEM"; name_other <- "Other journals"
cit_to_gt_classics[Code_Revue %in% c(4721,21784), Source := name_phi_econ]
cit_to_gt_classics[is.na(Source), Source := name_other]
# year-source journal aggregation
last_year = 2018
agg_cit_to_gt <- cit_to_gt_classics[between(Year, first_y,last_year),list(nb_cit =.N),by=.(Source,Year)]
# filling up years with 0 citations
dt1 <- merge(data.table(Year = first_y:last_year),
agg_cit_to_gt[Source == name_phi_econ,], by= "Year", all = TRUE)
dt1[is.na(Source), `:=`(Source = name_phi_econ, nb_cit =0)]
dt2 <- merge(data.table(Year = first_y:last_year),
agg_cit_to_gt[Source == name_other,], by= "Year", all = TRUE)
dt2[is.na(Source), `:=`(Source = name_other, nb_cit =0)]
agg_cit_to_gt <- rbindlist(list(dt1,dt2))
rm(dt1, dt2)
ggplot( agg_cit_to_gt, aes(x = Year, y = nb_cit, color = Source)) + geom_smooth() +
labs(y = "Number of citations", title = "Citations to Luce and Raiffa (1957), Aumann (1976) or Pearce (1984)")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
So we see that citations to these documents have not dropped significantly. Let's see in which journals or field the documentstend to be cited:
# Name of discipline in cit info:
cit_to_gt_classics <- merge(cit_to_gt_classics,
discipline_info[,list(Code_Discipline,Discipline = str_replace(discipline," \n |\n"," "))], by= "Code_Discipline")
# breakdown by disciplines over the full period
top_citing_disc <- cit_to_gt_classics[,list(nb_cit =.N),by=Discipline]
top_citing_disc[order(-nb_cit), list(Discipline, `Share of citations` = round(nb_cit / sum(nb_cit),3) )][1:10] %>%
kable(caption = "Share of citations to Luce and Raiffa (1957), Aumann (1976) or Pearce (1984)")
| Discipline | Share of citations |
|---|---|
| Economics | 0.354 |
| Management | 0.122 |
| Psychology | 0.087 |
| Computers & Operations Research | 0.082 |
| Philosophy and Science Studies | 0.059 |
| Political Science & Public Administration | 0.037 |
| Other Engineering and Technology | 0.034 |
| Mathematics | 0.033 |
| Law | 0.031 |
| International Relations | 0.029 |
# Looking at decades:
cit_to_gt_classics[between(Year,1990,1999),Decade := 1990]
cit_to_gt_classics[between(Year,2000,2009),Decade := 2000]
cit_to_gt_classics[between(Year,2010,2019),Decade := 2010]
# Aggregating by decade
top_citing_disc_decade <- cit_to_gt_classics[!is.na(Decade),list(nb_cit =.N),by=.(Decade,Discipline)][order(Decade,-nb_cit)]
top_citing_disc_decade[, `Share of citations` := round(nb_cit / sum(nb_cit),3), by= Decade]
top_citing_disc_decade[,.SD[1:5,list(Discipline, `Share of citations`)], by = Decade] %>%
kable(caption = "Share of citations to Luce and Raiffa (1957), Aumann (1976) or Pearce (1984)")
| Decade | Discipline | Share of citations |
|---|---|---|
| 1990 | Economics | 0.430 |
| 1990 | Management | 0.112 |
| 1990 | Computers & Operations Research | 0.091 |
| 1990 | Psychology | 0.067 |
| 1990 | Philosophy and Science Studies | 0.042 |
| 2000 | Economics | 0.361 |
| 2000 | Computers & Operations Research | 0.127 |
| 2000 | Management | 0.100 |
| 2000 | Psychology | 0.078 |
| 2000 | Philosophy and Science Studies | 0.069 |
| 2010 | Economics | 0.413 |
| 2010 | Management | 0.106 |
| 2010 | Philosophy and Science Studies | 0.102 |
| 2010 | Computers & Operations Research | 0.081 |
| 2010 | Psychology | 0.079 |
# And then also looking at journals most citing the papers
# Over the full period
top_citing_j <- cit_to_gt_classics[,list(nb_cit =.N),by=Revue]
top_citing_j[order(-nb_cit), list(Journal = Revue, `Share of citations` = round(nb_cit / sum(nb_cit),3) )][1:10] %>%
kable(caption = "Share of citations to Luce and Raiffa (1957), Aumann (1976) or Pearce (1984)")
| Journal | Share of citations |
|---|---|
| JOURNAL OF ECONOMIC THEORY | 0.037 |
| GAMES AND ECONOMIC BEHAVIOR | 0.033 |
| THEORY AND DECISION | 0.032 |
| ECONOMETRICA | 0.024 |
| INTERNATIONAL JOURNAL OF GAME THEORY | 0.015 |
| SYNTHESE | 0.015 |
| JOURNAL OF CONFLICT RESOLUTION | 0.013 |
| MATHEMATICAL SOCIAL SCIENCES | 0.013 |
| MANAGEMENT SCIENCE | 0.012 |
| ECONOMICS AND PHILOSOPHY | 0.010 |
# by decade
top_citing_j_decade <- cit_to_gt_classics[!is.na(Decade),list(nb_cit =.N),by=.(Decade,Revue)][order(Decade,-nb_cit)]
top_citing_j_decade[, `Share of citations` := round(nb_cit / sum(nb_cit),3), by= Decade]
top_citing_j_decade[,.SD[1:5,list(Journal = Revue, `Share of citations`)], by = Decade] %>%
kable(caption = "Share of citations to Luce and Raiffa (1957), Aumann (1976) or Pearce (1984)")
| Decade | Journal | Share of citations |
|---|---|---|
| 1990 | JOURNAL OF ECONOMIC THEORY | 0.062 |
| 1990 | GAMES AND ECONOMIC BEHAVIOR | 0.058 |
| 1990 | THEORY AND DECISION | 0.041 |
| 1990 | INTERNATIONAL JOURNAL OF GAME THEORY | 0.033 |
| 1990 | ECONOMETRICA | 0.028 |
| 2000 | GAMES AND ECONOMIC BEHAVIOR | 0.050 |
| 2000 | JOURNAL OF ECONOMIC THEORY | 0.038 |
| 2000 | THEORY AND DECISION | 0.025 |
| 2000 | ECONOMIC THEORY | 0.021 |
| 2000 | SYNTHESE | 0.020 |
| 2010 | GAMES AND ECONOMIC BEHAVIOR | 0.056 |
| 2010 | JOURNAL OF ECONOMIC THEORY | 0.034 |
| 2010 | SYNTHESE | 0.029 |
| 2010 | THEORY AND DECISION | 0.026 |
| 2010 | INTERNATIONAL JOURNAL OF GAME THEORY | 0.020 |
load("/projects/digital_history/philo_and_economics/data/article_and_com_id.RData")
load("/projects/digital_history/philo_and_economics/data/dt_ScopusCorpus_PhiEcon.RData")
Communities <- merge(dt_CorpusArticles[,list(ID, TI, AU, SO, PY,TC)], article_and_com_id[,list(ID, com_ID)], by="ID")
setnames(Communities, c("TI", "com_ID", "SO"), c("titre", "modularity_class", "Journal"))
journal_share_per_cluster <- Communities[,list(n =.N),by = .(modularity_class,Journal)]
journal_share_per_cluster[,share_cluster := n/sum(n),by=modularity_class]
#naming communities
journal_share_per_cluster <- merge(journal_share_per_cluster,doc_topic_map[,list(modularity_class = document, Topic = str_replace(Topic, "\n", " "))],by = "modularity_class", all.x = TRUE)
setorder(journal_share_per_cluster,modularity_class, Journal)
journal_share_per_cluster[,list(Topic,Journal, share_cluster)] %>%
kable()
| Topic | Journal | share_cluster |
|---|---|---|
| Moral Philosophy | ECONOMICS AND PHILOSOPHY | 0.8592058 |
| Moral Philosophy | JOURNAL OF ECONOMIC METHODOLOGY | 0.1407942 |
| Big M | ECONOMICS AND PHILOSOPHY | 0.1581395 |
| Big M | JOURNAL OF ECONOMIC METHODOLOGY | 0.8418605 |
| Decision Theory | ECONOMICS AND PHILOSOPHY | 0.4715026 |
| Decision Theory | JOURNAL OF ECONOMIC METHODOLOGY | 0.5284974 |
| Small m | ECONOMICS AND PHILOSOPHY | 0.1318681 |
| Small m | JOURNAL OF ECONOMIC METHODOLOGY | 0.8681319 |
| Behavioral Economics | ECONOMICS AND PHILOSOPHY | 0.3468208 |
| Behavioral Economics | JOURNAL OF ECONOMIC METHODOLOGY | 0.6531792 |
We use Louvain algorithm once again since it had the best results in the other corpus
load("/projects/digital_history/philo_and_economics/data/dt_citing_cited_metho.RData")
load("/projects/digital_history/philo_and_economics/data/dt_Articles_metho.RData")
# Constraining to documents in corpus and in temporal bounds
dt_citing_cited <- dt_citing_cited[ID_Art %in% dt_Articles[between(Year,first_y,last_y_metho) #& Code_Discipline == 119
,ID_Art]]
# Removing documents only cited once (because they can't participate to coupling)
i <- dt_citing_cited[,.N,by = New_id2][N>1,New_id2]
dt_ref <- dt_citing_cited[New_id2 %in% i]
bib_coup <- bibliographic_coupling(dt_ref, "ID_Art", "New_id2")
graph <- graph_from_data_frame(bib_coup, directed=FALSE, vertices=NULL)
louvain_result <- cluster_louvain(graph, weights = bib_coup$N)
#Formatting the results
article_and_com_id <- data.table()
for (i in (1:length(louvain_result)))
{
com <- data.table(ID = louvain_result[[i]])
com$com_ID <- i
article_and_com_id <- rbind(article_and_com_id,com, fill = TRUE)
}
#Only keeping communities that has more than 10 articles
article_and_com_id <- article_and_com_id[,N:=.N, by="com_ID"][N > 10]
article_and_com_id$ID <- as.numeric(article_and_com_id$ID)
save(article_and_com_id, file = "/projects/digital_history/philo_and_economics/data/article_and_com_id_metho.RData")
com_w_nb <- article_and_com_id[,.N,com_ID]
rm(article_and_com_id); rm(louvain_result); rm(graph)
com_w_nb <- merge(com_w_nb,JEL_doc_topic_map ,by.x = "com_ID", by.y= "document")
com_w_nb[order(-N),list(Cluster = str_replace(Topic, "\n"," "), `Number of articles`=N)]
We produce again the keywords over the full period. We manually named the clusters based on these words and on perusing the documents they cite most often (see below).
load("/projects/digital_history/philo_and_economics/data/article_and_com_id_metho.RData")
load("/projects/digital_history/philo_and_economics/data/dt_Articles_metho.RData")
Communities <- merge(dt_Articles[,list(ID_Art, Titre, Year)], article_and_com_id[,list(ID, com_ID)], by.x="ID_Art", by.y = "ID")
setnames(Communities, c("ID_Art", "Titre", "com_ID", "Year"), c("ID", "titre", "modularity_class", "PY"))
setkey(Communities, modularity_class)
#Making a dictionary in order to unstem our tf-idf later on
path_dic_metho <- "/projects/digital_history/philo_and_economics/data/dictionary_metho.RData"
make_unstem_dictionary(Communities, path_to_save_to = path_dic_metho, col_name = "titre")
corpus <- cleaning_corpus(table = Communities, col_name = "titre")
imp_bigrams <- find_most_significant_bigram(corpus)
#Generating the unigram table (all unigram for each document that aren't part of the significant bigram we found)
dt_ngram_occurences <- make_unigram_and_bigram_occurence_table(corpus = corpus, imp_bigrams = imp_bigrams,
unstem_dictionary_path = path_dic_metho)
dt_ngram_occurences[,conversion := conversion_table[n_gram]][!is.na(conversion), n_gram := conversion]
dt_ngram_occurences[,conversion := NULL]
#Naming topics
dt_ngram_occurences <- merge(dt_ngram_occurences,JEL_doc_topic_map ,by = "document", all.x = TRUE)
#ploting the tf-idf
plot_tfidf(dt_ngram_occurences, colors = colors_cluster, order_disc = order_disc_metho, title_graph = "")
rm(dt_ngram_occurences)
We could check various keywords here, but the patterns are rather clear without digging more. See especially the main references and the main sources below.
Checking the use of "conning", "Shiller", "Sen's" and "Sraffa" in Small m:
load("/projects/digital_history/philo_and_economics/data/Communities_metho.RData")
# Checking "conning" in Empirical Methodolog
token <- "conning|shiller|sen's|sraffa"
com <- "Small"
revers_papers <-
Communities[
modularity_class ==JEL_doc_topic_map[grepl(com, Topic), document] &
grepl(pattern = token, x = toTitleCase(titre),ignore.case = T),
list(
#Author = AU,
Year = PY, Title = toTitleCase(tolower(titre))
)
][order(Year)]
kable(revers_papers)
| Year | Title |
|---|---|
| 2006 | a Comment on Sen's 'Sraffa, Wittgenstein, and Gramsci' |
| 2007 | Variations on the Theme of Conning in Mathematical Economics |
| 2012 | Piero Sraffa and 'The True Object of Economics': The Role of the Unpublished Manuscripts |
| 2012 | Piero Sraffa and the Future of Economics |
| 2012 | The Change in Sraffa's Philosophical Thinking |
| 2013 | Rational Expectations: Retrospect and Prospect a Panel Discussion with Michael Lovell, Robert Lucas, Dale Mortensen, Robert Shiller, and Neil Wallace |
| 2017 | Market Sociality: Mirowski, Shiller and the Tension Between Mimetic and Anti-Mimetic Market Features |
Now the keywords for each decade.
load("/projects/digital_history/philo_and_economics/data/article_and_com_id_metho.RData")
load("/projects/digital_history/philo_and_economics/data/dt_Articles_metho.RData")
Communities <- merge(dt_Articles[,list(ID_Art, Titre, Year)], article_and_com_id[,list(ID, com_ID)], by.x="ID_Art", by.y = "ID")
setnames(Communities, c("ID_Art", "Titre", "com_ID", "Year"), c("ID", "titre", "modularity_class", "PY"))
setkey(Communities, modularity_class)
#Adding decade column
for (i in c(1990,2000,2010)){
Communities[between(PY,i,i+9),decade:=i]
}
path_dic_metho <- "/projects/digital_history/philo_and_economics/data/dictionary_metho.RData"
for(d in unique(Communities[!is.na(decade)]$decade)){
com_decade <- Communities[decade == d]
make_unstem_dictionary(com_decade, path_to_save_to = path_dic_metho, col_name = "titre")
corpus <- cleaning_corpus(Communities[decade == d], "titre")
imp_bigrams <- find_most_significant_bigram(corpus)
#Generating the unigram table (all unigram for each document that aren't part of the significant bigram we found)
dt_ngram_occurences <- make_unigram_and_bigram_occurence_table(corpus = corpus, imp_bigrams = imp_bigrams,
unstem_dictionary_path = path_dic_metho)
dt_ngram_occurences[,conversion := conversion_table[n_gram]][!is.na(conversion), n_gram := conversion]
dt_ngram_occurences[,conversion := NULL]
#Naming topics
dt_ngram_occurences <- merge(dt_ngram_occurences,JEL_doc_topic_map ,by = "document", all.x = TRUE)
print(plot_tfidf(dt_ngram_occurences,title_graph = paste0("TF-IDF for ",d, " to ", ifelse(d==2010,d+8,d+9))
,colors = colors_cluster, order_disc = order_disc_metho))
}
We produce the set of tables about the citations for each cluster in a similar manner to what we did for the corpus on specialized philosophy of economics.
load("/projects/digital_history/philo_and_economics/data/article_and_com_id_metho.RData")
load("/projects/digital_history/philo_and_economics/data/dt_Articles_metho.RData")
Communities <- merge(dt_Articles[,list(ID = ID_Art, titre= Titre, PY = Year)],
article_and_com_id[,list(ID, modularity_class = com_ID)], by="ID")
#Adding decade column
for (i in c(1990,2000,2010))
{
Communities[between(PY,i,i+9),decade:=i]
}
#naming communities
Communities <- merge(Communities,JEL_doc_topic_map[,list(modularity_class = document, Topic)],by = "modularity_class", all.x = TRUE)
save(Communities, file = "/projects/digital_history/philo_and_economics/data/Communities_metho.RData")
# Loading reference data
load("/projects/digital_history/philo_and_economics/data/dt_citing_cited_metho.RData")
load("/projects/digital_history/philo_and_economics/data/dt_refs_of_JEL_metho.RData")
Communities[, Topic := str_replace(Topic, "\n", " ")]
Communities <- merge(Communities[,list(ID, modularity_class, PY, decade, Topic)], dt_citing_cited %>% unique(), by.x = "ID", by.y = "ID_Art", all.x = TRUE, all.y = FALSE)
Communities <- merge(Communities,dt_refs_of_JEL_metho, by = "New_id2", all.x = TRUE)
#Top citations per community
top_ref_per_topic <- Communities[,list(nb_refs = .N, Author = first(First_author), Year = first(cited_year)), by = c("Topic", "New_id2")]
setorder(top_ref_per_topic, Topic, -nb_refs)
#top_ref_per_topic[,refs := paste0(Author, "-", Year)]
top_ref <- unique(top_ref_per_topic[,list(Author, Year, Topic)])[,head(.SD, 5), by="Topic"]
top_ref[,c("last","init") := tstrsplit(Author,"-")]
top_ref[,refs := paste0(toTitleCase(tolower(last)),"-", substr(init,0,1), "~", Year)]
#Formating the table so it has the form : author-date/author-date/author-date/..
top_ref <- top_ref[,aggregate(refs, list(Topic), paste0, collapse = " ")]
setnames(top_ref, c("Group.1", "x"), c("Cluster", "Full period"))
top_ref_full_period <- copy(top_ref)
#kable(top_ref, caption = "Top 5 of most cited documents")
#Top citations per decades
top_ref_per_topic_decade <- Communities[,list(nb_refs = .N,
Author = first(First_author), Year = first(cited_year)
), by = c("Topic", "New_id2", "decade")]
setorder(top_ref_per_topic_decade, Topic, decade, -nb_refs)
#top_ref_per_topic_decade[,refs := paste0(Author, "-", Year)]
top_ref <- unique(top_ref_per_topic_decade[!is.na(decade),list(Author, Year, Topic, decade)])[,head(.SD, 5), by=.(Topic,decade)]
top_ref[,c("last","init") := tstrsplit(Author,"-")]
top_ref[,refs := paste0(toTitleCase(tolower(last)),"-", substr(init,0,1), "~", Year)]
#Formating the table so it has the form : author-date/author-date/author-date/..
top_ref <- top_ref[,aggregate(refs, list(Topic), paste, collapse = " "), by = "decade"]
setnames(top_ref, c("Group.1", "x"), c("Cluster", "reference"))
# Making a column per decade
top_ref <- spread(top_ref,decade, reference)
setnames(top_ref,c("1990","2000","2010"), c("1990-1999","2000-2009","2010-2018"))
# Getting the ordering as in the tf-idf and time series graph
top_ref <- merge(top_ref, data.table(Order = 1:length(order_disc_metho),
Cluster = str_replace(order_disc_metho, "\n"," ")),
by = "Cluster" )
# Bringing in full period data:
top_ref <- merge(top_ref, top_ref_full_period, by = "Cluster")
# ordering rows:
setorder(top_ref,Order)
top_ref$Order <- NULL
# ordering columns:
setcolorder(top_ref, c("Cluster", "Full period"))
# Printing
kable(top_ref, caption = "Top 5 of most cited documents per decade (compact format)")
| Cluster | Full period | 1990-1999 | 2000-2009 | 2010-2018 |
|---|---|---|---|---|
| Institutional Economics | Nelson-R~1982 North-D~1990 Robbins-L~1935 Marshall-A~1920 Smith-A~1776 | Nelson-R~1982 Williamson-O~1985 Veblen-T~1919 Marshall-A~1920 Williamson-O~1975 | Nelson-R~1982 North-D~1990 Hayek-F~1948 Robbins-L~1935 Marshall-A~1920 | Robbins-L~1935 North-D~1990 Smith-A~1776 Marshall-A~1920 Nelson-R~1982 |
| Critical Realism | Lawson-T~1997 Lawson-T~2003 Bhaskar-R~1978 Bhaskar-R~1989 Fleetwood-S~1999 | Lawson-T~1997 Bhaskar-R~1978 Bhaskar-R~1989 Lawson-T~1994 Bhaskar-R~1986 | Lawson-T~1997 Lawson-T~2003 Bhaskar-R~1978 Bhaskar-R~1989 Fleetwood-S~1999 | Lawson-T~1997 Lawson-T~2003 Lawson-T~2012 Lawson-T~2006 Bhaskar-R~1978 |
| Political Economy | Searle-J~1995 Marx-K~1970 Marx-K~1973 Wendt-A~1999 George-A~2005 | Marx-K~1970 Marx-K~1973 Ollman-B~1993 Hegel-G~1969 Cohen-G~1978 | Searle-J~1995 Searle-J~1983 Searle-J~1969 Searle-J~1990 Tuomela-R~1995 | Searle-J~1995 George-A~2005 Searle-J~2010 Wendt-A~1999 King-G~1994 |
| Big M | Friedman-M~1953 Kuhn-T~1970 Mccloskey-D~1998 Blaug-M~1992 Popper-K~1968 | Friedman-M~1953 Kuhn-T~1970 Mccloskey-D~1998 Blaug-M~1992 Popper-K~1968 | Friedman-M~1953 Kuhn-T~1970 Popper-K~1968 Caldwell-B~1982 Mccloskey-D~1998 | Friedman-M~1953 Kuhn-T~1970 Mccloskey-D~1998 Popper-K~1968 Keynes-J~1936 |
| Small m | Leamer-E~1983 Keynes-J~1936 Lucas-R~1981 Sraffa-P~1960 Leamer-E~1978 | Stokey-N~1989 Davidson-P~1982 Arrow-K~1971 Keynes-J~1936 Leamer-E~1978 | Keynes-J~1936 Leamer-E~1983 Sraffa-P~1960 Arrow-K~1971 Schwartz-J~1986 | Leamer-E~1983 Keynes-J~1936 Lucas-R~1976 Sims-C~1980 Lucas-R~1981 |
| History of Economics | Schumpeter-J~1954 Marshall-A~1920 Smith-A~1776 Schumpeter-J~1934 Schumpeter-J~1950 | Schumpeter-J~1954 Hayek-F~1948 Marshall-A~1920 Smith-A~1776 Becker-G~1976 | Schumpeter-J~1954 Blaug-M~1985 Blaug-M~1980 Mill-J~1848 Hayek-F~1967 | Schumpeter-J~1954 Keynes-J~1936 Smith-A~1776 Schumpeter-J~1934 Marshall-A~1920 |
top_ref_compact <- top_ref
top_ref_compact %>% xtable(align= c("r|","L{0.14\\textwidth}", rep("L{0.20\\textwidth}",4)),
caption = "Most cited documents per cluster in the corpus of JEL code 'Economic Methodology'",
label = "tab:most_ref_metho") %>%
print(include.rownames=FALSE, sanitize.text.function = identity,
hline.after=-1:nrow(top_ref_compact), size = "small"
)
## % latex table generated in R 3.6.3 by xtable 1.8-4 package
## % Fri Oct 2 02:26:05 2020
## \begin{table}[ht]
## \centering
## \begingroup\small
## \begin{tabular}{L{0.14\textwidth}L{0.20\textwidth}L{0.20\textwidth}L{0.20\textwidth}L{0.20\textwidth}}
## \hline
## Cluster & Full period & 1990-1999 & 2000-2009 & 2010-2018 \\
## \hline
## Institutional Economics & Nelson-R~1982 North-D~1990 Robbins-L~1935 Marshall-A~1920 Smith-A~1776 & Nelson-R~1982 Williamson-O~1985 Veblen-T~1919 Marshall-A~1920 Williamson-O~1975 & Nelson-R~1982 North-D~1990 Hayek-F~1948 Robbins-L~1935 Marshall-A~1920 & Robbins-L~1935 North-D~1990 Smith-A~1776 Marshall-A~1920 Nelson-R~1982 \\
## \hline
## Critical Realism & Lawson-T~1997 Lawson-T~2003 Bhaskar-R~1978 Bhaskar-R~1989 Fleetwood-S~1999 & Lawson-T~1997 Bhaskar-R~1978 Bhaskar-R~1989 Lawson-T~1994 Bhaskar-R~1986 & Lawson-T~1997 Lawson-T~2003 Bhaskar-R~1978 Bhaskar-R~1989 Fleetwood-S~1999 & Lawson-T~1997 Lawson-T~2003 Lawson-T~2012 Lawson-T~2006 Bhaskar-R~1978 \\
## \hline
## Political Economy & Searle-J~1995 Marx-K~1970 Marx-K~1973 Wendt-A~1999 George-A~2005 & Marx-K~1970 Marx-K~1973 Ollman-B~1993 Hegel-G~1969 Cohen-G~1978 & Searle-J~1995 Searle-J~1983 Searle-J~1969 Searle-J~1990 Tuomela-R~1995 & Searle-J~1995 George-A~2005 Searle-J~2010 Wendt-A~1999 King-G~1994 \\
## \hline
## Big M & Friedman-M~1953 Kuhn-T~1970 Mccloskey-D~1998 Blaug-M~1992 Popper-K~1968 & Friedman-M~1953 Kuhn-T~1970 Mccloskey-D~1998 Blaug-M~1992 Popper-K~1968 & Friedman-M~1953 Kuhn-T~1970 Popper-K~1968 Caldwell-B~1982 Mccloskey-D~1998 & Friedman-M~1953 Kuhn-T~1970 Mccloskey-D~1998 Popper-K~1968 Keynes-J~1936 \\
## \hline
## Small m & Leamer-E~1983 Keynes-J~1936 Lucas-R~1981 Sraffa-P~1960 Leamer-E~1978 & Stokey-N~1989 Davidson-P~1982 Arrow-K~1971 Keynes-J~1936 Leamer-E~1978 & Keynes-J~1936 Leamer-E~1983 Sraffa-P~1960 Arrow-K~1971 Schwartz-J~1986 & Leamer-E~1983 Keynes-J~1936 Lucas-R~1976 Sims-C~1980 Lucas-R~1981 \\
## \hline
## History of Economics & Schumpeter-J~1954 Marshall-A~1920 Smith-A~1776 Schumpeter-J~1934 Schumpeter-J~1950 & Schumpeter-J~1954 Hayek-F~1948 Marshall-A~1920 Smith-A~1776 Becker-G~1976 & Schumpeter-J~1954 Blaug-M~1985 Blaug-M~1980 Mill-J~1848 Hayek-F~1967 & Schumpeter-J~1954 Keynes-J~1936 Smith-A~1776 Schumpeter-J~1934 Marshall-A~1920 \\
## \hline
## \end{tabular}
## \endgroup
## \caption{Most cited documents per cluster in the corpus of JEL code 'Economic Methodology'}
## \label{tab:most_ref_metho}
## \end{table}
# Same thing, but in "long format"
top_ref_per_topic_decade <- Communities[,list(
nb_refs = .N
), by = c("Topic", "New_id2", "decade")]
setorder(top_ref_per_topic_decade, Topic, decade, -nb_refs)
top_ref_per_topic_decade <- merge(
top_ref_per_topic_decade[,head(.SD,5), by = c("Topic", "decade")],
dt_refs_of_JEL_metho[,list(reference = paste0(first(First_author)," (",first(cited_year),"), ", first(Publication))), by = New_id2],
by = "New_id2")
setorder(top_ref_per_topic_decade, Topic, decade,- nb_refs)
datatable(top_ref_per_topic_decade
, caption = "Top 5 of most cited documents per decade (long format)")
The idea now is to get the journals publishing the most articles of each cluster over the period. We then do by disciplines.
load("/projects/digital_history/philo_and_economics/data/Communities_metho.RData")
load("/projects/digital_history/philo_and_economics/data/dt_Articles_metho.RData")
# Name of disciplines with articles:
dt_Articles <- merge(dt_Articles[between(Year, first_y, last_y_metho)],
discipline_info[,list(Code_Discipline,Discipline = str_replace(discipline," \n |\n"," "))], by= "Code_Discipline", all.x =T)
Communities <- merge(Communities[,list(ID_Art = ID, decade, Topic)],
dt_Articles[,list(ID_Art, Year, Code_Discipline, Code_Revue,
Journal, Discipline)], by = "ID_Art")
# counting over the whole period
top_j <- Communities[, list(nb_art = .N ) , by = .(Topic, Journal) ][order(Topic,-nb_art)]
top_j[, perc_j := nb_art/sum(nb_art), by = Topic]
top_j <- top_j[, head(.SD,3), by = Topic]
# counting per decade
top_decade_j <- Communities[, list(nb_art = .N ) , by = .(Topic, decade, Journal) ][order(Topic,decade, -nb_art)]
top_decade_j[, perc_j := nb_art/sum(nb_art), by = .(Topic,decade)]
top_decade_j <- top_decade_j[, head(.SD,3), by = .(Topic,decade)]
# The two together
top_j[, decade := "Full period"]
top_j <- rbindlist(list(top_j,top_decade_j), use.names = TRUE)
rm(top_decade_j)
# Formatting
top_j[ ,Journal := Journal %>% tolower() %>% toTitleCase()]
top_j[, Source := paste0(Journal, "~(", round(perc_j*100),"%)")]
top_j <- top_j[,aggregate(Source, list(Topic), paste, collapse = " "), by = "decade"]
setnames(top_j, c("Group.1", "x"), c("Cluster", "reference"))
top_j <- spread(top_j,decade, reference)
setnames(top_j,c("1990","2000","2010"), c("1990-1999","2000-2009","2010-2018"))
setcolorder(top_j,c("Cluster","Full period"))
top_j[, Cluster := str_replace(Cluster, "\n", " ")]
# manip to order rows as in tf-idf
top_j <- merge(top_j, data.table(Order = 1:length(order_disc_metho),
Cluster = str_replace(order_disc_metho, "\n"," ")),
by = "Cluster" )
setorder(top_j,Order)
top_j$Order <- NULL
# Printing
kable(
data.table(apply(top_j,MARGIN = 2, function(x) { str_replace_all(x, "~", " ") }))
, caption = "Top 3 sources of articles per decade (compact format)")
| Cluster | Full period | 1990-1999 | 2000-2009 | 2010-2018 |
|---|---|---|---|---|
| Institutional Economics | Journal of Economic Issues (9%) Cambridge Journal of Economics (8%) Journal of Institutional and Theoretical Economics (7%) | Journal of Institutional and Theoretical Economics (19%) Journal of Economic Issues (12%) History of Political Economy (8%) | Cambridge Journal of Economics (16%) Journal of Economic Issues (11%) Journal of Economic Behavior and Organization (11%) | Journal of Institutional Economics (16%) Journal of Economic Behavior and Organization (9%) Cambridge Journal of Economics (7%) |
| Critical Realism | Cambridge Journal of Economics (42%) Journal of Post Keynesian Economics (8%) Review of Social Economy (7%) | Cambridge Journal of Economics (22%) Journal of Post Keynesian Economics (22%) Review of Social Economy (22%) | Cambridge Journal of Economics (51%) Journal of Post Keynesian Economics (10%) Journal of Economic Issues (5%) | Cambridge Journal of Economics (41%) Journal of Economic Issues (6%) Review of Radical Political Economics (4%) |
| Political Economy | Science and Society (22%) Review of International Political Economy (8%) American Journal of Economics & Sociology (8%) | Science and Society (66%) Review of International Political Economy (10%) Journal of Economic Issues (7%) | American Journal of Economics & Sociology (31%) Review of International Political Economy (16%) Science and Society (9%) | New Political Economy (13%) European Journal of International Relations (12%) Science and Society (10%) |
| Big M | History of Political Economy (12%) Journal of Economic Issues (8%) Cambridge Journal of Economics (5%) | History of Political Economy (20%) Journal of Economic Issues (10%) Revue Economique (6%) | Journal of Post Keynesian Economics (9%) Journal of Economic Issues (6%) Cambridge Journal of Economics (6%) | Journal of Economic Behavior and Organization (7%) Journal of Economic Issues (6%) Cambridge Journal of Economics (6%) |
| Small m | Journal of Economic Perspectives (11%) Cambridge Journal of Economics (11%) Journal of Post Keynesian Economics (8%) | Journal of Post Keynesian Economics (15%) Economic Journal (12%) History of Political Economy (12%) | Journal of Post Keynesian Economics (14%) Cambridge Journal of Economics (10%) World Development (10%) | Cambridge Journal of Economics (18%) Journal of Economic Perspectives (14%) Oxford Review of Economic Policy (7%) |
| History of Economics | History of Political Economy (20%) European Journal of the History of Economic Thought (14%) Cambridge Journal of Economics (12%) | History of Political Economy (38%) Journal of Economic Issues (12%) Scottish Journal of Political Economy (6%) | Politicka Ekonomie (16%) History of Political Economy (16%) European Journal of the History of Economic Thought (16%) | European Journal of the History of Economic Thought (22%) Cambridge Journal of Economics (20%) History of Economic Ideas (12%) |
# Latex is commented out
# top_j %>% xtable(align= c("r|","L{0.14\\textwidth}", rep("L{0.17\\textwidth}",4)),
# caption = "Top 3 sources of articles per cluster in the corpus of JEL code 'Economic Methodology'",
# label = "tab:top_source_metho") %>%
# print(include.rownames=FALSE, sanitize.text.function = identity,
# hline.after=-1:nrow(top_j), size = "small"
# )
### Turning now to the aggregation by disciplines ###
# counting over the whole period
top_disc <- Communities[, list(nb_art = .N ) , by = .(Topic, Discipline) ][order(Topic,-nb_art)]
top_disc[, perc_j := nb_art/sum(nb_art), by = Topic]
top_disc <- top_disc[, head(.SD,3), by = Topic]
# counting per decade
top_decade_disc <- Communities[, list(nb_art = .N ) , by = .(Topic, decade, Discipline) ][order(Topic,decade, -nb_art)]
top_decade_disc[, perc_j := nb_art/sum(nb_art), by = .(Topic,decade)]
top_decade_disc <- top_decade_disc[, head(.SD,3), by = .(Topic,decade)]
# The two together
top_disc[, decade := "Full period"]
top_disc <- rbindlist(list(top_disc,top_decade_disc), use.names = TRUE)
rm(top_decade_disc)
# Formatting
top_disc[ ,Discipline := Discipline %>% tolower() %>% toTitleCase()]
top_disc[, Source := paste0(Discipline, "~(", round(perc_j*100),"%)")]
top_disc <- top_disc[,aggregate(Source, list(Topic), paste, collapse = " "), by = "decade"]
setnames(top_disc, c("Group.1", "x"), c("Cluster", "reference"))
top_disc <- spread(top_disc,decade, reference)
setnames(top_disc,c("1990","2000","2010"), c("1990-1999","2000-2009","2010-2018"))
setcolorder(top_disc,c("Cluster","Full period"))
top_disc[, Cluster := str_replace(Cluster, "\n", " ")]
# manip to order rows as in tf-idf
top_disc <- merge(top_disc, data.table(Order = 1:length(order_disc_metho),
Cluster = str_replace(order_disc_metho, "\n"," ")),
by = "Cluster" )
setorder(top_disc,Order)
top_disc$Order <- NULL
# Printing
kable(
data.table(apply(top_disc,MARGIN = 2, function(x) { str_replace_all(x, "~", " ") })),
caption = "Top 3 diciplinary sources of articles per decade (compact format)")
| Cluster | Full period | 1990-1999 | 2000-2009 | 2010-2018 |
|---|---|---|---|---|
| Institutional Economics | Economics (81%) Other Social Sciences (7%) Humanities (6%) | Economics (89%) Other Social Sciences (7%) Management (1%) | Economics (79%) Other Social Sciences (12%) Geography (3%) | Economics (74%) Humanities (11%) Other Social Sciences (4%) |
| Critical Realism | Economics (87%) Geography (4%) Political Science & Public Administration (2%) | Economics (97%) Geography (3%) | Economics (89%) Geography (6%) Management (2%) | Economics (82%) Political Science & Public Administration (5%) Humanities (4%) |
| Political Economy | Economics (40%) Other Social Sciences (32%) International Relations (13%) | Other Social Sciences (69%) Economics (28%) International Relations (3%) | Economics (59%) International Relations (16%) Other Social Sciences (16%) | Economics (36%) Other Social Sciences (23%) International Relations (16%) |
| Big M | Economics (76%) Other Social Sciences (6%) Management (5%) | Economics (85%) Management (6%) Other Social Sciences (4%) | Economics (62%) Other Social Sciences (11%) Management (8%) | Economics (68%) Humanities (8%) Other Social Sciences (7%) |
| Small m | Economics (95%) Other Social Sciences (2%) Political Science & Public Administration (1%) | Economics (97%) Political Science & Public Administration (3%) | Economics (100%) | Economics (93%) Other Social Sciences (4%) Humanities (2%) |
| History of Economics | Economics (82%) Humanities (12%) Other Social Sciences (4%) | Economics (94%) Other Social Sciences (6%) | Economics (84%) Other Social Sciences (8%) Humanities (8%) | Economics (74%) Humanities (22%) Management (2%) |
# Latex is commented out
# top_disc %>% xtable(align= c("r|","L{0.14\\textwidth}", rep("L{0.17\\textwidth}",4)),
# caption = "Top 3 disciplinary sources of articles per cluster in the corpus of JEL code 'Economic Methodology'",
# label = "tab:top_source_metho") %>%
# print(include.rownames=FALSE, sanitize.text.function = identity,
# hline.after=-1:nrow(top_ref_compact), size = "small"
# )
print("Below: specific focus on the possible turn to mainstream of Small m. Note that we coded ourselves the journals as maintream or not.")
## [1] "Below: specific focus on the possible turn to mainstream of Small m. Note that we coded ourselves the journals as maintream or not."
# focus on small M
small_m <- Communities[Topic == "Small m", list(decade, Year, Code_Discipline,Code_Revue,Journal)]
# Writing a file to manually coding what is mainstream or not
fwrite(small_m[, list(.N, Journal = unique(Journal)), by= Code_Revue],
"/projects/digital_history/philo_and_economics/data/2020-10-01_small_m_j.csv"
)
# Reloading the coded journals
mainstream_or_not <- fread("/projects/digital_history/philo_and_economics/data/2020-10-01_classifying_mainstream_for_small_m.csv")
# merging with initial data to have the mainstream/not for all journals
small_m <- merge(small_m, mainstream_or_not, by = "Journal", all.x= TRUE)
if(nrow(small_m[is.na(Coder1)])){
stop("Some journals where a Small m article is published have not be classified as mainstream or not.")
}
small_m[Coder1 == 1 & Coder2== 1, Journal_status := "All Mainstream"]
small_m[Coder2 == 0 & Coder1== 0, Journal_status := "All not mainstream"]
small_m[is.na(Journal_status), Journal_status := "Unclear"]
# ggplot(small_m,aes(x=decade,fill=Journal_status)) + geom_bar()
kable(
small_m[, list(`Proportion` = sum(Coder1)/.N), by= decade][order(decade)]
, caption = "Proportion of articles in Small m that are published in mainstream economics journals according to Coder 1")
| decade | Proportion |
|---|---|
| 1990 | 0.5454545 |
| 2000 | 0.4285714 |
| 2010 | 0.3859649 |
kable(
small_m[, list(`Proportion` = sum(Coder2)/.N), by= decade][order(decade)]
, caption = "Proportion of articles in Small m that are published in mainstream economics journals according to Coder 2")
| decade | Proportion |
|---|---|
| 1990 | 0.5757576 |
| 2000 | 0.4761905 |
| 2010 | 0.6315789 |
kable(
small_m[, list(`Proportion` = sum(Journal_status == "All Mainstream")/sum(Journal_status != "Unclear")), by= decade][order(decade)]
, caption = "Proportion of articles in Small m that are published in mainstream economics journals according to both coders divided by the sum of unambiguous journals.")
| decade | Proportion |
|---|---|
| 1990 | 0.5625000 |
| 2000 | 0.4444444 |
| 2010 | 0.5116279 |
rm(Communities,dt_Articles)
load("/projects/digital_history/philo_and_economics/data/Communities_metho.RData")
plot_topic_thru_time(Communities, 1990, colors = colors_cluster)
## `geom_smooth()` using formula 'y ~ x'
rm(Communities)