The getLattes
R
package, written by Roney Fraga Souza and Winicius Sabino, was built to extract data from the Lattes curriculum platform exported as XML
.
The XML
file needs to be extracted from .zip
.
To automate the download process, please see Captchas Negated by Python reQuests - CNPQ.
Stable version from CRAN.
install.packages('getLattes') library(getLattes)
Development version from GitHub.
# install and load devtools from CRAN install.packages("devtools") library(devtools) # install and load getLattes devtools::install_github("roneyfraga/getLattes") library(getLattes)
# the file 4984859173592703.xml is stored in datatest directory # cl <- readLattes(filexml='4984859173592703.xml', path='datatest/') # import all Lattes XML files in datateste # cls <- readLattes(filexml='*.xml$', path='datatest/') # import all Lattes XML files in the working directory cls <- readLattes(filexml='*.xml$')
To load 2 Lattes curricula, from important researchers in my academic journey, imported as R list.
# to combine list of data frames in data frame library(dplyr) # to import from one curriculum getDadosGerais(xmlsLattes[[2]]) # to import from two or more curricula lt <- lapply(xmlsLattes, getDadosGerais) head(bind_rows(lt))
# to import from one curriculum getArtigosPublicados(xmlsLattes[[2]]) # to import from two or more curricula lt <- lapply(xmlsLattes, getArtigosPublicados) head(bind_rows(lt))