Published July 23, 2023 | Version draft
Software Open

ALGORITHMIC LITERACY: Generative Artificial Intelligence Technologies for Data Librarians

  • 1. UFRGS

Description

The basic parameters and requirements for creating codes for web scraping are: (communicating in natural language description of a programming problem as input in English and code with the AI code generating solution as output, specifying the Python 3.11 version, using the Pycharm compiler, and installing the following Python libraries.

Notes

Readme @ ORCIDS
About de codes
They are ccodes for web scraping are create by AI (communicating in natural language description of a programming problem as input in English and code with the AI code generating solution as output)
========================================
OUTPUT AI CODE FILE: ORCID_author.py 
The code uses web scraping to search for authors' ORCID IDs and add them to a TSV file. It first imports necessary modules and sets the path and counter variables. It then loops through TSV files in the directory that start with 'v' and end with '_3.tsv'. For each file, it reads the contents, extracts the names of authors, and searches for their ORCID IDs using Selenium. If a match is found, the ORCID ID is added to the corresponding author's name in the TSV file. The updated TSV file is then saved as 'v{counter}_4.tsv'.
========================================
OUTPUT AI CODE FILE: GSCHOLARscraping.py 
This Python script extracts data from a list of Google Scholar pages and writes the results to a TSV file. The script uses the requests library to retrieve the HTML content of each page and the BeautifulSoup library to parse the HTML and extract the desired data. The extracted data includes the name of the scholar, total citations, citations since 2013, H-index, and H-index since 2013. The script also calculates the average H-index and total citations across all pages. Overall, this script can be helpful for researchers who want to analyze the impact of a group of scholars in a particular field.
==========================================
OUTPUT AI CODE FILE: SCOPUSscrapingAnon.py 
The code extracts data from a list of links to authors' Scopus profiles and writes the extracted data to a file. The extracted data includes the author's Scopus ID, H-Index, the number of documents authored, and several citations received.
==========================================
OUTPUT AI CODE FILE: PublonsScrapingAnon-average2.py 
This code is a web scraper that extracts citation metrics data from a list of researcher profiles on the Publons platform and then writes the extracted data to a tab-separated values (TSV) file. It uses the Selenium and BeautifulSoup libraries to navigate and parse web pages.
==========================================
OUTPUT AI CODE FILE: compare h index sources
The code reads three input files containing H-Index data from Google Scholar, Web of Science, and Scopus. It then searches for each H-Index in each file and calculates the media H-Index for each researcher by averaging the H-Index values from all three sources. Finally, it writes the output to a new file containing the media H-Index for each researcher and the overall media H-Index for the entire list.
INSTALLATION REQUIREMENTS:
The basic parameters and requirements is Python 3.11 version, using the Pycharm compiler, and installing the following Python libraries LXML, request, bs4, Selenium, and OS, too need the Machinet AI and Bito AI, use to write, comment, check security and explain syntax code. These plugins need APY in the Open AI code available at: (https://platform.openai.com/account/api-keys).

Files

AIproject-AIPROJECT 2.zip

Files (42.2 MB)

Name Size Download all
md5:8c4ecaeb44ff696414c8f1da33c9b3d5
42.2 MB Preview Download

Additional details

Identifiers