MetaGraph for FinNLP Research: A Large-Scale Knowledge Graph of GenAI in Financial NLP (2022–2025)
Authors/Creators
Contributors
Annotator (4):
Description
MetaGraph for FinNLP Research: A Large-Scale Knowledge Graph of GenAI in Financial NLP (2022–2025)
This dataset accompanies the paper "Prompting the Market? A Large-Scale Meta-Analysis of GenAI in Financial NLP (2022–2025)". It introduces MetaGraph, a machine-readable knowledge graph automatically constructed from 681 scientific papers (see note below) in the field of financial NLP. MetaGraph enables large-scale meta-analyses of research trends, conceptual relationships, and method co-usage in GenAI for finance.
Content Overview
MetaGraph represents the FinNLP research landscape as a structured graph where:
- Nodes correspond to papers, models, datasets, tasks, and key research concepts.
- Edges encode citation links, shared methods, topic similarity, and temporal co-evolution.
The graph facilitates:
- Discovery of underexplored intersections in FinNLP research
- Temporal analyses of GenAI adoption
- Visual exploration of conceptual and methodological trends
Included Files
- finnlp_ontology.json: Defines an extensible ontology of FinNLP concepts (e.g., model, dataset, task) and relation types.
- finnlp_graph.graphml: The full MetaGraph in GraphML format, compatible with tools like Gephi, NetworkX, and GraphViz.
- finnlp_graph.json: A simplified JSON version for human-readable access, including structured metadata for nodes and edges.
Use Cases
This dataset is intended for researchers in NLP, finance, and scientometrics. It can be used to:
- Conduct bibliometric studies
- Analyze the methodological evolution of GenAI in finance
- Train or evaluate systems for scientific knowledge extraction or reasoning
Citation
If you use this resource, please cite the accompanying paper.
Note: 58 of the 681 papers used in our analysis are not included in the published dataset. These papers were posted to ArXiv
under a CC BY-NC-ND 4.0 license, which prevents reusers from distributing any derivative, adapted form of the original
material. As such, we will not be distributing them as part of the MetaGraph knowledge graph.