pytups.core package
Submodules
pytups.core.NOAADataset module
- class pytups.core.NOAADataset.NOAADataset(study_data)[source]
Bases:
objectThis class encapsulates study metadata and its related components (e.g. publications, sites) retrieved from the NOAA API.
- study_id
The unique NOAA study identifier.
- Type:
str
- xml_id
The XML identifier of the study.
- Type:
str
- metadata
A dictionary containing basic metadata such as studyName, dataType, earliestYearBP, etc.
- Type:
dict
- investigators
A comma-separated string of investigator names.
- Type:
str
- publications
A list of Publication objects associated with the study.
- Type:
list of Publication
pytups.core.Dataset module
- class pytups.core.Dataset.Dataset[source]
Bases:
objectA wrapper class for interacting with the NOAA Studies API.
Manages the retrieval, parsing, and aggregation of NOAA study data, and provides methods to access summaries, publications, sites, and external data files.
- BASE_URL
The NOAA API endpoint URL.
- Type:
str
- studies
A mapping from NOAADatasetId to NOAADataset instances.
- Type:
dict
- data_table_index
A mapping from dataTableID to associated study, site, and paleo data.
- Type:
dict
- search_studies(...)[source]
Searches for studies using provided parameters and parses the response.
- get_data(dataTableIDs, file_urls)[source]
Fetches and returns external data based on data table IDs or file URLs.
- BASE_URL = 'https://www.ncei.noaa.gov/access/paleo-search/study/search.json'
- get_data(dataTableIDs=None, file_urls=None)[source]
Fetch external data for given dataTableIDs or file URLs, perform validations, and attach study and site metadata.
- Parameters:
dataTableIDs (list or str, optional) – One or more NOAA data table IDs.
file_urls (list or str, optional) – One or more file URLs.
- Returns:
A list of DataFrames corresponding to the fetched data.
- Return type:
list of pandas.DataFrame
- Raises:
ValueError – For missing parent study mapping, missing file URL, or proprietary/unsupported file types.
Exception – Propagates any exceptions raised by the parser.
- get_data_deprecated(dataTableIDs=None, file_urls=None)[source]
Fetch external data for given dataTableIDs or file URLs and attach study/site metadata.
- Parameters:
dataTableIDs (list or str, optional) – One or more NOAA data table IDs.
file_urls (list or str, optional) – One or more file URLs.
- Returns:
A list of DataFrames, each corresponding to fetched data.
- Return type:
list of pandas.DataFrame
- get_publications_dataframe()[source]
Get a DataFrame of all publications aggregated from the studies.
- Returns:
A DataFrame containing publication details with study context.
- Return type:
pandas.DataFrame
- get_sites_dataframe()[source]
Get a DataFrame of all sites aggregated from the studies, including paleo data.
- Returns:
A DataFrame containing site details with study context and paleo data.
- Return type:
pandas.DataFrame
- get_summary_dataframe()[source]
Get a DataFrame summarizing all loaded studies.
- Returns:
A DataFrame with a summary of study metadata and components.
- Return type:
pandas.DataFrame
- search_studies(xml_id=None, noaa_id=None, data_publisher='NOAA', data_type_id=None, keywords=None, investigators=None, max_lat=None, min_lat=None, max_lon=None, min_lon=None, location=None, publication=None, search_text=None, earliest_year=None, latest_year=None, cv_whats=None, recent=False)[source]
Search for NOAA studies using the provided parameters.
At least one parameter must be specified for a search to be initiated.
- Parameters:
xml_id (str, optional) – XML identifier for a study.
noaa_id (str, optional) – NOAA study identifier.
data_publisher (str, optional) – Publisher of the data, default is “NOAA”.
data_type_id (str, optional) – Data type identifier.
keywords (str, optional) – Keywords for the search.
investigators (str, optional) – Investigator names.
max_lat (float, optional) – Maximum latitude.
min_lat (float, optional) – Minimum latitude.
max_lon (float, optional) – Maximum longitude.
min_lon (float, optional) – Minimum longitude.
location (str, optional) – Location description.
publication (str, optional) – Publication details.
search_text (str, optional) – Additional text to search within the study.
earliest_year (int, optional) – Earliest year of study.
latest_year (int, optional) – Latest year of study.
cv_whats (str, optional) – Controlled vocabulary term.
recent (bool, optional) – Flag to filter recent studies.
- Returns:
The method populates internal attributes with the retrieved data. Requires at least one single parameter. Parameter validation to be implemented soon.
- Return type:
None