pyeo.queries_and_downloads¶
Functions for querying, filtering and downloading data.
SAFE files¶
Sentinel 2 data is downloaded in the form of a .SAFE file; all download functions will end with data in this structure. This is a directory structure that contains the imagery, metadata and supplementary data of a Sentinel 2 image. The rasters themeselves are the in the GRANULE/[granule_id]/IMG_DATA/[resolution]/ folder; each band is contained in its own .jp2 file. For full details, see https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi/data-formats
There are two ways to refer to a given Sentinel-2 products: the UUID and the product ID. The UUID is a set of five five-character strings (EXMAPLE HERE) The product ID is a human-readable string (more or less) containing all the information needed for unique identification of an product, split by the underscore character. For more information on the structure of a product ID, see (EXAMPLE HERE)
Query data structure¶
All query functions return a dictionary. The key of the dictionary is the UUID of the product id; the product is a further set of nested dictionaries containing information about the product to be downloaded. (PUT STRUCTURE HERE)
Data download sources¶
This library presently offers two options for download sources; Scihub and Amazon Web Services. If in doubt, use Scihub.
Scihub
The Copernicus Open-Access Hub is the default option for downloading sentinel-2 images. Images are downloaded in .zip format, and then automatically unzipped. Users are required to register with a username and password before downloading, and there is a limit to no more than two concurrent downloads per username at a time. Scihub is entirely free.
AWS
Sentinel data is also publically hosted on Amazon Web Services. This storage is provided by Sinergise, and is normally updated a few hours after new products are made available. There is a small charge associated with downloading this data. To access the AWS repository, you are required to register an Amazon Web Services account (including providing payment details) and obtain an API key for that account. See https://aws.amazon.com/s3/pricing/ for pricing details; the relevant table is Data Transfer Pricing for the EU (Frankfurt) region. There is no limit to the concurrent downloads for the AWS bucket.
Functions¶
-
pyeo.queries_and_downloads.
activate_and_dl_planet_item
(session, item, asset_type, file_path)¶ Activates and downloads a single planet item
-
pyeo.queries_and_downloads.
build_search_request
(aoi, start_date, end_date, item_type, search_name)¶ Builds a search request for the planet API
-
pyeo.queries_and_downloads.
check_for_s2_data_by_date
(aoi_path, start_date, end_date, conf, cloud_cover=50)¶ Gets all the products between start_date and end_date. Wraps sent2_query to avoid having passwords and long-format timestamps in code.
- Parameters
aoi_path – Path to a geojson file containing a polygon of the outline of the area you wish to download. See www.geojson.io for a tool to build these.
start_date – Start date in the format yyyymmdd.
end_date – End date of the query in the format yyyymmdd
conf – Output from a configuration file containing your username and password for the ESA hub. If needed, this can be dummied with a dictionary of the following format: conf={‘sent_2’:{‘user’:’your_username’, ‘pass’:’your_pass’}}
cloud_cover – The maximem level of cloud cover in images to be downloaded.
-
pyeo.queries_and_downloads.
do_quick_search
(session, search_request)¶ Tries the quick search; returns a dict of features
-
pyeo.queries_and_downloads.
do_saved_search
(session, search_request)¶ Does a saved search; this doesn’t seem to work yet.
-
pyeo.queries_and_downloads.
download_blob_from_google
(bucket, object_prefix, out_folder, s2_object)¶ Still experimental.
-
pyeo.queries_and_downloads.
download_from_aws_with_rollback
(product_id, folder, uuid, user, passwd)¶ Attempts to download a single product from AWS using product_id; if not found, rolls back to Scihub using the UUID
- Parameters
product_id – The product ID (“L2A_…”)
folder – The folder to download the .SAFE file to.
uuid – The product UUID (4dfB4-432df….)
user – Scihub username
passwd – Scihub password
-
pyeo.queries_and_downloads.
download_from_google_cloud
(product_ids, out_folder, redownload=False)¶ Still experimental.
-
pyeo.queries_and_downloads.
download_from_scihub
(product_uuid, out_folder, user, passwd)¶ Downloads and unzips product_uuid from scihub
- Parameters
product_uuid – The product UUID (4dfB4-432df….)
out_folder – The folder to save the .SAFE file to
user – Scihub username
passwd – Scihub password
Notes
If interrupted mid-download, there will be a .incomplete file in the download folder. You might need to remove this for further processing.
-
pyeo.queries_and_downloads.
download_planet_image_on_day
(aoi_path, date, out_path, api_key, item_type='PSScene4Band', search_name='auto', asset_type='analytic', threads=5)¶ Queries and downloads all images on the date in the aoi given
-
pyeo.queries_and_downloads.
download_s2_data
(new_data, l1_dir, l2_dir, source='scihub', user=None, passwd=None, try_scihub_on_fail=False)¶ Downloads S2 imagery from AWS, google_cloud or scihub. new_data is a dict from Sentinel_2.
- Parameters
new_data – A query dictionary contining the products you want to download
l1_dir – The directory to download level 1 products to.
l2_dir – The directory to download level 2 products to.
source – The source to download the data from. Can be ‘scihub’ or ‘aws’; see section introduction for details
user – The username for sentinelhub
passwd – The password for sentinelheub
try_scihub_on_fail – If true, this function will roll back to downloading from Scihub on a failure of any other downloader.
- Raises
BadDataSource – Raised when passed either a bad datasource or a bad image ID
-
pyeo.queries_and_downloads.
filter_non_matching_s2_data
(query_output)¶ Filters a query such that it only contains paired level 1 and level 2 data products.
- Parameters
query_output – Query list
- Returns
- Return type
A dictionary of products contaiing only L1 and L2 data.
-
pyeo.queries_and_downloads.
filter_to_l1_data
(query_output)¶ Takes list of products from check_for_s2_data_by_date and removes all non Level 1 products.
- Parameters
query_output – A dictionary of products
- Returns
- Return type
A dictionary of products containing only the L1C data products
-
pyeo.queries_and_downloads.
filter_to_l2_data
(query_output)¶ Takes list of products from check_for_s2_data_by_date and removes all non Level 2A products.
- Parameters
query_output – A dictionary of products
- Returns
- Return type
A dictionary of products containing only the L2A data products
-
pyeo.queries_and_downloads.
get_granule_identifiers
(safe_product_id)¶ Returns the parts of a S2 name that uniquely identify that granulate at a moment in time :param safe_product_id: The filename of a SAFE product
- Returns
satellite – A string of either “L2A” or “L2B”
intake_date – The timestamp of the data intoake of this granule
orbit number – The orbit number of this granule
granule – The ID of this granule
-
pyeo.queries_and_downloads.
get_paginated_items
(session, search_id)¶ Let’s leave this out for now.
-
pyeo.queries_and_downloads.
get_planet_product_path
(planet_dir, product)¶ Returns the path to a Planet product within a Planet directory
-
pyeo.queries_and_downloads.
get_query_datatake
(query_item)¶ Gets the datatake timestamp of a query item.
- Parameters
query_item – An item from a query results dictionary.
- Returns
- Return type
The timestamp of that item’s datatake.
-
pyeo.queries_and_downloads.
get_query_granule
(query_item)¶ Gets the granule ID (ex: 48MXU) of a query
- Parameters
query_item – An item from a query results dictionary.
- Returns
- Return type
The granule ID of that item.
-
pyeo.queries_and_downloads.
get_query_level
(query_item)¶ Returns the processing level of the query item.
- Parameters
query_item – An item from a query results dictionary.
- Returns
- Return type
A string of either ‘Level-1C’ or ‘Level-2A’.
-
pyeo.queries_and_downloads.
get_query_processing_time
(query_item)¶ Returns the processing timestamps of a query item
- Parameters
query_item – An item from a query results dictionary.
- Returns
The date processing timestamp in the format yyyymmddThhmmss (Ex
- Return type
20190613T123002)
-
pyeo.queries_and_downloads.
load_api_key
(path_to_api)¶ Returns an API key from a single-line text file containing that API
- Parameters
path_to_api – The path a text file containing only the API key
- Returns
- Return type
Returns the API key
-
pyeo.queries_and_downloads.
planet_query
(aoi_path, start_date, end_date, out_path, api_key, item_type='PSScene4Band', search_name='auto', asset_type='analytic', threads=5)¶ Downloads data from Planetlabs for a given time period in the given AOI
- Parameters
aoi (str) – Filepath of a single-polygon geojson containing the aoi
start_date (str) – the inclusive start of the time window in UTC format
end_date (str) – the inclusive end of the time window in UTC format
out_path (filepath-like object) – A path to the output folder Any identically-named imagery will be overwritten
item_type (str) – Image type to download (see Planet API docs)
search_name (str) – A name to refer to the search (required for large searches)
asset_type (str) – Planet asset type to download (see Planet API docs)
threads (int) – The number of downloads to perform concurrently
Notes
IMPORTANT: Will not run for searches returning greater than 250 items.
-
pyeo.queries_and_downloads.
read_aoi
(aoi_path)¶ Opens the geojson file for the aoi. If FeatureCollection, return the first feature.
- Parameters
aoi_path – The path to the geojson file
- Returns
- Return type
A dictionary translation of the feature inside the .json file
-
pyeo.queries_and_downloads.
sent2_query
(user, passwd, geojsonfile, start_date, end_date, cloud=50)¶ Fetches a list of Sentienl-2 products
- Parameters
user (string) – Username for ESA hub. Register at https://scihub.copernicus.eu/dhus/#/home
passwd (string) – password for the ESA Open Access hub
geojsonfile (string) – Path to a geojson file containing a polygon of the outline of the area you wish to download. See www.geojson.io for a tool to build these.
start_date (string) – Date of beginning of search in the format YYYY-MM-DDThh:mm:ssZ (ISO standard)
end_date (string) – Date of end of search in the format yyyy-mm-ddThh:mm:ssZ See https://www.w3.org/TR/NOTE-datetime, or use cehck_for_s2_data_by_date
cloud (string (optional)) – The maximum cloud clover (as calculated by Copernicus) to download.
- Returns
A dictionary of Sentinel-2 granule products that are touched by your AOI polygon, keyed by product ID.
Returns both level 1 and level 2 data.
Notes
If you get a ‘request too long’ error, it is likely that your polygon is too complex. The following functions download by granule; there is no need to have a precise polygon at this stage.