geospatial_learn package

Submodules

geospatial_learn.raster module

The geodata module.

Description

A series of tools for the manipulation of geospatial imagery/rasters such as masking or raster algebraic type functions and the conversion of Sentinel 2 data to gdal compatible formats.

raster.array2raster(array, bands, inRaster, outRas, dtype, FMT=None)

Save a raster from a numpy array using the geoinfo from another.

Parameters
  • array (np array) – a numpy array.

  • bands (int) – the no of bands.

  • inRaster (string) – the path of a raster.

  • outRas (string) – the path of the output raster.

  • dtype (int) – though you need to know what the number represents! a GDAL datatype (see the GDAL website) e.g gdal.GDT_Int32

  • FMT (string) – (optional) a GDAL raster format (see the GDAL website) eg Gtiff, HFA, KEA.

raster.batch_translate(folder, wildcard, FMT='Gtiff')

Using the gdal python API, this function translates the format of files to commonly used formats

Parameters
  • folder (string) – the folder containing the rasters to be translated

  • wildcard (string) – the format wildcard to search for e.g. ‘.tif’

  • FMT (string (optional)) – a GDAL raster format (see the GDAL website) eg Gtiff, HFA, KEA

raster.calc_ndvi(inputIm, outputIm, bandsList, blocksize=256, FMT=None, dtype=None)

Create a copy of an image with an ndvi band added

Parameters
  • inputIm (string) – the granule folder

  • bands (list) – a list of band indicies to be used, eg - [3,4] for Sent2 data

  • FMT (string) – the output gdal format eg ‘Gtiff’, ‘KEA’, ‘HFA’

  • blocksize (int) – the chunk of raster read in & write out

raster.clip_raster(inRas, inShape, outRas, nodata_value=None, blocksize=None, blockmode=True)

Clip a raster

Parameters
  • inRas (string) – the input image

  • outPoly (string) – the input polygon file path

  • outRas (string (optional)) – the clipped raster

  • nodata_value (numerical (optional)) – self explanatory

  • blocksize (int (optional)) – the square chunk processed at any one time

  • blockmode (bool (optional)) – whether the raster will be clipped entirely in memory or by chunck

raster.color_raster(inRas, color_file, output_file)

Generate a txt colorfile and make a RGB image from a grayscale one

Parameters
  • inRas (string) – Path to input raster (single band greyscale)

  • color_file (string) – Path to output colorfile.txt

raster.combine_scene(scl, c_scn, blocksize=256)

combine another scene classification with the sen2cor one

sclstring

the sen2cor one

c_scnstring

the independently derived one - this will be modified

blocksizestring

chunck to process

raster.hist_match(inputImage, templateImage)

Adjust the pixel values of a grayscale image such that its histogram matches that of a target image.

Writes to the inputImage dataset so that it matches

As the entire band histogram is required this can become memory intensive with big rasters eg 10 x 10k+

Inspire by/adapted from something on stack on image processing - credit to that author

Parameters
  • inputImage (string) – image to transform; the histogram is computed over the flattened array

  • templateImage (string) – template image can have different dimensions to source

raster.jp2_translate(folder, FMT=None, mode='L1C')

Translate all files from S2 download to a useable format

Default FMT is GTiff (leave blank), for .img FMT=’HFA’, for .vrt FMT=’VRT’

If you posses a gdal compiled with the corrext openjpg support use that

This function might be useful if you wish to retain seperate rasters, but the use of stack_S2 is recommended

Parameters
  • folder (string) – S2 granule dir

  • mode (string) – ‘L2A’ , ‘20’, ‘10’, L1C (default)

  • FMT (string (optional)) – a GDAL raster format (see the GDAL website) eg Gtiff, HFA, KEA

raster.jp2_translate_batch(mainFolder, FMT=None, mode=None)

Batch version of jp2translate

Perhaps only useful for the old tile format

Parameters
  • mainFolder (string) – the path to S2 tile folder to process

  • FMT (string) – a GDAL raster format (see the GDAL website) eg Gtiff, HFA, KEA

  • mode (string (optional)) – ‘L2A’ , ‘20’, ‘10’, L1C (default)

raster.mask_raster(inputIm, mval, overwrite=True, outputIm=None, blocksize=None, FMT=None)

Perform a numpy masking operation on a raster where all values corresponding to mask value are retained - does this in blocks for efficiency on larger rasters

Parameters
  • inputIm (string) – the input raster

  • mval (int) – the mask value eg 1, 2 etc

  • FMT (string) – the output gdal format eg ‘Gtiff’, ‘KEA’, ‘HFA’

  • outputIm (string (optional)) – optionally write a separate output image, if None, will mask the input

  • blocksize (int) – the chunk of raster to read in

Returns

A string of the output file path

Return type

string

raster.mask_raster_multi(inputIm, mval=1, outval=None, mask=None, blocksize=256, FMT=None, dtype=None)

Perform a numpy masking operation on a raster where all values corresponding to mask value are retained - does this in blocks for efficiency on larger rasters

Parameters
  • inputIm (string) – the granule folder

  • mval (int) – the masking value that delineates pixels to be kept

  • outval (numerical dtype eg int, float) – the areas removed will be written to this value default is 0

  • mask (string) – the mask raster to be used (optional)

  • FMT (string) – the output gdal format eg ‘Gtiff’, ‘KEA’, ‘HFA’

  • mode (string) – None > 10m data, ‘20’ >20m

  • blocksize (int) – the chunk of raster read in & write out

raster.multi_temp_filter(inRas, outRas, bands=None, windowSize=None)

The multi temp filter for radar data as outlined & published by Quegan et al, Uni of Sheffield

This is only suitable for small images, as it holds intermediate data in memory

Parameters
  • inRas (string) – the input raster

  • outRas (string) – the output raster

  • blocksize (int) – the chunck processed

  • windowsize (int) – the filter window size

  • FMT (string) – gdal compatible (optional) defaults is tif

raster.multi_temp_filter_block(inRas, outRas, bands=None, blocksize=256, windowsize=7, FMT=None)

Multi temporal filter implementation for radar data

See Quegan et al., for paper

Requires an installation of OTB

Parameters
  • inRas (string) – the input raster

  • outRas (string) – the output raster

  • blocksize (int) – the chunck processed

  • windowsize (int) – the filter window size

  • FMT (string) – gdal compatible (optional) defaults is tif

raster.polygonize(inRas, outPoly, outField=None, mask=True, band=1, filetype='ESRI Shapefile')

Lifted straight from the cookbook and gdal func docs.

http://pcjericks.github.io/py-gdalogr-cookbook

Parameters

inRas (string) – the input image

outPolystring

the output polygon file path

outFieldstring (optional)

the name of the field containing burnded values

maskbool (optional)

use the input raster as a mask

bandint

the input raster band

raster.raster2array(inRas, bands=[1])

Read a raster and return an array, either single or multiband

Parameters
  • inRas (string) – input raster

  • bands (list) – a list of bands to return in the array

raster.remove_cloud_S2(inputIm, sceneIm, blocksize=256, FMT=None, min_size=4, dist=1)

Remove cloud using the a scene classification

This saves back to the input raster by default

Parameters
  • inputIm (string) – the input image

  • sceneIm (string) – the scenemap to use as a mask for removing cloud It is assumed the scene map consists of 1 shadow, 2 cloud, 3 land, 4 water

  • FMT (string) – the output gdal format eg ‘Gtiff’, ‘KEA’, ‘HFA’

  • min_size (int) – size in pixels to retain of cloud mask

  • blocksize (int) – the square chunk processed at any one time

raster.remove_cloud_S2_stk(inputIm, sceneIm1, sceneIm2=None, baseIm=None, blocksize=256, FMT=None, max_size=10, dist=1)

remove cloud using the the c_utils scene classification the KEA format is recommended, .tif is the default,

no need to add the file extension this is done automatically

Parameters
  • inputIm (string) – the input image

  • sceneIm1, 2 (string) – the classification rasters used to mask out the areas in

  • the input image

  • baseIm (string) – Another multiband raster of same size extent as the inputIm where the baseIm image values are used rather than simply converting to zero (in the use case of 2 sceneIm classifications)

  • Returns

  • ———–

  • nowt

  • Notes

  • ———–

  • Useful if you have a base image whic is a cloudless composite, which

  • you intend to replace with the current image for the next round of

  • classification/ change detection

raster.rgb_ind(inputIm, outputIm, blocksize=256, FMT=None, dtype=5)

Create a copy of an image with an ndvi band added

Parameters
  • inputIm (string) – the input rgb image

  • outputIm (string) – the output image

  • FMT (string) – the output gdal format eg ‘Gtiff’, ‘KEA’, ‘HFA’

  • blocksize (int) – the chunk of raster read in & write out

raster.stack_S2(granule, inFMT='jp2', FMT=None, mode=None, old_order=False, blocksize=2048, overwrite=True)

Stacks S2 bands downloaded from ESA site

Can translate directly from jp2 format (this is recommended and is default).

If you possess gdal 2.1 with jp2k support then alternatively use gdal_translate

Parameters
  • granule (string) – the granule folder

  • inFMT (string (optional)) – the format of the bands will likely be jp2

  • FMT (string (optional)) – the output gdal format eg ‘Gtiff’, ‘KEA’, ‘HFA’

  • mode (string (optional)) – None, ‘10’ ‘20’

  • old_order (bool (optional)) – this function used to order the 20m imagery 2,3,4,5,6,7,11,12,8a if false ordered like this 2,3,4,5,6,7,8a,11,12

  • blocksize (int (optional)) – the chunk of jp2 to read in - glymur seems to work fastest with 2048

Returns

A string of the output file path

Return type

string

raster.stack_ras(rasterList, outFile)

Stack some rasters for change classification

Parameters
  • rasterList (string) – the input image

  • outFile (string) – the output file path including file extension

raster.stat_comp(inRas, outMap, bandList=None, stat='percentile', q=95, blocksize=256, FMT=None, dtype=6)

Calculate depth wise stat on a multi band raster with selected or all bands

Parameters
  • inRas (string) – input Raster

  • outMap (string) – the output raster calculated

  • stat (string) – the statisitc to be calculated make sure there are no nans as nan percentile is far too slow

  • blocksize (int) – the chunck processed

  • q (int) – the ith percentile if percentile is the stat used

  • FMT (string) – gdal compatible (optional) defaults is tif

  • dtype (string) – gdal datatype (default gdal.GDT_Int32)

raster.temporal_comp(fileList, outMap, stat='percentile', q=95, folder=None, blocksize=None, FMT=None, dtype=5)

Calculate an image beased on a time series collection of imagery (eg a years woth of S2 data)

Parameters
  • FileList (list of strings) – the files to be inputed, if None a folder must be specified

  • outMap (string) – the output raster calculated

    statstring

    the statisitc to be calculated

  • blocksize (int) – the chunck processed

  • q (int) – the ith percentile if percentile is the stat used

  • FMT (string) – gdal compatible (optional) defaults is tif

  • dtype (string) – gdal datatype (default gdal.GDT_Int32)

raster.tile_rasters(inImage, outputImage, tilesize)

Split a large raster into smaller ones

Parameters
  • inImage (string) – the path to input raster

  • outputImage (string) – the path to the output image

  • tilesize (int) – the side of a square tile

geospatial_learn.learning module

the learning module

Description

The learning module set of functions provide a framework to optimise and classify EO data for both per pixel or object properties

learning.RF_oob_opt(model, X_train, min_est, max_est, step, regress=False)

This function uses the oob score to find the best parameters.

This cannot be parallelized due to the warm start bootstrapping, so is potentially slower than the other cross val in the create_model function

This function is based on an example from the sklearn site

This function plots a graph diplaying the oob rate

Parameters
  • model (string (.gz)) – path to model to be saved

  • X_train (np array) – numpy array of training data where the 1st column is labels

  • min_est (int) – min no of trees

  • max_est (int) – max no of trees

  • step (int) – the step at which no of trees is increased

  • regress (bool) – boolean where if True it is a regressor

  • Returns (tuple of np arrays)

  • ———————–

  • error rate, best estimator

learning.classify_object(model, inShape, attributes, field_name=None)

Classify a polygon/point file attributes (‘object based’) using an sklearn model

Parameters
  • model (string) – path to input model

  • inShape (string) – input shapefile path (must be .shp for now….)

  • attributes (list of stings) – list of attributes names

  • field_name (string) – name of classified label field (optional)

learning.classify_pixel(model, inputDir, bands, outMap, probMap)

A function to classify an image using a pre-saved model - assumes a folder of tiled rasters for memory management - classify_pixel_block is recommended instead of this function

Parameters

modelsklearn model

a path to a scikit learn model that has been saved

inputDirstring

a folder with images to be classified

bandsint

the no of image bands eg 8

outMapstring

path to output image excluding the file format ‘pathto/mymap’

probMapstring

path to output prob image excluding the file format ‘pathto/mymap’

FMTstring

optional parameter - gdal readable fmt

learning.classify_pixel_bloc(model, inputImage, bands, outMap, blocksize=None, FMT=None, ndvi=None, dtype=5)

A block processing classifier for large rasters, supports KEA, HFA, & Gtiff formats. KEA is recommended, Gtiff is the default

Parameters
  • model (sklearn model) – a path to a scikit learn model that has been saved

  • inputImage (string) – path to image including the file fmt ‘Myimage.tif’

  • bands (band) – the no of image bands eg 8

  • outMap (string) – path to output image excluding the file format ‘pathto/mymap’

  • FMT (string) – optional parameter - gdal readable fmt

  • blocksize (int (optional)) – size of raster chunck in pixels 256 tends to be quickest if you put None it will read size from gdal (this doesn’t always pay off!)

  • dtype (int (optional - gdal syntax gdal.GDT_Int32)) – a gdal dataype - default is int32

Notes

Block processing is sequential, but quite a few sklearn models are parallel so that has been prioritised rather than raster IO

learning.create_model(X_train, outModel, clf='svc', random=False, cv=6, cores=-1, strat=True, regress=False, params=None, scoring=None)

Brute force or random model creating using scikit learn. Either use the default params in this function or enter your own (recommended - see sklearn)

Parameters
  • X_train (np array) – numpy array of training data where the 1st column is labels

  • outModel (string) – the output model path which is a gz file

  • clf (string) – an sklearn or xgb classifier/regressor logit, sgd, linsvc, svc, svm, nusvm, erf, rf, gb, xgb

  • random (bool) – if True, a random param search

  • cv (int) – no of folds

  • cores (int or -1 (default)) – the no of parallel jobs

  • strat (bool) – a stratified grid search

  • regress (bool) – a regression model if True, a classifier if False

  • params (a dict of model params (see scikit learn)) – enter your own params dict rather than the range provided

  • scoring (string) – a suitable sklearn scoring type (see notes)

There are more sophisticated ways to tune a model, this greedily searches everything but can be computationally costly. Fine tuning in a more measured way is likely better. There are numerous books, guides etc… E.g. with gb- first tune no of trees for gb, then learning rate, then tree specific

From my own experience and reading around

sklearn svms tend to be not great on large training sets and are slower with these (i have tried on HPCs and they time out on multi fits)

sklearn ‘gb’ is very slow to train, though quick to predict

xgb is much faster, but rather different in algorithmic detail - ie won’t produce same results as sklearn…

xgb also uses the sklearn wrapper params which differ from those in xgb docs, hence they are commented next to the area of code

Scoring types - there are a lot - some of which won’t work for multi-class, regression etc - see the sklearn docs!

‘accuracy’, ‘adjusted_rand_score’, ‘average_precision’, ‘f1’, ‘f1_macro’, ‘f1_micro’, ‘f1_samples’, ‘f1_weighted’, ‘neg_log_loss’, ‘neg_mean_absolute_error’, ‘neg_mean_squared_error’, ‘neg_median_absolute_error’, ‘precision’, ‘precision_macro’, ‘precision_micro’, ‘precision_samples’, ‘precision_weighted’, ‘r2’, ‘recall’, ‘recall_macro’, ‘recall_micro’, ‘recall_samples’, ‘recall_weighted’, ‘roc_auc’

learning.create_model_tpot(X_train, outModel, cv=6, cores=-1, regress=False, params=None, scoring=None)

Create a model using the tpot library where genetic algorithms are used to optimise pipline and params.

This also supports xgboost incidentally

Parameters
  • X_train (np array) – numpy array of training data where the 1st column is labels

  • outModel (string) – the output model path (which is a .py file) from which to run the pipeline

  • cv (int) – no of folds

  • cores (int or -1 (default)) – the no of parallel jobs

  • strat (bool) – a stratified grid search

  • regress (bool) – a regression model if True, a classifier if False

  • params (a dict of model params (see tpot)) – enter your own params dict rather than the range provided

  • scoring (string) – a suitable sklearn scoring type (see notes)

learning.get_training(inShape, inRas, bands, field, outFile=None)

Collect training as an np array for use with create model function

Parameters
  • inShape (string) – the input shapefile - must be esri .shp at present

  • inRas (string) – the input raster from which the training is extracted

  • bands (int) – no of bands

  • field (string) – the attribute field containing the training labels

  • outFile (string (optional)) – path to the training file saved as joblib format (eg - ‘training.gz’)

Returns

  • A tuple containing

  • -np array of training data

  • -list of polygons with invalid geometry that were not collected

learning.get_training_point(inShape, inRas, bands, field)
Collect training as a np array for use with create model function using

point data

Parameters
  • inShape (string) – the input shapefile - must be esri .shp at present

  • inRas (string) – the input raster from which the training is extracted

  • bands (int) – no of bands

  • field (string) – the attribute field containing the training labels

  • outFile (string (optional)) – path to the training file saved as joblib format (eg - ‘training.gz’)

Returns

  • A tuple containing

  • -np array of training data

  • -list of polygons with invalid geometry that were not collected

UNFINISHED DO NOT USE

learning.get_training_shp(inShape, label_field, feat_fields, outFile=None)

Collect training from a shapefile attribute table. Used for object-based classification (typically).

Parameters
  • inShape (string) – the input shapefile - must be esri .shp at present

  • label_field (string) – the field name for the class labels

  • feat_fields (list) – the field names of the feature data

  • outFile (string (optional)) – path to training data to be saved (.gz)

Returns

  • training data as a dataframe, first column is labels, rest are features

  • list of reject features

learning.plot_feature_importances(modelPth, featureNames)

Plot the feature importances of an ensemble classifier

Parameters
  • modelPth (string) – A sklearn model path

  • featureNames (list of strings) – a list of feature names

learning.prob_pixel_bloc(model, inputImage, bands, outMap, classes, blocksize=None, FMT=None, one_class=None)

A block processing classifier for large rasters that produces a probability, output.

Supports KEA, HFA, & Gtiff formats -KEA is recommended, Gtiff is the default

Parameters
  • model (string) – a path to a scikit learn model that has been saved

  • inputImage (string) – path to image including the file fmt ‘Myimage.tif’

  • bands (int) – the no of image bands eg 8

  • outMap (string) – path to output image excluding the file format ‘pathto/mymap’

  • classes (int) – no of classes

  • blocksize (int (optional)) – size of raster chunck 256 tends to be quickest if you put None it will read size from gdal (this doesn’t always pay off!)

  • FMT (string) – optional parameter - gdal readable fmt eg ‘Gtiff’

  • one_class (int) – choose a single class to produce output prob raster

Block processing is sequential, but quite a few sklearn models are parallel so that has been prioritised rather than raster IO

learning.rmse_vector_lyr(inShape, attributes)

Using sklearn get the rmse of 2 vector attributes (the actual and predicted of course in the order [‘actual’, ‘pred’])

Parameters
  • inShape (string) – the input vector of OGR type

  • attributes (list) – a list of strings denoting the attributes

geospatial_learn.shape module

The shape module.

Description

This module contains various functions for the writing of data in OGR vector formats. The functions are mainly concerned with writing geometric or pixel based attributes, with the view to them being classified in the learning module

shape.meshgrid(inRaster, outShp, gridHeight=1, gridWidth=1)
shape.ms_snake(inShp, inRas, outShp, band=2, buf1=0, buf2=0, algo='ACWE', nodata_value=0, iterations=200, smoothing=1, lambda1=1, lambda2=1, threshold='auto', balloon=-1)

Deform a polygon using active contours on the values of an underlying raster.

This uses morphsnakes and explanations are from there.

Parameters
  • inShp (string) – input shapefile

  • inRas (string) – input raster

  • outShp (string) – output shapefile

  • band (int) – an integer val eg - 2

  • algo (string) – either “GAC” (geodesic active contours) or the default “ACWE” (active contours without edges)

  • buf1 (int) – the buffer if any in map units for the bounding box of the poly which extracts underlying pixel values.

  • buf2 (int) – the buffer if any in map units for the expansion or contraction of the poly which will initialise the active contour. This is here as you may wish to adjust the init polygon so it does not converge on a adjacent one or undesired area.

  • nodata_value (numerical) – If used the no data val of the raster

  • iterations (uint) – Number of iterations to run.

  • smoothing (uint, optional) – Number of times the smoothing operator is applied per iteration. Reasonable values are around 1-4. Larger values lead to smoother segmentations.

  • lambda1 (float, optional) – Weight parameter for the outer region. If lambda1 is larger than lambda2, the outer region will contain a larger range of values than the inner region.

  • lambda2 (float, optional) – Weight parameter for the inner region. If lambda2 is larger than lambda1, the inner region will contain a larger range of values than the outer region.

  • threshold (float, optional) – Areas of the image with a value smaller than this threshold will be considered borders. The evolution of the contour will stop in this areas.

  • balloon (float, optional) – Balloon force to guide the contour in non-informative areas of the image, i.e., areas where the gradient of the image is too small to push the contour towards a border. A negative value will shrink the contour, while a positive value will expand the contour in these areas. Setting this to zero will disable the balloon force.

shape.ransac_lines(inRas, outRas, sigma=3, row=True, col=True, binwidth=40)
shape.shape_props(inShape, prop, inRas=None, label_field='ID')

Calculate various geometric properties of a set of polygons Output will be relative to geographic units where relevant, but normalised where not (eg Eccentricity)

Parameters

inShape (string) – input shape file path

inRasstring

a raster to get the correct dimensions from (optional), required for scikit-image props

propstring

Scikit image regionprops prop (see http://scikit-image.org/docs/dev/api/skimage.measure.html)

OGR is used to generate most of these as it is faster but the string keys are same as scikit-image see notes for which require raster

Notes

Only shape file needed (OGR / shapely / numpy based)

‘MajorAxisLength’, ‘MinorAxisLength’, Area’, ‘Eccentricity’, ‘Solidity’, ‘Extent’: ‘Extent’, ‘Perimeter’: ‘Perim’

Raster required

‘Orientation’ and the remainder of props calcualble with scikit-image. These

process a bit slower than the above ones

shape.shp2gj(inShape, outJson)

Converts a geojson/json to a shapefile

Parameters

inShape (string) – input shapefile

outJsonstring

output geojson

Notes

Credit to person who posted this on the pyshp site

shape.snake(inShp, inRas, outShp, band=1, buf=1, nodata_value=0, boundary='fixed', alpha=0.1, beta=30.0, w_line=0, w_edge=0, gamma=0.01, max_iterations=2500, smooth=True, eq=False, rgb=False)

Deform a line using active contours based on the values of an underlying

raster - based on skimage at present so

not quick!

Notes

Param explanations for snake/active contour from scikit-image api

Parameters
  • vector_path (string) – input shapefile

  • raster_path (string) – input raster

  • band (int) – an integer val eg - 2

  • buf (int) – the buffer area to include for the snake deformation

  • alpha (float) – Snake length shape parameter. Higher values makes snake contract faster.

  • beta (float) – Snake smoothness shape parameter. Higher values makes snake smoother.

  • w_line (float) – Controls attraction to brightness. Use negative values to attract toward dark regions.

  • w_edge (float) – Controls attraction to edges. Use negative values to repel snake from edges.

  • gamma (float) – Explicit time stepping parameter.

  • max_iterations (int) – No of iterations to evolve snake

  • boundary (string) – Scikit-image text: Boundary conditions for the contour. Can be one of ‘periodic’, ‘free’, ‘fixed’, ‘free-fixed’, or ‘fixed-free’. ‘periodic’ attaches the two ends of the snake, ‘fixed’ holds the end-points in place, and ‘free’ allows free movement of the ends. ‘fixed’ and ‘free’ can be combined by parsing ‘fixed-free’, ‘free-fixed’. Parsing ‘fixed-fixed’ or ‘free-free’ yields same behaviour as ‘fixed’ and ‘free’, respectively.

  • nodata_value (numerical) – If used the no data val of the raster

  • rgb (bool) – read in bands 1-3 assuming them to be RGB

shape.texture_stats(vector_path, raster_path, band, gprop='contrast', offset=2, angle=0, write_stat=None, nodata_value=0, mean=False)

Calculate and optionally write texture stats for an OGR compatible polygon based on underlying raster values

Parameters
  • vector_path (string) – input shapefile

  • raster_path (string) – input raster path

  • gprop (string) – a skimage gclm property entropy, contrast, dissimilarity, homogeneity, ASM, energy, correlation

  • offset (int) – distance in pixels to measure - minimum of 2!!!

  • angle (int) – angle in degrees from pixel (int)

    135 90 45 | /

    c - 0

  • mean (bool) – take the mean of all offsets

  • Important to note that the results will be unreliable for glcm

  • texture features if seg is true as non-masked values will be zero or

  • some weird no data and will affect results

Notes

Important

The texture of the bounding box is at present the “relible” measure

Using the segment only results in potentially spurious results due to the scikit-image algorithm measuring texture over zero/nodata to number pixels (e.g 0>54). The segment part will be developed in due course to overcome this issue

shape.thresh_seg(inShp, inRas, outShp, band, buf=0, algo='otsu', min_area=4, nodata_value=0)

Use an image processing technique to threshold foreground and background in a polygon segment.

This default is otsu’s method.

Parameters
  • vector_path (string) – input shapefile

  • raster_path (string) – input raster

  • band (int) – an integer val eg - 2

  • algo (string) – ‘otsu’, niblack, sauvola

  • nodata_value (numerical) – If used the no data val of the raster

shape.write_text_field(inShape, fieldName, attribute)

Write a string to a ogr vector file

Parameters
  • inShape (string) – input OGR vecotr file

  • fieldName (string) – name of field being written

  • attribute (string) – ‘text to enter in each entry of column’

shape.zonal_rgb_idx(vector_path, raster_path, nodata_value=0)

Calculate RGB-based indicies per segment/AOI

Parameters
  • vector_path (string) – input shapefile

  • raster_path (string) – input raster

  • nodata_value (numerical) – If used the no data val of the raster

shape.zonal_stats(vector_path, raster_path, band, bandname, stat='mean', write_stat=None, nodata_value=0)

Calculate zonal stats for an OGR polygon file

Parameters
  • vector_path (string) – input shapefile

  • raster_path (string) – input raster

  • band (int) – an integer val eg - 2

  • bandname (string) – eg - blue

  • stat (string) – string of a stat to calculate, if omitted it will be ‘mean’ others: ‘mode’, ‘min’,’mean’,’max’, ‘std’,’ sum’, ‘count’,’var’, skew’, ‘kurt (osis)’

  • write_stat (bool (optional)) – If True, stat will be written to OGR file, if false, dataframe only returned (bool)

  • nodata_value (numerical) – If used the no data val of the raster

shape.zonal_stats_all(vector_path, raster_path, bandnames, statList=['mean', 'min', 'max', 'median', 'std', 'var', 'skew', 'kurt'])

Calculate zonal stats for an OGR polygon file

Parameters
  • vector_path (string) – input shapefile

  • raster_path (string) – input raster

  • band (int) – an integer val eg - 2

  • bandnames (list) – eg - [‘b’,’g’,’r’,’nir’]

  • nodata_value (numerical) – If used the no data val of the raster

geospatial_learn.utilities module

Created on Thu Sep 8 22:35:39 2016 @author: Ciaran Robb The utilities module - things here don’t have an exact theme or home yet so may eventually move elsewhere

If you use code to publish work cite/acknowledge me and authors of libs etc as appropriate

utilities.accum_gabor(inRas, outRas=None, size=(9, 9), stdv=1, no_angles=16, wave_length=3, eccen=1, phase_off=0, pltgrid=(4, 4), blockproc=False)

Process with custom gabor filters and output an raster containing each kernel output as a band

Parameters
  • inRas (string) – input raster

  • outRas (string) – output raster

  • size (tuple) – size of in gabor kernel in pixels (ksize)

  • stdv (int) – size of stdv / of of gabor kernel (sigma/stdv)

  • no_angles (int) – number of angles in gabor kernel (theta)

  • wave_length (int) – width of stripe in gabor kernel (lambda/wavelength)

  • phase_off (int) – the phase offset of the kernel

  • eccen (int) – the elipticity of the kernel when = 1 the gaussian envelope is circular

  • blocproc (bool) – whether to process in chunks - necessary for very large images!

utilities.colorscale(seg, prop)
utilities.combine_hough_seg(inRas1, inRas2, outRas, outShp, min_area=None)
utilities.get_corners(bboxes)

Get corners of bounding boxes

Parameters

bboxes (numpy.ndarray) – Numpy array containing bounding boxes of shape N X 4 where N is the number of bounding boxes and the bounding boxes are represented in the format x1 y1 x2 y2

Returns

Numpy array of shape N x 8 containing N bounding boxes each described by their corner co-ordinates x1 y1 x2 y2 x3 y3 x4 y4

Return type

numpy.ndarray

utilities.get_enclosing_box(corners)

Get an enclosing box for ratated corners of a bounding box

Parameters

corners (numpy.ndarray) – Numpy array of shape N x 8 containing N bounding boxes each described by their corner co-ordinates x1 y1 x2 y2 x3 y3 x4 y4

Returns

Numpy array containing enclosing bounding boxes of shape N X 4 where N is the number of bounding boxes and the bounding boxes are represented in the format x1 y1 x2 y2

Return type

numpy.ndarray

utilities.hough2line(inRas, outShp, edge='canny', sigma=2, thresh=None, ratio=2, n_orient=6, n_scale=5, hArray=True, vArray=True, prob=False, line_length=100, line_gap=200, valrange=1, interval=10, band=2, min_area=None)
utilities.image_thresh(image)
utilities.iter_ransac(image, sigma=3, no_iter=10, order='col', mxt=2500)
utilities.min_bound_rectangle(points)

Find the smallest bounding rectangle for a set of points. Returns a set of points representing the corners of the bounding box. :Parameters: points (list) – An nx2 iterable of points

Returns

an nx2 list of coordinates

Return type

list

utilities.ms_toposeg(inRas, outShp, iterations=100, algo='ACWE', band=2, dist=30, se=3, usemin=False, imtype=None, useedge=True, burnedge=False, merge=False, close=True, sigma=4, hi_t=None, low_t=None, init=4, smooth=1, lambda1=1, lambda2=1, threshold='auto', balloon=1)

Topology preserveing segmentation, implemented in python/nump inspired by ms_topo and morphsnakes

This uses morphsnakes level sets to make the segments and param explanations are mainly from there.

Parameters
  • inSeg (string) – input segmentation raster

  • raster_path (string) – input raster whose pixel vals will be used

  • band (int) – an integer val eg - 2

  • algo (string) – either “GAC” (geodesic active contours) or “ACWE” (active contours without edges)

  • sigma (the size of stdv defining the gaussian envelope if using canny edge) – a unitless value

  • iterations (uint) – Number of iterations to run.

  • smooth (uint, optional) – Number of times the smoothing operator is applied per iteration. Reasonable values are around 1-4. Larger values lead to smoother segmentations.

  • lambda1 (float, optional) – Weight parameter for the outer region. If lambda1 is larger than lambda2, the outer region will contain a larger range of values than the inner region.

  • lambda2 (float, optional) – Weight parameter for the inner region. If lambda2 is larger than lambda1, the inner region will contain a larger range of values than the outer region.

  • threshold (float, optional) – Areas of the image with a value smaller than this threshold will be considered borders. The evolution of the contour will stop in this areas.

  • balloon (float, optional) – Balloon force to guide the contour in non-informative areas of the image, i.e., areas where the gradient of the image is too small to push the contour towards a border. A negative value will shrink the contour, while a positive value will expand the contour in these areas. Setting this to zero will disable the balloon force.

utilities.ms_toposnakes(inSeg, inRas, outShp, iterations=100, algo='ACWE', band=2, sigma=4, alpha=100, smooth=1, lambda1=1, lambda2=1, threshold='auto', balloon=-1)

Topology preserveing morphsnakes, implemented in python/numpy exclusively by C.Robb

This uses morphsnakes and explanations are from there.

Parameters
  • inSeg (string) – input segmentation raster

  • raster_path (string) – input raster whose pixel vals will be used

  • band (int) – an integer val eg - 2

  • algo (string) – either “GAC” (geodesic active contours) or “ACWE” (active contours without edges)

  • sigma (the size of stdv defining the gaussian envelope if using canny edge) – a unitless value

  • iterations (uint) – Number of iterations to run.

  • smooth (uint, optional) – Number of times the smoothing operator is applied per iteration. Reasonable values are around 1-4. Larger values lead to smoother segmentations.

  • lambda1 (float, optional) – Weight parameter for the outer region. If lambda1 is larger than lambda2, the outer region will contain a larger range of values than the inner region.

  • lambda2 (float, optional) – Weight parameter for the inner region. If lambda2 is larger than lambda1, the inner region will contain a larger range of values than the outer region.

  • threshold (float, optional) – Areas of the image with a value smaller than this threshold will be considered borders. The evolution of the contour will stop in this areas.

  • balloon (float, optional) – Balloon force to guide the contour in non-informative areas of the image, i.e., areas where the gradient of the image is too small to push the contour towards a border. A negative value will shrink the contour, while a positive value will expand the contour in these areas. Setting this to zero will disable the balloon force.

utilities.ms_toposnakes2(inSeg, inRas, outShp, iterations=100, algo='ACWE', band=2, sigma=4, smooth=1, lambda1=1, lambda2=1, threshold='auto', balloon=-1)

Topology preserveing morphsnakes, implmented by Jirka Borovec version with C++/cython elements- credit to him!

This is memory intensive so large images will likely fill RAM and produces similar resuts to ms_toposnakes

This uses morphsnakes and explanations are from there.

Parameters
  • inSeg (string) – input segmentation raster

  • raster_path (string) – input raster whose pixel vals will be used

  • band (int) – an integer val eg - 2

  • algo (string) – either “GAC” (geodesic active contours) or “ACWE” (active contours without edges)

  • sigma (the size of stdv defining the gaussian envelope if using canny edge) – a unitless value

  • iterations (uint) – Number of iterations to run.

  • smooth (uint, optional) – Number of times the smoothing operator is applied per iteration. Reasonable values are around 1-4. Larger values lead to smoother segmentations.

  • lambda1 (float, optional) – Weight parameter for the outer region. If lambda1 is larger than lambda2, the outer region will contain a larger range of values than the inner region.

  • lambda2 (float, optional) – Weight parameter for the inner region. If lambda2 is larger than lambda1, the inner region will contain a larger range of values than the outer region.

  • threshold (float, optional) – Areas of the image with a value smaller than this threshold will be considered borders. The evolution of the contour will stop in this areas.

  • balloon (float, optional) – Balloon force to guide the contour in non-informative areas of the image, i.e., areas where the gradient of the image is too small to push the contour towards a border. A negative value will shrink the contour, while a positive value will expand the contour in these areas. Setting this to zero will disable the balloon force.

utilities.otbMeanshift(inputImage, radius, rangeF, minSize, outShape)

OTB meanshift by calling the otb command line Written for convenience and due to otb python api being rather verbose

There is a maximum size for the .shp format otb doesn’t seem to want to move beyond (2gb), so enormous rasters may need to be sub divided

You will need to install OTB etc seperately

Parameters
  • inputImage (string) – the input image

  • radius (int) – the kernel radius

  • rangeF (int) – the kernel range

  • minSize (int) – minimum segment size

  • outShape (string) – the ouput shapefile

utilities.ragmerge(inSeg, inRas, outShp, band, thresh=0.02)
utilities.raster2array(inRas, bands=[1])

Read a raster and return an array, either single or multiband

Parameters
  • inRas (string) – input raster

  • bands (list) – a list of bands to return in the array

utilities.rotate_box(corners, angle, cx, cy, h, w)

Rotate the bounding box.

Parameters
  • corners (numpy.ndarray) – Numpy array of shape N x 8 containing N bounding boxes each described by their corner co-ordinates x1 y1 x2 y2 x3 y3 x4 y4

  • angle (float) – angle by which the image is to be rotated

  • cx (int) – x coordinate of the center of image (about which the box will be rotated)

  • cy (int) – y coordinate of the center of image (about which the box will be rotated)

  • h (int) – height of the image

  • w (int) – width of the image

Returns

Numpy array of shape N x 8 containing N rotated bounding boxes each described by their corner co-ordinates x1 y1 x2 y2 x3 y3 x4 y4

Return type

numpy.ndarray

utilities.rotate_im(image, angle)

Rotate the image.

Rotate the image such that the rotated image is enclosed inside the tightest rectangle. The area not occupied by the pixels of the original image is colored black.

Parameters
  • image (numpy.ndarray) – numpy image

  • angle (float) – angle by which the image is to be rotated

Returns

Rotated Image

Return type

numpy.ndarray

utilities.spinim(self, img, bboxes)
utilities.temp_match(vector_path, raster_path, band, nodata_value=0, ind=None)

Based on polygons return template matched images

Parameters
  • vector_path (string) – input shapefile

  • raster_path (string) – input raster

  • band (int) – an integer val eg - 2

  • nodata_value (numerical) – If used the no data val of the raster

  • ind (int) – The feature ID to use - if used this will use one feature and rotate it 90 for the second

Returns

Return type

list of template match arrays same size as input

utilities.test_gabor(im, size=9, freq=0.1, angle=None, funct='cos', plot=True, smooth=True, interp='none')

Process image with gabor filter bank of specified orientation or derived from image positive values bounding box - implemented from numpy with more intuitive params

This is the numpy based one

Parameters
  • inRas (string) – input raster

  • size (int) – size of in gabor kernel in pixels (ksize)

  • freq (float)

angles: int

number of angles in gabor kernel (theta)

utilities.test_gabor_cv2(im, size=9, stdv=1, angle=None, wave_length=3, eccen=1, phase_off=0, plot=True, smooth=True, interp='none')

Process image with gabor filter bank of specified orientation or derived from image positive values bounding box

This is the open cv based one

Parameters
  • inRas (string) – input raster

  • size (int) – size of in gabor kernel in pixels (ksize)

  • stdv (int) – stdv / of of gabor kernel (sigma/stdv)

angles: int

number of angles in gabor kernel (theta)

wave_length: int

width of stripe in gabor kernel (lambda/wavelength) optional best to leave none and hence same as size

phase_off: int

the phase offset of the kernel

eccen: int

the elipticity of the kernel when = 1 the gaussian envelope is circular (gamma)

utilities.visual_callback_2d(background, fig=None)

Returns a callback than can be passed as the argument iter_callback of morphological_geodesic_active_contour and morphological_chan_vese for visualizing the evolution of the levelsets. Only works for 2D images.

Parameters
  • background ((M, N) array) – Image to be plotted as the background of the visual evolution.

  • fig (matplotlib.figure.Figure) – Figure where results will be drawn. If not given, a new figure will be created.

Returns

callback – A function that receives a levelset and updates the current plot accordingly. This can be passed as the iter_callback argument of morphological_geodesic_active_contour and morphological_chan_vese.

Return type

Python function

Module contents