PP exploration dashboard

Todo:

  • put detailed data into cloud-accessible file format
  • define dashboard. elements:
    • panes:
      • map of yearly mean
        • widgets
          • year start/stop
          • select region for timeseries
      • yearly timeseries
      • climatology
    • widgets:
      • for map, yearly timeseries:
        • mean/int over months
        • month start/stop

Arrigo & von Dijken database

On Nov 16, 2017, at 11:51 AM, Gert van Dijken gertvd@stanford.edu wrote:

Hi Achim,

Glad to hear. Yes, they are single precision float, 32 bit or 4 bytes, sorry, I just said 'plain binary'. And indeed, production maps are the same.

Enjoy! Gert


On Nov 16, 2017, at 6:15 AM, Achim Randelhoff Achim.Randelhoff@npolar.no wrote:

Hi Gert,

thank you very much! I only got sensible lat/lon coordinates when I interpreted the binaries as single-precision floats (float32), so I am assuming the same holds for the productivity maps. It looks good so far, I will look into this in the coming months.

Cheers, Achim


On Nov 13, 2017, at 12:32 PM, Gert van Dijken gertvd@stanford.edu wrote:

Hi Achim,

I made our productivity data available on our website. You can find it at: http://albedo.stanford.edu/gertvd/research/arctic/prod/

There are two directories, one is containing the navigation data and the other one the productivity data. They are all plain binary files, containing arrays of 750x750 pixels (maps, datatype: float). I put all daily maps together in one gzipped tar-file. Units are mg C/m2/day. The files starting with an A or based on MODIS/Aqua Chl a, and the ones starting with an S on SeaWiFS Chl.

The navigation files contain a similar array of 750x750 pixels. Each will have the latitude or longitude value of the corresponding pixel of the production files.

Hope it all makes sense, if now let me know and I will try to help.

Yes, please let us know when you use the data for publication or share with other people.

Good luck! Gert

In [ ]:
pp = []

for thisyear in np.arange(2012,2016):
    for thisdoy in range(100,301):

        datadir = '/Users/doppler/databases/vandijken_arrigo_PP/'
        yeardir = 'prod_data/prod_A_%i/' % thisyear  # _A_ meaning MODIS/Aqua, see above

        filedate = '%4i%03i' % (thisyear, thisdoy)

        lat=fromfile(datadir+'nav_data/lats_750x750_4f.flat',dtype='float32')
        lon=fromfile(datadir+'nav_data/lons_750x750_4f.flat',dtype='float32')

        p=fromfile(datadir+yeardir+filedate+'_A_R2014.0_L3b_reanal1_d4_c2chl_90p0_prod.bin',dtype='float32')
        p=np.where(p==-99.0,np.nan,p)

        pp.append(xr.Dataset(
            data_vars=dict(
                pp=(['date', 'pos'], p[np.newaxis,:])
            ),
            coords=dict(
                date=[dt.datetime(thisyear,1,1)+dt.timedelta(thisdoy-1)],
                pos=(['pos'], np.arange(len(lat))),
                lon=(['pos'], lon),
                lat=(['pos'], lat)
            ),
            attrs=dict(
                pp='Prim Prod, mg C/m2/d'
            )
        ))
        
pp = xr.concat(pp,dim='date')
pp

Derive yearly PP, OWD...

In [ ]:
bins = pd.date_range(start='2014', periods=3, freq='YS') #end=dt.datetime(2009,1,1)
groups = pp.pp.groupby_bins(
    group='date',
    bins=bins,
    labels=bins[:-1]
    # labels=bins[:-1] + (bins[1:] - bins[:-1])/2
)
pp_annual = groups.sum(dim='date')/1000
pp_count = groups.count(dim='date')
In [ ]:
pp = xr.Dataset(dict(pp=pp_annual, owd=pp_count)
              ).rename(dict(date_bins='date')).to_dataframe().reset_index()
pp.head()

to dataframe, do lat/lon binning

In [ ]:
p=pp.copy()

latbins=np.arange(np.floor(pp.lat.min()), np.ceil(pp.lat.max()), 1)
p.lat = pd.cut(p.lat, bins=latbins, labels=(latbins[:-1]+latbins[1:])/2)
lonbins=np.arange(-180.5, 180.5, 5)
p.lon = pd.cut(p.lon, bins=lonbins, labels=(lonbins[:-1]+lonbins[1:])/2)


pa = p.groupby(['date','lon','lat']).mean()
p = pa.unstack(level=0)#.reset_index()
pa = pa.reset_index()

p.dropna().head()

regression OWD-pp

In [ ]:
def lsq(row):
    x=row['owd'].values
    y=row['pp'].values
    if len(x)==0 or len(y)==0:
        return [np.nan, np.nan]
    else:
        mask = ~np.isnan(x)
        x, y = x[mask], y[mask]
        if (len(np.unique(x))>1):
            return list(np.polyfit(x=x, y=y, deg=1))
        else:
            return [np.nan, np.nan]        

reg = p.apply(axis=1, func=lsq)

reg = pd.DataFrame(reg.values.tolist(), 
                    columns=['slope', 'y0'],
                    index=reg.index)
def lsq(x,y): if len(x)==0 or len(y)==0: return [np.nan, np.nan] else: mask = ~np.isnan(x) x, y = x[mask], y[mask] if (len(np.unique(x))>1): return np.polyfit(x=x, y=y, deg=1) else: return [np.nan, np.nan]

to geodataframe

In [ ]:
reg=gpd.GeoDataFrame(reg.reset_index()).dropna()
reg['geometry']=[Point(x, y) for x, y in zip(reg.lon, reg.lat)]
reg.crs=from_epsg(4326)

reg.head()

plot

In [ ]:
reg.slope.describe()
In [ ]:
ax = geoplot.pointplot(
    reg
    ,hue='slope', legend=True,limits=(-50,50),cmap='Reds',
    projection=gcrs.NorthPolarStereo())
ax.coastlines()
In [ ]:
plt.scatter(pa.lat, pa.owd)
In [ ]:
_=pp.groupby('lat').plot.scatter(x='owd', y='pp')
In [ ]:
pa.plot.scatter(x='owd', y='pp')
In [ ]:
plt.close('all')
In [ ]:
hv.Scatter(pa, kdims=['owd'], vdims=['pp', 'lat']).groupby('lat')

PP plots

In [ ]:
ax = geoplot.pointplot(
    pp.loc[(pp.pp>1) & (pp.date==pd.Timestamp(2014,1,1))]
    ,hue='pp', legend=True,
    cmap='Reds',
    projection=gcrs.NorthPolarStereo())
ax.coastlines()
In [ ]:
np.log10(p.pp+1e-2).hist()