CDAT - cdms2, maskes, cdscan, cdutil, genutil - Introduction

Lesson 3
cdms2, masks, cdscan, cdutil, genutil

cdms2: selecting data
import cdms2
DATAPATH = ‘/CAS_OBS/mo/sst/HadISST/’
f = cdms2.open(DATAPATH + ‘sst_HadISST_Climatology_1961-1990.nc’)
x = f(‘sst’) # retrieves the whole dataset - a “slab”
# Selecting a specific area
x = f(‘sst’, latitude=(0., 35.), longitude= (20., 100.))
print x.shape
# Selecting specific times
y = f(‘sst’, time=(‘0-6-1’, ‘0-6-31’))
# You can also select the specific time from a “slab”
z = x(time=(‘0-6-1’, ‘0-6-31’))
# You can “squeeze” out “singleton” dimensions
y = f(‘sst’, time=(‘0-6-1’, ‘0-6-31’), squeeze=1)
# You can also change the order of returned dimensions
z = x(time=(‘0-6-1’, ‘0-6-31’), order=‘xyt’)

Querying the slab
# You can query the order of returned dimensions!
print 'Order of dimensions for x is:', x.getOrder()
# You can change the order of returned dimensions!
x2 = f('sst', latitude=(0., 35.), longitude=(20., 100.), order='xty')
# If I just want to ensure that time as my first axis
x3 = f('sst', order=‘t…')
# You can check the axis and its properties
lat_axis = x2.getLatitude()
print lat_axis.info()
lat_bounds = lat_axis.getBounds()
# Axes can be retrieved by index position
my_ax = x2.getAxis(1)
print my_ax.info()

Bounds!
(lat, lon)
lat_bounds_1
lat_bounds_2
lon_bounds_1 lon_bounds_2

Specifying precise regions to extract
# If I need a specific point - say lat, lon = 32.1, 100.3
x3 = f('sst', latitude=(32.1, 32.1, 'cob'),
longitude=(100.3, 100.3, 'cob'))
print x3.shape
# The first 2 positions are either ‘c’losed or ‘o’pen
# The 3rd position is ‘b’ or ‘e’ or …..

cdutil ‐ overview
• Set of climate data specific utilities.
• The cdutil Package contains a collection of
sub‐packages useful to deal with Climate
Data
• Sub‐components are:
– region
– times: tools to deal with the time dimension.
– vertical
– averager

cdutil: region
• The cdutil.region module allows the user to
extract a region “exactly”. i.e. resetting the
latitude and longitude bounds to match the
area “exactly”, therefore computing an “exact”
average when passed to the averager function.
• Predefined regions are:
– AntarcticZone, AAZ (South of latitude 66.6S)
– ArcticZone, AZ (North of latitude 66.6N)
– NorthernHemisphere, NH # useful for dataset with
latitude crossing the equator
– SouthernHemisphere, SH
– Tropics (latitudes band: 23.4S, 23.4N)

Cdutil: region
• cdutil.region
– cdms2 selector to extract “exact” region (i.e reset
bounds correctly so averaging account for only
“actual” area averaged not the full cell.
S=f(“var”, cdutil.region.NorthernHemisphere)
• Creating your own regions
NINO34 =
cdms2.selectors.Selector(cdutil.region.domain(l
atitude=(-5., 5.), longitude=(190., 240.)))

Interpolation (re‐gridding)
# Suppose we have a slab
ds1 = f('sst', latitude=(0., 35.), longitude=(20., 100.),
time=('0-6-1', '0-6-31'), order='xty’)
print 'ds1.shape = ', ds1.shape, 'Axis order=', ds1.getOrder()
# Let us now extract another dataset.
f2 =
cdms2.open(’/CAS_OBS/climatology/NCEP_NCAR_Climatology_ltm/slp.
day.ltm.nc')
ds2 = f2('slp', latitude=(0., 35.), longitude=(20., 100.),
time=('1-6-1', '1-6-31'))
# I want to transform ds2 into the grid in ds1
ds3 = ds2.regrid(ds1.getGrid())
print 'ds3.shape = ', ds3.shape, 'Axis order=', ds3.getOrder()
# Alternate regridder
from regrid import Regridder
ingrid = ds2.getGrid()
outgrid = ds1.getGrid()
regridFunc = Regridder(ingrid, outgrid)
new_ds2 = regridFunc(ds2)

Vertical regridding
• You can regrid pressure‐level coordinates in the vertical
axis using the pressureRegrid() method.
• You need to define, or use an existing, vertical axis.
• Then use the pressureRegrid method on the variable
you wish to regrid, passing it the new level as the
argument:
• If var is the variable to regrid and the newlevs is
the vertical axis to regrid to:
>>> var_on_new_levels = var.pressureRegrid(levout)

cdms2: Using masks
sst_mask = cdms2.MV2.getmask(ds1(order='tyx'))
print 'sst_mask.shape after reorder = ', sst_mask.shape
print sst_mask.__class__
# So we resize the mask
sst_mask.resize(ds3.shape)
#
ds4 = cdms2.createVariable(ds3[:], mask=sst_mask,
id='masked_psl', fill_value=1.e+20)

cdscan
• A utility that helps you manage files better.
• When you have many .nc (or .ctl) files you can
use this utility to generate a single “xml” file
that makes life simple.
• Try the following:
cdscan -x “some_filename.xml” DATA_PATH1/*.nc
– You can also change the time axis while you are at
it!
cdscan –x “some_filename.xml” –i 1 –r“months
since 1-1-1” DATA_PATH2/*.nc

cdutil: times
• cdutil.times – for time axes, geared toward climate data
– Climatology, Departures, Anomalies Tools works on BOUNDS,
NOT on time values, designed for monthly seasons, but one
can create an engine for other kind of data (daily, yearly,
etc…).
ac=cdutil.times.ANNUALCYCLE.climatology(s)
– In order to set bounds you can use:
cdutil.setTimeBoundsMonthly(Obj)
cdutil.setTimeBoundsYearly(Obj)
cdutil.setTimeBoundsDaily(Obj, frequency=1)
(Obj can be slab or time axis)
– Create your own seasons:
MONSOON = cdutil.times.Seasons(’JJAS’)
• cdutil imports everything in the times module so you can just
call e.g.:
cdutil.setTimeBoundsMonthly(slab/axis)

The importance of bounds
• CDAT used to set bounds automatically. E.g.:
longitude = [0, 90, 180, 270]
∴ bounds = [[-45, 45], [45, 135],
[135, 225], [225, 315]]
• Seems reasonable, but imagine a monthly mean time series where the times
are recorded on 1st day of each month:
timeax=[“1999-1-1”, “1999-2-1”, …, “2100-12-1”]
• CDAT assumes that each month represents the period of 15th last month to
15th this month.
• Since cdutil tools use bounds they will be misinterpreting the data. Need to
set the bounds sensibly:
>>> cdutil.setTimeBoundsMonthly(timeax)

Pre‐defined time‐related means
• DJF, MAM, JJA, SON (seasons)
>>> djf_mean=cdutil.DJF(my_var)
• SEASONALCYCLE (means for the 4 predefined seasons
[DJF, MAM, JJA, SON ]) – array of above.
>>> seas_mns=cdutil.SEASONALCYCLE(my_var)
• YEAR (annual means)
• ANNUALCYCLE (monthly means for each month of the
year)
– EXERCISE: Try calculating the climatological annual cycle for
the NCEP Reanalysis data you have read in.

Climatologies and departures
Season extractors have 2 functions available:
• climatology: which computes the average of all
seasons passed. ANNUALCYCLE.climatology(), will return
the 12 month annual cycle for the slab:
>>> ann=cdutil.ANNUALCYCLE.climatology(v)
• departures: which given an optional climatology will
compute seasonal departures from it.
>>> d=cdutil.ANNUALCYCLE.departures(v, cli60_99)
# Note that the second argument is optional but can be a pre‐computed
climatology such as here cli60_99 is a 1960‐1999 climatology but the variable v is
defined from 1900‐2000. If not given then the overall climatology for v is used.

Simple user‐defined averaging
• You can create your own simple averages using
arrays, slabs or variables in the usual way:
– Averaging over 4 time steps:
>>> t.shape
(4, 181, 360)
>>> av=(t[0]+t[1]+t[2]+t[3])/4
• Drawbacks:
– Doesn’t retain your metadata.
– Cannot average simply across axes within a variable.

MV2 Averaging
• The MV2 module has an averaging function:
MV2.average(x, axis=0, weights=None, returned=0)
– computes the average value of the non‐masked elements of x
along the selected axis. If weights is given, it must match the
size and shape of x, and the value returned is:
– elements corresponding to those masked in x or weights are
ignored. If returned, a 2‐tuple consisting of the average and
the sum of the weights is returned.

MV2 Averaging: example
• To calculate a set of zonal means:
import cdms2, MV2
f=cdms2.open(’/CAS_OBS/mo/sst/HadISST/
sst_HadISST_Climatology_1961-1990.nc’)
data=f(‘sst’)
print data.shape
zm=MV2.average(data, axis=2)
print zm.shape
print zm.info()

The cdutil “averager” function
• The “cdutil.averager()” function is the key to
spatial and temporal averaging in CDAT.
• Masks are dealt with implicitly.
• Powerful area averaging function.
• Provides control over the order of operations
(i.e. which dimensions are averaged over first).
• Allows weightings for the different axes:
– pass your own array of weights for each dimension,
use the default (grid) weights or specify equal
weighting.

Usage of cdutil.averager
result = averager( V, axis=axisoptions,
weights=weightoptions,
action=actionoptions,
returned=returnedoptions,
combinewts=combinewtsoptions)
axisoptions has to be a string. You can pass axis='tyx', or '123', or 'x (plev)’.
weightoptions is one of 'generate’ | ‘weighted’ | 'equal' | ‘unweighted’ |
array | Masked Variable
actionoptions is 'average' | 'sum‘ [Default = 'average‘].
You can either return the weighted average or the weighted sum of the
data.

Example: Region Averaging, and
Climatology
import cdutil
# define the your custom regions.
NINO3 = cdms2.selectors.Selector(cdutil.region.domain(latitude=(-5., 5.,
'ccb'), longitude=(210., 270., 'ccb')))
fsst = cdms2.open(INDIR + 'sst_HadISST_1870-1_2011-1.nc’)
nino3_data = fsst('sst', NINO3)
print nino3_data.shape, nino3_data.getOrder()
# Compute the Spatial average
nino3_average = cdutil.averager(nino3_data, axis='xy')
# Anomaly from climatology computed over 1961-1990
nino3_slice = nino3_average(time=('1961-1-1', '1990-12-31'))
nino3_clim = cdutil.ANNUALCYCLE.climatology(nino3_slice)
print nino3_clim.shape
# Now departures
nino3_anomaly = cdutil.ANNUALCYCLE.departures(nino3_average, nino3_clim)
print nino3_anomaly.shape

EXERCISE
• Extract the SST data and compute global
anomalies from the 1961‐1990 climatology for
the whole length of dataset.
• Average the anomaly data over x and y axes
using “equal” weights for both axes and
compare against area “weighted” average.

genutil : general utilities
• genutil.statistics: set of basic statistical
functions
• correlation, covariance, geometricmean,
laggedcorrelation, laggedcovariance,
linearregression, meanabsdiff, median,
array_indexing, percentiles, arrayindexing,
rank, autocorrelation , rms, autocovariance,
std, variance

Statistics Example
• c1 = genutil.statistics.correlation(a, b, axis=‘t’)
• c1.shape
• c2 = genutil.statistics.correlation(a, b,
axis=‘xy’)
• c2.shape

Support for other grid types
RectGrid ‐ Associated latitude and longitude are
1‐D axes, with strictly monotonic values.
CurveGrid ‐ Latitude and longitude are 2‐D
coordinate axes (Axis2D).
GenericGrid ‐ Latitude and longitude are 1‐D
auxiliary coordinate axes (AuxAxis1D)

Curvilinear and Generic Grids

Acknowledgements
• Dean Williams, Charles Doutriaux (PCMDI,
LLNL)
• Dr. Johnny Lin

CDAT - cdms2, maskes, cdscan, cdutil, genutil - Introduction

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a CDAT - cdms2, maskes, cdscan, cdutil, genutil - Introduction

Semelhante a CDAT - cdms2, maskes, cdscan, cdutil, genutil - Introduction (20)

Mais de Arulalan T

Mais de Arulalan T (20)

Último

Último (20)

CDAT - cdms2, maskes, cdscan, cdutil, genutil - Introduction