2. cdms2: selecting data
import cdms2
DATAPATH = ‘/CAS_OBS/mo/sst/HadISST/’
f = cdms2.open(DATAPATH + ‘sst_HadISST_Climatology_1961-1990.nc’)
x = f(‘sst’) # retrieves the whole dataset - a “slab”
# Selecting a specific area
x = f(‘sst’, latitude=(0., 35.), longitude= (20., 100.))
print x.shape
# Selecting specific times
y = f(‘sst’, time=(‘0-6-1’, ‘0-6-31’))
# You can also select the specific time from a “slab”
z = x(time=(‘0-6-1’, ‘0-6-31’))
# You can “squeeze” out “singleton” dimensions
y = f(‘sst’, time=(‘0-6-1’, ‘0-6-31’), squeeze=1)
# You can also change the order of returned dimensions
z = x(time=(‘0-6-1’, ‘0-6-31’), order=‘xyt’)
3. Querying the slab
# You can query the order of returned dimensions!
print 'Order of dimensions for x is:', x.getOrder()
# You can change the order of returned dimensions!
x2 = f('sst', latitude=(0., 35.), longitude=(20., 100.), order='xty')
# If I just want to ensure that time as my first axis
x3 = f('sst', order=‘t…')
# You can check the axis and its properties
lat_axis = x2.getLatitude()
print lat_axis.info()
lat_bounds = lat_axis.getBounds()
# Axes can be retrieved by index position
my_ax = x2.getAxis(1)
print my_ax.info()
5. Specifying precise regions to extract
# If I need a specific point - say lat, lon = 32.1, 100.3
x3 = f('sst', latitude=(32.1, 32.1, 'cob'),
longitude=(100.3, 100.3, 'cob'))
print x3.shape
# The first 2 positions are either ‘c’losed or ‘o’pen
# The 3rd position is ‘b’ or ‘e’ or …..
6. cdutil ‐ overview
• Set of climate data specific utilities.
• The cdutil Package contains a collection of
sub‐packages useful to deal with Climate
Data
• Sub‐components are:
– region
– times: tools to deal with the time dimension.
– vertical
– averager
8. Cdutil: region
• cdutil.region
– cdms2 selector to extract “exact” region (i.e reset
bounds correctly so averaging account for only
“actual” area averaged not the full cell.
S=f(“var”, cdutil.region.NorthernHemisphere)
• Creating your own regions
NINO34 =
cdms2.selectors.Selector(cdutil.region.domain(l
atitude=(-5., 5.), longitude=(190., 240.)))
9. Interpolation (re‐gridding)
# Suppose we have a slab
ds1 = f('sst', latitude=(0., 35.), longitude=(20., 100.),
time=('0-6-1', '0-6-31'), order='xty’)
print 'ds1.shape = ', ds1.shape, 'Axis order=', ds1.getOrder()
# Let us now extract another dataset.
f2 =
cdms2.open(’/CAS_OBS/climatology/NCEP_NCAR_Climatology_ltm/slp.
day.ltm.nc')
ds2 = f2('slp', latitude=(0., 35.), longitude=(20., 100.),
time=('1-6-1', '1-6-31'))
# I want to transform ds2 into the grid in ds1
ds3 = ds2.regrid(ds1.getGrid())
print 'ds3.shape = ', ds3.shape, 'Axis order=', ds3.getOrder()
# Alternate regridder
from regrid import Regridder
ingrid = ds2.getGrid()
outgrid = ds1.getGrid()
regridFunc = Regridder(ingrid, outgrid)
new_ds2 = regridFunc(ds2)
11. cdms2: Using masks
sst_mask = cdms2.MV2.getmask(ds1(order='tyx'))
print 'sst_mask.shape after reorder = ', sst_mask.shape
print sst_mask.__class__
# So we resize the mask
sst_mask.resize(ds3.shape)
#
ds4 = cdms2.createVariable(ds3[:], mask=sst_mask,
id='masked_psl', fill_value=1.e+20)
12. cdscan
• A utility that helps you manage files better.
• When you have many .nc (or .ctl) files you can
use this utility to generate a single “xml” file
that makes life simple.
• Try the following:
cdscan -x “some_filename.xml” DATA_PATH1/*.nc
– You can also change the time axis while you are at
it!
cdscan –x “some_filename.xml” –i 1 –r“months
since 1-1-1” DATA_PATH2/*.nc
13. cdutil: times
• cdutil.times – for time axes, geared toward climate data
– Climatology, Departures, Anomalies Tools works on BOUNDS,
NOT on time values, designed for monthly seasons, but one
can create an engine for other kind of data (daily, yearly,
etc…).
ac=cdutil.times.ANNUALCYCLE.climatology(s)
– In order to set bounds you can use:
cdutil.setTimeBoundsMonthly(Obj)
cdutil.setTimeBoundsYearly(Obj)
cdutil.setTimeBoundsDaily(Obj, frequency=1)
(Obj can be slab or time axis)
– Create your own seasons:
MONSOON = cdutil.times.Seasons(’JJAS’)
• cdutil imports everything in the times module so you can just
call e.g.:
cdutil.setTimeBoundsMonthly(slab/axis)
14. The importance of bounds
• CDAT used to set bounds automatically. E.g.:
longitude = [0, 90, 180, 270]
∴ bounds = [[-45, 45], [45, 135],
[135, 225], [225, 315]]
• Seems reasonable, but imagine a monthly mean time series where the times
are recorded on 1st day of each month:
timeax=[“1999-1-1”, “1999-2-1”, …, “2100-12-1”]
• CDAT assumes that each month represents the period of 15th last month to
15th this month.
• Since cdutil tools use bounds they will be misinterpreting the data. Need to
set the bounds sensibly:
>>> cdutil.setTimeBoundsMonthly(timeax)
15. Pre‐defined time‐related means
• DJF, MAM, JJA, SON (seasons)
>>> djf_mean=cdutil.DJF(my_var)
• SEASONALCYCLE (means for the 4 predefined seasons
[DJF, MAM, JJA, SON ]) – array of above.
>>> seas_mns=cdutil.SEASONALCYCLE(my_var)
• YEAR (annual means)
• ANNUALCYCLE (monthly means for each month of the
year)
– EXERCISE: Try calculating the climatological annual cycle for
the NCEP Reanalysis data you have read in.
16. Climatologies and departures
Season extractors have 2 functions available:
• climatology: which computes the average of all
seasons passed. ANNUALCYCLE.climatology(), will return
the 12 month annual cycle for the slab:
>>> ann=cdutil.ANNUALCYCLE.climatology(v)
• departures: which given an optional climatology will
compute seasonal departures from it.
>>> d=cdutil.ANNUALCYCLE.departures(v, cli60_99)
# Note that the second argument is optional but can be a pre‐computed
climatology such as here cli60_99 is a 1960‐1999 climatology but the variable v is
defined from 1900‐2000. If not given then the overall climatology for v is used.
18. MV2 Averaging
• The MV2 module has an averaging function:
MV2.average(x, axis=0, weights=None, returned=0)
– computes the average value of the non‐masked elements of x
along the selected axis. If weights is given, it must match the
size and shape of x, and the value returned is:
– elements corresponding to those masked in x or weights are
ignored. If returned, a 2‐tuple consisting of the average and
the sum of the weights is returned.
21. Usage of cdutil.averager
result = averager( V, axis=axisoptions,
weights=weightoptions,
action=actionoptions,
returned=returnedoptions,
combinewts=combinewtsoptions)
axisoptions has to be a string. You can pass axis='tyx', or '123', or 'x (plev)’.
weightoptions is one of 'generate’ | ‘weighted’ | 'equal' | ‘unweighted’ |
array | Masked Variable
actionoptions is 'average' | 'sum‘ [Default = 'average‘].
You can either return the weighted average or the weighted sum of the
data.
22. Example: Region Averaging, and
Climatology
import cdutil
# define the your custom regions.
NINO3 = cdms2.selectors.Selector(cdutil.region.domain(latitude=(-5., 5.,
'ccb'), longitude=(210., 270., 'ccb')))
fsst = cdms2.open(INDIR + 'sst_HadISST_1870-1_2011-1.nc’)
nino3_data = fsst('sst', NINO3)
print nino3_data.shape, nino3_data.getOrder()
# Compute the Spatial average
nino3_average = cdutil.averager(nino3_data, axis='xy')
# Anomaly from climatology computed over 1961-1990
nino3_slice = nino3_average(time=('1961-1-1', '1990-12-31'))
nino3_clim = cdutil.ANNUALCYCLE.climatology(nino3_slice)
print nino3_clim.shape
# Now departures
nino3_anomaly = cdutil.ANNUALCYCLE.departures(nino3_average, nino3_clim)
print nino3_anomaly.shape