For this post, we'll work with one of NOAA's climate-change simulations. This file is in .nc format (NetCDF), so to open them we need to have the capability to read NetCDF files. For instructions, see my blog post on how to install NetCDF on Linux and use Ruby as an interface (RubyNetCDF).
I'm using Ubuntu Linux, Ruby 1.8.7, NetCDF 3.6, ruby-netcdf-0.6.5.
Download The File
For this tutorial, download row #6 (ftp download--tas_A1.020101-022012.nc), "air_temperature," from NOAA's GFDL CM2.1 climate model.How do we open this file?
I downloaded the file tas_A1.020101-022012.nc into my /Downloads directory. Now I want to read it. Assuming you've installedruby-netcdf
, you can follow this short tutorial.NB: I'm using Ubuntu Linux, Ruby 1.8.7, NetCDF 3.6, ruby-netcdf-0.6.5. To get the same results as below, you may need to prepend the following line to your Irb sessions:
require 'rubygems'
First navigate the
/Downloads
folder and use the following ncdump
code (which is courtesy of the NetCDF software we installed) to display information about the file.$ cd Downloads $ ncdump -h tas_A1.020101-022012.ncThis command has three parts:
- The command
ncdump
- The option
-h
which restricts the output to just summary data about the file - The filename.
netcdf tas_A1.020101-022012 { dimensions: lon = 144 ; lat = 90 ; time = UNLIMITED ; // (240 currently) bnds = 2 ; variables: double lon(lon) ; lon:standard_name = "longitude" ; lon:long_name = "longitude" ; lon:units = "degrees_east" ; lon:axis = "X" ; lon:bounds = "lon_bnds" ; double lon_bnds(lon, bnds) ; double lat(lat) ; lat:standard_name = "latitude" ; lat:long_name = "latitude" ; lat:units = "degrees_north" ; lat:axis = "Y" ; lat:bounds = "lat_bnds" ; double lat_bnds(lat, bnds) ; double time(time) ; time:standard_name = "time" ; time:long_name = "time" ; time:units = "days since 0001-01-01 00:00:00" ; time:axis = "T" ; time:calendar = "noleap" ; time:bounds = "time_bnds" ; double time_bnds(time, bnds) ; double height ; height:standard_name = "height" ; height:long_name = "height" ; height:units = "m" ; height:axis = "Z" ; height:positive = "up" ; float tas(time, lat, lon) ; tas:standard_name = "air_temperature" ; tas:long_name = "Surface Air Temperature" ; tas:units = "K" ; tas:cell_methods = "time: mean" ; tas:coordinates = "height" ; tas:original_name = "t_ref" ; // global attributes: :title = "GFDL CM2.1, 1%to2x (run1) 1%/year CO2 increase experiment (to doubling) output for IPCC AR4 and US CCSP" ; :institution = "NOAA GFDL (US Dept of Commerce / NOAA / Geophysical Fluid Dynamics Laboratory, Princeton, NJ, USA)" ; :source = "GFDL_CM2.1 (2004): atmosphere: AM2.1 (am2p13fv, M45L24); ocean: OM3.1 (mom4p1p7_om3p5, tripolar360x200L50); sea ice: SIS; land: LM2; infrastructure: FMS preK release" ; :contact = "GFDL.Climate.Model.Info@noaa.gov" ; :project_id = "IPCC Fourth Assessment and US CCSP Projects" ; :table_id = "Table A1 (20 September 2004)" ; :experiment_id = "1%/year CO2 increase experiment (to doubling)" ; :realization = 1 ; :cmor_version = 0.96f ; :Conventions = "CF-1.0" ; :history = "input/atmos.020101-022012.t_ref.nc At 20:33:05 on 02/01/2005, CMOR rewrote data to comply with CF standards and IPCC Fourth Assessment and US CCSP Projects requirements" ; :references = "The GFDL Data Portal (http://nomads.gfdl.noaa.gov/) provides access to NOAA/GFDL\'s publicly available model input and output data sets. From this web site one can view and download data sets and documentation, including those related to the GFDL CM2.1 model experiments run for the IPCC\'s 4th Assessment Report and the US CCSP." ; :comment = "GFDL experiment name = CM2.1U-D4_1PctTo2X_I1. PCMDI experiment name = 1%to2x (run1). Initial conditions for this experiment were taken from 1 January of year 1 of the 1860 control model experiment named CM2.1U_Control-1860_D4. In the CM2.1U-D4_1PctTo2X_I1 experiment atmospheric CO2 levels were prescribed to increase from their initial mixing ratio level of 286.05 ppmv at a compounded rate of +1 percent per year until year 70 (the point of doubling). CO2 levels were held constant at 572.11 ppmv from year 71 through the end of the 220 year long experiment. For the entire 220 year duration of the experiment, all non-CO2 forcing agents (CH4, N2O, halons, tropospheric and stratospheric O3, tropospheric sulfates, black and organic carbon, dust, sea salt, solar irradiance, and the distribution of land cover types) were held constant at values representative of year 1860." ; :gfdl_experiment_name = "CM2.1U-D4_1PctTo2X_I1" ; }
Next, open an interactive Ruby session, load the RubyNetCDF library, and open the file:
$ irb --simple-prompt >> require 'rubygems' #this will return false because I already loaded RubyGems => false >> require 'numru/netcdf' => true >> file = NumRu::NetCDF.open("tas_A1.020101-022012.nc") => NetCDF:tas_A1.020101-022012.nc
Two notes about the preceding lines:
- I used the
--simple-prompt
argument, but it's not necessary, and has nothing to do with NetCDF. It just cleans up the output a bit. - The line
require 'numru/netcdf'
will fail unless you have RubyGems loaded. Mine is loaded automatically, so it returns false, but I wrote it here as a reminder.
First, let's play around with object we created.
NetCDF methods: NetCDF
The object we created is of class NetCDF, so we can use any of the NetCDF class methods, such asvar_names
, att_names
, and ndims
.>> file.class => NumRu::NetCDFThis uses the Ruby method
class
to show again that the object we named "file" is an object of class NetCDF. Try the NetCDF method att_names
:>> file.att_names => ["title", "institution", "source", "contact", "project_id", "table_id", "experiment_id", "realization", "cmor_version", "Conventions", "history", "references", "comment", "gfdl_experiment_name"]
Try the NetCDF method
var_names
:>> file.var_names => ["lon", "lon_bnds", "lat", "lat_bnds", "time", "time_bnds", "height", "tas"]
Try the NetCDF method
nvars
:>> file.nvars => 8As you can see
var_names
returned the names of all the variables associated the with NetCDF object called "file", and the NetCDF method nvars
returned the number of variables of the same. You can do the same with dim_names
and ndims
(dimensions), as well as att_names
and natts
(attributes). This gives us a clue as the structure of a NetCDF file: It has attributes, variables, and dimensions.Attributes: NetCDFAtt
We already saw the names of all the attributes of the NetCDF object "file". Let's look at one of those in more depth:>> file.att("title") => NetCDFAtt:titleThe NetCDF method
att
opens an attribute. To use it, we just specify the name of the attribute we want to open. Ruby returns NetCDFAtt:title
, which means now we have an object of class NetCDFAtt. So now we can use any of the NetCDFAtt methods on this object, such as name
, atttype
, or get
. >> file.att("title").name => "title" >> file.att("title").atttype => "char" >> file.att("title").get => "GFDL CM2.1, 1%to2x (run1) 1%/year CO2 increase experiment (to doubling) output for IPCC AR4 and US CCSP"
As you can see,
name
returns the name of the attribute; atttype
returns the type of the attribute (possible values are things like character, float, etc), and get
returns the actual value of the attribute.Just to drive home the point that in
file.att
we're dealing with an object of class "NetCDFAtt" and not "NetCDF", use the Ruby method class
to check the class:>> file.att("title").class => NumRu::NetCDFAtt
Variables: NetCDFVar
Now let's move on to variables. Just likeatt
opens an attribute, we have var
to open a variable: >> file.var_names => ["lon", "lon_bnds", "lat", "lat_bnds", "time", "time_bnds", "height", "tas"] >> file.var("tas") => NetCDFVar:tas_A1.020101-022012.nc?var=tas >> file.var("tas").class => NumRu::NetCDFVar
Now we can use all the NetCDFVar methods on this object, such as
vartype
, att_names
, att
, and get
.>> file.var("tas").vartype => "sfloat" >> file.var("tas").att_names => ["standard_name", "long_name", "units", "cell_methods", "coordinates", "original_name"] >> file.var("tas").att("standard_name") => NetCDFAtt:standard_name >> file.var("tas").att("standard_name").class => NumRu::NetCDFAtt >> file.var("tas").att("standard_name").get => "air_temperature"
With the code above we see that the variable "tas":
- is of type "sfloat" (which is a number with decimal points)
- has the 6 attributes listed above (you could check the number with the NetCDFVar method
nvars
) - has an attribute called "standard_name" which is of type "NetCDFAtt"...
- ...and which attribute has the value "air_temperature."
But how do we actually see some air temperature values? We use the NetCDFVar method
get
, to which we pass an index:>> file.var("tas").get[0] => 248.853698730469 >> file.var("tas").get[1] => 248.853637695312 >> file.var("tas").get[857390] => 271.350402832031
Dimensions: NetCDFDim
What are dimensions? It's not immediately clear, so let's dive in and look at the dimensions of the "tas" variable.>> file.var("tas").dim_names => ["lon", "lat", "time"] >> file.var("tas").dim(0) => NetCDFDim:lon >> file.var("tas").dim(1) => NetCDFDim:lat >> file.var("tas").dim(2) => NetCDFDim:time >> file.var("tas").dim(2).class => NumRu::NetCDFDim
As we can see by the last line, once we open a dimension this way, we're at an object of class NetCDFDim, so we can use that class's methods, such as
name
,length
, and unlimited?
.>> file.var("tas").dim(0).name => "lon" >> file.var("tas").dim(0).length => 144 >> file.var("tas").dim(0).unlimited? => false >> file.var("tas").dim(2).name => "time" >> file.var("tas").dim(2).length => 240 >> file.var("tas").dim(2).unlimited? => true >> file.var("tas").dim(2).length_ul0 => 0Things to notice about the above code:
- Ruby is base-0, not base-1, so the first item in an index is 0
- If a dimension is unlimited then it will return 240 as length
- A dimension being "unlimited" means it can grow to any length along that dimension. An example of this would be the ID number of individual records in a database; the ID will increment forever.
- The NetCDFDim method
length_ul0
will return 0 (instead of 240) if the dimension is unlimited. - The other two dimensions--latitude and longitude--correspond to real-world physical dimensions.
What's the difference between a variable and dimension?
The dimension class exists to help describe a variable. In fact, the dimensions themselves are variables. Check it out:The variables "tas" (air_temperature) has three dimensions (lon, lat, time):
>> file.var("tas").dim_names => ["lon", "lat", "time"]
But if we get the variables of the whole "file" object, we see those same three "dimensions" appear here as variables:
>> file.var_names => ["lon", "lon_bnds", "lat", "lat_bnds", "time", "time_bnds", "height", "tas"]
If we look at the dimensions of the variable/dimension "lon" we see that it has only one dimension, which is itself:
>> file.var("lon").dim_names => ["lon"]
So all the data are stored as variables, but some of the variables serve as dimensions to other variables. That's the difference between variables and dimensions, and that's why NetCDF files are called "self-describing."
So I've got a netcdf file described as such: https://gist.github.com/4198037
ReplyDeleteDo you happen to know of a way to get output of a whole "row" of data?
Im after the value in the "level" dim... and I'm looking for the code which would allow me to do something like:
file.var("vegtype").get[0].dim(2)
where I could step through the value given here as " 0 " for each row ... and subsequently get the dim(2) ... the "level" from each row