I'm using Ubuntu Linux, Ruby 1.8.7, NetCDF 3.6, ruby-netcdf-0.6.5.
First, download and open the file as described in Part 1. Start an IRb (Interactive Ruby) session and open the file:
$ irb --simple-prompt >> require 'rubygems' #this will return false because I already loaded RubyGems => false >> require 'numru/netcdf' => true >> file = NumRu::NetCDF.open("tas_A1.020101-022012.nc") => NetCDF:tas_A1.020101-022012.nc
to motive this discussion, drill down to the actual temperature data (the variable is "tas") and look around:
>> file.var("tas").class => NumRu::NetCDFVar >> file.var("tas").get[0] => 248.853698730469 >> file.var("tas").get[0].class => Float >> file.var("tas").get.class => NArrayA few things to notice, in order:
- The variable itself is of the class
NetCDFVar
. No surprises there. - The actual value of the very first temperature data point (
get[0]
) is 248.853698730469.This implies the temperature readings are in Kelvin. I'll also note that it's a little strange to report temperature to 12 decimal places. - The actual data is float object, meaning a number with a decimal point.
- The command
get
without an index returns an object of classNArray
.
This file (tas_A1.020101-022012.nc) is part of a NOAA climate model. It holds surface temperature data. Each data point has a longitude, a latitude, a time, and a temperature. To begin, I would like to know two things:
- how many longitude/latitude points are we talking about, and
- for each geographical point, how many temperature readings do we have?
The "tas" NArray Object
First, how many points are we talking about? We can know that by asking how large is the NArray object returned by theget
command on the variable "tas":>> file.var("tas").get.size => 3110400That's 3,110,400. But are those all temperature readings? What exactly are we dealing with? First notice that this NArray object has 3 dimensions:
>> file.var("tas").get.dim => 3Why should temperature data be in a 3-dimensional array? I have no idea. Let's dig deeper. What exactly are the lengths of these 3 dimensions? That's the shape method (of an NArray object):
>> file.var("tas").get.shape => [144, 90, 240]Notice that 144 * 90 * 240 = 3,110,400, so that's where the size of this array comes from. But why those lengths? Who knows. But since we know it's dimensions, we can call individual elements by specifying the index of each dimension:
>> file.var("tas").get[0,0,0] => 248.853698730469 >> file.var("tas").get[1,0,0] => 248.853637695312 >> file.var("tas").get[82,23,42] => 283.665496826172
That's a dead end for me, so let's dive into a different variable.
Time
>> file.dim_names => ["lon", "lat", "time", "bnds"] >> file.dim("time") => NetCDFDim:time >> file.dim("time").length => 240That's a great clue: the length of the variable "time" is 240, the same length as one of the dimensions of the "tas" (temperature) variable. Keep digging:
>> file.var("time").vartype => "float" >> file.var("time").natts => 6 >> file.var("time").att_names => ["standard_name", "long_name", "units", "axis", "calendar", "bounds"] >> file.var("time").att("long_name").get => "time" >> file.var("time").att("units").get => "days since 0001-01-01 00:00:00"This is the next important clue: the units of the "time" variable are days since...the beginning (whenever that is). Next, we use the shape_current command to reaffirm the length of this dimension, and then check that we indeed have only 240 values:
>> file.var("time").shape_current => [240] >> file.var("time").get[0] => 73015.5 >> file.var("time").get[239] => 80284.5 >> file.var("time").get[240] IndexError: index out of range from (irb):194:in `[]' from (irb):194 from :0As expected, when we try for an index of 240, it throws an error. So we have 240 points in time, measured in days since the beginning. We can divide by 365 to get a better feel for what we're dealing wtih. (Note that I use the
get
method to return an NArray object, which includes the methods min
and max
.)>> file.var("time").get.min => 73015.5 >> file.var("time").get.min/365 => 200.042465753425 >> file.var("time").get.max/365 => 219.957534246575So the starting time is year 200, and the ending time is about year 220. We have 240 data points, so assuming they're evenly distributed, we have 240 / 20 = 12, or one data point every month. Makes sense. It seems like we have one temperature reading per month; so as we move along the 240 axis, we're moving along in time. Now what about the other two axes? Let's look at "lon".
Longitude
>> file.var_names => ["lon", "lon_bnds", "lat", "lat_bnds", "time", "time_bnds", "height", "tas"] >> file.var("lon").att_names => ["standard_name", "long_name", "units", "axis", "bounds"] >> file.var("lon").att("long_name").get => "longitude" >> file.var("lon").att("units").get => "degrees_east" >> file.var("lon").att("axis").get => "X" >> file.var("lon").att("bounds").get => "lon_bnds"
We have longitute, measured in degrees east along the X axis, bounded by "lon_bounds." Not sure what that last one means.
>> file.var("lon").get.shape => [144] >> file.var("lon").get.min => 1.25 >> file.var("lon").get.max => 358.75So we have 144 longitude points, from 1.25 to 358. We know there are 360 degrees around the Earth, so to find our precision, we divide 360 by 144:
>> 360.0/144 => 2.5 >> 360.0/144.0 => 2.5So we essentially have an average of one data point every 2.5 degrees. At the equator that translates to one data point approximately every 170 miles. Now let's look at latitude.
Latitude
>> file.var_names => ["lon", "lon_bnds", "lat", "lat_bnds", "time", "time_bnds", "height", "tas"] >> file.var("lat").att_names => ["standard_name", "long_name", "units", "axis", "bounds"] >> file.var("lat").att("long_name").get => "latitude" >> file.var("lat").att("units").get => "degrees_north" >> file.var("lat").att("axis").get => "Y"
We have latitude, measured in degrees north, represented by the Y axis. More detail:
>> file.var("lat").get.shape => [90] >> file.var("lat").get.min => -89.494384765625 >> file.var("lat").get.max => 89.494384765625
We have 90 data points, stretching from about -90 to +90 degrees north; that's about one every two degrees, or about one data point every 140 miles. Note that these are appoximations, since we can't map a square grid onto a sphere. The geographical data resolution will shift across the topology of a sphere.
Now let's see if we can get at the data itself.
Put it all together
We know from before that the temperature data (in variable "tas") is in a 3-dimensional grid with lengths [144, 90, 240]. We now know that those surely correspond to [longitude, latitude, months]. Let's say we want to find the temperature at longitude 0 (corresponds to England, Morroco, and Ghana), latitude 0 (the equator), month 0 (the beginning). First we need to know which index in 0 to 90 corresponds to latitude 0. If we print out all the latitudes, we see that they start at -90 and increase to +90:
>> file.var("lat").get.to_a.each_with_index{|value, index| puts "#{index} #{value}"} 0 -89.494384765625 1 -87.9775314331055 2 -85.9550552368164 3 -83.9325866699219 4 -81.9101104736328 5 -79.8876419067383 6 -77.8651657104492 7 -75.8426971435547 8 -73.8202209472656 9 -71.7977523803711 10 -69.7752838134766 11 -67.7528076171875 12 -65.730339050293 13 -63.7078666687012 14 -61.6853942871094 15 -59.6629219055176 16 -57.6404495239258 17 -55.617977142334 18 -53.5955047607422 19 -51.5730323791504 20 -49.5505599975586 21 -47.5280914306641 22 -45.5056190490723 23 -43.4831466674805 24 -41.4606742858887 25 -39.4382019042969 26 -37.4157295227051 27 -35.3932571411133 28 -33.3707847595215 29 -31.3483142852783 30 -29.3258419036865 31 -27.3033714294434 32 -25.2808990478516 33 -23.2584266662598 34 -21.235954284668 35 -19.2134838104248 36 -17.191011428833 37 -15.1685390472412 38 -13.1460676193237 39 -11.1235952377319 40 -9.10112380981445 41 -7.07865190505981 42 -5.05618000030518 43 -3.03370785713196 44 -1.01123595237732 45 1.01123595237732 46 3.03370785713196 47 5.05618000030518 48 7.07865190505981 49 9.10112380981445 50 11.1235952377319 51 13.1460676193237 52 15.1685390472412 53 17.191011428833 54 19.2134838104248 55 21.235954284668 56 23.2584266662598 57 25.2808990478516 58 27.3033714294434 59 29.3258419036865 60 31.3483142852783 61 33.3707847595215 62 35.3932571411133 63 37.4157295227051 64 39.4382019042969 65 41.4606742858887 66 43.4831466674805 67 45.5056190490723 68 47.5280914306641 69 49.5505599975586 70 51.5730323791504 71 53.5955047607422 72 55.617977142334 73 57.6404495239258 74 59.6629219055176 75 61.6853942871094 76 63.7078666687012 77 65.730339050293 78 67.7528076171875 79 69.7752838134766 80 71.7977523803711 81 73.8202209472656 82 75.8426971435547 83 77.8651657104492 84 79.8876419067383 85 81.9101104736328 86 83.9325866699219 87 85.9550552368164 88 87.9775314331055 89 89.494384765625The midway point, then, corresponds to latitude zero, the equator. We access that with
>> file.var("lat").get[45] => 1.01123595237732This is a close enough approximation to zero latitude. Now we can find the temperature at longitude 0, latitude 0, and month 0:
>> file.var("tas").get[0, 45, 0] => 300.574279785156 >> file.var("tas").get[0, 45, 0]-273 => 27.5742797851562We get an answer of 300 Kelvin, or about 27 degrees Celsius. 0 longitude and 0 latitude is a spot off the southern coast of Ghana, Africa. I suppose 27 Celsius sounds about right for January. Let's see what temperature England reports, at the same time and longitude, but at England's latitude (which is about 52). We look up latitude 52 in the list above, and see that it corresponds to an index of 70.
>> file.var("tas").get[0,70,0]-273 => 1.03604125976562Despite the absurd number of significant digits, that number looks about right for England in January. How about England in July?
>> file.var("tas").get[0,70,6]-273 => 17.9317932128906Much more pleasant.