I'm using Ubuntu Linux, Ruby 1.8.7, NetCDF 3.6, ruby-netcdf-0.6.5.
First, download and open the file as described in Part 1. Start an IRb (Interactive Ruby) session and open the file:
$ irb --simple-prompt
>> require 'rubygems' #this will return false because I already loaded RubyGems
=> false
>> require 'numru/netcdf'
=> true
>> file = NumRu::NetCDF.open("tas_A1.020101-022012.nc")
=> NetCDF:tas_A1.020101-022012.ncto motive this discussion, drill down to the actual temperature data (the variable is "tas") and look around:
>> file.var("tas").class
=> NumRu::NetCDFVar
>> file.var("tas").get[0]
=> 248.853698730469
>> file.var("tas").get[0].class
=> Float
>> file.var("tas").get.class
=> NArrayA few things to notice, in order:- The variable itself is of the class
NetCDFVar. No surprises there. - The actual value of the very first temperature data point (
get[0]) is 248.853698730469.This implies the temperature readings are in Kelvin. I'll also note that it's a little strange to report temperature to 12 decimal places. - The actual data is float object, meaning a number with a decimal point.
- The command
getwithout an index returns an object of classNArray.
This file (tas_A1.020101-022012.nc) is part of a NOAA climate model. It holds surface temperature data. Each data point has a longitude, a latitude, a time, and a temperature. To begin, I would like to know two things:
- how many longitude/latitude points are we talking about, and
- for each geographical point, how many temperature readings do we have?
The "tas" NArray Object
First, how many points are we talking about? We can know that by asking how large is the NArray object returned by theget command on the variable "tas":>> file.var("tas").get.size
=> 3110400
That's 3,110,400. But are those all temperature readings? What exactly are we dealing with? First notice that this NArray object has 3 dimensions:>> file.var("tas").get.dim
=> 3Why should temperature data be in a 3-dimensional array? I have no idea. Let's dig deeper. What exactly are the lengths of these 3 dimensions? That's the shape method (of an NArray object):>> file.var("tas").get.shape
=> [144, 90, 240]Notice that 144 * 90 * 240 = 3,110,400, so that's where the size of this array comes from. But why those lengths? Who knows. But since we know it's dimensions, we can call individual elements by specifying the index of each dimension:>> file.var("tas").get[0,0,0]
=> 248.853698730469
>> file.var("tas").get[1,0,0]
=> 248.853637695312
>> file.var("tas").get[82,23,42]
=> 283.665496826172
That's a dead end for me, so let's dive into a different variable.
Time
>> file.dim_names
=> ["lon", "lat", "time", "bnds"]
>> file.dim("time")
=> NetCDFDim:time
>> file.dim("time").length
=> 240
That's a great clue: the length of the variable "time" is 240, the same length as one of the dimensions of the "tas" (temperature) variable. Keep digging:>> file.var("time").vartype
=> "float"
>> file.var("time").natts
=> 6
>> file.var("time").att_names
=> ["standard_name", "long_name", "units", "axis", "calendar", "bounds"]
>> file.var("time").att("long_name").get
=> "time"
>> file.var("time").att("units").get
=> "days since 0001-01-01 00:00:00"
This is the next important clue: the units of the "time" variable are days since...the beginning (whenever that is). Next, we use the shape_current command to reaffirm the length of this dimension, and then check that we indeed have only 240 values:>> file.var("time").shape_current
=> [240]
>> file.var("time").get[0]
=> 73015.5
>> file.var("time").get[239]
=> 80284.5
>> file.var("time").get[240]
IndexError: index out of range
from (irb):194:in `[]'
from (irb):194
from :0
As expected, when we try for an index of 240, it throws an error. So we have 240 points in time, measured in days since the beginning. We can divide by 365 to get a better feel for what we're dealing wtih. (Note that I use the get method to return an NArray object, which includes the methods min and max.)>> file.var("time").get.min
=> 73015.5
>> file.var("time").get.min/365
=> 200.042465753425
>> file.var("time").get.max/365
=> 219.957534246575So the starting time is year 200, and the ending time is about year 220. We have 240 data points, so assuming they're evenly distributed, we have 240 / 20 = 12, or one data point every month. Makes sense. It seems like we have one temperature reading per month; so as we move along the 240 axis, we're moving along in time. Now what about the other two axes? Let's look at "lon".Longitude
>> file.var_names
=> ["lon", "lon_bnds", "lat", "lat_bnds", "time", "time_bnds", "height", "tas"]
>> file.var("lon").att_names
=> ["standard_name", "long_name", "units", "axis", "bounds"]
>> file.var("lon").att("long_name").get
=> "longitude"
>> file.var("lon").att("units").get
=> "degrees_east"
>> file.var("lon").att("axis").get
=> "X"
>> file.var("lon").att("bounds").get
=> "lon_bnds"
We have longitute, measured in degrees east along the X axis, bounded by "lon_bounds." Not sure what that last one means.
>> file.var("lon").get.shape
=> [144]
>> file.var("lon").get.min
=> 1.25
>> file.var("lon").get.max
=> 358.75So we have 144 longitude points, from 1.25 to 358. We know there are 360 degrees around the Earth, so to find our precision, we divide 360 by 144:>> 360.0/144 => 2.5 >> 360.0/144.0 => 2.5So we essentially have an average of one data point every 2.5 degrees. At the equator that translates to one data point approximately every 170 miles. Now let's look at latitude.
Latitude
>> file.var_names
=> ["lon", "lon_bnds", "lat", "lat_bnds", "time", "time_bnds", "height", "tas"]
>> file.var("lat").att_names
=> ["standard_name", "long_name", "units", "axis", "bounds"]
>> file.var("lat").att("long_name").get
=> "latitude"
>> file.var("lat").att("units").get
=> "degrees_north"
>> file.var("lat").att("axis").get
=> "Y"We have latitude, measured in degrees north, represented by the Y axis. More detail:
>> file.var("lat").get.shape
=> [90]
>> file.var("lat").get.min
=> -89.494384765625
>> file.var("lat").get.max
=> 89.494384765625
We have 90 data points, stretching from about -90 to +90 degrees north; that's about one every two degrees, or about one data point every 140 miles. Note that these are appoximations, since we can't map a square grid onto a sphere. The geographical data resolution will shift across the topology of a sphere.
Now let's see if we can get at the data itself.
Put it all together
We know from before that the temperature data (in variable "tas") is in a 3-dimensional grid with lengths [144, 90, 240]. We now know that those surely correspond to [longitude, latitude, months]. Let's say we want to find the temperature at longitude 0 (corresponds to England, Morroco, and Ghana), latitude 0 (the equator), month 0 (the beginning). First we need to know which index in 0 to 90 corresponds to latitude 0. If we print out all the latitudes, we see that they start at -90 and increase to +90:
>> file.var("lat").get.to_a.each_with_index{|value, index| puts "#{index} #{value}"}
0 -89.494384765625
1 -87.9775314331055
2 -85.9550552368164
3 -83.9325866699219
4 -81.9101104736328
5 -79.8876419067383
6 -77.8651657104492
7 -75.8426971435547
8 -73.8202209472656
9 -71.7977523803711
10 -69.7752838134766
11 -67.7528076171875
12 -65.730339050293
13 -63.7078666687012
14 -61.6853942871094
15 -59.6629219055176
16 -57.6404495239258
17 -55.617977142334
18 -53.5955047607422
19 -51.5730323791504
20 -49.5505599975586
21 -47.5280914306641
22 -45.5056190490723
23 -43.4831466674805
24 -41.4606742858887
25 -39.4382019042969
26 -37.4157295227051
27 -35.3932571411133
28 -33.3707847595215
29 -31.3483142852783
30 -29.3258419036865
31 -27.3033714294434
32 -25.2808990478516
33 -23.2584266662598
34 -21.235954284668
35 -19.2134838104248
36 -17.191011428833
37 -15.1685390472412
38 -13.1460676193237
39 -11.1235952377319
40 -9.10112380981445
41 -7.07865190505981
42 -5.05618000030518
43 -3.03370785713196
44 -1.01123595237732
45 1.01123595237732
46 3.03370785713196
47 5.05618000030518
48 7.07865190505981
49 9.10112380981445
50 11.1235952377319
51 13.1460676193237
52 15.1685390472412
53 17.191011428833
54 19.2134838104248
55 21.235954284668
56 23.2584266662598
57 25.2808990478516
58 27.3033714294434
59 29.3258419036865
60 31.3483142852783
61 33.3707847595215
62 35.3932571411133
63 37.4157295227051
64 39.4382019042969
65 41.4606742858887
66 43.4831466674805
67 45.5056190490723
68 47.5280914306641
69 49.5505599975586
70 51.5730323791504
71 53.5955047607422
72 55.617977142334
73 57.6404495239258
74 59.6629219055176
75 61.6853942871094
76 63.7078666687012
77 65.730339050293
78 67.7528076171875
79 69.7752838134766
80 71.7977523803711
81 73.8202209472656
82 75.8426971435547
83 77.8651657104492
84 79.8876419067383
85 81.9101104736328
86 83.9325866699219
87 85.9550552368164
88 87.9775314331055
89 89.494384765625
The midway point, then, corresponds to latitude zero, the equator. We access that with >> file.var("lat").get[45]
=> 1.01123595237732
This is a close enough approximation to zero latitude. Now we can find the temperature at longitude 0, latitude 0, and month 0:>> file.var("tas").get[0, 45, 0]
=> 300.574279785156
>> file.var("tas").get[0, 45, 0]-273
=> 27.5742797851562
We get an answer of 300 Kelvin, or about 27 degrees Celsius. 0 longitude and 0 latitude is a spot off the southern coast of Ghana, Africa. I suppose 27 Celsius sounds about right for January. Let's see what temperature England reports, at the same time and longitude, but at England's latitude (which is about 52). We look up latitude 52 in the list above, and see that it corresponds to an index of 70.>> file.var("tas").get[0,70,0]-273
=> 1.03604125976562
Despite the absurd number of significant digits, that number looks about right for England in January. How about England in July?>> file.var("tas").get[0,70,6]-273
=> 17.9317932128906
Much more pleasant.