[gradsusr] Using subsetting on a grads data server
Jennifer Adams
jma at cola.iges.org
Wed Aug 10 15:10:17 EDT 2011
Hi, Shaun --
In your first example, when you just open the URL and then define the
3-D object, the I/O is done 1 2D grid at a time, looping over time, so
you will hit the server 29 times, bringing back your lat/lon subset
grid for each time step. The server does not cache the subset data, so
if you were to ask for this same data set again, it would take just as
long.
In your second example, you are using the _expr_ syntax to create a
new data set that is not an analysis result, merely a subset. The
difference is that this time the GDS will write out this 3-D data set
as a binary file in its cache and then send it you across the network
as you make data requests from the GrADS client for the variable
called 'result'. Because you invoke 'define' again, with the same 3-D
grid dimensions, the network traffic will be the same from the server
to you, 29 hits, one 2-D grid per time step.
This second technique requires an extra I/O step, once to read the
grib2 data from the original file and write it out to the cache as a
binary file, and then a second I/O step to read the binary file in the
cache to fulfill your data request from the client. So, for small
subsets, it will be faster to just open the URL and invoke 'define'.
However, the GDS does not invoke GrADS to do the I/O to read the
binary data in the cache -- it is a direct file read, so if the cached
file is sufficiently big, then the second technique will end up being
faster. In this case, "sufficiently big" means BIG, and you are likely
to run into configuration limits on the size of a _expr_ result before
you notice a performance difference. Also, you will populate the
server's cache with a lot of binary files, causing their server to
gobble up local resources and slow down.
The bottom line is this: don't use _expr_ syntax for basic subsetting.
--Jennifer
On Aug 10, 2011, at 10:28 AM, Shaun Carney wrote:
> Hello,
> I have what hopefully is a simple question. I'm trying to do a clip of
> a precip grid to a small area on a grads server rather than opening
> the entire dataset and then doing the clip. The following works:
>
> 'sdfopen http://nomads.ncep.noaa.gov:9090/dods/nam/nam20110808/nam_crb_00z'
> 'set lon 269 273'
> 'set lat 12 15'
> 'set t 1 29'
> 'define namprecip = apcpsfc'
>
> However, when I try to give an expression so that I only get the
> subset from the server, it will not work:
>
> baseurl = 'http://nomads.ncep.noaa.gov:9090/dods/nam/_expr_'
> datasets = '{nam20110808/nam_crb_00z}'
> expression = '{apcpsfc}'
> dimensions = '{269:273,12:15,1000:1000,00Z08AUG2011:12Z11AUG2011}'
> 'sdfopen '%baseurl%datasets%expression%dimensions
> 'set t 1 29'
> 'define namprecip = result'
>
> Grads gives the following error:
> gadsdf: Couldn't ingest SDF metadata
>
> Any ideas why this will not read? On a more basic level, will there be
> any speed improvement using the second method rather than the first,
> or does setting the coord box before defining namprecip reduce the
> data transfer?
>
> This will be implemented on a computer with slow connection speed so I
> want to minimize data transfer.
>
> Thanks!
> Shaun
> _______________________________________________
> gradsusr mailing list
> gradsusr at gradsusr.org
> http://gradsusr.org/mailman/listinfo/gradsusr
--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma at cola.iges.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://gradsusr.org/pipermail/gradsusr/attachments/20110810/61b87b85/attachment-0003.html
More information about the gradsusr
mailing list