[gradsusr] Using subsetting on a grads data server

Jennifer Adams jma at cola.iges.org
Wed Aug 10 15:10:17 EDT 2011


Hi, Shaun --
In your first example, when you just open the URL and then define the  
3-D object, the I/O is done 1 2D grid at a time, looping over time, so  
you will hit the server 29 times, bringing back your lat/lon subset  
grid for each time step. The server does not cache the subset data, so  
if you were to ask for this same data set again, it would take just as  
long.

In your second example, you are using the _expr_ syntax to create a  
new data set that is not an analysis result, merely a subset. The  
difference is that this time the GDS will write out this 3-D data set  
as a binary file in its cache and then send it you across the network  
as you make data requests from the GrADS client for the variable  
called 'result'. Because you invoke 'define' again, with the same 3-D  
grid dimensions, the network traffic will be the same from the server  
to you, 29 hits, one 2-D grid per time step.

This second technique requires an extra I/O step, once to read the  
grib2 data from the original file and write it out to the cache as a  
binary file, and then a second I/O step to read the binary file in the  
cache to fulfill your data request from the client. So, for small  
subsets, it will be faster to just open the URL and invoke 'define'.

However, the GDS does not invoke GrADS to do the I/O to read the  
binary data in the cache -- it is a direct file read, so if the cached  
file is sufficiently big, then the second technique will end up being  
faster. In this case, "sufficiently big" means BIG, and you are likely  
to run into configuration limits on the size of a _expr_ result before  
you notice a performance difference. Also, you will populate the  
server's cache with a lot of binary files, causing their server to  
gobble up local resources and slow down.

The bottom line is this: don't use _expr_ syntax for basic subsetting.
--Jennifer


On Aug 10, 2011, at 10:28 AM, Shaun Carney wrote:

> Hello,
> I have what hopefully is a simple question. I'm trying to do a clip of
> a precip grid to a small area on a grads server rather than opening
> the entire dataset and then doing the clip. The following works:
>
> 'sdfopen http://nomads.ncep.noaa.gov:9090/dods/nam/nam20110808/nam_crb_00z'
> 'set lon 269 273'
> 'set lat 12 15'
> 'set t 1 29'
> 'define namprecip = apcpsfc'
>
> However, when I try to give an expression so that I only get the
> subset from the server, it will not work:
>
> baseurl    = 'http://nomads.ncep.noaa.gov:9090/dods/nam/_expr_'
> datasets   = '{nam20110808/nam_crb_00z}'
> expression = '{apcpsfc}'
> dimensions = '{269:273,12:15,1000:1000,00Z08AUG2011:12Z11AUG2011}'
> 'sdfopen '%baseurl%datasets%expression%dimensions
> 'set t 1 29'
> 'define namprecip = result'
>
> Grads gives the following error:
> gadsdf: Couldn't ingest SDF metadata
>
> Any ideas why this will not read? On a more basic level, will there be
> any speed improvement using the second method rather than the first,
> or does setting the coord box before defining namprecip reduce the
> data transfer?
>
> This will be implemented on a computer with slow connection speed so I
> want to minimize data transfer.
>
> Thanks!
> Shaun
> _______________________________________________
> gradsusr mailing list
> gradsusr at gradsusr.org
> http://gradsusr.org/mailman/listinfo/gradsusr

--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma at cola.iges.org



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://gradsusr.org/pipermail/gradsusr/attachments/20110810/61b87b85/attachment-0003.html 


More information about the gradsusr mailing list