[gradsusr] Data download help
Jennifer Adams
jma at cola.iges.org
Wed Jul 10 18:06:27 EDT 2013
Dear All,
Jim Potemra brings up a good point which is worth mentioning to the entire forum. You might want to read his message (included below) before reading the rest of my response.
When reading netcdf files, GrADS doesn't read every time axis value to make sure the increment is the same throughout the file. It assumes a linear time axis and therefore only reads the first two values to get start time and increment. (If you provide all the time axis metadata in a descriptor, GrADS never checks the metadata in the data file at all).
I encountered this problem with some CMIP5 data -- NCAR was missing some time steps in a model run, so just skipped it in the data and distributed a file that did not have a linear time axis. The missing grids were hidden in time axis values that did not increase linearly -- somewhere in the middle of a file was a delta-T twice as big as all the others. This is very difficult for GrADS to detect. My solution was to break the file into two parts at the point where the missing time occurred and rename the subset files with appropriate date strings. GrADS handled it from there.
I believe it would be a performance hit to check the time axis values for all times in every file when opening a dataset, especially one being served via opendap. And I am unlikely to take a lot of time writing new code to slow down GrADS to accomodate (what I feel are) sloppy data practices. Gridded model output should be regularly spaced in time.
It is my opinion that GrADS handles file aggregation elegantly, and all the other methods out there cannot compete, but I am obviously extremely biased.
--Jennifer
On Jul 10, 2013, at 5:18 PM, James T. Potemra wrote:
> Hi Jennifer:
>
> Your suggestion on using the descriptor file is the better way to go, and it reminded me of an
> issue with accessing aggregated files in GrADS. It seems like GrADS gets time information from
> the start and increment values. If this is true, one could potentially run into problems with missing
> files (times).
>
> As example, let's say you have daily files for a week, but one day (Wednesday) is missing. If you make
> an aggregate file from the six valid days using ncrcat (or some other tool), then load this aggregated
> file into GrADS and plot a timeseries, I think you will actually get a six-day plot from Sunday through
> Friday, rather than a seven-day plot from from Sunday through Saturday with a gap for Wednesday.
> Data that appear for "Wednesday" are actually from "Thursday", and so on.
>
> This behavior can be especially problematic when reading files via OPeNDAP, since you can't really be
> sure of the contents in the aggregated file unless you check the time values in the file.
>
> I could be wrong on this, so I didn't want to post to the listserv, but feel free to reply there...
>
> Thanks,
>
> Jim
>
> On 7/9/13 8:00 AM, Jennifer Adams wrote:
>> You can skip the ncrcat step by using a GrADS descriptor file to aggregate the files together. Put the following three lines in a text file called air.sig995.xtl in the same directory with the data files:
>>
>> DSET ^air.sig995.%y4.nc
>> OPTIONS template
>> TDEF time 24107 linear 1jan1948 1dy
>>
>>
>> Then open this data set with GrADS using the 'xdfopen' command:
>>
>> ga-> xdfopen air.sig995.xtl
>>
>> Now all the files appear as a single data set -- there is no need to concatenate them together. You can begin doing whatever data analysis you want with the data. If you want to create a subset of the data in a new file, have a look at the 'sdfwrite' command.
>>
>> --Jennifer
>>
>>
>>
>> On Jul 9, 2013, at 1:49 PM, James T. Potemra wrote:
>>
>>> Emily:
>>>
>>> The files in the URL below are served via ftp, so this is not really a GrADS issue. Instead, you can use "wget" to retrieve all the files, then "ncrcat" to concatenate them all together. For example,
>>>
>>> wget -r -A.nc ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis.dailyavgs/surface/
>>>
>>> will retrieve all the files on that page that end with ".nc". Next,
>>>
>>> ncrcat -h air.sig995.*.nc one_big_file.nc
>>>
>>> will cat all the individual files into one big file. "ncrcat" is part of the netCDF Operator (NCO) toolkit; more info at http://nco.sourceforge.net/
>>>
>>> Jim
>>>
>>> On 7/9/13 7:09 AM, Emily Wilson wrote:
>>>> Hello All,
>>>>
>>>> I am wanting to write a GrADS script in emacs that downloads a bunch of NetCDF files from a website and compresses them into one large file. The website is
>>>> http://www.esrl.noaa.gov/psd/cgi-bin/db_search/DBListFiles.pl?did=33&tid=38147&vid=668 and I want to download all of the NetCDF files on this page. What commands can I use to do this task? If this is not possible would it be possible to format a script so that from the webpage, each file is read individually and specific data is pulled in each file, wrote or saved to a master file then the same thing is done for the next file and so on?
>>>> Thanks for the help in advance,
>>>>
>>>> Emily P. Wilson, Intern
>>>> Research and Conservation Department
>>>> Denver Botanic Gardens
>>>> 1007 York St.
>>>> Denver, CO 80206
>>>> 720-865-3593
>>>>
>>>>
>>>> _______________________________________________
>>>> gradsusr mailing list
>>>> gradsusr at gradsusr.org
>>>> http://gradsusr.org/mailman/listinfo/gradsusr
>>>
>>> _______________________________________________
>>> gradsusr mailing list
>>> gradsusr at gradsusr.org
>>> http://gradsusr.org/mailman/listinfo/gradsusr
>>
>> --
>> Jennifer M. Adams
>> IGES/COLA
>> 4041 Powder Mill Road, Suite 302
>> Calverton, MD 20705
>> jma at cola.iges.org
>>
>>
>>
>>
>>
>> _______________________________________________
>> gradsusr mailing list
>> gradsusr at gradsusr.org
>> http://gradsusr.org/mailman/listinfo/gradsusr
>
--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma at cola.iges.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://gradsusr.org/pipermail/gradsusr/attachments/20130710/acfe625b/attachment-0003.html
More information about the gradsusr
mailing list