Memory optimization (was Re: Performance tuning)

Arlindo da Silva dasilva at ALUM.MIT.EDU
Wed Sep 17 14:51:04 EDT 2008


On Wed, Sep 17, 2008 at 1:33 PM, Kevin M Levey <klevey at customweather.com>wrote:

> WED 17SEP08: 1030PDT
>
> Hi Arlindo
>
> Below is a snippet you contributed and I'd like to know in a little
> more detail what you mean by
>
> "Grib-2 is a good format for data transmission but is far less less
> optimal for day to day use. For example, the performance of your
> meteograms scripts will go up by a lot with grib-1"
>
> Are you saying that GRADS processes GRIB1 data much better/faster than
> GRIB2 data?


Yes. Grib-1 compression is simpler.


> Also more specifically, does GRADS 2x take longer to
> process GRIB2 data than say GRADS 1.9RC1 takes to process GRIB1 using
> the same grads script?
>

Due to better compression algorithms, Grib-2 files are smaller (therefore
better for transmission and storage) but you have to pay a price to
decompress it.  Grib-2 is particularly problematic to process single point
time series such as those used in meteograms. Even if you only need one
point, typically Grib-2 requires that you read in the whole globe in order
in order to apply the compression. GrADS v2 does a lot of caching behind the
scenes to minimize this, at the expense of increased memory usage. The g2()
extension for GrADS v1.9 does not do any such caching.


>
> Right now I've run into a time issue using the GFS 0.5x0.5 GRIB2 data
> using your GRADS 1.9RC1 + GEX  - I thought it was merely the result of
> using higher res data that caused the time issue (i.e. taking very
> long to process all the necessary scripts).


Run a test to convince yourself. Get one of your Grib-2 datasets, run it
through lats4d and create a grib-1 version and rerun your usual scripts. If
your script is i/o bound you should see a dramatic increased in speed. (If
you do that, post your numbers here for future reference.)


>
> Would it be more efficient to use GRIB1 data instead of GRIB2 data?


Grib-1 takes more disk space, so it is not more efficient in terms of
storage.


> As
> you said, the conversion factor is often the problem point here as
> does take some time to convert the 0.5x0.5 GFS GRIB2 data to GRIB1 and
> is not possible to do in our case as it takes too long to convert 61
> time steps between 0 and 180 hours, even with high end Linux servers.
>


When disk space is not a concern, the only situation I found this grib-1
conversion to pay off is when you are doing a lot of single grid point
analysis like for meteograms, or accessing the same datasets over and over
again.  If you only read your files once it does not matter,  this extra
grib-2 to grib-1 conversion just adds complexity to your system.   On the
other hand, if you have a dynamic website where the plots are done on the
fly, or a GDS site for that matter, having your files in Grib-1 could
increase your response time by a lot. These days storage is getting cheaper
while individual processor speed has flatten out, so using more disk space
for grib-1 files seems a viable alternative for increasing performance.

   Does anybody out there have a different experience/opinion about this?

         Arlindo


 ps: BTW, internally gzipped HDF-4 files has a similar issue with single
point data.


--
Arlindo da Silva
dasilva at alum.mit.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://gradsusr.org/pipermail/gradsusr/attachments/20080917/35b24053/attachment.html 


More information about the gradsusr mailing list