[gradsusr] Performance Tips

Christopher Gilroy chris.gilroy at gmail.com
Mon Feb 15 14:13:21 EST 2016


Jennifer,

I'm trying to speed-up processing (each hour is a seperate script call,
using args) and I'm wondering if there's anything out of the ordinary with
the way we're doing our snowfall 10:1:

define snowtot = const((sum(maskout(weasdsfc.1-weasdsfc.1(t-1),weasdsfc.1-
weasdsfc.1(t-1)),t=2,t+0)*0.0393701)*10, 0, -u)

d snowtot

The main question is one of... I assume it should take longer the further
out the data goes and is there anyway to reduce that time? The actual
calculation was presumably causing the processing time being longer and
longer since it was doing a maskout, sum and const... less data in the
beginning, more and more data the further out it goes that it needs to
calculate, makes sense. I thought I could run an fwrite script that exports
the "pre-calculated" variable snowtot, and then when plotting the already
calculated snowtot I could essentially simply do something like:

d snowtot(t=2)+snowtot(t=3)+snowtot(t=4)

etc but it seems like the processing time seemed to be the exact same
instead of (expectedly) simple addition being extremely quick?


Basically, we do an approach like so, and I'd have no problem changing it
to something like: define snowtot2, define snowtot3, etc but:

//download t 2 data
set t 2
define snowtot = const((sum(maskout(weasdsfc.1-weasdsfc.1(t-1),weasdsfc.1-
weasdsfc.1(t-1)),t=2,t+0)*0.0393701)*10, 0, -u)
d snowtot *draws timestep 2 snow
quit

//download t 3 data
set t 3
define snowtot = const((sum(maskout(weasdsfc.1-weasdsfc.1(t-1),weasdsfc.1-
weasdsfc.1(t-1)),t=2,t+0)*0.0393701)*10, 0, -u)
d snowtot *draws timestep 2 and 3 snowtot
quit

//download t 4 data
set t 4
define snowtot = const((sum(maskout(weasdsfc.1-weasdsfc.1(t-1),weasdsfc.1-
weasdsfc.1(t-1)),t=2,t+0)*0.0393701)*10, 0, -u)
d snowtot *draws timestep 2, 3 and 4 snowtot
quit


So the thought process was since we were already calculating t 2 snowtot we
could use the previously calculated t 2 for the subsequent run. Then on t 4
we've already calculated t 2 and t 3, so we could just use those. I thought
fwrite'ing each hours "already calculated" data would have drastically sped
that up for each future hour, but it seemed to take the same amount of time
as doing it this way?

Sorry if that seems to make no sense (I'll try to explain better if so), or
way overboard, but fundamentally we process the data real-time as the files
are released from the models.

-Chris



On Mon, Feb 15, 2016 at 9:13 AM, Jennifer M Adams <jadams21 at gmu.edu> wrote:

>
> On Feb 14, 2016, at 12:34 AM, Christopher Gilroy <chris.gilroy at gmail.com>
> wrote:
>
> I figured this would be the perfect post to ask these questions in:
>
> 1.) "Defining" a variable actually 'costs' nothing in terms of
> performance, correct? It only actually gets calculated (the cost of
> processing) when you display it, right?
>
> The define command copies the result of your expression into memory, so
> that the I/O and any calculations are done only that one time (when you
> invoke ‘define’). In a sense, all of the ‘cost’ is paid with define, and
> subsequent displays get the grid from memory, so they are effectively
> ‘free’. you can draw your defined grid twice (say, once for shaded contours
> and again for a line-contour highlight), without using any extra processing
> time. If you do not define your expression, but just ‘display’ it, then
> after the drawing is finished, the grid is released. If you ‘display’ the
> expression again, the I/O and calculations are repeated.
>
> Many use ‘define’ and then ‘display' as a matter of habit, even if the
> grid is only drawn once and it would be just as fast to just use ‘display’
> once. This practice does no harm, but it can lead to an unnecessarily
> bloated memory footprint of your GrADS session.
>
>
> 2.) Is there anyway to process the display once, but use it's output
> multiple times without incurring additional processing time? Simple example
> would be, display t step 2, but also fwrite the processed data to a file so
> when you process t step 3 you can just open the already processed data and
> simply add t step 3's data to it? Think of doing a total accumulation plot,
> the further out it gets the longer it takes to process (presumably) because
> it's calculating all previous hours data (which technically has already
> been processed) before it even calculates the current hours.
>
> This sounds like a job for tloop().
>
>
> 3.) All of the pdef talk in this, without defining a smaller pdef is the
> actual entire file read as soon as you open the corresponding ctl file?
> From the way the talks have been in this, it seems so but I figured I'd
> ask.
>
> The I/O for pdef’d data is done one grid box at a time, for each point in
> the destination grid (which is defined with your xdef and ydef entries).
> Assuming you are doing bilinear interpolation, for each grid box in the
> destination grid, four data points are read from the native grid, then the
> weighted average of those four points goes into one element of the array
> containing the interpolated grid. If the native grid is cached, then this
> 4:1 I/O factor is not so noticable. Grads does internal caching for grib,
> grib2, and netcdf4 data types. The operating system does some cacheing for
> binary data. The one that is really slow is netcdf3. It is very slow to
> read a high-res netcdf3 file that needs PDEF. It is only my list to add
> internal cacheing in the GrADS I/O layer for *all* data types, but I
> haven’t gotten that code written yet.
>
> Perhaps I'm confused on what setting an actual lat/long 'does', I assume
> it calculated the display output based around those dimensions
>
> and from the way I'm reading this it seems like the lat/lon is simply
> "controlling" grads in terms of the actual bounding dimension to display
> the data?
>
> That’s right.
>
>
> For example, we do this:
>
> 'set lat 10 68'
> 'set lon -138 -55'
>
> 'define radar1kagl = refd1000m.1'
> 'define hgt850tmp = tmpprs.1(lev=850)-273.15’
>
>
> At this point your variable hgt850tmp is copied in memory, with the
> dimension limits you specified.
>
>
> 'set lat 20 58'
> 'set lon -128 -65'
>
> 'd hgt850tmp’
>
> Now you changed the dimension limits and display the defined variable — no
> new I/O is performed from the original data file. If your new limits are
> outside the domain where the variable is defined, you will get missing
> data. In this case, your display will be complete because your new
> dimesions are within the domain of the defined variable.
> —Jennifer
>
>
> As you guys are probably aware, if we were to make the lat/lon before the
> define half the size the second lat/lon would still be the bounding box but
> the data would only be calculated from the first set, which is also why I'm
> questioning #3.
>
>
>
>
> On Fri, Feb 12, 2016 at 1:38 PM, Wesley Ebisuzaki - NOAA Federal <
> wesley.ebisuzaki at noaa.gov> wrote:
>
>> Travis,
>>
>>     I think this is how Jennifer wants it to work.
>>
>> You have a 1 km grid and a ctl for the 1km grid.  This works great for my
>> town
>> but is super slow for the CONUS where your sceen can't resolve 1 km.
>>
>> You make a ctl that uses the 1km file but uses xdef/ydef for a 10 km
>> grid.
>> This will require a pdef line.  This control file will be good for state
>> maps.
>>
>> You can also make a ctl that uses the 1km file but uses xdef/ydef for a
>> 100 km grid.
>> This will also require a pdef line.  This ctl file will be good for CONUS
>> maps.
>>
>> How you make the pdef line/file is the subject of another email.
>>
>> Wesley
>>
>>
>>
>>
>> On Fri, Feb 12, 2016 at 12:55 PM, Travis Wilson - NOAA Federal <
>> travis.wilson at noaa.gov> wrote:
>>
>>> Hi All,
>>>
>>>
>>> These are great tips.  Changing xdef and ydef is a great option but if I
>>> understand correctly, it won’t work with Jennifer’s new code since
>>> pdefwrite would have to be redone.  Making a dummy grid on the fly and
>>> using lterp looks to be the next best option for us.
>>>
>>>
>>> The most surprising thing I found from my python test is that plotting
>>> performance doesn’t really degrade as you view a larger area of a high
>>> resolution grib file.
>>>
>>>
>>> HRRR example (attached in original email)
>>>
>>> Grads =  0.86s (California view)  --> 6.36 (Conus View)
>>>
>>> Python = 4.7 (California view) --> 6.8 (Conus View)
>>>
>>>
>>> It may be beneficial if grads had a grid-to-xwindow ratio and/or a
>>> grid-to-image ratio setting to acknowledge the fact that we don’t want to
>>> keep writing over the same pixel for high resolution grids (basically
>>> regrid or start skipping grib points for the plot/xwindow when things
>>> become redundant).  GrADS could possibly allow users to turn this option
>>> on/off and set their desired ratio.  This would make grads very snappy with
>>> possibly little to no image/xwindow quality loss.  A good example is when
>>> someone is doing an analysis with a HRRR grib file and grads performance
>>> would change very little whether someone is looking at the entire conus or
>>> just a small region.   Right now, grads performance changes by a factor of
>>> 7 in the examples I sent in the PDF.  Python’s performance changes by only
>>> a factor of 1.5, so I suspect it is doing some regridding or selective grib
>>> point plotting on the fly to keep things speedy.  Anyways, it is just a
>>> thought and may be beneficial as we head towards higher resolution
>>> datasets.  Thank you all for your help, I really appreciate it.
>>>
>>>
>>> Travis
>>>
>>> On Thu, Feb 11, 2016 at 11:12 AM, Jennifer M Adams <jadams21 at gmu.edu>
>>> wrote:
>>>
>>>> Dear Travis, Wesley, et al.,
>>>>
>>>> I have done some testing with the high-res fnexrad 1km radar data,
>>>> comparing the use of ‘pdef lccr’ (where the interpolation weights are
>>>> calculated internally) and ‘pdef bilin’ (where interpolation weights are
>>>> provided by the user in an external file. Reading the weights from a file
>>>> was significantly faster — something like 30x faster!
>>>>
>>>> The tricky part of taking advantage of this performance gain is
>>>> creating the pdef file itself, which depends on you being able to calculate
>>>> non-integer i,j values in the native grid that correspond to each grid
>>>> point in the destination grid, which is defined by what you put in your
>>>> XDEF and YDEF statements. This is not necessarily simple.
>>>>
>>>> The good news is that GrADS does this calculation for you every time
>>>> you open a descriptor with a pdef statement that doesn’t point to an
>>>> external file — lcc, lccr, nps, sps, etc. I am going to implement a command
>>>> ‘pdefwrite’ that will write out the interpolation weights calculated
>>>> internally for these types of PDEF entries so that the file can be used
>>>> with ‘pdef bilin’ instead. The protocol will be something like this:
>>>> 1. Create a descriptor that has a pdef statement like this:
>>>> pdef 4736 3000 lccr 23.0 -120 1 1 40.0 40.0 -100 1016.2360 1016.150
>>>> 2. Open it with grads
>>>> 3. Invoke pdefwrite with a file name as an argument
>>>> 4. Rewrite your descriptor to use this pdef statment instead:
>>>> pdef 4736 3000 bilin stream binary *your-filename-here*
>>>> 5. Don’t change the XDEF and YDEF statements — those match the pdef
>>>> file you created in step 3.
>>>> 6. Open the new descriptor with GrADS and start working right away.
>>>>
>>>> Additional comments on Travis’s email:
>>>>
>>>> Shade1 may be faster than shade2 in some cases, but it won’t look right
>>>> with transparent colors because the polygons in the old algorithm overlap.
>>>> By the way, in the newer versions of GrADS, ‘gxout shaded’ is an alias for
>>>> shade2, so if you want to use shade1 you have to say so explicitly.
>>>>
>>>> For regridding, the new code in lterp() does just about everything re()
>>>> does only it is faster and more accurate. It is true that the destination
>>>> grid definition requires an open file, but I use something like this all
>>>> the time:
>>>> dset ^foo.bin
>>>> options template
>>>> undef -9.99e8
>>>> xdef 90 linear 2 4
>>>> ydef 45 linear -88 4
>>>> tdef 1 linear 01Jan0001 1dy
>>>> zdef 1 linear 1 1
>>>> vars 1
>>>> foo 0 99 foo
>>>> endvars
>>>>
>>>> You can even create that dummy descriptor on the fly, depending on what
>>>> destination grid you need at the time. Also, if you are using pdef, it is a
>>>> waste of resources to use lterp(), just put your desired destination grid
>>>> in the XDEF and YDEF statements.
>>>>
>>>> High res data sets take longer to render because they have more data to
>>>> grind through to calculate where to draw the contours. But if your data is
>>>> high res, don’t you want to see that reflected in your plot?
>>>>
>>>> I like ‘gxout grfill’ to really see the finer details in the data.
>>>> Contours over highly variable data (e.g. temperature in the Rocky
>>>> Mountains) can look really noisy but grfill lets you see that variability
>>>> without all the annoying squiggly contour lines.
>>>>
>>>> Regarding the resolution of the image output — there is no point to
>>>> write out really high res data to a small image file; you just end up
>>>> drawing over the same pixel multiple times. If image file dimensions are
>>>> your limiting factor, then it might make sense to downgrade the resolution
>>>> of your grid. I don’t think the optimal ratio between grid size and image
>>>> size is 1:1, however. There’s probably a sweet spot somewhere where you can
>>>> still see all the details in your data and the image size is lean enough. I
>>>> think 800x600 is pretty small, and it is also not quite the same aspect
>>>> ratio as 11x8.5 so your image will be a bit distorted from what you see in
>>>> the display window.
>>>>
>>>> Don’t forget about the utility ‘pngquant' for making the image output
>>>> files (from v2.1+) less bulky so you can store more of them and they will
>>>> load faster in a browser.
>>>>
>>>> —Jennifer
>>>>
>>>>
>>>> On Feb 11, 2016, at 10:40 AM, Wesley Ebisuzaki - NOAA Federal <
>>>> wesley.ebisuzaki at noaa.gov> wrote:
>>>>
>>>> Travis,
>>>>
>>>>    I haven't tried this but it may work.
>>>>
>>>>    Instead of regridding your hi-res lat-lon data, make a new control
>>>> file
>>>> which has a PDEF .. BILIN.  This PDEF would map low_res(i,j) ->
>>>> hi_res(n*i, n*j)
>>>>
>>>>      low_res() : the low-res x-y grid which is defined in the low-res
>>>> ctl file.
>>>>      hi_res(): the hi-res grib file grid
>>>>
>>>> I don't remember if grids start at grid(0,0) or grid(1,1).  If grids
>>>> start at (1,1) then
>>>> the above formula would have to be changed.
>>>>
>>>> Wesley
>>>>
>>>>
>>>>
>>>> On Tue, Feb 9, 2016 at 5:37 PM, Travis Wilson - NOAA Federal <
>>>> travis.wilson at noaa.gov> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>>
>>>>> Attached is a very short ppt on grads performance vs python using grib
>>>>> files.  In most cases, grads blows python away.  Times are relative to our
>>>>> machine and consider everything from starting grads/opening the file, to
>>>>> closing the file.
>>>>>
>>>>>
>>>>> - In particular we have found that shaded1 is much faster.  Up to 40%
>>>>> faster on our machines.
>>>>>
>>>>> - Wesley Ebisuzaki recommended converting the grib files to a lat/lon
>>>>> grid to eliminate the PDEF entry to significantly speed up the opening time
>>>>> of high resolution grib files.
>>>>> http://gradsusr.org/pipermail/gradsusr/2016-January/039339.html
>>>>>
>>>>> - Again noted by Wesley, grib packing can have an impact on
>>>>> performance
>>>>> http://gradsusr.org/pipermail/gradsusr/2010-May/027683.html
>>>>>
>>>>>
>>>>> One thing we show in the ppt is that as the view gets wider (i.e. the
>>>>> number of points that are plotted increase), the slower grads is relative
>>>>> to python.  At some point, python will become faster.   Anyways, to battle
>>>>> with this, regridding (using the re() function) the data within grads
>>>>> significantly speeds up the plotting time (see last slide) when you have a
>>>>> lot of points.  As far as I know, you can’t use re() in grads 2.1a3.  You
>>>>> do have lterp() but a grid is needed.  Is there anything that will allow me
>>>>> to lterp to my image dimensions?  Say my image dimensions are x800 y600
>>>>> then lterp would interpolate my high resolution grib file to x800 y600 (or
>>>>> some multiple of) when a view exceeds 800 points across.  This will
>>>>> significantly speed up the plotting time when viewing a wide view of a high
>>>>> resolution grib file while not degrading the image quality by much (again,
>>>>> see last slide).
>>>>>
>>>>>
>>>>> Also, if anyone has other performance tips on plotting high resolution
>>>>> grib files we would love to hear them.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Travis
>>>>>
>>>>> _______________________________________________
>>>>> gradsusr mailing list
>>>>> gradsusr at gradsusr.org
>>>>> http://gradsusr.org/mailman/listinfo/gradsusr
>>>>>
>>>>>
>>>> _______________________________________________
>>>> gradsusr mailing list
>>>> gradsusr at gradsusr.org
>>>> http://gradsusr.org/mailman/listinfo/gradsusr
>>>>
>>>>
>>>> --
>>>> Jennifer Miletta Adams
>>>> Center for Ocean-Land-Atmosphere Studies (COLA)
>>>> George Mason University
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> gradsusr mailing list
>>>> gradsusr at gradsusr.org
>>>> http://gradsusr.org/mailman/listinfo/gradsusr
>>>>
>>>>
>>>
>>> _______________________________________________
>>> gradsusr mailing list
>>> gradsusr at gradsusr.org
>>> http://gradsusr.org/mailman/listinfo/gradsusr
>>>
>>>
>>
>> _______________________________________________
>> gradsusr mailing list
>> gradsusr at gradsusr.org
>> http://gradsusr.org/mailman/listinfo/gradsusr
>>
>>
>
>
> --
> -Chris A. Gilroy
> _______________________________________________
> gradsusr mailing list
> gradsusr at gradsusr.org
> http://gradsusr.org/mailman/listinfo/gradsusr
>
>
> --
> Jennifer Miletta Adams
> Center for Ocean-Land-Atmosphere Studies (COLA)
> George Mason University
>
>
>
>
> _______________________________________________
> gradsusr mailing list
> gradsusr at gradsusr.org
> http://gradsusr.org/mailman/listinfo/gradsusr
>
>


-- 
-Chris A. Gilroy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://gradsusr.org/pipermail/gradsusr/attachments/20160215/a423af26/attachment-0001.html 


More information about the gradsusr mailing list