[gradsusr] flushing

Muhammad Yunus Ahmad Mazuki ukm.yunus at gmail.com
Mon Jan 21 21:38:31 EST 2013


Jennifer and Jeff,

I'll share some of my experience, I'm not sure whether it will help
you. When you read from many files and writing many files at the same
time, the speed of your hard disk limits the speed of your computer in
running your loops. One of my solution was to copy my file into the
ram literally, in Ubuntu, by copying my binary(nc, grib, idx, ctl and
also gs) files into /dev/shm which is actually the physical ram of the
computer. I also copied the scripts into the folder (copying into the
ram actually), and set to produce the output inside the ram folder as
well. This has the effect of reducing my loops from 12 minutes to a
mere 4 minutes.

On that note, it seems that the GrADS version I'm using, 2.0.a9, does
not support multiple core processer usage, it only use one. When I
used 4 GrADS instances but reading and writing from a single HDD, it
just total up to 12x4=48 minutes. But when I used 4 GrADS instances
and reading and writing inside RAM, its parallel processing, resulting
in total of 4 minutes only, instead of 48 minutes. So, even if you
used really good HPC, but you use only a single old spinning HDD, your
processing speed will be limited to your HDD speed. You may want to
invest in a RAID array setup, as it will at least increase read/write
speed by two times, and this will reduce the time it taken to complete
your loops.

As for your statement in the loops taking longer time in the end, I
believe its just the bottleneck problem, where your data to be written
has piled up so much, that your  HDD is running at full speed. So
previously, even though I used quad processor computer, my speed in
processing is limited to my HDD speed. When I look at my processor
usage, it goes down an up. Now, when I read/write to RAM directly, the
processor stays at 100% usage until the end of the process, all 4
processors in fact from running 4 instances of GrADS.

Is there a future plan for GrADS to support multicore usage?

Yunus

On Tue, Jan 22, 2013 at 4:14 AM, Jeff Duda <jeffduda319 at gmail.com> wrote:
> Jennifer,
> I have attached a somewhat simplified version of the script I'm running.
> The significant part of it is the loop (while a <= ensemble_size) ...
> I put '!date' statements in to see how long each iteration of the loop
> takes.  Here is representative output from that loop:
>
> Mon Jan 21 13:55:23 CST 2013
> Mon Jan 21 13:55:43 CST 2013
> Mon Jan 21 13:56:08 CST 2013
> Mon Jan 21 13:56:33 CST 2013
> Mon Jan 21 13:57:01 CST 2013
> Mon Jan 21 13:57:34 CST 2013
> Mon Jan 21 13:58:05 CST 2013
> Mon Jan 21 13:58:38 CST 2013
> Mon Jan 21 13:59:13 CST 2013
> Mon Jan 21 13:59:48 CST 2013
> Mon Jan 21 14:00:29 CST 2013
> Mon Jan 21 14:01:12 CST 2013
> Mon Jan 21 14:01:56 CST 2013
> Mon Jan 21 14:02:39 CST 2013
> Mon Jan 21 14:03:26 CST 2013
> Mon Jan 21 14:04:14 CST 2013
> Mon Jan 21 14:05:07 CST 2013
> Mon Jan 21 14:06:02 CST 2013
> Mon Jan 21 14:07:01 CST 2013
> Mon Jan 21 14:08:03 CST 2013
>
> Note that the first few iterations take 20-25 seconds, but that increase to
> about 60 seconds by the last few iterations of the loop.  What I want to
> know is 1) why is it taking longer despite the same code running, and 2) can
> I reduce that time?
>
> Jeff
>
>
> On Thu, Jan 17, 2013 at 4:50 PM, Jennifer Adams <jma at cola.iges.org> wrote:
>>
>> Hi, Jeff --
>> GRIB2 records are cached in case you want to reread data from the same
>> grid more than once -- having grib records in the cache saves you the time
>> of having to re-do the I/O and the uncompression. However, the cache can get
>> big, and if you are pushing the memory limits of the your machine, you can
>> clear the cache with the 'flush' command if you are sure you won't need any
>> of the previously-read grids again. It is part of the 'reinit' command, and
>> I just put it in there as a separate command in case it became necessary,
>> but I don't think it will fix your slow-down problem. If you can provide a
>> script that is as simple as possible but still illustrates the slow down, I
>> will take a look.
>> --Jennifer
>>
>>
>>
>> On Jan 17, 2013, at 5:16 PM, Jeff Duda wrote:
>>
>> What is the purpose of the command flush?  I see from the documentation
>> that it clears the GRIB2 cache.  I think I understand what that basically
>> means, but I wonder if it has implications in something I'm doing.
>>
>> I'm running a series of complicated Grads scripts that read GRIB2 data and
>> make a lot of plots.  For some of my plots I am using the set defval
>> command.  One thing I've noticed while using this command is that the more
>> times I run it in a loop of a Grads script, the longer it seems to take to
>> run with each iteration.  I know that sounds non-specific, so I can provide
>> script code if anyone wants it, but I'm just feeling around here to see if
>> the flush command may enable my scripts to run faster.
>>
>> Jeff Duda
>>
>> --
>> Jeff Duda
>> Graduate research assistant
>> University of Oklahoma School of Meteorology
>> Center for Analysis and Prediction of Storms
>> _______________________________________________
>> gradsusr mailing list
>> gradsusr at gradsusr.org
>> http://gradsusr.org/mailman/listinfo/gradsusr
>>
>>
>> --
>> Jennifer M. Adams
>> IGES/COLA
>> 4041 Powder Mill Road, Suite 302
>> Calverton, MD 20705
>> jma at cola.iges.org
>>
>>
>>
>>
>> _______________________________________________
>> gradsusr mailing list
>> gradsusr at gradsusr.org
>> http://gradsusr.org/mailman/listinfo/gradsusr
>>
>
>
>
> --
> Jeff Duda
> Graduate research assistant
> University of Oklahoma School of Meteorology
> Center for Analysis and Prediction of Storms
>
> _______________________________________________
> gradsusr mailing list
> gradsusr at gradsusr.org
> http://gradsusr.org/mailman/listinfo/gradsusr
>



More information about the gradsusr mailing list