Decrease LCF Fitting Computational Time in Athena
Hi, I have been given access to a Virtual Machine (VM) with 8 GB of RAM and 4 CPU's with the hopes of decreasing the time required to do LCF fittings. I have large set of environmental samples from different waste streams and/or contaminate sources. Currently I have 14 standards and I wanted to run a LCF model fitting up to 4 standards. This results in 1456 fits if I selected "fit all combinations". This takes about 100 mins. to run 1 samples (I have 60+ and more coming). It appears that Athena is constrained to using at most 25% of the available CPU and the physical memory is limited to 22% . 1) Is there a way I could change Athena's settings to take advantage of this VM's multiple CPU's and larger RAM? 2) Or if Athena can't handle multiple CPU's is there a way to increase the maximum percentage of the CPU being used? It appears that Athena is limited to accessing up to 25% of the CPU's. One CPU of the four appears to be processing a little more data than the other 3 CPU's (when viewed on the Task Manager, Performance tab). When more CPU's were added to the VM, computational time remained similar... if not worse. 3) Similarly the physical memory is limited to 22% of available memory (when viewed on the Task Manager, Performance tab) when Athena is running the LCF models as well. Could Athena's setting be changed to increase the % of RAM available for LCF fitting? 4) Any other suggestions? I spoke with my IT guy who said he believes the software is limiting my hopes of decreasing the computational time. I'd love to call it all contaminated dirt, rent some big equipment like D-9 dozers, big trucks, a couple barges and dump it into the ocean... but we don't have that much $$$ for remediation and my kids eat fish. That's is a joke of course... my kids don't eat fish :-) btw Thanks for your time and for use of the software! Respectfully, Bradley W. Miller, Ph.D. Post Doctoral Fellow Oak Ridge Institute for Science and Education U. S. Environmental Protection Agency National Risk Management Research Laboratory Land Remediation and Pollution Control Division 5995 Center Hill Avenue, Cincinnati, OH 45224-1702 www.tinyurl.com/bwmiller The great tragedy of Science—the slaying of beautiful hypothesis by an ugly fact.— Thomas H. Huxley Views or Opinions expressed in this email is solely representative of the sender and does not represents those of the EPA or any other agency.
Hi Bradley, Athena cannot use multiple CPUs. No threads, no message passing. Sorry. As for why Athena uses only 25% of the CPU in your VM is a mystery to me. Thatis not a restriction I put on the code and it is not something that normally happens with programs written using the tools Athena uses. My suspicion is that it is a VM configuration issue. As for the memory issue, the dominant use of memory by a long stretch in Athena is the memory that is statically allocated for the Ifeffit library. While it is possible to compile up Ifeffit to use more memory, that isn't likely to have an impact on how fast you LCF runs. As for the underlying problem -- the fact that it takes an hour and 40 minutes to run a large combinatorial set, the best advice I can offer is to use some prior knowledge to restrict the scope of the problem. If you know that a standard is present in a sample, you can restrict the combinatorial set to always include that one. Similarly, if you can reduce the size of you set of standards by excluding ones that you know to be unlikely or absent, that too will help. Combinatorial fitting is a blunt instrument and you seem to be using it in the most blunt way. Sorry I didn't have happier new for you. B PS: One thing that occurs to me is that you could run 2 or three instances of Athena. Each one will use its own CPU and you can be a human load leveler. If you are up for a bit of programming, I can suggest another solution which uses my new Demeter code to do the LCF. On the plus side, it would be much easier to automate your large amount of work. On the minuis side, you would have to do some programming. On Tuesday, March 20, 2012 04:22:24 pm BradleyW Miller wrote:
Hi,
I have been given access to a Virtual Machine (VM) with 8 GB of RAM and 4 CPU's with the hopes of decreasing the time required to do LCF fittings. I have large set of environmental samples from different waste streams and/or contaminate sources. Currently I have 14 standards and I wanted to run a LCF model fitting up to 4 standards. This results in 1456 fits if I selected "fit all combinations". This takes about 100 mins. to run 1 samples (I have 60+ and more coming).
It appears that Athena is constrained to using at most 25% of the available CPU and the physical memory is limited to 22% .
1) Is there a way I could change Athena's settings to take advantage of this VM's multiple CPU's and larger RAM?
2) Or if Athena can't handle multiple CPU's is there a way to increase the maximum percentage of the CPU being used? It appears that Athena is limited to accessing up to 25% of the CPU's. One CPU of the four appears to be processing a little more data than the other 3 CPU's (when viewed on the Task Manager, Performance tab). When more CPU's were added to the VM, computational time remained similar... if not worse.
3) Similarly the physical memory is limited to 22% of available memory (when viewed on the Task Manager, Performance tab) when Athena is running the LCF models as well. Could Athena's setting be changed to increase the % of RAM available for LCF fitting?
4) Any other suggestions?
I spoke with my IT guy who said he believes the software is limiting my hopes of decreasing the computational time. I'd love to call it all contaminated dirt, rent some big equipment like D-9 dozers, big trucks, a couple barges and dump it into the ocean... but we don't have that much $$$ for remediation and my kids eat fish. That's is a joke of course... my kids don't eat fish :-)
btw Thanks for your time and for use of the software!
Respectfully, Bradley W. Miller, Ph.D. Post Doctoral Fellow Oak Ridge Institute for Science and Education U. S. Environmental Protection Agency National Risk Management Research Laboratory Land Remediation and Pollution Control Division 5995 Center Hill Avenue, Cincinnati, OH 45224-1702
www.tinyurl.com/bwmiller
The great tragedy of Science—the slaying of beautiful hypothesis by an ugly fact.— Thomas H. Huxley
Views or Opinions expressed in this email is solely representative of the sender and does not represents those of the EPA or any other agency.
-- Bruce Ravel ------------------------------------ bravel@bnl.gov National Institute of Standards and Technology Synchrotron Methods Group at NSLS --- Beamlines U7A, X24A, X23A2 Building 535A Upton NY, 11973 My homepage: http://xafs.org/BruceRavel EXAFS software: http://cars9.uchicago.edu/ifeffit/Demeter
Hi Bradley, Bruce,
On Wed, Mar 21, 2012 at 8:44 AM, Bruce Ravel
Hi Bradley,
Athena cannot use multiple CPUs. No threads, no message passing. Sorry.
As for why Athena uses only 25% of the CPU in your VM is a mystery to me. Thatis not a restriction I put on the code and it is not something that normally happens with programs written using the tools Athena uses. My suspicion is that it is a VM configuration issue.
I think Bradley said 'using at most 25% of the available CPU', which probably means 'at most 1 CPU core at a time', which is correct. The OS may be able to switch which core is used for Athena/Ifeffit, but only one can be used at any time.
As for the memory issue, the dominant use of memory by a long stretch in Athena is the memory that is statically allocated for the Ifeffit library. While it is possible to compile up Ifeffit to use more memory, that isn't likely to have an impact on how fast you LCF runs.
I doubt memory (22% of 8Gb) is an issue here. Ifeffit is 32-bit, and will not ever use more than ~2Gb on Windows, though the GUI could add to that, and might be able to go above 2Gb. Anyway, memory is not being swapped from RAM to disk. Low-level cache memory will definitely be swapped to RAM, which is likely to actually limit run time, not exactly the number of cores. That's all to say that the mathematical problem probably could be solved much better by making use of multiple cores. But Athena/Ifeffit won't be doing that anytime soon.
As for the underlying problem -- the fact that it takes an hour and 40 minutes to run a large combinatorial set, the best advice I can offer is to use some prior knowledge to restrict the scope of the problem. If you know that a standard is present in a sample, you can restrict the combinatorial set to always include that one. Similarly, if you can reduce the size of you set of standards by excluding ones that you know to be unlikely or absent, that too will help. Combinatorial fitting is a blunt instrument and you seem to be using it in the most blunt way.
Right, throwing more spectra at Athena's LCF is going to be expensive. The combinations are tried one at a time, and that won't ever be changed. OTOH, recasting the basic problem in terms of linear algebra might help, and might be able to make use of multiple cores, but that is not something Athena is going to do with Ifeffit 1.2, or probably ever.
PS: One thing that occurs to me is that you could run 2 or three instances of Athena. Each one will use its own CPU and you can be a human load leveler.
Yes, running 4 instances of Athena would help, as each could presumably use a different core. Even if they're sharing physical cores, it might actually go faster, depending on caching. Splitting the problem in half 4 different ways and ensuring overlap is likely to take less real time. --Matt
participants (3)
-
BradleyW Miller
-
Bruce Ravel
-
Matt Newville