Search  
DOE Pulse
  • Number 388  |
  • May 13, 2013

Makeover Puts CHARMM Back in Business

NREL scientists Michael Crowley and Antti-Pekka Hynninen have developed algorithms that speed calculations done by the software tool CHARMM (Chemistry at Harvard Molecular Mechanics) by several orders of magnitude, using code such as the one pictured. Using the new petascale high performance computer housed in NREL's Energy Systems Integration Facility, scientists will be able to simulate the motions of thousands of atoms, leading to greater understanding of how molecular models work. Credit: Dennis Schroeder

NREL scientists Michael Crowley and
Antti-Pekka Hynninen have developed
algorithms that speed calculations done by
the software tool CHARMM (Chemistry at
Harvard Molecular Mechanics) by several
orders of magnitude, using code such as
the one pictured. Using the new petascale
high performance computer housed in NREL's
Energy Systems Integration Facility,
scientists will be able to simulate the motions
of thousands of atoms, leading to greater
understanding of how molecular models work.
Credit: Dennis Schroeder

Biofuels scientists are asking more complex questions about how molecules spin, bond, and break when enzymes attack plants — all in the name of quickening the process of turning biomass into fuels for the sake of cleaner air and better energy security.

They're the kinds of questions that require trillions of mathematical operations each second on supercomputers. But, software engineers hadn't been able to keep up with the ever-increasing demands of the scientists and the growing capabilities of modern supercomputers. That is, until unique work at DOE’s National Renewable Energy Laboratory (NREL) supercharged an essential decades-old software program to run on a single high performance computer such as the new petascale computer at NREL's Energy Systems Integration Facility.

Software engineers at NREL have reworked codes and algorithms on the CHARMM (Chemistry at Harvard Molecular Mechanics) program to allow it to simulate molecular motion with millions to billions of steps of computation. It does so by simulating nanoseconds to microseconds of molecular motion, which takes days of computing time.

How long is a nanosecond? Well, a nanosecond (a billionth of a second) is to a second as a second is to 31.7 years.

And a nanosecond is a very long time when measuring all the movements of thousands of atoms in a molecule. It takes a million molecular dynamics (MD) steps to simulate a nanosecond of molecular motion.

"For an average system of 100,000 atoms on a single modern processor core, it would take us half a day of computing to simulate less than half a nanosecond," NREL Senior Scientist Michael Crowley said.

But they need to simulate molecular motion for much longer than that — as long as 100 nanoseconds. "Using the original version of parallel CHARMM, it would take half a year, no matter how many processors we used, to simulate molecular motion for that long," Crowley said.

Thanks to the improvements the NREL engineers made to the CHARMM algorithms and code, they can now do that simulation in a day with hundreds of processors running in parallel.

"To get a microsecond [1,000 nanoseconds] on a thousand processors will now take a few days," Crowley said.

The only limit on the questions scientists can ask — and expect answers to — is the speed of computing power. For more than a decade, each time scientists asked new questions that required faster computer power to answer, engineers could count on a computer's speed doubling every year or so to keep up.

"But this is not enough to keep up anymore. Computer chips are not getting any faster — they are getting more parallel," said NREL's Antti-Pekka Hynninen, a physicist and software engineer. "We now have to parallelize the code" to multiply the speed at which the simulations can be run.

CHARMM was developed at Harvard University in the 1980s to allow scientists to generate and analyze a wide range of molecular simulations, including production runs of a molecular dynamics trajectory for proteins, nucleic acids, lipids, and carbohydrates.

It is a favorite program of molecular researchers around the world for simulating biological reactions such as the action of cellulase on cellulose for converting biomass into ethanol. CHARMM is also a crucial code for the pharmaceutical industry.

CHARMM is unique in its ability to build, simulate, and analyze results of molecular motion in a single program — cutting-edge methods for thermodynamics, reaction sampling, quantum mechanics, molecular mechanics, and advanced imaging, Crowley said.

For all its advantages, though, CHARMM's crunching velocity hadn't kept up with the new demands and the new questions. The size of the new biomolecular simulations is so large (more than 1 million atoms) and the simulation time so long (5 million time steps for the 10-nanosecond simulation) that they exceeded the capabilities of CHARMM.

So, three years ago, Crowley hired Hynninen to update the code and increase its performance.

If Hynninen had tried writing the entire 600,000 lines of code, he estimates it would have taken him about 10 years.

Instead, he focused on rewriting the heart of CHARMM, the molecular dynamics engine, and he was able to parse the chore down to two years. The molecular dynamics engine is where all the heavy computation is done.

He's the first to admit it wasn't exactly a day (or two years) at the beach.

"It's one of those very hard problems, mechanics of atoms and enzymes," Hynninen said. "There is really no limit to how a molecule can behave." Its motions are determined by the interplay of a multitude of interactions between each atom and every other atom nearby — through both chemical bonds and non-bonded interactions, he noted. That results in thousands of different kinds of interactions per atom. And there can be hundreds of thousands of atoms in a simulation.

The day-long task using hundreds or thousands of processors simulates a very brief moment cataloguing every move by thousands of atoms. "It's not just that they all move but that each atom is feeling forces from thousands of other atoms," Crowley said. "And each one of those forces has to be calculated for every atom at every step."

Simulating the motion of atoms answers important questions about how any enzyme can access the sugars in a plant.

Molecular dynamics code is quite easy to write if you do not care about performance, Hynninen said. But to code the algorithm to run very fast … that's difficult.

"I just started digging in," Hynninen said. "It's a lot of sort of lonely work. Just to figure out the algorithms, I went through 20 legal tablets, drawing diagrams — and then writing the algorithm into code."

Hynninen retained the strengths of the CHARMM code and combined them with ideas from other programs to enhance CHARMM's speed.

"Now we're back in the ballgame again," Crowley said. "This is a huge, huge improvement. People are using CHARMM again."

Funding for NREL's work on CHARMM came from two areas in the Energy Department’s Office of Science — Advanced Scientific Computing Research (ASCR) and Biological and Environmental Research (BER) — as well as funds from the National Institutes of Health (NIH) for code modernization at the University of Michigan. Partners include the University of Michigan, Oak Ridge National Laboratory, and the University of California at San Diego.

By fully understanding how enzymes find, reach, and act on the cellulose in plants, scientists may be able to engineer super-efficient enzymes that create abundant energy from algae or agricultural waste products.

The work is crucial because one of the most promising paths toward energy independence and clean energy requires biofuels to achieve price parity with gasoline.

Now that the new version of CHARMM is orders of magnitude faster — supersonic transport versus the Kitty Hawk plane, Crowley attests — the chances of solving problems and avoiding bottlenecks have increased exponentially. And the new and improved CHARMM should prove a boon to molecular scientists working with pharmaceuticals, as well.

Like a three-legged-race team moving in tandem, scientists and computer engineers have to count on each other to keep up.

Crowley compared Hynninen's borrowing of algorithms from other molecular dynamics software packages to "looking at how a VW is built to help you make your Chevy better."

Learn more about NREL's biomass and computational sciences research. —Bill Scanlon

Submitted by DOE’s National Renewable Energy Laboratory