ORNL heads effort to build better supercomputer centers

OAK RIDGE, Tenn., Sep. 6, 2001 — Supercomputers provide researchers with powerful tools, but operating them can also be a super hassle, says an Oak Ridge National Laboratory researcher who heads a team working to fix the problem.

Through an $11 million five-year project, ORNL and a team from universities and other Department of Energy laboratories will create the Scalable Systems Software Center. The center, funded through DOE's Scientific Discovery through Advanced Computing initiative, will address the lack of software for effective management of terascale computational resources like the ones being installed at ORNL and other sites around the country. "DOE operates many of the largest computers in the world and some of the largest computer facilities," said Al Geist of ORNL's Computer Science and Mathematics Division. "But today, each computer facility uses ad hoc and homegrown systems software solutions to, for example, schedule jobs and monitor the health of the supercomputers."

With the center, problems solved at one DOE computer facility could be leveraged to other large facilities.

"The Scalable Systems Software Center provides the opportunity to create and support a common set of systems software for large computer facilities across the country," Geist said. "It's a problem that the computer industry isn't going to solve because business trends push the industry toward smaller systems aimed at Web serving, database farms and departmental-sized systems."

The vision and goal of the center are to bring together a team of experts who, with industry involvement, can agree on and specify standardized interfaces between system components. Another goal is to produce a fully integrated set of systems software and tools to effectively use terascale computational resources.

Researchers also plan to study and develop more advanced versions of the system tools to meet the needs of future - and even larger supercomputers.

Scientific Discovery through Advanced Computing is an integrated program that will help create a new generation of scientific simulation codes. The codes will take full advantage of the extraordinary computing capabilities of computers capable of performing trillions of calculations per second to address increasingly complex problems.

The recently announced 51 DOE SciDAC projects will receive a total of $57 million this fiscal year to advance fundamental research in several areas related to the department's missions, including climate modeling, fusion energy sciences, chemical sciences, nuclear astrophysics, high-energy physics and high-performance computing.

ORNL's partners for the Scalable Systems Software Center are Ames Laboratory, Argonne National Laboratory, Lawrence Berkeley National Laboratory, Los Alamos National Laboratory, Pacific Northwest National Laboratory, Sandia National Laboratories and the National Center for Supercomputer Applications, made up of dozens of universities throughout the United States.

ORNL is a DOE multiprogram facility operated by UT-Battelle.