NSF creates TeraGrid
The National Science Foundation (NSF) has awarded $53 million to four U.S. research institutions to build and deploy a distributed terascale facility (DTF). The DTF will be the largest, most comprehensive infrastructure ever deployed for scientific research—with more than 13.6 teraflops (trillions of calculations per second) of computing power as well as facilities capable of managing and storing more than 450 terabytes (trillions of bytes) of data.
Besides offering the world’s fastest unclassified supercomputers, the DTF’s hardware and software will include ultra high-speed networks, high-resolution visualization environments, and toolkits for grid computing. All of these components will be tightly integrated into an information infrastructure dubbed the “TeraGrid.” Scientists and industry researchers across the country will be able to tap this infrastructure to solve scientific problems.
“Breakthrough discoveries in fields from biology and genomics to astronomy depend critically on computational and data management infrastructure as a first-class scientific tool,” said Fran Berman, director of NPACI and SDSC and a principal investigator of the DTF award. “The TeraGrid recognizes the increasing importance of data-oriented computing and connection of data archives, remote instruments, computational sites, and visualization over high-speed networks. The TeraGrid will be a far more powerful and flexible scientific tool than any single supercomputing system.”
Each of the four DTF sites will play a unique role in the project.
- NCSA will lead the TeraGrid project’s computational aspects with an IBM Linux cluster powered by the next generation of Intel® Itanium™ processors, code named McKinley. The cluster’s peak performance will be 8 teraflops, combining the DTF-funded systems and other NCSA clusters, with 240 terabytes of secondary storage.
- SDSC will lead the TeraGrid data and knowledge management effort by deploying a data-intensive IBM Linux cluster based on Intel Itanium family processors (McKinley). This system will have a peak performance of just over 4 teraflops and 225 terabytes of network disk storage. In addition, a next-generation Sun Microsystems high-end server will provide a gateway to grid-distributed data for data-oriented applications.
- Argonne will lead the effort to deploy advanced distributed computing software, high-resolution rendering and remote visualization capabilities, and networks. This effort will require a 1-teraflop IBM Linux cluster with parallel visualization hardware.
- Caltech will focus on providing online access to very large scientific data collections and will facilitate access to those data by connecting data-intensive applications to components of the TeraGrid. Caltech will deploy a 0.4-teraflop IBM Itanium processor family (McKinley) cluster and an IA-32 cluster that will manage 86 terabytes of online storage.