Bandwidth Challenge Will Push the Limits of Technology at SC2003

SC2003 Bandwidth Challenge Contestants

At SC2003, to be held in Phoenix, Arizona November 15-20, 2003, eight contestants will be challenged to "significantly stress" the SCinet network infrastructure while moving meaningful data files across the multiple research networks that connect to SCinet, the conference's temporary but powerful on-site network infrastructure. The primary standard of performance will be verifiable network throughput as measured from the contestant's equipment through the SCinet switches and routers to external connections. Continuing a tradition started at SC2000, Qwest Communications http://qwest.com/ is awarding monetary prizes for applications that make the most effective or "courageous" use of SCinet resources. The winners will be announced at SC2003 on Thursday, November 20.

Bandwidth Lust: Distributed Particle Physics Analysis using Ultra High Speed TCP on the GRiD
In this demonstration we will show several components of a Grid-enabled distributed Analysis Environment (GAE) being developed to search for the Higgs particles thought to be responsible for mass in the universe, and for other signs of new physics processes to be explored using CERN 's Large Hadron Collider (LHC), SLAC 's BaBar, and Fermi National Accelerator Laboratory's CDF and D0. We use simulated events resulting from proton-proton collisions at 14 Tera electronVolts (TeV) as they would appear in the LHC's Compact Muon Solenoid (CMS) experiment, which is now under construction and which will collect data starting in 2007.

Participants

Stanford: Les Cotrell

Concurrent Multi-Location Write to Single File
This entry will demonstrate collaborative data processing over very large distances using a distributed file system. Servers in multiple locations across North America (StarLight, Phoenix, and Edmonton, Canada) will read and write concurrently to a single file. The file may reside on heterogeneous storage that is localized in one site or it may be distributed across many sites. Aggregate throughput to disk should be in excess of 10 Gbps. WAN transmission of data between sites will be over conventional TCP/IP. The system will leverage distributed cache coherence so that all sites read and write to a single globally consistent data image yet maintain near local I/O rates. While this project will demonstrate very high throughput across the WAN, it will also demonstrate data and meta-data localization so that the amount of data that needs to be transmitted between sites is minimized.

Participants

YottaYotta, SGI Inc., Navy Research Labs, University of Tennessee, StarLight, CANARIE, Netera Alliance, WestGrid

Distributed Lustre File System Demonstration
Using the Lustre File System the project will demonstrate both clustered and remote file system access over local (the ASCI booth) very high bandwidth (80Gbps) links combined with remote (10Gbps) access over a large distance -- 2000 miles -- to NCSA. Compute Nodes (Clients in the Lustre lexicon) in both locations will access servers (OSTs in the Lustre lexicon) in both locations and will read and write concurrently to a single file and to multiple files spread across the OSTs.

Aggregate performance to disk should be in excess of 10s of Gbps. All transmission of data between locations and within the clusters will be on multiple 10Gbps links using conventional TCP/IP and also, potential, Intel's ETA technology. There will be a single metadata server managing the single name space. This demonsration aims to show very high data rate for local clusterd file systems integrated with a remote site into the same name space.

Participants

SDSC, NCSA, Intel, Foundry, Supermicro, DataDirect Networks, and CFS

Grid Technology Research Center Gfarm File System
This entry by the National Institute of Advanced Industrial Science and Technology of Japan (AIST) will replicate terabyte-scale experimental data between the United States and Japan. Currently, four clusters in Japan and one cluster in the U.S. are involved in this challenge. They constitute a Grid virtual file system (Gfarm file system) federating local file systems on each cluster node.

Clusters will be located on the SC2003 exhibit floor and at AIST, the Tokyo Institute of Technology, APAN Tokyo XP, and KEK. Between US and Japan, there are several OC-48 links. We are aiming at a file transfer rate of over 3 Gbps -- and preferably 5 Gbps --between the U.S. and Japan (about 10,000 km or 6,000 miles).

Participants

AIST: Osamu Tatebe

High Performance Grid-Enabled Data Movement with Striped GridFTP
This entry will demonstrate striped GridFTP between SDSC and the SDSC booth from an application. We will transfer several files, each over 1 TB in size from SDSC to the SDSC booth. These files will make use of the 40 nodes on each side to transfer the file. The file will be taken from the GPFS, broken up into 40 chunks, transferred to the 40 nodes at SDSC and then reattached. The GridFTP file transfer is integrated into VISTA, a rendering toolkit from SDSC, for data visualization.

Many applications create large amounts of data. To take advantage of different computational resources, applications may compute, visualize and mine this data at diverse geographically distributed sites over a grid. Example applications and data include the National Virtual Observatory and the Southern California Earthquake Center. Sharing data efficiently and effectively between sites is important to improving usability of the grid. Moving large amounts of data seamlessly, reliably and quickly is crutial to effective geographically distributed grid computing. By harnessing the power of multiple nodes (and multiple network interfaces) with the new striped GridFTP, data can be transferred in parallel efficiently. We will demonstrate this capability using real data sets and a real vizualization application for the bandwidth challenge.

We will create a grid site in SDSC's booth. This grid site consists of 40 IBM dual Intel 1.5 GHz Madison processor nodes. Each node is connected to Gigabit Ethernet (GE) and Storage Area Networks (SAN). The Gigabit Ethernet is connected to a Force10 switch which is connected to the SCinet network. The SAN adapters are connected to a Brocade switch which is connected to 40 TB of Sun StorEdge disks. These nodes mount a shared parallel filesystem (GPFS - IBM's General Parallel File System) which was created using the Sun disks.

The other grid site is SDSC, which has 40 IBM dual Intel 1.3 GHz Madison processor nodes. Again, each node is connected to GE and SAN. This GPFS consists of 77 TB of Sun storage. Data will be moved from the nodes in the booth to the nodes at SDSC.

The striped GridFTP server provides mulitple levels of parallelism. Parallel file systems allow parallel access to disk subsystems. The use of multiple nodes reading the data gives parallelism of CPU's, Network Interfaces, etc.. The client contacts a head node and carrys on a normal GridFTP session. The head node forwards the commands to the back end nodes which then form connections directly to the destination host.

The National Virtual Observatory (NVO) data cover the sky in different wavebands, from gamma and X-rays, optical, infrared, to radio. Various mosiacs and datasets are retrieved and processed or mined for specific research projects. The dataset used in the demonstration is from the Palomar Observatory and was moved to SDSC for data mining.

The Southern California Earthquake Center (SCEC) created data from surveys and computer simulations. The data from each time step are saved in many files and collected in groups for easy retrieval from archival storage and movement across the grid.

The application that renders the data from these data sets is called VISTA, part of the NPACI Scalable Volume Renderer. It can render any size volume using out-of-core paging.

Participants

San Diego Supercomputer Center: Mike Packard, Martin Margo, Bryan Banister, Don Thorp, Patricia Kovatch
Argonne National Laboratory: Bill Allcock

High Performance Grid-Enabled Data Movement with GPFS
This entry will demonstrate the IBM General Parallel File System (GPFS) operating between SDSC and the SDSC booth at SC2003 from an application. We will transfer several files, each over 1 TB in size from SDSC to the SDSC booth. These files will make use of the 40 nodes on each side to transfer the file. The file will be taken from the GPFS, broken up into 40 chunks, transferred to the 40 nodes at SDSC and then reattached. The GPFS file transfer is integrated into VISTA, a rendering toolkit from SDSC, for data visualization.

This demonstration uses the same applications and data files as the team's other demo, "High Performance Grid-Enabled Data Movement with Striped GridFTP."

By harnessing the power of a parallel filesystem (GPFS) accessible over the WAN, multiple nodes (and multiple network interfaces), data can be transferred in parallel efficiently. We will demonstrate this capability using real data sets and a real vizualization application for the bandwidth challenge.

In SDSC's booth at SC2003, we will create a grid site consisting of 40 IBM dual Intel 1.5 GHz Madison processor nodes. Each node is connected to Gigabit Ethernet (GE) and Storage Area Networks (SAN). The Gigabit Ethernet is connected to a Force10 switch which is connected to the SCinet network. The SAN adapters are connected to a Brocade switch which is connected to 40 TB of Sun StorEdge disks. These nodes mount the GPFS shared parallel filesystem which was created using the Sun disks.

The other grid site is SDSC, which has 40 IBM dual Intel 1.3 GHz Madison processor nodes. Again, each node is connected to GE and SAN. This GPFS consists of 77 TB of Sun storage. Data will be moved from SDSC to the nodes in the SDSC booth.

Participants

San Diego Supercomputer Center: Mike Packard, Martin Margo, Bryan Banister, Don Thorp, Patricia Kovatch
Argonne National Laboratory: Bill Allcock

Multi-Continental Telescience
This entry will showcase technology and partnerships encompassing Telescience, the Biomedical Informatics Research Network (BIRN), OptIPuter, and the Pacific Rim Applications and Grid Middleware Assembly (PRAGMA). By utilizing technology and resources from these different projects, we will demonstrate how high performance visualization, tele-instrumentation, and infrastructure for collaborative data-sharing all converge to solve multi-scale challenges in biomedical imaging. Specifically, we will demonstrate how network bandwidth and IPv6 can be effectively used to enhance the control of multiple high data-rate instruments of different modalities, enable interactive multi-scale visualization of data pulled from the BIRN Grid, and facilitate large-scale grid-enabled computation. The coordinated environment will include globally distributed resources and users, spanning multiple locations in the US, Argentina, Japan, Korea, the Netherlands, Sweden, and Taiwan.

Participants

UCSD BIRN: Steve Peltier, Abel Lin, David Lee, Mark Ellisman
UCSD SDSC: Tom Hutton
Universidad de Buenos Aires: Francisco Capani
Karolinska Instiute in Sweden: Oleg Shupliakov (connectivity via Nordunet)
Osaka Univeristy, Cybermedia Center: Shimojo Shinji, Tokokazu Akiyama
The Center for Ultra High Voltage Microscopy in Osaka: H. Mori
KDDI R&D Labs: USA division
NCHC in Taiwan: Fang-pang Lin

Project DataSpace
At the National Center for Data Mining, Project DataSpace is developing an open-source infrastructure based on high-performance web services and new network transport protocols for exploring and analyzing remote and distributed data.

For our Bandwidth Challenge demonstration, we will transport a terabyte of geoscience data between Amsterdam and Phoenix. We will also demonstrate high performance Web services for a distributed application involving astronomical data distributed between Chicago, Amsterdam, and Phoenix.

The contest entry will focus on two related metrics: (1) end-to-end performance -- the entry will show that we can scale bandwidth intensive applications by not only moving bits over a network but also by moving bits from source disk over the network to target disk to achieve high end to end performance for data intensive applications, and (2) scaling Web services -- we will demonstrate specialized Web services designed for distributed data-intensive applications, and we will build many distributed data intensive applications will be built using web services.

The data stack for Project DataSpace consists of three layers. In the data transport layer, we will use a high performance Web service based protocol called the DataSpace Transfer Protocol, or DSTP. In the network transport layer, we will use two application layer libraries for high performance network transport: SABUL and UDT, which were developed as part of Project DataSpace; both are based upon UDP and employ separate control and data channels, and both employ rate control and congestion control mechanisms so that they are friendly to both TCP, as well as other SABUL and UDT streams. In the path services layer, Project DataSpace can employ path services to set up lambdas on demand or be used with more traditional statically configured networks.

Participants

University of Illinois at Chicago: Robert L. Grossman, Yunhong Gu, David Hanley, Xinwei Hong, Michal Sabala.
Northwestern University: Joe Mambretti
University of Amsterdam: Cees de Laat, Freek Dijkstra, Hans Blom
SURFNet: Dennis Paus
Johns Hopkins University: Alex Szalay
Oak Ridge National Laboratory: Nagiza F. Samatova, Guru Kora