First NMADS Day

Home

Participating Institutions

Steering Committee

NMADS past events

Related events in the NYC Metro Area

Send mail to NMADS

 

March 24, 2000
IBM T. J. Watson Research Center
(Hawthorne Buildings)

[Directions]


[Schedule]
[List of attendees]
[Invited Talk - slides]


ABSTRACTS


Invited Talk
The Next Generation Internet: Unsafe at any speed?

Kenneth P. Birman , Cornell University

There is a debate concerning the future directions that the Internet should take. Support has been building for the Next Generation Internet (NGI) Initiative, which would greatly enhance the performance and scalability of the current Internet, while also providing improved Internet access for K-12 schools and investing in new kinds of Internet uses, such as for providing government services over the network. These include safety, revenue and life-critical systems which have traditionally been developed using special-purpose computing technology. The NGI envisions a steady transition of such systems onto a commercial off-the-shelf technology base. There really isn't any alternative; only standard technologies yield cost-effective systems. But this begs the question: What needs to be done to transform the current Internet into the NGI, if critical applications will run upon it? In particular, if the current Internet is unsafe for such applications, why should we expect the NGI to be more safe? The talk suggests that as things stand, the NGI will not be safe for critical uses, but also shows how a simple "virtual overlay network" capability could be developed to plug the major deficiency.


Short Talks


Using Jini for Complex Systems Integration 
(Ben Jai, Bell Labs)

I will describe the work that we have been doing using a Jini object model as the basis for Systems Integration, in particular where those systems are large, distributed, complex, and have legacy components. In contrast to other distributed object systems, Jini has the unique advantage that it is protocol-independent, meaning that any communication protocol can be encapsulated in the object's stub or proxy that a client obtains from the Lookup Service. This is immensely important for large systems that often have to support many co-existing protocols.
Contact Information:  benjai@research.bell-labs.com


Evolving the Publish-Subscribe Paradigm 
(Rob Strom, IBM T. J. Watson Research Center)

We discuss the work in the Gryphon project at IBM TJ Watson Research, to provide robust, scalable, internet-wide event distribution for business and personal applications. Gryphon began as an effort to extend conventional group-based publish-subscribe to content-based publish-subscribe. It is now being extended into a system allowing clients to subscribe to "interesting" state derived from published events and from other state. The design and implementation of Gryphon involves a challenging synthesis of concepts drawn from both database and distributed messaging technologies. We discuss recent results, and ongoing work.
Contact Information: strom@us.ibm.com


Computing Communities: Transparent Distribution Middleware 
(Vijay Karamcheti, New York University)

The heterogeneous and dynamic nature of  distributed systems makes it a challenging task to design performance-portable applications. Even when appropriate techniques have been developed (e.g., for fault-tolerant systems), they have seen restricted deployment because of the application development barrier: such techniques have typically required applications to adhere to a different API, ultimately limiting acceptance. The Computing Communities project is designing transparent virtualization and adaptation techniques which aim to address these shortcomings. These techniques lower the application development barrier by being embodied in a middleware layer that is inserted between the application and the base operating system (Windows NT in our case) using API interception techniques. We discuss recent results and ongoing work.
Contact Information: vijayk@cs.nyu.edu
URL: http://www.cs.nyu.edu/cc 


Robustness Testing of CORBA ORB Implementations 
(Jiantao Pan, AT&T Labs - Research)


Today the Common Object Request Broker Architecture (CORBA) is widely recognized as a standard for heterogeneous distributed computing and application level integration of COTS and legacy software components.
As the new standard for object management technology in the new century, CORBA and CORBA-enabled solutions have penetrated into many critical application areas, such as aerospace/defense, banking/finance, e-commerce. The robustness of these applications, however, is becoming a serious concern. Moreover, as the infrastructure that manages communication and requests, the Object Request Broker (ORB) implementation plays a critical role in the overall robustness of the whole system. Ungraceful handling of exceptional inputs may cause crashes of the whole application and result in huge loss. The Ballista® software robustness testing service provides a scalable, portable way of testing generic software
modules for robustness failures caused by exception handling problems. We have implemented a Ballista® client that can test the robustness of ORB implementations. This approach enables us to evaluate the robustness of a specific ORB implementation, to compare different ORB implementations provided by various vendors, and to enhance robustness of a specific ORB implementation.
Contact Information: jpan@research.att.com


Integrating Asynchronous Messaging and Distributed Object Transactions 
(Stefan Tai and Isabelle Rouvello, IBM T. J. Watson Research Center)

Messaging, and distributed transactions, describe two important models for building enterprise-scale software systems. Distributed object middleware provides services for transaction processing and for asynchronous messaging. However, the integration of distributed transactions and asynchronous messaging in distributed object systems is mostly undefined and unsupported. In this talk, we argue for a systematic and well-defined approach to integrating distributed transactions and messaging using middleware. We present first results from our project of developing a messaging and transaction integration facility, and outline important research topics that need to be addressed. 
Contact Information: stai@us.ibm.com


Perfect Failure Detection in Timed Systems with Hardware Watchdogs 
(Christof Fetzer, AT&T Labs - Research)

I address the problem of how a backup computer can decide when it has to take over the services of a primary computer. On one hand, the backup must not take over as long as the primary is alive because in the considered replication scheme the backup is taking over some of the network addresses of the primary. On the other hand, when the primary is crashed, the backup has to take over within a bounded amount of time (unless it has crashed itself). To solve this problem, I define a *timed perfect* failure detector TP that is very similar to a perfect failure detector. The main difference is that TP has to cope with computers recovering from crash failures and TP has to detect crashes/recoveries within a maximum known time. I propose a protocol that implements TP in timed systems with three computers (primary, backup, and a witness) if each of the three computers has a local hardware watchdog.
Contact Information: christof@research.att.com 


Exemptions from Coordination and the Origins of Primary Partition Behavior 
(Aleta Ricciardi, Bell Labs)

We describe a generalization of Uniform Distributed Coordination in which not just the faulty processes are exempt from participating. We define one exemption and show it is the origin of primary-partition behavior, seen in systems that are said to exhibit virtual synchrony. Our characterization of the exemption allows us to weaken this exemption still further to derive problems that are inherently solvable in systems that devolve to multiple partitions.
Contact Information: aleta@research.bell-labs.com 


Fast Deterministic Consensus in a Noisy Environment 
(James Aspnes, Yale University)

It is well known that the consensus problem cannot be solved deterministically in an asynchronous environment, but that randomized solutions are possible. We propose a new model we call noisy scheduling in which an adversarial schedule is perturbed randomly, and show that in this model randomness in the environment can substitute for randomness in the algorithm. In particular, we show that a simplified, deterministic version of Chandra's wait-free shared-memory consensus algorithm solves consensus in time at most logarithmic in the number of active processes. The proof of
termination is based on showing that a race between independent delayed renewal processes produces a winner quickly. In addition, we show that the protocol finishes in constant time using quantum and priority-based scheduling on a uniprocessor, suggesting that it is robust against the choice of model over a wide range.
Contact Information: aspnes@cs.yale.edu 


Fault Tolerant HLRC Distributed Shared Memory 
(Florin Sultan, Thu Nguyen, and Liviu Iftode, Rutgers University)

We describe a design for integrating fault tolerance in a distributed shared memory (DSM) system based on a Home-based Lazy Release Consistency (HLRC) protocol. Fault tolerant HLRC uses checkpointing and volatile memory logging for rollback recovery from single node failures. HLRC has been known as a highly efficient DSM protocol, and therefore any design for fault tolerance support should not adversely impact its performance. Our recovery support introduces low overhead during failure-free operation by using volatile memory for logging, and by aggressively exploiting the base DSM protocol. The main goal of this work is to investigate the feasibility and efficiency of building a fault-tolerant DSM system that can recover from failures using only independent checkpointing and fast log-based recovery.
Independent checkpointing is particularly interesting for computing on large local clusters and on clusters interconnected by a wide area network, since taking coordinated checkpoints in such cases may be either too expensive or simply impractical. Our approach is motivated by the concept of meta-clustering (clusters of local-area clusters connected by the Internet), pursued by projects like InterWeave (U. of Rochester) and Legion (U. of Virginia). The challenge in building such a system lies in controlling the size of the logged and checkpointed state, without forcing processes to synchronize. We have designed a scheme for lazy log trimming (LLT) and checkpoint garbage collection (CGC) in a fault-tolerant HLRC DSM system that does not use global synchronization and therefore retains the advantages of independent checkpointing. We have extended a Home-base LRC system to include our LLT and CGC algorithms along with a local checkpointing policy, and obtained results indicating that log and checkpoint
management can be effectively performed under this scheme. For three update-intensive applications, we show that the amount of logged and checkpointed state can be controlled without recourse to global synchronization operations. For high efficiency, our DSM system is implemented on top of an efficient user-level virtual memory-mapped communication system. An important problem that we had to address was how to provide the minimal support required for recovering the communication state after the crash of a node. All our checkpointing, logging and log-management algorithms are light-weight, do not add extra messages to the base DSM protocol, and do not specifically require global synchronization.
Contact Information: sultan@paul.rutgers.edu 


Automatic Configuration and Run-time Adaptation of Distributed Applications 
(Fangzhe Chang and Vijay Karamcheti, New York
University)

Increased platform heterogeneity and varying resource availability in distributed systems motivates the design of resource-aware applications, which ensure a desired performance level by continuously adapting their behavior to changing resource characteristics. In this talk, we describe an application-independent adaptation framework that simplifies the design of resource-aware applications. This framework eliminates the need for adaptation decisions to be explicitly programmed into the application by relying on two novel components: (1) a tunability interface, which exposes
adaptation choices in the form of alternate application configurations while encapsulating core application functionality; and (2) a virtual execution environment, which emulates application execution under diverse resource availability enabling off-line collection of information about resulting behavior. Together, these components permit automatic run-time decisions on "when" to adapt by continuously monitoring resource conditions and application progress, and "how" to adapt by dynamically choosing an application configuration most appropriate for the prescribed user preference. We have evaluated the framework using an interactive distributed image visualization application. The framework permits successful automatic application adaptation in reaction to changes in CPU load and network bandwidth by choosing a different compression algorithm or controlling the image transmission sequence so as to satisfy user
preferences of visualization quality and timeliness.
Contact Information: fangzhe@cs.nyu.edu
URL: http://www.cs.nyu.edu/cc 


A Component Description Repository 
(Francisco Curbera, Sanjiva Weerawarana and Matthew Duftler, IBM T. J. Watson Research Center)

One of the most powerful factors in the development of open industrial sectors is the standardization of industrial components. This enables industry players to unbind their operations from individual suppliers, and to take advantage of an open market of commoditized components. Take the case of the new economy now developing on the Internet. Here, the central components are electronic services, and several standardization processes are on their way to produce common service interfaces in a wide variety of
sectors. From a developer?s perspective, commoditized services can be viewed as standardized software components which are accessed remotely. The ability to dynamically switch suppliers is provided here by late binding to remote components. In this talk we will present a mechanism for dynamic binding to remote software components built around a networked repository of component interface descriptions and a Java API for dynamic resolution of component references. Three key characteristics distinguish this system from related industry initiatives no assumptions are made about the protocol for component access, rather, this is part of the component?s description; component interfaces are described in an XML IDL language; finally, matching component interfaces is done by structural equivalence.
Contact Information: {curbera, sanjiva, duftler}@us.ibm.com 


Cluster-Based WWW Servers 
(Enrique Vinicio Carrera and Ricardo Bianchini, Rutgers University)


In this talk I will describe and evaluate a new cluster-based network server, the Locality and Load balancing Server (L2S). In essence, L2S uses cache (memory) locality and load balancing information to distribute requests across the cluster nodes. All cluster nodes can both forward and service requests and, thus, the server can achieve high throughput and has no single point of failure or potential bottleneck. L2S can be used to implement a variety of cluster-based network servers, such as ftp or WWW servers, but our evaluation of L2S concentrates on its application as a WWW server. The evaluation is based on the detailed simulation of a 16-node cluster and on a performance comparison of L2S against a traditional server and the
best-known locality-conscious server. Our simulation results demonstrate that L2S achieves throughput that is within 20\% of the full potential of locality-conscious distribution on 16 nodes, significantly outperforming and outscaling the other servers. Preliminary results of a native implementation of L2S are also promising.
Contact Information: vinicio@cs.rutgers.edu 


Prefetching Hyperlinks 
(Dan Duchamp, AT&T Labs - Research)

We develop a new method for prefetching Web pages into the browser's cache. Clients send information about hyperlink reference patterns to Web servers, which aggregate the reference information in near-real-time and then disperse the aggregated information to all clients, piggybacked on GET responses. The information indicates how often hyperlink URLs embedded in pages have been previously accessed relative to the embedding page. Based on knowledge about which hyperlinks are generally popular, clients initiate prefetching of the hyperlinks and their embedded images according to any algorithm they prefer. Both client and server may cap the prefetching mechanism's space overhead and waste of network resources due to speculation. The result of these differences is improved prefetching lower client latency (by 52.3\%) and less wasted network bandwidth (24.0\%).
Contact Information: duchamp@research.att.com 


Posters


DataSpace - querying and monitoring deeply networked collections in
physical space 

(Samir Goel and Tomasz Imielinski, Rutgers University)

The DataSpace is a three dimensional physical space 100 kilometers above and 10 kilometers below the surface of earth that is accessible to the network. It is addressed geographically as opposed to the current ``logical'' addressing scheme of the Internet. With the enormous 128 bit addressing space of IPv6, one can individually address every cubic centimeter of the physical space in DataSpace with approximately 90 bits of area code. This would include every street, building, room, basement or even drawer of a desk. The DataSpace would thus serve as the host for the entire part of the
physical world that is connected to the network. The DataSpace is populated by a massive number of objects that produce and locally store data about themselves. In the DataSpace, physical objects are no longer characterized just by shape, size, and color. They are also characterized by processor type, the amount of memory and the network connection. To support the DataSpace, we propose a version of the multicast protocol called ``spacecast''. Here, the network plays the role of a Database machine, handling queries through spacecast which ``illuminate'' selected datacubes and gather multiple responses from the objects that respond to the query. The objects in a DataSpace must be organized in a way that they can be usefully accessed and manipulated. This is achieved by grouping objects into classes called dataflocks. Dataflocks are often mobile class of objects that move through the physical world while still maintaining (some level of) connectivity to the network. Dataflocks are accessed through the
datacubes they are located in and through their own class properties. For new objects, becoming part of a datacube is a simple ``plug and play'' mechanism. In this work we build on earlier work with Navas where we introduced GeoCast for geographic messaging and have implemented it within DARPA's GLOMO program.
Contact Information: {gsamir, imielins}@cs.Rutgers.edu
URL: http://paul.rutgers.edu/~gsamir/dataspace/ 


Transparent Network Connectivity in Dynamic Cluster Environments
(Xiaodong Fu, Hua Wang, and Vijay Karamcheti, New York University)

Improvements in microprocessor and networking performance have made networks of workstations a very attractive platform for high-end parallel and distributed computing. However, the effective deployment of such environments requires addressing two problems not associated with dedicated parallel machines: heterogeneous resource capabilities and dynamic availability. Achieving good performance requires that application components be able to migrate between cluster resources and efficiently adapt to the underlying resource capabilities. An important component of the required support is maintaining network connectivity, which directly impacts on the transparency of migration to the application and its performance after migration. Unfortunately, existing approaches rely on either extensive operating system modifications or new APIs to maintain network connectivity, both of which limits their wider applicability. This poster presents the design, implementation, and performance of a transparent network connectivity layer for dynamic cluster environments. Our design uses the techniques of API interception and virtualization to construct a transparent layer in user space; use of the layer requires no modification either to the application or the underlying operating system and messaging layers. Our layer enables the migration of application components without breaking network connections, and additionally permits adaptation to the characteristics of the underlying networking substrate. Experiments with supporting a persistent socket interface in two environments---an Ethernet LAN on top of TCP/IP, and a Myrinet LAN on top of Fast Messages---show that our approach incurs minimal overheads and can effectively select the best substrate for implementing application communication requirements.
Contact Information: {xiaodong, wanghua, vijayk}@cs.nyu.edu
URL: http://www.cs.nyu.edu/cc 


Software DSM on PC clusters using the Virtual Interface Architecture
(Murali Rangarajan, Zoltan Jarai, Ricardo Bianchini, Thu Nguyen, and
Liviu Iftode, Rutgers University)

Our goal is to provide a shared address space over a cluster of PCs using a software DSM protocol and a low-latency high-speed interconnect supporting the Virtual Interface Architecture (VIA). VIA standardizes the interface for user-level memory mapped communication over high-performance System Area Networks(SAN). The focus of our project is on providing a high-performance DSM system that can be used for scientific as well as non-scientific applications like data mining and continuous media applications. We have implemented a Home-based Lazy Release Consistent (HLRC) DSM protocol on a cluster of eight PCs connected by a VIA-based Giganet network. Our preliminary results indicate that software DSM on VIA achieves good performance that compares well with that achieved by similar protocols on Myrinet-based networks.
Contact Information: muralir@paul.rutgers.edu 


Modeling Cluster Failures 
Taliver Heath and Rich Martin, Rutgers University)

Our work investigates the distribution of failure rates for a variety of cluster components, including operating system failures, software failures, and component failures. In particular, we focus on modeling the failure rate of the operating system. Our primary method is empirical observation of systems running in a variety of environments, ranging from desktop PCs to clusters of Suns servers housed in strictly controlled machine rooms.
Contact Information: taliver@paul.rutgers.edu 


Split-OS 
(Matias Cuenca, Kiran Nagaraj, Kiran Srinivasan, Florin Sultan, Ricardo Bianchini, Richard Martin, Thu D. Nguyen and Liviu Iftode, Rutgers University)

The introduction of switched-based scalable I/O architecture for servers, like InfiniBand, and the increasing interest for "intelligent" devices with RISC processors and memory on board are creating new opportunities for OS designers. In particular, by redesigning the OS, we may be able to improve system reliability and increase performance through the aggressive use of device intelligence. In Split-OS, we are redesigning the traditional OS by splitting (and possibly replicating) OS functions and state between the host and the intelligent devices. Our current approach is to simulate this expected server architecture using a cluster of PCs interconnected by Virtual Interface Architecture (VIA). In this setup, each PC emulates a particular device. We are working to define a communication protocol between devices using VIA channels in anticipation of the InfiniBand specifications release.
Contact Information: mcuenca@paul.rutgers.edu


Co-operative File Systems for clusters
Matias Cuenca, Suresh Gopalakrishnan, Chandrashekar Krishnan, Kiran Nagaraja, Rahul Pupala, Kiran Srinivasan, Qian Zhang, Ricardo Bianchini, Liviu Iftode, Thu Nguyen - Rutgers University

A cooperative file system is a user-level distributed file system created ad-hoc for a distributed application by merging the local file systems of the cluster nodes. The merging is lazily performed, temporary (application lifetime), volatile (in memory, not on disks), local to the application , and dynamically reconfigurable. The system is called cooperative because local file systems can "help" each other by migrating or replicating hot files among them for better load balacing and fault tolerance. Central to the design is the concept of virtual directories provide location independent file naming and consequently support file migration. The main application for CFS are the distributed servers like file servers, web servers, video servers, email servers etc. Today, servers are extremely loaded and their I/O activity can be easily imbalanced. CFS supports dynamic balacing of the disc I/O load accross the cluster using file migration and replication. We are currently implementing a CFS prototype that uses Virtual Interface Architecture (VIA) to interconnect 8 PCs into a cluster. Our first application is a NFS distributed server. We will be exploring various policies for migration and studying the impact of caching policies on performance. 
Contact Information: gsuresh@cs.rutgers.edu

 

For problems or questions regarding this web page contact [discolab@cs.rutgers.edu].
Last updated: September 06, 2001.