Department of Systems and Computer Engineering


Real-Time and Distributed Systems Group

Software Engineering

Performance Engineering


We are in the Department of Systems and Computer Engineering at Carleton University, in Ottawa, Canada, and we study software engineering, especially software architecture and design and system operation for communications systems and for distributed systems. We develop techniques for visualizing modelling and evaluating designs, for making design decisions. Design for real-time performance is a major concern. Distributed applications and high-level protocols are the application focus.

Below you will find our

and on nearby pages we have material on Layered Queueing Networks, which we use to build analytic software performance models: You can send us a message at cmw@sce.carleton.ca or at mailto:majumdar@sce.carleton.ca.

Research Areas

Software Architecture

We seek to understand the implication of different real-time systems, especially in communications, and for open distributed systems. Our goal is to match an architecture to the system requirements, and to evaluate implementation architectures (e.g. for scalability).

Use Case Maps

Use Case Maps abstract from a set of use cases to give a total description of behaviour as sequences of responsibilities. UCMs can be mapped over an implementation architecture, and can be used to systematically derive an architecture. While UCMs are themselves graphical and appear to be informal, a formalization has been developed using LOTOS (a process algebra for telecom systems) More...

Downloadable publications

Design of Event-driven Software

Event-driven systems typically must support a range of services which may be specified by scenarios, and which must be mapped to concurrent processes. Large systems are intrinsically difficult to understand, leading to errors in design or redesign for maintenance, and to inconsistent patterns of behaviour which are yet more difficult to understand. This aspect of our research develops design visualizations for understanding, and design principles based on the visualization. Our goal has been a machine-supported "Software CAD".

Work in this area has generated graphical notations for structure (Buhr diagrams, Machine Charts) described in Practical Visual Techniques in System Design with a methodology for developing concurrent objects, and for desired behaviours described in UCM.

Software Performance Engineering

There are many problems and few systematic guidelines for producing software with good performance, particularly for concurrent software. It is shown in MPR how concurrent system performance is viewed in three different ways (called the Map, Path and Resource Views), which must be combined to give a full understanding. Performance engineering processes are suggested which move between views in various ways.

We are developing methods for predicting the performance of a concurrent system, based on measured or estimated workload parameters for components, and a description of how they are combined (and controlled) together in the system. We use a framework of Layered Service Systems to develop insights, together with a performance model for prediction, called Layered Queueing Networks, or Rendezvous Networks. This frameowrk is very general and describes most distributed systems, and many other event-driven systems. Performance models can be built either by analysis or by capturing behaviour traces.

Layered Queueing Networks: Approximate Solution

Layered queueing networks (LQNs) describe systems with software servers and logical resources. A server may itself act as a customer. Our earlier work on this model referred to "Active Servers", "Lazy Boss Model", "Rendezvous Networks" and the "Method of Layers." The general techniques of mean value analysis for queues have been adapted to provide approximations which have accuracies ranging from 1-2% (usual) to 10%. A second approach called "Task Directed Aggregation" derives the analysis directly from the structure of a Markov model.

LQNs have a great advantage over the competing models (Petri nets, Markov chains, timed process algebras) that they scale up to large systems with dozens or hundreds of cooperating processes. However the description of the program behaviour must be adapted to the layered software service paradigm. Extensive tools have been constructed.

1. SRVN has a graphical interface and solves models by Rendezvous Nets, Petri Nets, Task-Directed Aggregation, or Simulation.

2. MOL solves by MOL or simulation.

3. LQNS, combining the MVA features of SRVN and MOL.

4. lqndef

5. MultiSRVN generates one and two-parameter sensitivity studies for any of solvers 1, 2, 3.

6. LSG creates a synthetic set of tasks.

7. Performance Bounds using CLP.

Model-Building and Architecture Discovery

A process called TLC (Trace-based Load Characterization) is being developed to determine the service relationships in a system with many processes . It can analyze a design at many stages of development, from scenarios or Use Cases in requirements, to executable models, prototypes and full-scale implementations, using special causal "angio traces". Traces are being obtained from design tools, simulators, and UNIX and DCE-based systems. A layered system can be described by partitioning a network of objects. . The resulting model can be solved as an LQN, and there is a bounding analysis which derives software concurrency requirements.

Component and System Measurements are often poorly documented, unrepeatable and are seldom maintained for long term use. A workbench has been designed to make the job of characterizing components by workload parameters like CPU usage routine, reliable and easier, with a repository to keep data for later use. Also, using DECALS, another tool which loads, runs and monitors distributed system experiments, we have been measuring performance in different client-server architectures, including deeply layered systems. Similar measurements have also been undertaken to analyze distributed system midware overheads as workload elements (PMMC).

Interval-Based Performance Analysis

A conventional analytic model used for evaluating the performance of computer and communication systems accepts single values as model inputs and computes a single value for each performance measure of interest. However uncertainties regarding parameter values exist in different situations such as during early stages of system design. Although the clients in a system are statistically identical data dependency and other factors can introduce variabilities in service demands for the devices in a system. We propose to associate intervals or ranges of values and a probability of ocuurence with parameters of interest. The work on interval-based performance analysis is concerned with extending existing analytic models such as queueing networks to handle interval parameters.

Resource Management

The availability of multiple processors and tools for the development of concurrent application software is increasing the demand for parallel and distributed systems. Appropriate resource management strategies are crucial for harnessing the power of the underlying hardware. Research is underway in the following domains: processor management in shared memory and distributed memory systems, performance enhancement of client-server systems through server scheduling and multithreading, as well as management of parallel I/O.

Performance Measurement

We have found special problems in new kinds of software, due to concurrency and distribution. Complex Performance Measurements with NICE (Notation for Interval Combinations and Events) is a language (including a tool called Finale) which identifies complex subsequences of events and bases performance metrics on them. DECALS loads and controls experiments on a heterogeneous UNIX network, and monitors and collects events. It feeds data to Finale for estimation of metrics, or to XTG for browsing and visualization.

Funded Research Projects

Telecom Software Methods (TRIO)

This project investigates architectures for telecommunications software, and models for performance that mimic the architecture. The current goals are to characterize software componenents and their architectural assumptions and to predict performance of distributed systems using analytic models.

A central feature of the performance research is the layered queueing network model. It is being validated against various kinds of distributed software including synthetic task systems and systems using midware such as DCE and CORBA.

The project has produced several software tools including Layered Queueing Network Solver (ref. to Tools95), the X-windows Timeline Generator (ref ? xtg by Karam in BIB.bib) and the Parasol (ref to user guide?) simulator. Parasol is an extremely efficient simulator for multitasking software running on distributed platforms, which has been used to create a simulated development environment for the Alex parallel computer.

TimeBench and MachineCharts

This work is essentially completed. It includes MachineCharts model for architectural description of concurrent software, as summarized in the book "Practical Visual Techniques in System Design" and a software tool called TimeBench. TimeBench includes formal graphical definition of structure, definition of behaviour by state-machine submodels, and code generation in Ada and C. It is described by the TimeBench User Guide(ref this in biblio) and the software is available. More...

Design of Object Oriented Real-Time Systems (DOORS)

The goal of this project is to record intended behaviour and its relationship to the modules of a system design. Use Case Maps describe the sequences of responsibilities and activities. Methods for developping a software design from Use Case Maps are the current research focus. More...

Systems Performance Analysis with Concurrent Entities (SPACE)

To make predictive models of performance more practical for working software engineers, this project has developed a technique for building layered queueing models of distributed software systems from special traces called "Angio traces" (ref. to ANGIO95). The tracing capability has been installed in the ObjecTime CASE tool, in Unix and other execution environments.

Management of Distributed Applications Systems (MANDAS)

This project, involving six universities and the IBM Center for Advanced Studies, is developing an approach to managing distributed applications built on top of DCE midware. Management will include the capability to recognize performance problems and diagnose their causes. (ref to DSOM95 -- a survey paper on the project) More ...

Bibliography

Group Members


Real Time and Distributed Systems Group
Last modified: Mon Oct 6 12:45:41 EDT 1997