Introduction to Analysis

Chapter 2: Introduction to Analysis

Purpose
Models
Process
Summary

Purpose

The analysis phase of object-oriented software development is an activity aimed at clarifying the requirements of an intended software system, with:

Input:: A fuzzy, minimal, possibly inconsistent target specification, user policy and project charter.
Output:: Understanding, a complete, consistent description of essential characteristics and behavior.
Techniques:: Study, brainstorming, interviewing, documenting.
Key notion for the descriptions:: Object.

The final item in this list distinguishes object-oriented analysis from other approaches, such as Structured Analysis [14] and Jackson's method [6].

Constructing a complex artifact is an error-prone process. The intangibility of the intermediate results in the development of a software product amplifies sensitivity to early errors. For example, Davis [4] reports the results of studies done in the early 1970's at GTE, TRW and IBM regarding the costs to repair errors made in the different phases of the life cycle. As seen in the following summary table, there is about a factor of 30 between the costs of fixing an error during the requirement phase and fixing that error in the acceptance test, and a factor of 100 with respect to the maintenance phase. Given the fact that maintenance can be a sizable fraction of software costs, getting the requirements correct should yield a substantial payoff.

Development Phase	Relative Cost of Repair
Requirements	0.1 -- 0.2
Design	0.5
Coding	1
Unit test	2
Acceptance test	5
Maintenance	20

Further savings are indeed possible. Rather than being aimed at a particular target system, an analysis may attempt to understand a domain that is common to a family of systems to be developed. A domain analysis factors out what is otherwise repeated for each system in the family. Domain analysis lays the foundation for reuse across systems.

Input

There are several common input scenarios, generally corresponding to the amount of ``homework'' done by the customer:

At one extreme, we can have as input a ``nice idea''. In this case, the requirements are most likely highly incomplete. The characterization of the ATM system in Chapter 1 is an example. The notion of a bank card (or any other technique) to be used by a customer for authentication is not even mentioned. In this case, elaboration on the requirements is a main goal. Intensive interaction between analyst and client will be the norm.
In the ideal case, a document may present a ``totally'' thought-through set of requirements. However, ``totally'' seldom means that the specification is really complete. ``Obvious'' aspects are left out or are circumscribed by reference to other existing systems. One purpose of the analysis is to make sure that there are indeed no surprises hiding in the omissions. Moreover, a translation into (semi) formal notations is bound to yield new insights in the requirements of the target system.
In another scenario, the requirements are not yet complete. Certain trade-offs may have been left open on purpose. This may be the case when the requirements are part of a public offering for which parties can bid. For instance, we can imagine that our ATM example is a fragment of the requirements formulated by a bank consortium. Since the different members of the consortium may have different regulations, certain areas may have been underdefined and left to be detailed in a later phase. A main aim of the analysis will be the precise demarcation of these ``white areas on the map''.
A requirements document may propose construction of a line of products rather than one system in particular. This represents a request for an OO domain analysis. Domain analysis specifies features common to a range of systems rather than, or in addition to, any one product. The resulting domain characterization can then be used as a basis for multiple target models. Domain analysis is discussed in more detail in Chapter 13. Until then, we will concentrate most heavily on the analysis of single target systems. However, we also note that by nature, OO analysis techniques often generate model components with applicability stretching well beyond the needs of the target system under consideration. Even if only implicit, some form and extent of domain analysis is intrinsic to any OO analysis.

Across such scenarios we may classify inputs as follows:

Functionality:: Descriptions that outline behavior in terms of the expectations and needs of clients of a system. (``Client'' is used here in a broad sense. A client can be another system.)
Resource:: Descriptions that outline resource consumptions for the development of a system (or for a domain analysis) and/or descriptions that outline the resources that an intended system can consume.
Performance:: Descriptions that constrain acceptable response-time characteristics.
Miscellaneous:: Auxiliary constraints such as the necessity for a new system to interface with existing systems, the dictum that a particular programming language is to be used, etc.

Not all these categories are present in all inputs. It is the task of the analyst to alert the customer when there are omissions.

As observed by Rumbaugh et al [10], the input of a fuzzy target specification is liable to change due to the very nature of the analysis activity. Increased understanding of the task at hand may lead to deviations of the initial problem characterization. Feedback from the analyst to the initiating customer is crucial. Feedback failure leads to the following consideration [10]: If an analyst does exactly what the customer asked for, but the result does not meet the customer's real need, the analyst will be blamed anyway.

Output

The output of an analysis for a single target system is, in a sense, the same as the input, and may be classified into the same categories. The main task of the analysis activity is to elaborate, to detail, and to fill in ``obvious'' omissions. Resource and miscellaneous requirements often pass right through, although these categories may be expanded as the result of new insights obtained during analysis.

The output of the analysis activity should be channeled in two directions. The client who provided the initial target specification is one recipient. The client should be convinced that the disambiguated specification describes the intended system faithfully and in sufficient detail. The analysis output might thus serve as the basis for a contractual agreement between the customer and a third party (the second recipient) that will design and implement the described system. Of course, especially for small projects, the client, user, analyst, designer, and implementor parties may overlap, and may even all be the same people.

An analyst must deal with the delicate question of the feasibility of the client's requirements. For example, a transportation system with the requirement that it provide interstellar travel times of the order of seconds is quite unambiguous, but its realization violates our common knowledge. Transposed into the realm of software systems, we should insist that desired behavior be plausibly implementable.

Unrealistic resource and/or performance constraints are clear reasons for nonrealizability. Less obvious reasons often hide in behavioral characterizations. Complex operations such as Understand, Deduce, Solve, Decide, Induce, Generalize, Induct, Abduct and Infer are not as yet recommended in a system description unless these notions correspond to well-defined concepts in a certain technical community.

Even if analysts accept in good faith the feasibility of the stated requirements, they certainly do not have the last word in this matter. Designers and implementors may come up with arguments that demonstrate infeasibility. System tests may show nonsatisfaction of requirements. When repeated fixes in the implementation and/or design do not solve the problem, backtracking becomes necessary in order to renegotiate the requirements. When the feasibility of requirements is suspect, prototyping of a relevant ``vertical slice'' is recommended. A mini-analysis and mini-design for such a prototype can prevent rampant throwaway prototyping.

Models

Most attention in the analysis phase is given to an elaboration of functional requirements. This is performed via the construction of models describing objects, classes, relations, interactions, etc.

Declarative Modeling

We quote from Alan Davis [4]:

A Software Requirements Specification is a document containing a complete description of what the software will do without describing how it will do it.

Subsequently, he argues that this what/how distinction is less obvious than it seems. He suggests that in analogy to the saying ``One person's floor is another person's ceiling'' we have ``One person's how is another person's what''. He gives examples of multiple what/how layers that connect all the way from user needs to the code.

We will follow a few steps of his reasoning using the single requirement from our ATM example that clients can obtain cash from any of their accounts.

Investigating the desired functionality from the user's perspective may be seen as a definition of what the system will do.
The ability of clients to obtain cash is an example of functionality specified by the user.
``The next step might be to define all possible systems ... that could satisfy these needs. This step clearly defines how these needs might be satisfied. ...''
The requirements already exclude a human intermediary. Thus, we can consider different techniques for human-machine interaction, for example screen and keyboard interaction, touch screen interaction, audio and voice recognition. We can also consider different authentication techniques such as PIN, iris analysis, handline analysis. These considerations address the how dimension.
``On the other hand, we can define the set of all systems that could possibly satisfy user needs as a statement of what we want our system to do without describing how the particular system ... will behave.''
The suggestion to construct this set of all systems (and apply behavior abstraction?) strikes us as artificial for the present discussion, although an enumeration of techniques may be important to facilitate a physical design decision.
``The next step might be to define the exact behavior of the actual software system to be built ... This step ... defines how the system behaves ...''
This is debatable and depends on the intended meaning of ``exact behavior''. If this refers to the mechanism of the intended system then we subscribe to the quotation. However, it could also encompass the removal of ambiguity by elaborating on the description of the customer-ATM interaction. If so, we still reside in what country. For example, we may want to exemplify that customer authentication precedes all transactions, that account selection is to be done for those customers who own multiple accounts, etc.
``On the other hand, we can define the external behavior of the actual product ... as a statement of what the system will do without defining how it works internally.''
We may indeed be more specific by elaborating an interaction sequence with: ``A client can obtain cash from an ATM by doing the following things: Obtaining proper access to the ATM, selecting one of his or her accounts when more than one owned, selecting the cash dispense option, indicating the desired amount, and obtaining the delivered cash.'' We can go further in our example by detailing what it means to obtain proper access to an ATM, by stipulating that a bank card has to be inserted, and that a PIN has to be entered after the system has asked for it.
``The next step might be to define the constituent architectural components of the software system. This step ... defines how the system works internally ...''
Davis continues by arguing that one can define what these components do without describing how they work internally.

In spite of such attempts to blur how versus what, we feel that these labels still provide a good initial demarcation of the analysis versus the design phase.

On the other hand, analysis methods (and not only OO analysis methods) do have a how flavor. This is a general consequence of any modeling technique. Making a model of an intended system is a constructive affair. A model of the dynamic dimension of the intended system describes how that system behaves. However, analysts venture into how-country only to capture the intended externally observable behavior, while ignoring the mechanisms that realize this behavior.

The object-oriented paradigm puts another twist on this discussion. OO analysis models are grounded in object models that often retain their general form from analysis (via design) into an implementation. As a result, there is an illusion that what and how get blurred (or even should be blurred). We disagree with this fuzzification. It is favorable indeed that the transitions between analysis, design, and implementation are easier (as discussed in Chapter 15), but we do want to keep separate the different orientations inherent in analysis, design, and implementation activities.

We should also note that the use of models of any form is debatable. A model often contains spurious elements that are not strictly demanded to represent the requirements. The coherence and concreteness of a model and its resulting (mental) manipulability is, however, a distinct advantage.

Objects in Analysis

OO analysis models center on objects. The definition of objects given in Chapter 1 is refined here for the analysis phase. The bird's eye view definition is that an object is a conceptual entity that:

is identifiable;
has features that span a local state space;
has operations that can change the status of the system locally, while also inducing operations in peer objects.

Since we are staying away from solution construction in the analysis phase, the objects allowed in this stage are constrained. The output of the analysis should make sense to the customer of the system development activity. Thus we should insist that the objects correspond with customers' notions, and add:

an object refers to a thing which is identifiable by the users of the target system -- either a tangible thing or a mental construct.

Another ramification of avoiding solution construction pertains to the object's operator descriptions. We will stay away from procedural characterizations in favor of declarative ones.

Active Objects

Some OO analysis methods have made the distinction between active and passive objects. For instance Colbert [3] defines an object as active if it ``displays independent motive power'', while a passive object ``acts only under the motivation of an active object''.

We do not ascribe to these distinctions, at least at the analysis level. Our definition of objects makes them all active, as far as we can tell. This active versus passive distinction seems to be more relevant for the design phase (cf., Bailin [1]).

This notion of objects being active is motivated by the need to faithfully represent the autonomy of the entities in the ``world'', the domain of interest. For example, people, cars, accounts, banks, transactions, etc., are all behaving in a parallel, semi-independent fashion. By providing OO analysts with objects that have at least one thread of control, they have the means to stay close to a natural representation of the world. This should facilitate explanations of the analysis output to a customer. However, a price must be paid for this. Objects in the programming realm deviate from this computational model. They may share a single thread of control in a module. Consequently, bridging this gap is a major responsibility of the design phase.

Four-Component View

A representation of a system is based on a core vocabulary. The foundation of this vocabulary includes both static and dynamic dimensions. Each of these dimensions complements the other. Something becomes significant as a result of how it behaves in interaction with other things, while it is distinguished from those other things by more or less static features. This distinction between static and dynamic dimensions is one of the axes that we use to distinguish the models used in analysis.

Our other main distinction refers to whether a model concentrates on a single object or whether interobject connections are addressed. The composition of these two axes give us the following matrix:

	inside object	between objects
static	attribute constraint	relationship acquaintanceship
dynamic	state net and/or interface	interaction and/or causal connection

Detailed treatments of the cells in this matrix are presented in the following chapters.

The static row includes a disguised version of entity-relationship (ER) modeling . ER modeling was initially developed for database design. Entities correspond to objects, and relations occur between objects.¹ Entities are described using attributes . Constraints capture limitations among attribute value combinations. Acquaintanceships represent the static connections among interacting objects.

¹Footnote:
The terms ``relation'' and ``relationship'' are generally interchangeable. ``Relationship'' emphasizes the notion as a noun phrase.

The dynamic row indicates that some form of state machinery is employed to describe the behavior of a prototypical element of a class . Multiple versions of causal connections capture the ``social'' behavior of objects.

Inheritance impacts all four cells by specifying relationships among classes. Inheritance allows the construction of compact descriptions by factoring out commonalities.

Other Model Components

The four models form a core. Additional models are commonly added to give summary views and/or to highlight a particular perspective. The core models are usually represented in graphic notations. Summary models are subgraphs that suppress certain kinds of detail.

For instance, a summary model in the static realm may remove all attributes and relationship interconnections in a class graph to highlight inheritance structures. Alternatively, we may want to show everything associated with a certain class C, for example, its attributes, relationships in which C plays a role, and inheritance links in which C plays a role.

An example in the dynamic realm is a class interaction graph where the significance of a link between two classes signifies that an instance of one class can connect in some way or another with an instance of another class. Different interaction mechanisms can give rise to various interaction summary graphs. Another model component can capture prototypical interaction sequences between a target system and its context. Jacobson [7] has labeled this component use cases . They are discussed in Chapters 10 and 12.

All of these different viewpoints culminate in the construction of a model of the intended system as discussed in Chapter 10.

Process

Several factors prevent analysis from being performed according to a fixed regime. The input to the analysis phase varies not only in completeness but also in precision. Backtracking to earlier phases is required to the extent of the incompleteness and the fuzziness of the input. Problem size, team size, corporate culture, etc., will influence the analysis process as well.

After factoring out these sources of variation, we may still wonder whether there is an underlying ``algorithm'' for the analysis process. Investigation of current OO analysis methods reveals that:

The creators of a method usually express only a weak preferences for the sequence in which models are developed.
There is as yet no consensus about the process.
There appear to be two clusters of approaches: (1) Early characterization of the static dimension by developing a vocabulary in terms of classes, relations, etc. (2) Early characterization of the behavioral dimension, the system-context interactions.

We have similarly adopted a weak bias. Our presentation belongs to the cluster of methods that focuses on the static dimension first and, after having established the static models, gives attention to the dynamic aspects. However, this position is mutable if and when necessary. For instance in developing large systems, we need top-down functional decompositions to get a grip on manageable subsystems. Such a decomposition requires a preliminary investigation of the dynamic realm. Chapter 9 (Ensembles) discusses these issues in more detail. A prescribed sequence for analysis is given via an example in Chapter 10 (Constructing a System Model). A formalization of this ``algorithm'' is given in Chapter 12 (The Analysis Process).

Summary

Analysis provides a description of what a system will do. Recasting requirements in the (semi) formalism of analysis notations may reveal incompleteness, ambiguities, and contradictions. Consistent and complete analysis models enable early detection and repair of errors in the requirements before they become too costly to revise.

Inputs to analysis may be diverse, but are categorizable along the dimensions of functionality, resource constraints, performance constraints and auxiliary constraints.

Four different core models are used to describe the functional aspects of a target system. These core models correspond with the elements in a matrix with axes static versus dynamic, and inside an object versus in between objects.

Analysis is intrinsically non-algorithmic. In an initial iteration we prefer to address first the static dimension and subsequently the behavioral dimension. However, large systems need decompositions that rely on early attention to functional, behavioral aspects.

Exercises

Discuss whether analysis, in the sense discussed in this chapter, should be performed for the following tasks:
1. The construction of a new Fortran compiler for a new machine with a new instruction repertoire.
2. The planning for your next vacation.
3. The repair of faulty software.
4. The acquisition of your next car.
5. The remodeling of your kitchen.
6. The decision to get married.
7. The reimplementation of an airline reservation system to exploit five supercomputers.
8. A comparative study of OO analysis methods.
Analysis is a popular notion. Mathematics, philosophy, psychiatry, and chemistry all have a claim on this concept. Is there any commonality between these notions and the notion of analysis developed in this chapter?
Do you expect the following items to be addressed during OO analysis?
1. Maintainability.
2. Quality.
3. The development costs of the target system.
4. The execution costs of the target system.
5. The programming language(s) to be used.
6. The reusability of existing system components.
7. The architecture of the implementation.
8. The relevance of existing frameworks.

References

1: S.C. Bailin. An object-oriented requirements specification method. Communications of the ACM, 32(5), May 1989.
2: G. Booch. Object Oriented Design with Applications. Benjamin/Cummings, 1990.
3: E. Colbert. The object-oriented software development method: A practical approach to object-oriented development. In TRI-Ada 89, October 1989.
4: A.M. Davis. Software Requirements, Analysis and Specification. Prentice-Hall, 1990.
5: D. de Champeaux and P. Faure. A comparative study of object-oriented analysis methods. Journal of Object-Oriented Programming, March/April 1992.
6: M. Jackson. Systems Development. Prentice Hall, 1982.
7: I. Jacobson. Object-oriented development in an industrial environment. In OOPSLA '87. ACM, 1987.
8: I. Jacobson, M. Christerson, P. Jonsson, and G. Overgaard. Object-Oriented Software Engineering. Addison-Wesley, 1992.
9: T.D. Korson and V.K. Vaishnavi. Managing emerging software technologies: A technology transfer framework. Communications of the ACM, September 1992.
10: J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. Object-Oriented Modeling and Design. Prentice Hall, 1991.
11: S. Shlaer and S.J. Mellor. Object-Oriented Systems Analysis. Yourdon Press, 1988.
12: S. Shlaer and S.J. Mellor. Object Life Cycles: Modeling the World in States. Yourdon Press, 1991.
13: R. Wirfs-Brock, B. Wilkerson, and L. Wiener. Designing Object-Oriented Software. Prentice Hall, 1990.
14: E. Yourdon. Modern Structured Analysis. Yourdon Press, 1989.

Next: Chapter 3

Doug Lea
Wed Jan 10 07:52:09 EST 1996