A Theory of Objects

Prologue

Procedural languages are generally well understood; their constructs are by now standard, and their formal underpinnings are solid. The fundamental features of these languages have been distilled into formalisms that prove useful in identifying and explaining issues of implementation, static analysis, semantics, and verification.

An analogous understanding has not yet emerged for object-oriented languages. There is no widespread agreement on a collection of basic constructs and on their properties. Consequently, practical object-oriented languages often support disparate features and programming techniques with little concern for orthogonality. This situation might improve if we had a better understanding of the foundations of object-oriented languages.

Various calculi of functions (-calculi) have been used as foundations for procedural languages. It is natural to attempt to use those same calculi for object-oriented languages. Unfortunately, there always seems to be a mismatch when one tries to model objects with functions. This mismatch is manageable for untyped languages, but becomes a serious obstacle when one attempts to explain typed object-oriented languages in terms of typed function calculi.

In this book, we take a different approach. Instead of taking functions as primitive and struggling with complex encodings of objects as functions, we take objects as primitive and concentrate on the intrinsic rules that objects should obey. We introduce object calculi and develop a theory of objects around them. These object calculi are as simple as function calculi, but represent objects directly.

Our theory of objects clarifies the general principles of object-oriented languages. Understanding those general principles can be helpful to those studying or designing programming languages. The theory suggests how to interpret existing constructions, and how to create and assess new ones. It also provides tools for reasoning about languages (existing or new), in particular for soundness proofs.

Using object calculi, we explain both the semantics of objects and their typing rules. We account for a range of object-oriented concepts, such as self, dynamic dispatch, classes, inheritance, prototyping, subtyping, covariance and contravariance, and method specialization. Because of the compactness of object calculi, we hope that they will serve as a convenient basis for further research on specification, verification, and analysis of object-oriented programs.

The book begins with an informal review of object-oriented concepts, where we aim to describe and categorize the fundamental features of object-oriented languages. The main distinction we make is between class-based and object-based languages. Object-based languages simplify and generalize class-based languages by reducing classes to more primitive notions. Our object calculi are loosely modeled after object-based languages, but are even more primitive.

After the review, the main body of the book is divided into three parts where we discuss semantic, typing, and programming structures of increasing sophistication. At the end of each of the three parts, we present an object-oriented programming language that embodies the main features considered in that part. Given some familiarity with our notation, these languages should be understandable independently of the more formal material. For each language, we give an interpretation into one of our object calculi, establishing the correctness of the language's type system.

In Part I of the book we begin the development of our theory by introducing a tiny untyped object calculus. This is a calculus where the only entities are objects and object operations, with a built-in notion of self. In a sense, this calculus embodies the Smalltalk credo that anything can be modeled as an object. We go even further than Smalltalk by showing that objects can represent functions and numbers, and that classes can be programmed from objects. Several important variants of classes arise naturally.

After studying the untyped object calculus, we associate various type structures with it. We consider, in particular, object types, recursive types, subtyping, and dynamic types. These basic type constructions give us a type system rich enough to interpret a first object-oriented language. This language is, in rough terms, similar to languages such as Simula and Modula-3. Although lacking much of the convenience of common languages, it is remarkably expressive in that it integrates class-based and object-based notions.

In Parts II and III we move beyond basic type constructions and we investigate the Self type (the type of self). This topic brings us into the realm of polymorphism, data abstraction, and type operators. We review standard constructions that type theory provides to model these concepts, and incorporate them into our object calculi. We also give typing rules for the Self type and prove their correctness, in the context of both calculi and languages.

The Self type often occurs only as the type of the results of methods. We show that, in this special case, it can be understood through a combination of data abstraction and type recursion, and that method inheritance requires polymorphism. The general case where the Self type may occur in argument positions is also important; it arises in the presence of binary methods, which take arguments of the type of self. The programming and typing constructions related to binary methods are still in flux. Nonetheless, we show how to analyze them using fairly standard constructions from type theory. We conclude Parts II and III with two object-oriented languages that incorporate the Self type, and we discuss their subtle rules.

At the beginning of this Prologue we remarked on a mismatch between objects and functions. In the course of the book we develop a number of ideas and techniques that enable us to assess that mismatch more precisely. Specifically, we consider the problem of translating typed object calculi into typed function calculi. This is, in a sense, the problem of programming in object-oriented style within a typed procedural language. We show that this is possible in principle, given a sufficiently expressive procedural language, but we argue that this would be too inconvenient in practice. Therefore we find that the expressiveness of object-oriented languages cannot be emulated easily by procedural languages.

Luca Cardelli