David L. Shang
Is Circle a subtype of Ellipse? Is Square a subtype of Rectangle? Is the subrange 1..10 a subtype of integer? Is WarmColor a subtype of Color? Some people say yes; some other people say no.
What is a subtype after all? The commonly accepted definition is:
This definition is not very clear. It does not say how a subtype value should be used. The usage may include a conversion of the subtype value: we can first convert a subtype value into a supertype value and then use the converted result in the context where a supertype value is expected. Under this interpretation, the subrange 1..10 is a subtype of integer because a value of 1..10 can always be converted into an integer. As a matter of fact, in languages such as Pascal and Ada, when a subrange value is used where an integer is expected, it is always converted into an integer.
How about a square? Can its value be used in a resize function which expects a rectangle? Yes, we can convert the square value into a rectangle value before applying the resize function.
Then, what is a subtype? Any convertible type can be a subtype! This is certainly not the definition we want. Can we convert a circle into a square? or a color into a three dimensional geometric point? Yes, we can. Conversion is so powerful that we can virtually convert a type into any type. So, can a type be a subtype of any type?
It is necessary to make a clear distinction between subtypes and convertible types. For subtypes,
Let us consider an example in pseudo Pascal:
In fact, subranges are types convertible to integer, but not subtypes of integer, because subranges have a more restrictive premise than an integer.
Subtype rules are not always required for substitutions. In many cases, we just use convertible type rules, which are more permissive and allow more substitutions.
In my last column, I introduced the concept of object name in the Transframe language. There are three different substitutions (I use the term name replacement):
The kind of substitution used depends on the kinds of names involved in the substitution. For example:
Only shared reference substitutions require subtype rules. For a shared reference substitution, it is possible to have two references, one is of the supertype, another is of the subtype, to share the same object. The object cannot be converted into a supertype object, otherwise the reference of subtype will refer to an invalid object.
The following figure shows when subtype rules are used and when convertible type rules are used.
Convertible rules are used in the white part of the above figure, which takes the majority. Subtype rules are used in the gray part. And the equivalent type rules are used in the dark gray part. The black part is a forbidden area where substitutions are not allowed.
The question really is: should a subclass always be a subtype? To answer the question, we first examine the case that makes a subclass not a subtype.
Cardelli and Wegner introduced bounded universal quantification, which was later followed by many research works. Cook, Hill, and Canning suggested F-bounded polymorphism, in which inheritance is characterized as an operation on a recursive definition. With this operational definition of inheritance, subclasses, or inherited types, are not always subtypes. They are merely new types gotten by augmenting and modifying the parent type. Bruce followed this work by introducing a series of statically-typed functional languages where type checking is decidable.
Let me put it in a simple way: a subclass is not a subtype exactly when a covariance is presented in the subclass. The purpose of allowing subclasses which are not subtypes of their parent is to enable more subclasses so that we can get more code reuse benefits from inheritance.
The type system in languages such as Sather and Cecil go further to clearly separate the subtype hierarchy and the class hierarchy (implementation inheritance), which makes the language more complex.
I have examined the case of covariance in my first column. When subclasses require covariance, there is usually an untrue promise defined in their superclass: the superclass fails in the presentation of a precise interface specification, thus subclasses must correct the specification by using covariance and then make a violation that disables them being subtypes.
What lacks in traditional type systems is the ability to describe type dependencies: the type of a method input depends on the type of the object that performs the object; the type of the member object depends on the type of the owner; the type of the output depends on the type of the inputs; and the type of following input depends on the type of previous input. In many cases, this dependency is not just a simple the-same-type-as relation, which is supported by the concept of "selfclass", "MyType" or "like current".
Even the discussion is focused on the simple type dependency case, there is a still a major difference between Transframe's approach and other approaches. The major difference is that whether a subclass should be a subtype if the subclass has a method that uses "selfclass" in its input interface. By Transframe, a subclass is always a subtype even when "selfclass" is used.
Let us consider the example given in Kim Bruce's recent paper, Typing in Object-Oriented Languages: Achieving Expressibility and Safety (the example is simplified and translated into Transframe):
What is the real problem? I don't believe the problem is to let DoubleNode be a subtype of Node. The problem is the function attach. The method attachRight declares that the object on the right must be the same type of the object on the left. But the interface of attach does not say that n1 and n2 are in the same type, therefore, the statement:
n1.attachRight(n2)is illegal. It does not conform to the interface of attachRight declared in the type of Node.
By Transframe, therefore, DoubleNode is a subtype of Node. The problem is not that attachRight uses "selfclass". The problem is the illegal usage of attachRight in the function attach. Such misusage should be prohibited by the compiler for static safety.
The function attach should declare the type dependency in its interface:
If we do want a function attach that takes two nodes of two different types, we should check the two nodes to see whether they are in the same type before we call the attachRight method:
The advantage of letting subclass be a subtype can be illustrated in the following way.
If DoubleNode is not a subtype of Node, the validity of a call to function attach(Node,Node) is very limited. The function can only accept nodes whose exact types are Node:
If DoubleNode is a subtype of Node, and the type dependency is specified in the function attach's interface (i.e. attach< NodeType > (NodeType,NodeType)), the function can accept more valid combinations:
If DoubleNode is not a subtype of Node, the validity of a call to function attach(Node,Node) will depend on whether the exact type of the actual parameters can be inferred by the compiler. This might be a big loss in practice for dynamic or heterogeneous programming. If type inference fails, the program fails:
If DoubleNode is a subtype of Node, and the function attach provides a heterogeneous interface (i.e. attach(Node,Node)), the above code will be acceptable, because any subclass of Node are considered subtypes. Therefore, to enable a polymorphism in heterogeneous data structure, DoubleNode must be a subtype of Node.
Transframe supports multiple interfaces. So the function attach can be written in the following way:
In general, B is a subclass of A if B inherits (directly or indirectly) A in B's declaration. This is the way to establish an explicit subclass relationship between two classes.
Transframe allows more subclasses, and hence more subtypes, by introducing implicit subclasses via class parameterization. As I mentioned in my first column, Transframe's paramerterized classes are real classes; so that subclasses can derived from paramerterized classes.
Transframe's subtype rules contain a implicit subclass rule that judges whether two parallel subclasses of a (parameterized) class are a subclass of the other. For the description of the rule, click here.
Here are some examples:
Convertible rules are more permissive than subtype rules. Transframe allows more convertible types by the following rules:
Subclass conversion rule: Let A and B be two concrete classes and B is an explicit subclass of A, an object of type B can be converted to an object of type A by object slicing: the extended part of the object by A's subclasses is truncated.
Explicit type conversion rule: if class A defines a conversion method explicitly, which converts an object of A or any subclass of A to a type B, then, an object of A or a subclass of A can be converted to an object of type B by this conversion method.
Standard conversion rules: they are defined by the Transframe language for automatic type conversions among primitive types such as integer to float, float to double, etc.
Tuple type conversion rule: it is used to convert one tuple object into a different tuple type. In general, the rule is applied recursively to each of its element. The formal description of the rule is described in the Transframe Reference. The rule considers type parameters, and is mainly used for function interface match check. For example:
The actual parameter of the function call is a tuple type:
((integer;float); (float;char); (char;integer))which is converted to the type:
< R=integer;S=float;T=char > ( (R;S); (S;T); (T;R) );which is a subtype of the interface type declared by the function foo.
For more examples, click here.
As shown in the following figure,
Transframe enables more subtypes:
Transframe allows more convertible types by including parameterized type conversion rules. Compared to other languages, Transframe uses convertible type rules instead of subtype rules for more name attachment situations.
Adding type dependency specification to the function interface, we can eliminate the necessity for global type inference, ensure error isolation, reduce unnecessary run-time type checks, and produce safer programs.