Index
A list of animals in a file (animal_file) has been loaded into the memory, and stored in a heterogeneous array:
int number_of_animals; Animal * animals[]; ReadAnimals (animal_file, &number_of_animals, animals);The task is to sort these animals into several animal houses in which only the animals of the same type are put together. We have three regulations:
The first step is get the following information from the animal list:
- The house should be built in the right size that just fits to the exact number of a specific kind of animals;
- Animals should live directly in the house, or in other words, animals should not live in other places referred by pointers from this house; and finally,
- Byte-wise hacking method is prohibited.
- the number of different species of animals;
- the number of each animal species; and
- a representative animal for each species
This is an easy step. Suppose that we have put these information in the following variable:
int number_of_species; int number_of_each_species[]; Animal* representative[];The second step is to build number_of_species houses. It would be nice to design a generic house for all species, and specific houses for a particular species can be derived from the generic one easily.
That's not a difficult job, why don't we use templates or generic classes? Let's see:
template class AnimalHouse < class AnimalType, int size > { private: AnimalType animals[size]; public: // public method interface of the house };Yes, we've got a nice generic house that satisfies the three regulations we set. But how can we derive concrete houses from the generic one? Can we do:
1 AnimalHouse* houses[number_of_species]; 2 for (int i=0; i < number_of_species; i++) 3 houses[i] = 4 new AnimalHouse 5 < typeof(representative[i]), 6 number_of_each_species[i] 7 >;? Sadly, we find that C++ compiler won't accept our idea. The compiler complains:
line 1: You cannot use template class without giving the actual bindings to each class parameter! line 4: I don't know the exact AnimalType and the exact size, so I cannot build the house!Eiffel's parameterized classes and Ada's generic type will give us the same complaint. They have the following three limitations:
- Lack of Polymorphism. The traditional approach does not consider a parameterized class a real type. It can not be used to declare a polymorphic variable (pointer in C++), as shown at line 1 of the above example. No polymorphism is allowed based on the parameterization. That is, we cannot have a polymorphic animal house variable that can denote any concrete animal houses. In the example we give, it is impossible to declare an array of such houses. We must declare houses one by one, giving each a concrete animal type and the exact size, which is also impossible in this example.
- Lack of Dynamic Biding. Traditional parameterized classes are merely a syntactic expansion feature in language concept. The actual type or object used to bind a class parameter must be statically known by the compiler. A concrete animal house cannot be instantiated by binding a run-time value.
- Planning-Ahead Required. Using parameterized classes in declarations and object instantiations, we must present the exact type. This will require a plan-ahead: all subclasses of Animal must be known previously at the time we create animal houses. New subclasses cannot be created later without changing the existing program that is supposed for all species of animals.
Transframe unifies the concept of parameterized class into the concept of class. A parameterized class has no difference from a non-parameterized class in language concept. They are all real types that can be used to declare variables. This is another thread of demonstration of the beauty and the power of unification I have presented in my last column.
Any class can be parameterized. Class parameters are constrained so that the error of using a class parameter can be detected at the time the parameterized class is designed (Compared to C++, the error of using a class parameter is mostly detected at the time the template is used, sometimes after it is delivered to customers!)
Unlike traditional approaches, Transframe's parameterized class is a real class. The example of AnimalHouse written in Transframe is:
class AnimalHouse #(AnimalType: type of Animal; size: int) { private: animals: AnimalType[size]; public: // public method interface of the house };No keyword like template or generic is necessary because it is just a class.
Since it is a real class, subclass relation can be discussed among parameterized classes. A subclass inherits a class parameter by reconstraining or binding the class parameter. Assuming that we have an animal class hierarchy:
class Animal; class Carnivore is Animal; class Herbivore is Animal; class Tiger is Carnivore; class Cow is Herbivore;We can have the following subclass of AnimalHouse:class CarnivoreHouse is AnimalHouse #(AnimalType: type of Carnivore);The subclass CarnivoreHouse inherits the class parameter AnimalType by reconstraining it into Carnivore. The class parameter size is also inherited but not changed. Consider more subclasses:
class TigerHouse is CarnivoreHouse #(AnimalType=Tiger); class FiveTigerHouse is TigerHouse #(size=5);TigerHouse is subclass of CarnivoreHouse where the class parameter AnimalType is bound to Tiger; and FiveTigerHouse is subclass of TigerHouse where the class parameter size is bound to 5;
In C++, Eiffel or Ada95, the above inheritance structure among parameterized classes seems possible also, but the inheritance actually does not exist because it cannot show any subclass relationship. When parameterized classes are not classes, saying that one is a subclass of another is meaningless.
With Transframe's parameterized class, the following declaration is now possible:
houses: AnimalHouse...number_of_species;houses is declared as heterogeneous array which is similar to Java's array but type-safe (I'll have an explanation of this point later). The other type constructor as in AnimalType[size] creates an homogeneous array similar to C++'s array.The concrete type of the element is not necessarily known at the time the parameterized class AnimalHouse is used. Each element is a polymorphic variable referring to an house of a subclass of AnimalHouse, which might be a tiger house
houses[i] := FiveTigerHouse();or a cow house:houses[i] := AnimalHouse#(Cow,10)();or a house for a new species of animal created in the future.Class parameters are not necessarily bound to a thing known at compile-time. They can be bound by a run-time value. Therefore, Transframe will not complain over the following code:
for (int i=0; i < number_of_species; i++) houses[i] := AnimalHouse #( typeof(representative[i]), number_of_each_species[i] ) ();
In Java, arrays are instances of subclasses of class Object. If class A is a subclass of B then A[] is a subclass of B[]. This introduces the famous covariance problem and hence a hole in the type system (for detail, read my column in the January issue.) Consider:
Animal[] x; Cow[] cow_flock = new Cow[100]; x = cow_flock; // Cow[] is a subclass Animal[] x[i] = new Tiger; // oops, we put a tiger into the cow flock!The error cannot be captured at compile time. Java reports an error at runtime. As result, the system crashes at the point where an array element type exception is raised. The cow flock is destroyed just because of an error that we try to put a tiger into the cow flock. Who should be blamed? The person who write this piece of code? This might be unfair. If we tell people that the function put has an interface:put (Animal[], Animal);we should not blame on people who perform put(x, new Tiger()) by destroying the whole system. It would be nicer to prevent people from doing such a mistake rather than to blame them after the mistake is incurably done.According to Arthur van Hoff's post in Java's news group:
Recently Bill Joy, Guy Steel, David Ungar and Ole Madsen have been evaluating the Java language and they have pointed out that our current array type system may not be the optimal choice. As it turns we only allow co-variant argument types on array types and not on class types. We are working on a proposal which, in the future, allows us to express arrays as parameterized types without the requirement for co-variant arguments.
Yes, parameterized type is the right direction to follow. However, the traditional parameterization concept is not sufficient to support Java's notion of array, because it does not support dynamic binding and polymorphism on an array. If the class array is parameterized by ElementType and size:
class array #(ElementType: type of any; size: int);by the traditional approach, the following declaration is impossible:Animal[] x;because the class parameter size is not bound. Therefore, Animal[] is not a real type and cannot be used to declare a variable.It is is still impossible to create an animal array when the size of the array is given but not evaluative:
x = new Animal[number];because number is a variable whose value cannot be evaluated at compile-time.Even the size given is statically evaluative, the array is still not the right one equivalent to a Java's array:
Animal[100] x;because the element type of the array is bound to Animal; Each element shall be exactly in the type of Animal. Subtypes are not allowed. The array shall not be a Java's heterogeneous array.With Transframe's dynamic parameterization, an expression of arrays as parameterized types is now possible. As a matter of fact, array is defined as a parameterized class in Transframe.
Transframe provides tree built-in structural classes: array, cluster, and tuple.
array is similar to C++'s array in which element of array must have the same type. It defines a homogeneous collection and provides an efficient way to group a number of objects of the same type:
class array #(exact ElementType: type of any; size: int) is structural;The keyword exact suggests that the class parameter ElementType is used as a sealed type: when the class parameter is bound to a class, say Animal, the argument having the type ElementType will be exactly the type of the binding class, i.e. Animal. Therefore, by declaration:x: array#(ElementType=Animal);the type of x[i] is exactly the type of Animal.With the same efficiency, Transframe's array is more powerful than C++'s array. We can declare a polymorphic array variable:
x: array#(ElementType: type of Animal);and then bind the variable to a concrete array later:x:= array#(ElementType=Cow; size=100);With the polymorphic array variable, we can perform operations defined in the Animal class without knowing the concrete type of the element.By C++'s array, the element type is always bound at the time you declare an array variable. Polymorphism on array as whole is not supported.
cluster is similar to Java's array in which element of array may have different types. It defines a heterogeneous collection which is more flexible than C++'s array but has an efficiency penalty on both space and speed. Definitions of cluster and array are almost identical except the class parameter ElementType:
class cluster #(ElementType: type of any; size: int) is structural;The keyword exact is not presented, which suggests that the class parameter ElementType is used as a non-sealed type: when the class parameter is bound to a class, say Animal, the argument having the type ElementType will be the type of the binding class or any subtype of the binding class, i.e. Animal, or any subtype of Animal such as Tiger and Cow. Therefore, by declaration:x: cluster#(ElementType=Animal);the type of x[i] is a subtype of Animal.With the same flexibility, Transframe's cluster is safer than Java's array. Consider:
x: cluster#(ElementType: type of Animal); cow_flock = cluster#(ElementType=Cow, size=100)(); x := cow_flock; x[i] := Tiger();The last assignment is illegal by Transframe. By definition, the type of x[i] is typeof(x).ElementType, which is a subtype of Animal, and Tiger is also a subtype of Animal. They are subtypes in parallel but have no subtype relation between themselves. Therefore, a type assurance statement must be used:assume(y=x[i] is super Tiger) y := Tiger();We can put a tiger into x only if the element type of x is a super type of Tiger.Though run-time type check is also used by Transframe, it is used in a different way. Transframe uses type assurance statement for prevention. Type errors are always prevented at compile-time. No run-time type exceptions are issued. A compile program is guaranteed type-safe. While Java uses run-time type checks for recovery, and unfortunately, a desperate recovery with no cure.
Transframe's cluster is even a more flexible feature than Java's array. The element type of Java's array is always unbound (which is the opposite extreme to C++ array where element type is always bound), therefore, run-time type check is always necessary as long as the class may have subclasses. The element type of a Transframe's cluster can be bound if we desire so:
x: cluster#(ElementType=Animal);then,x[i] := Tiger();should be a valid assignment. Run-time type check is not necessary.While C++'s array and Java's array are two extremes, Transframe's array and cluster cover a complete spectrum between the two extremes.
Transframe offers syntactic sugar for arrays and clusters so that you can declare arrays and clusters in the simple way. Examples are:
[] == array [100] == array#(size=100) Animal[] == array#(ElementType: type of Animal) array of Animal == Animal[] Tiger[] == array#(ElementType=Tiger), when Tiger is a sealed class array of Tiger == Tiger[] Tiger[8] == array#(ElementType=Tiger; size=8) ... == cluster ...100 == cluster#(size=100) Animal... == cluster#(ElementType: type of Animal) cluster of Animal == Animal... Tiger... == cluster#(ElementType=Tiger), when Tiger is non-abstract cluster of Tiger == Tiger... Tiger...8 == cluster#(ElementType=Tiger; size=8)
There are lots of different definitions of dynamic typing. For clarity, I list my definitions as below:
- Dynamically Typed (con: statically typed). An argument (variable, reference, or formal parameter) is dynamically typed if the type of the object to which the argument can be bound cannot be determined statically (at compile or link time). Examples of dynamically typed arguments are:
- Untyped without static type inference: the type is not given before run-time. It can be bound to an object of any type at run-time. This is an extreme case of the dynamically typed. The example is a SmallTalk variable. Note that a variable in ML is also untyped, but the exact type of its bound object is inferred (guaranteed!) before run-time. Therefore, an ML variable, though untyped at the time of declaration, but typed at the time before running. It is not dynamically typed by my definition.
- Polymorphically typed: the type is given at the time the argument is declared. It can be bound to an object of any subtype of the given type at run-time. A Java's variable (non-primitive) falls into this case.
- Tagged Union: a set of possible types are given at the time the argument is declared. It can be bound to one of the given types at run-time.
- Dynamic Typing (con: static typing). A type system is a dynamic typing system if new types can be defined/created at run-time. CLOS and Dylan support dynamic typing while C++, Java, Eiffel, and Ada95 are static typing languages. In a dynamic typing system, the class hierarchy can be dynamically expanded by creating new subclasses at the time the programming is running. But the class hierarchy is fixed at runtime in a static typing system.
- Dynamic Type (con: static type). A type is a dynamic type if its definition can be changed after its creation. Therefore, we can add/delete methods or slots in a dynamic type at run-time. A static type is fixed since it is created. A simulation of dynamic type by dynamic typing system can be done by considering a new type is created whenever the existing type changes.
Transframe supports dynamic typing, but in a different way than CLOS and Dylan. Transframe's dynamic typing system is as efficient as C++, Java, and Eiffel, and even safer than traditional static typing systems.
Transframe supports dynamic typing by dynamic parameterization. New subclasses can be created by binding class parameters at run-time.
Let's assuming that we are writing a server with a generic framework F#(X). For each new type T from the client, the sever must create a new framework (i.e. a new type F#(T)) based on F#(X). Without the support of dynamic parameterization, the responsibility of the creation of F#(T) must be shifted to the clients. If the framework should only be used within the server, such exposure not only makes the client application difficult, but also endangers the server.
Class cannot be changed (static type) by language (it can be changed when it is melted into the dynamic development environment, see my previous column). However, classes can be created at run time by using run-time value. This offers an alternative way for the dynamic change: whenever an existing class is required to be changed, we can always create a new class at run-time to replace the existing class. A simple example is to resize an array. An array is an object that has a fixed number of elements in the same type. Therefore, the number of elements of a concrete array class is fixed. When an array object needs to be resized, we can create a new array class and assign this class to the array object:
function ArrayResize (old_array: array; new_size: int) { new_array: array= new ARRAY #(typeof(old_array).ElementType, new_size); num: int = min(old_size, typeof(old_array).size); for (int i=0; i < num; i++) new_array[i]=old_array[i]; old_array = new_array; }where new_array is a polymorphic variable whose class is created dynamically according to the function input parameters. As result, the reference old_array refers to a new array with a new type. The old array may be garbage collected later if it becomes dangling.Compared to other dynamic typing system, dynamic parameterization is an efficient mechanism. The implementation of dynamic parameterization is just an indirect access, which is very similar to the implementation of a C++ virtual function.
Unlike other dynamic typing system, dynamic parameterization does not require more run-time checks than a static system requires. On the contrary, it reduces the chance of run-time checks in case of type dependencies should be specified in an interface.
Unlike Eiffel, Transframe does not require run-time type check to eliminate holes ( a hole is defined as a leak in the specification through which an argument can pass the check but the type of the argument is actually wrong) in the supertype's interface, instead, it requires run-time type check merely for flexibility in heterogeneous programming, which is exactly the case forbidden by some other approaches.
Class parameters can be used to express the type dependency. And thanks to the type dependency check via type assurance, a compiled Transframe program will never report an run-time type error.
Eiffel allows fields/methods in subclasses to be redefined by subtypes, which have a more flexible type system than that of C++. Eiffel allows covariant changes to types of instance variables. As a result, it is not statically type-safe. A system-level type check is required and this works only in a close-world assumption. It seems that Eiffel's newly proposed type checking rule is too conservative to include an useful usage of a covariance. Transframe considers that the requirement of covariance in subclass is actually a requirement of type dependency in superclass. Let's consider an example:
class Vehicle; class Limo is Vehicle; class Airplane is Vehicle; class Operator { function operate(Vehicle); }; class Chauffeur is Operator { function operate(Limo); }; class Pilot is Operator { function operate(Airplane); };The operate function interface defined in the superclass Operator fails to reflect the fact that a concrete operator cannot drive any type of vehicle. That is, the type of the vehicle that an operator can drive depends on the type of the operator. This imprecision causes subclasses of Operator unable to inherit the interface and they have to correct the interface by covariant redefinition. Parameterized classes in Transframe now have a subtype relation, which makes it possible to rectify the operate interface as follows:class Operator#(VehicleType: type of Vehicle){ function operate(VehicleType); }; class Chauffeur is Operator#(Limo); class Pilot is Operator#(Airplane);According to this interface, an operator can drive a vehicle only if the type of the vehicle is the right type of the operator type. Let's examine the following three cases:Therefore, Transframe requires fewer run-time type checks than Eiffel and other approaches in cases where type dependency is specified.
- For variables whose static types can be statically inferred, the compiler can determine whether an operator can drive an vehicle:
x: Chauffeur; y: Limo; x.operate(y); // correct- For variables whose static types cannot be inferred statically, but the dependency is specified, the compiler can still determine whether it is a valid combination. Consider an open interface where the dependency is specified:
< OperatorType: Operator > ( x: OperatorType; y: OperatorType.VehicleType ) { x.operate(y); } // correctThe implementation does not need run-time type check though the actual types of x and y are unknown.- Only for variables whose static types cannot be inferred and the dependency is unknown either, a runtime type check is required:
x: Operator; y: Vehicle; assume (y is x#.VehicleType) x.operate(y); // correct