Index
Philistinism!-
We have not the expression...
Perhaps we have not the word
because we have so much of the thing
- Matthew ArnoldMuch of the work in programming is done by writing expressions -- meaningful combinations of symbols that can accurately convey the writer's intentions in software community.
We have so much of the thing that we always feel short of expressions.
Expressions provided by a typical programming language are:
- Literals (integers, character strings, etc.)
- Names (variables, references, parameters, etc.)
- Function calls
- Expressions with Operators
The first three are primitive expressions that contribute little to enrich the expression of a language. Language designers have spent a great deal of effort in design the syntactic structure for expressions with operators.
C++ is perhaps the language having the most sophisticated operator-oriented expressions. It has 46 operators. Each operator has a precedence level, a rule of associativity, and a specific format for the construction of an expression. 42 operators in C++ can be overloaded to take a different type of input.
However, C++ as well as other existing languages have the following limitations in expressions with operators:
- Fixed format. The format of an operator is fixed. Though an operator can be overloaded, but the format cannot be changed. For example, a binary operator must always be used as a binary operator. Thus, if operator "<" is defined as a member function within a class, the operator must have one and exactly one input parameter.
- Limited kinds of formats. Kinds of formats for operators are limited to unary, binary and a few specific formats as in function call operator and subscripting operator.
- No user-defined operators. All operators are built-in. User-defined operators are not supported.
Transframe has an rich set of operators that provide access to most of the operations provided by the underlying platform and required by most applications. For any C++ expression, there is a similar and better expression in Transframe. However, Transframe's expression is much richer than C++'s expressions based on the following facts:
The design of Transframe's expression is orthogonal. The diversity and flexibility is obtained by unification, which is the design philosophy of the language. Instead of providing more and more operators and operator formats, Transframe treat all operators by a set of common rules.
- Flexible format. The format of an operator is not fixed. For example, an binary operator can be used as a unary operator later. Thus, if operator "<" is defined as a member function within a class, the operator may take no input or any number of inputs.
- Unlimited kinds of formats. The format for an operator is defined in the interface specification in whatever the way the user would like to have. The expression is constructed based on the structure defined by the user.
- User-defined operators. There are built-in operators with assigned precedence and associativity. Users can always add new operators.
An operator is a specific class whose name is an operator name. An operator name is composed of one or more identifiers or operator identifiers. The first identifier or operator identifier is called the operator name head and is prefixed with the keyword operator or noperator. Examples of operator names are:
operator . operator <<= operator [ ] operator set to operator select from operator perform on catchAn operator interface type is specified by a number of intersected sections of a tuple types and the parts of an operator name. Only the operator name head is significant. The rest operator names are operator name suffixes which are used to separate actual parameters presented in an input list for object instantiations.
Examples of operator names with interface declarations:
operator . (this) operator <<= (integer) operator [ (index: integer) ] : ElementType operator set (char[]) to (char[]) operator select (name: char[]) from (Database): Object operator perform (a:Action) on (s:Site) catch (elst:Event...)The expression for object instantiation from this class shall be written in a specific way determined by the specification of the operator's interface.
Let us consider an operator interface:
class Robot { public: function operator bring (obj:any) to (who: Person) from (where: Room); };The operator "bring to from" has three character strings, and "bring" is the operator name head. The expression to create an "bring" activity within a robot (or in other words, to call the member function "bring" of a robot) will be written exactly in the the format defined in the interface:
robot: Robot; robot bring projector to me from southwest_conference_room;Consider another example,
class Matrix { protected: elements: int[][]; public: function operator [ (int, int) ]: pointer of int; }; m: matrix; m[4,5]:=m[5,4];An operator prefixed by the keyword noperator is a name operator. A name operator requires the performing object to be expressed by a name expression. For example,
class int is numeric { public: function noperator ++(): int; }; n: int; n++; // correct, because "n" is a name expression 3++; // wrong, because "3" is not a name expressionWhen an operator is a member class, the first part of the operator input interface is not given and is always the object of the enclosing class. For example:
class foo { function operator $ (other: selfclass); }The actual interface type of the operator is:(this: selfclass) $ (other: selfclass)Should an operator name be put in the front of the performing object in an expression. a prefix interface must be used. A prefixed interface is obtained by inserting the keyword prefix at the front of the interface. An object instantiation expression using a prefix interface is a prefix expression. For example:
class numeric { class operator - is function prefix (): selfclass, (selfclass): selfclass; };The operator has two interfaces. The first interface is an prefix interface. The expression:- xuses the prefix interface and hence is a prefix expression.
Each operator has a precedence level and a rule of associativity. When an operand could be grouped with either of two operators, the operand shall go with the operator that has the higher precedence. If the two operator have the same precedence, the operand is grouped with the right operator if the operators are right-associative, and the operand is grouped with the left operator if the operators are left-associative. Operators in the same group has the same associativity.
What follows is a list of the Transframe's built-in operators in a order from the highest to the lowest precedence. Operators with the same precedence are grouped together. Items listed in the first group are used for primitive expressions, and they are not operators but have the highest precedence in the composition of an expression.
E0 : Precedence 13 (Primitive Expression) literal name string concatenation ( ) tuple object composition brackets :: global scope selectors :: scope selectors E1 : Precedence 12 (Terms, or Simple Expressions) [ ] array subscribe left . member selector (by object) left object instantiation (empty name) left E2 : Precedence 11 ( Prefix Operators) ++ prefix increment right -- prefix decrement right + arithmetic positive right - arithmetic negative right & reference of right # type of right ~ bitwise complement right ! logical NOT right E3 : Precedence 10 (Incremental Expression) ++ post increment left -- post decrement left E4 : Precedence 9 (Multiplicative Expression) * multiplication left / division left % remainder left & bitwise AND left E5 : Precedence 8 (Additive Expression) + arithmetic addition left - arithmetic substraction left | bitwise OR left ^ bitwise exclusive OR left E6 : Precedence 7 (Shift Expression) << left shift left >> right shift left <+ left rotate shift left +> right rotate shift left E7 : Precedence 6 (Relational Expression) < less than left > greater than left <= less than or equal left >= greater than or equal left = equal left =/ not equal left E8 : Precedence 5 (Logic AND Expression) && logical AND left E9 : Precedence 4 (Logic OR Expression) || logical OR left E10 : Precedence 3 (Conditional Expression) ? conditional left E11 : Precedence 2 (Reserved for User Defined Operators) left E12 : Precedence 1 (Assignment Expressions, Expressions) := assignment right *= multiplication assignment right /= division assignment right %= modulus assignment right += addition assignment right -= substraction assignment right <<= left-shift assignment right >>= right-shift assignment right <+= left-shift assignment right +>= right-shift assignment right &= bitwise AND assignment right |= bitwise OR assignment right ^= bitwise exclusive OR assignment right E13 : Precedence 0 (Expression List) , sequential evaluation leftThere are 14 precedence groups. Each group is assigned with a number as shown in the above table. All user defined postfix operators have the precedence 2. All the user-defined prefix operators has the precedence 11. However, if user defined operator use the same name as one of the built-in operators, the precedence of the user defined operator will use the built-in's precedence.
There are three kinds of primitive expressions: names, literal constants, and tuple expressions.
PrimitiveExpression: Name Literal Tuple Name: self selfclass Identifier OperatorIdentifier ScopeQualifier Name ScopeQualifier: :: ClassScopeQualifier ClassScopeQualifier: ClassName:: ClassName::ClassScopeQualifier ClassName: Identifier OperatorIdentifier Tuple: ( ExpressionList )An identifier or an OperatorIdentifier is a name or part of the operator name given in a declaration. The value of the identifier is the object that the name represents. If it is a type exact name, the exact type of the object is determined by the declaration. If it is a polymorphic name, the exact type is determined by its current value, and the declaration gives the face type which is the superclass of all the possible exact types for the object that the name can represent.
Let X be an expression of an identifier and R be the storage class, and T be the object type. If R is a static storage class, the type of X is T. If R is a dynamic storage class, the value and the type of the expression of the identifier depends on the presence of the expression:
Let us re-consider the example,
When the expression appears at the place where a name expression is expected, for example, at the left side of an assignment, the value of the expression is an reference to the object attached to the identifier, and the type of the expression is R#(T).
When the expression appears at the place where an object (non-name) expression is expected, for example, at the right side of an assignment, the value of the expression is the object attached to the identifier, and the type of the expression is T.
class Matrix { protected: elements: int[][]; public: function operator [ (int, int) ]: pointer of int; }; m: matrix; m[4,5]:=m[5,4];The operator "[]" returns a pointer to int, so the expression "m[i,j]" has a dynamic storage class. The expression "m[4,5]" appears at the leftside of the ":=" where a name expression is expected (because ":=" is defined as a name operator). Therefore, the type of the expression is a pointer to integer. The expression "m[5,4]" appears at the right side of the assignment operator where an object expression (integer) is expected, therefore, the result of the expression is an integer.note that "pointer" is a low-level object reference, and should only be used for system programming or used where memory efficiency is a critical issue. For detail about object references, refer to Access to Objects.
An identifier declared as a constant cannot appear at the place where name expression is expected, for example, at the left side of an assignment operator. It cannot be applied by a reference-of (&) operator.
Simple expressions are composed of primitive expressions and operators with the highest precedence.
SimpleExpression: PrimaryExpression SubscriptExpression ObjectInstantiation MemberExpression SubscriptExpression: SimpleExpression [ ExpressionList ] ObjectInstantiation: SimpleExpression Expressionopt MemberExpression: SimpleExpression . NameA subscripting expression is used for operators whose name head is the character '['. It is usually used for array subscripting, where the simple expression (commonly an array name) evaluates an array object. The expression list in the brackets must match the operator's interface specification.
An object instantiation expression is used to create a new object, where the simple expression (commonly a class name) evaluates an class. The following expression provides an input that matching the class' object instantiation interface. The object instantiation interface returns void or an object of the type specified in the interface of the class. Object instantiation will call the "create" operator defined in the class. For detail, read The Beauty and the Power of Unification.
A member expression is used for operators whose name head is the character '.'. It is usually used to evaluate a member of an object, where the simple expression evaluates the object and the following name provides the name of the member.
Prefix expressions are composed of simple expressions and prefix operators:
PrefixExpression: SimpleExpression OperatorIdentifier PrefixExpression AttachedExpressionAttached expressions are composed of operator suffixes and their related expressions, which will be discussed in the following section.
Examples of user-defined prefix operators and prefix expressions using these operators:
class Image { function operator ==> prefix () <==; //compress; function operator <== prefix () ==>; //decompress; }; function foo () { my_image = Image(); <== ==> my_image <== ==>; // compress then decompress };
Expressions are divided into different levels according to precedence of the operators they are using. Compound expressions are made of operators and primitive expressions. Primitive expressions are in the inmost level, which is level 0; primitive expressions with operators of the highest precedence constitute compound expressions at level 1, which are simple expressions including expressions at level 0; expressions at level 1 with operators of the operators of the second highest precedence constitute compound expressions at level 2, which includes expressions at level 1; and so on. Let E0 be a primitive expression at level 0; E1 be a simple expression at level 1; E2 be a prefix expression at level 2; and En be an expression at level n; We have defined simple expressions and prefix expressions.
Compound expressions above level 2 are made of postfix operators and expressions at the below levels in the following syntax rule:
En (13<=n<=3): En-1 En AttchedExpressionn AttchedExpressionn: InffixExpressionnopt SuffixExpressionn InffixExpressionn: OperatorExpressionn InffixExpressionn OperatorExpressionn SuffixExpressionn: LeftOperatorExpressionn RightOperatorExpressionn OperatorExpressionn: ClassName E13 LeftOperatorExpressionn: ClassName En-1 RightOperatorExpressionn: ClassName EnThe syntax does not regulate the combination rule of operator identifiers and the expressions at a lower level. The combination is defined by the semantics of individual operators' input interfaces. That is, expressions used to attach an operator's input parameters must be in the order and the type required by the operator's interface.
If a user defined operators use the same name of a built-in operator used in simple expressions, the operator must be defined in the same format as the built-in operator is defined, through the type of the input parameters can be changed.
There is no rule to restrict a user from using other built-in operator names. However, the precedence and associativity of the operator will be the same as the built-in's.
An example:
class Pipe { function operator >> (Pipe): Pipe; enter (integer); }; function Plumber () { pipes: Pipe[4] = ( Pipe(0), Pipe(1), Pipe(2), Pipe(3) ); pipes[0] >> pipes[1] >> pipes[2] >> pipes[3]; };User defined operators cannot use the following operator strings:
// /* */ :: : , ...
The above description of Transframe's expression is cited from the Transframe Language Reference with a few examples added. As you can see from the description, Transframe's expressions are significantly simpler than C++ expressions. There are 20 pages (crowded with small fonts) for C++ expressions in the X3J16 standards document; and 70 pages for Java's expressions in the Java Language Specification.
In Transframe, unless the operator character is an separator (one of { } ( ) [ ] , ; ' and "), an operator string terminates only when next character is not an operator character. This requires programmers to use white space or other separator to separate two operator strings. For example, expression
x--+--yin C++ contains three operator names: "--", "+", and "--". Transframe considers them a single operator name "--+--". Transframe encourages programmers to write clear expressions. The above C example in Transframe should be written asx-- + --yor(x--)+(--y)In contrast, C++ expressions may create combinations that confuse either the compiler writers or the language users. For example:x-----yIt is difficult to guess the exact meaning of this expression:((x--)--)-ywhich should generate an error, or(x--)-(--y)which should be correct.
Transframe enhances a number of poorly understood and confusing C++ operator overloading definitions such as function call operator and new operator.
The function call operator in C++ just provide a regular member function call but borrowing the class name. For example,
class X { public: operator () (int, int); }; X x; x(3,4);is essentially the same as:class X { public: a_dummy_member_function (int, int); }; X x; x.a_dummy_member_function(3,4);and have little value in use. Transframe's function call operator is a higher-order operator (defined in the meta level). Given an function "f" and an input "(x1,x2,...)", the function call operator applies the function's constructor to the input and generate the output, if any. Consider the similar example:class X { meta public: operator create (X.inputType): X.outputType; public: enter (int; int); // X's constructor }; x = X(3,4);The expression "X(3,4)" is equivalent to:
create(X, (3,4))and an object of "X" will be created by the implementation of the "create" operator. This enables a possibility to define various protocols for function calls or object instantiations. Examples are thread creation, remote procedure calls, message sending, etc. For detail, read The Beauty and the Power of Unification.
Transframe offers a much richer expression not only in terms of the expression design itself, but also in terms of other convenient constructions including tuple/array/cluster constructors, variable number of input arguments (safe!), and type expressions. I'll cover those subjects in my future columns.