The Expressive Function of Language

An Introduction to Transframe's Innovative Expressions

David L. Shang

Index


Limited Expressions

Much of the work in programming is done by writing expressions -- meaningful combinations of symbols that can accurately convey the writer's intentions in software community.

We have so much of the thing that we always feel short of expressions.

Expressions provided by a typical programming language are:

The first three are primitive expressions that contribute little to enrich the expression of a language. Language designers have spent a great deal of effort in design the syntactic structure for expressions with operators.

C++ is perhaps the language having the most sophisticated operator-oriented expressions. It has 46 operators. Each operator has a precedence level, a rule of associativity, and a specific format for the construction of an expression. 42 operators in C++ can be overloaded to take a different type of input.

However, C++ as well as other existing languages have the following limitations in expressions with operators:


The Flexibility of Transframe's Expressions

Transframe has an rich set of operators that provide access to most of the operations provided by the underlying platform and required by most applications. For any C++ expression, there is a similar and better expression in Transframe. However, Transframe's expression is much richer than C++'s expressions based on the following facts:

The design of Transframe's expression is orthogonal. The diversity and flexibility is obtained by unification, which is the design philosophy of the language. Instead of providing more and more operators and operator formats, Transframe treat all operators by a set of common rules.

Operator Names and Interfaces

An operator is a specific class whose name is an operator name. An operator name is composed of one or more identifiers or operator identifiers. The first identifier or operator identifier is called the operator name head and is prefixed with the keyword operator or noperator. Examples of operator names are:

An operator interface type is specified by a number of intersected sections of a tuple types and the parts of an operator name. Only the operator name head is significant. The rest operator names are operator name suffixes which are used to separate actual parameters presented in an input list for object instantiations.

Examples of operator names with interface declarations:

The expression for object instantiation from this class shall be written in a specific way determined by the specification of the operator's interface.

Let us consider an operator interface:

	class Robot
	{
	   public:
		function operator bring (obj:any)
				  to (who: Person)
				  from (where: Room);
	};

The operator "bring to from" has three character strings, and "bring" is the operator name head. The expression to create an "bring" activity within a robot (or in other words, to call the member function "bring" of a robot) will be written exactly in the the format defined in the interface:

	robot: Robot;
	robot bring projector to me from southwest_conference_room;

Consider another example,

	class Matrix
	{
	   protected:
		elements: int[][];
	   public:
		function operator [ (int, int) ]: pointer of int;
	};

	m: matrix;
	m[4,5]:=m[5,4];

An operator prefixed by the keyword noperator is a name operator. A name operator requires the performing object to be expressed by a name expression. For example,

	class int is numeric
	{
	   public:
		function noperator ++(): int;
	};

	n: int;
	n++;     // correct, because "n" is a name expression
	3++;	 // wrong, because "3" is not a name expression

When an operator is a member class, the first part of the operator input interface is not given and is always the object of the enclosing class. For example:

	class foo
	{	function operator $ (other: selfclass);
	}
The actual interface type of the operator is:
	(this: selfclass) $ (other: selfclass)

Should an operator name be put in the front of the performing object in an expression. a prefix interface must be used. A prefixed interface is obtained by inserting the keyword prefix at the front of the interface. An object instantiation expression using a prefix interface is a prefix expression. For example:

class numeric
{	
	class operator - is function
			prefix (): selfclass,
			(selfclass): selfclass;
};
The operator has two interfaces. The first interface is an prefix interface. The expression:
	- x
uses the prefix interface and hence is a prefix expression.

Precedence and Associativity of Operators

Each operator has a precedence level and a rule of associativity. When an operand could be grouped with either of two operators, the operand shall go with the operator that has the higher precedence. If the two operator have the same precedence, the operand is grouped with the right operator if the operators are right-associative, and the operand is grouped with the left operator if the operators are left-associative. Operators in the same group has the same associativity.

What follows is a list of the Transframe's built-in operators in a order from the highest to the lowest precedence. Operators with the same precedence are grouped together. Items listed in the first group are used for primitive expressions, and they are not operators but have the highest precedence in the composition of an expression.

E0 : Precedence 13 (Primitive Expression)
		literal
		name
		string concatenation
	( )	tuple object composition brackets	
	::	global scope selectors
	::	scope selectors

E1 : Precedence 12 (Terms, or Simple Expressions)
	[ ]	array subscribe				left
	.	member selector (by object)		left
		object instantiation (empty name)	left

E2 : Precedence 11 ( Prefix Operators)
	++	prefix increment			right
	--	prefix decrement			right
	+	arithmetic positive			right
	-	arithmetic negative			right
	&	reference of 				right
	#	type of 				right
	~	bitwise complement			right
	!	logical NOT				right

E3 : Precedence 10 (Incremental Expression)
	++	post increment				left
	--	post decrement				left

E4 : Precedence 9 (Multiplicative Expression)
	*	multiplication				left
	/	division				left
	%	remainder				left
	&	bitwise AND				left

E5 : Precedence 8 (Additive Expression)
	+	arithmetic addition			left
	-	arithmetic substraction			left
	|	bitwise OR				left
	^	bitwise exclusive OR			left

E6 : Precedence 7 (Shift Expression)
	<<	left shift				left
	>>	right shift				left
	<+	left rotate shift			left
	+>	right rotate shift			left

E7 : Precedence 6 (Relational Expression)
	<	less than				left
	>	greater than				left
	<=	less than or equal			left
	>=	greater than or equal			left
	=	equal					left
	=/	not equal				left

E8 : Precedence 5 (Logic AND Expression)
	&&	logical AND				left

E9 : Precedence 4 (Logic OR Expression)
	||	logical OR				left

E10 : Precedence 3 (Conditional Expression)
	?	conditional				left

E11 : Precedence 2 (Reserved for User Defined Operators)
							left

E12 : Precedence 1 (Assignment Expressions, Expressions) 
	:=	assignment				right
	*=	multiplication assignment		right
	/=	division assignment			right
	%=	modulus assignment			right
	+=	addition assignment			right
	-=	substraction assignment			right
	<<=	left-shift assignment			right
	>>=	right-shift assignment			right
	<+=	left-shift assignment			right
	+>=	right-shift assignment			right
	&=	bitwise AND assignment			right
	|=	bitwise OR assignment			right
	^=	bitwise exclusive OR assignment		right

E13 : Precedence 0 (Expression List)
	,	sequential evaluation			left

There are 14 precedence groups. Each group is assigned with a number as shown in the above table. All user defined postfix operators have the precedence 2. All the user-defined prefix operators has the precedence 11. However, if user defined operator use the same name as one of the built-in operators, the precedence of the user defined operator will use the built-in's precedence.


Primitive Expressions

There are three kinds of primitive expressions: names, literal constants, and tuple expressions.

	PrimitiveExpression:
		Name
		Literal
		Tuple
	Name:
		self
		selfclass
		Identifier
		OperatorIdentifier
		ScopeQualifier Name
	ScopeQualifier:
		::
		ClassScopeQualifier
	ClassScopeQualifier:
		ClassName::
		ClassName::ClassScopeQualifier
	ClassName:
		Identifier
		OperatorIdentifier
	Tuple:
		( ExpressionList )

An identifier or an OperatorIdentifier is a name or part of the operator name given in a declaration. The value of the identifier is the object that the name represents. If it is a type exact name, the exact type of the object is determined by the declaration. If it is a polymorphic name, the exact type is determined by its current value, and the declaration gives the face type which is the superclass of all the possible exact types for the object that the name can represent.

Let X be an expression of an identifier and R be the storage class, and T be the object type. If R is a static storage class, the type of X is T. If R is a dynamic storage class, the value and the type of the expression of the identifier depends on the presence of the expression:

Let us re-consider the example,
	class Matrix
	{
	   protected:
		elements: int[][];
	   public:
		function operator [ (int, int) ]: pointer of int;
	};

	m: matrix;
	m[4,5]:=m[5,4];
The operator "[]" returns a pointer to int, so the expression "m[i,j]" has a dynamic storage class. The expression "m[4,5]" appears at the leftside of the ":=" where a name expression is expected (because ":=" is defined as a name operator). Therefore, the type of the expression is a pointer to integer. The expression "m[5,4]" appears at the right side of the assignment operator where an object expression (integer) is expected, therefore, the result of the expression is an integer.

note that "pointer" is a low-level object reference, and should only be used for system programming or used where memory efficiency is a critical issue. For detail about object references, refer to Access to Objects.

An identifier declared as a constant cannot appear at the place where name expression is expected, for example, at the left side of an assignment operator. It cannot be applied by a reference-of (&) operator.


Simple Expressions

Simple expressions are composed of primitive expressions and operators with the highest precedence.

	SimpleExpression:
		PrimaryExpression
		SubscriptExpression
		ObjectInstantiation
		MemberExpression
	SubscriptExpression:
		SimpleExpression [ ExpressionList ]
	ObjectInstantiation:
		SimpleExpression Expressionopt
	MemberExpression:
		SimpleExpression . Name

A subscripting expression is used for operators whose name head is the character '['. It is usually used for array subscripting, where the simple expression (commonly an array name) evaluates an array object. The expression list in the brackets must match the operator's interface specification.

An object instantiation expression is used to create a new object, where the simple expression (commonly a class name) evaluates an class. The following expression provides an input that matching the class' object instantiation interface. The object instantiation interface returns void or an object of the type specified in the interface of the class. Object instantiation will call the "create" operator defined in the class. For detail, read The Beauty and the Power of Unification.

A member expression is used for operators whose name head is the character '.'. It is usually used to evaluate a member of an object, where the simple expression evaluates the object and the following name provides the name of the member.


Prefix Expressions

Prefix expressions are composed of simple expressions and prefix operators:

	PrefixExpression:
		SimpleExpression
		OperatorIdentifier PrefixExpression AttachedExpression

Attached expressions are composed of operator suffixes and their related expressions, which will be discussed in the following section.

Examples of user-defined prefix operators and prefix expressions using these operators:

	class Image
	{
		function operator ==> prefix () <==; //compress;
		function operator <== prefix () ==>; //decompress;
	};
	function foo ()
	{
		my_image = Image();
		<== ==> my_image <== ==>;  // compress then decompress
	};

Compound Expressions

Expressions are divided into different levels according to precedence of the operators they are using. Compound expressions are made of operators and primitive expressions. Primitive expressions are in the inmost level, which is level 0; primitive expressions with operators of the highest precedence constitute compound expressions at level 1, which are simple expressions including expressions at level 0; expressions at level 1 with operators of the operators of the second highest precedence constitute compound expressions at level 2, which includes expressions at level 1; and so on. Let E0 be a primitive expression at level 0; E1 be a simple expression at level 1; E2 be a prefix expression at level 2; and En be an expression at level n; We have defined simple expressions and prefix expressions.

Compound expressions above level 2 are made of postfix operators and expressions at the below levels in the following syntax rule:

	En (13<=n<=3):
		En-1
		En AttchedExpressionn
	AttchedExpressionn:
		InffixExpressionnopt SuffixExpressionn
	InffixExpressionn:
		OperatorExpressionn
		InffixExpressionn  OperatorExpressionn
	SuffixExpressionn:
		LeftOperatorExpressionn
		RightOperatorExpressionn
	OperatorExpressionn:
		ClassName  E13
	LeftOperatorExpressionn:
		ClassName  En-1
	RightOperatorExpressionn:
		ClassName  En

The syntax does not regulate the combination rule of operator identifiers and the expressions at a lower level. The combination is defined by the semantics of individual operators' input interfaces. That is, expressions used to attach an operator's input parameters must be in the order and the type required by the operator's interface.


User-Defined Operators

If a user defined operators use the same name of a built-in operator used in simple expressions, the operator must be defined in the same format as the built-in operator is defined, through the type of the input parameters can be changed.

There is no rule to restrict a user from using other built-in operator names. However, the precedence and associativity of the operator will be the same as the built-in's.

An example:

	class Pipe
	{
		function operator >> (Pipe): Pipe;
		enter (integer);
	};
	function Plumber ()
	{
		pipes: Pipe[4] = ( Pipe(0), Pipe(1), Pipe(2), Pipe(3) );
		pipes[0] >> pipes[1] >> pipes[2] >> pipes[3];
	};

User defined operators cannot use the following operator strings:

	//	/*	*/
	::	:	,	...

Simplicity and Clarity

The above description of Transframe's expression is cited from the Transframe Language Reference with a few examples added. As you can see from the description, Transframe's expressions are significantly simpler than C++ expressions. There are 20 pages (crowded with small fonts) for C++ expressions in the X3J16 standards document; and 70 pages for Java's expressions in the Java Language Specification.

In Transframe, unless the operator character is an separator (one of { } ( ) [ ] , ; ' and "), an operator string terminates only when next character is not an operator character. This requires programmers to use white space or other separator to separate two operator strings. For example, expression

	x--+--y
in C++ contains three operator names: "--", "+", and "--". Transframe considers them a single operator name "--+--". Transframe encourages programmers to write clear expressions. The above C example in Transframe should be written as
	x-- + --y
or
	(x--)+(--y)
In contrast, C++ expressions may create combinations that confuse either the compiler writers or the language users. For example:
	x-----y
It is difficult to guess the exact meaning of this expression:
	((x--)--)-y
which should generate an error, or
	(x--)-(--y)
which should be correct.

Orthogonality

Transframe enhances a number of poorly understood and confusing C++ operator overloading definitions such as function call operator and new operator.

The function call operator in C++ just provide a regular member function call but borrowing the class name. For example,

	class X
	{
		public: operator () (int, int);
	};
	X x;
	x(3,4);
is essentially the same as:
	class X
	{
		public: a_dummy_member_function (int, int);
	};
	X x;
	x.a_dummy_member_function(3,4);
and have little value in use. Transframe's function call operator is a higher-order operator (defined in the meta level). Given an function "f" and an input "(x1,x2,...)", the function call operator applies the function's constructor to the input and generate the output, if any. Consider the similar example:
	class X
	{
		meta public: operator create (X.inputType): X.outputType;
		public: enter (int; int);   // X's constructor
	};
	x = X(3,4);

The expression "X(3,4)" is equivalent to:

	create(X, (3,4))
and an object of "X" will be created by the implementation of the "create" operator. This enables a possibility to define various protocols for function calls or object instantiations. Examples are thread creation, remote procedure calls, message sending, etc. For detail, read
The Beauty and the Power of Unification.

Transframe offers a much richer expression not only in terms of the expression design itself, but also in terms of other convenient constructions including tuple/array/cluster constructors, variable number of input arguments (safe!), and type expressions. I'll cover those subjects in my future columns.