CONTENTS | PREV | NEXT
8H. the essentials of the E typesystem
--------------------------------------
This section tries to explain how the E typesystem works from
another perspective.
Most problems people have while programming in E stem from their incorrect
view of how the E type-system works, Also, many people have an idea how types
work from their previous programming language, and try to apply this to E,
which is often fatal, because E is quite different when it come to types.
The Type System.
but E is in essence a TYPELESS language. Indeed, variables may have a type,
but this is only used as a specification how to dereference a variable when
it is used as a pointer. In almost ALL other language constructions,
variables are treated as all being of the same type, namely the 32bit
typeless value.
In practise this means that for example in expressions with the exception
of the ".", "[]" and "++" operators etc., all operators and functions work
on 32bit values, regardless of whether they represent booleans, integers,
reals or pointers to something.
Pointer Types.
In the E type-system only 4 types exist, PTR TO CHAR, PTR TO INT,
PTR TO LONG and PTR TO <object>, where <object> is a name of a previously
defined OBJECT. When a variable (or an object member, as we'll see later)
is declared as being of this type, It means that if the variable contains
a value that is a legal pointer, this is how it should be dereferenced.
LONG, ARRAY etc.
All other types one may see in a DEF declaration are not really types, as
they really are only other ways of writing one of the above four. As an
example, ARRAY OF <type> is just another way of writing PTR TO <type>, with
the only difference that the former is automatically assigned the address
of an area of stackspace which is big enough to hold data for the #of
elements specified in square brackets.
Here's a table that shows all E 'types' in terms of the basic four:
ARRAY OF CHAR, ARRAY, STRING, LONG (are equal to) PTR TO CHAR
ARRAY OF INT (is equal to) PTR TO INT
ARRAY OF LONG, LIST (are equal to) PTR TO LONG
ARRAY OF <object>, <object> (are equal to) PTR TO <object>
- LONG is for variables that are not intended to be used as a pointer,
i.e integers. Its equivalence with PTR TO CHAR is quite logical, as
conceptually both talk about things that are measured in units of 1.
(for example, "++" has the same effect on both)
- LIST and STRING are the same as their ARRAY equivalents, in respect
to the fact that they're initialised to a piece of stack-space, but
their stack representation is a little more complex to facilitate
runtime bounds-checking (when used with the correct functions).
- an <object> is equivalent to [1]:ARRAY OF <object>. both represent
an initialised PTR TO <object>.
In an OBJECT one can have the same declarations, with the addition of CHAR
and INT (similar to LONG), and the ommission of LIST and STRING, as these
are complex objects in their own right, and cannot be part of an object.
Deferencing.
Given a pointer p of some type,
"[]" may index other elements that are sequentially ordered next to
the element it is currently pointing to. note that this allows for
both positive and negative indices, and also no assumptions are made
about where and how many elements are actually allocated.
"++" sets the pointer to the next element in memory, "--" to the previous
one. note that these operators always operate on the pointer and
never on the the element the pointer is pointing to.
"." works similar to "[]", only now indexes the pointer by name, i.e. the
pointer must be a PTR TO <object>.
"[]" and "." may be concatenated to a pointer p in any sequence, given the
fact that the previous resulting value again is known to be of a "PTR TO"
type.
One does not need to write out a de-reference in total, as in other
languages, e.g. if p is an ARRAY OF obj, instead of having to write
p[index].member you can write just p[index], which logically results
in the address of that object. This also explains why p[].member
is equivalent to p.member, since p[] is the same as p when it points
to an object.
Reference Semantics.
Another type-related issue that makes E somewhat different from other
languages and thus harder to grasp is it's accent on Reference Semantics
rather than Value Semantics. I'll try to argue why that's good here.
Informally, Reference Semantics means that objects in a language (mostly
other than the simple ones like LONGs) are represented by pointers, while
Value Semantics treats these objects as just being themselves. An example
of a language that has only Value Semantics is BASIC, examples of
languages that have them both are the C/C++ and Pascal type-of languages,
and examples of Reference only are newer Object Oriented languages,
functional languages like LISP and of course E.
Using Reference Semantics doesn't mean being occupied with pointers
all the time, rather you're worrying about them a lot less then in the
mixed case or the Value-only case, especially since in real life programs
most non-trivial data-structures get allocated dynamically which implies
pointers. The best example of this is LISP, where one programs heavily
with pointers without noticing. In E, one could easily forget STRING
is a pointer, given the easy by which one can pass it around to other
functions; in C often lots of "&" are needed where in the equivalent E
case none are, and the Oberon equivalent of bla('hallo') looks like
bla(sys.ADR('hallo')) because the string doesn't represent a pointer,
but a value as a whole...