Module espr::ir [−][src]
Expand description
Intermediate Representation (IR) legalized (semantically analyzed) from SyntaxTree
Legalize procedure consists of three steps:
- Create Namespace
- Resolve SubType/SuperType consists to yield Constraints
- Legalize each AST portions
First two step are global analysis, which make the last step local.
Namespace creation
Legalize phase starts with Namespace creation. This step also introduces Scope and Path to represent scopes and names in EXPRESS schema. Using Namespace, we can look up corresponding Path to a local identifier appears in a Scope.
SCHEMA sc1; -- Scope "sc1[schema]" starts
ENTITY a; -- Path "sc1[schema].a[entity]" is registered
x: REAL;
y: REAL;
END_ENTITY;
ENTITY b; -- Path "sc1[schema].b[entity]" is registered
z: REAL;
a: a; -- Identifier "a" is resolved as a path "sc1[schema].a[entity]"
END_ENTITY;
END_SCHEMA; -- Scope "sc1[schema]" ends
SubType/SuperType constraints
ENTITY may have subtype or supertype constraints.
ENTITY person;
name: STRING;
END_ENTITY;
ENTITY employee SUBTYPE OF (person);
pay: INTEGER;
END_ENTITY;
ENTITY student SUBTYPE OF (person);
school_name: STRING;
END_ENTITY;
This means employee
has a field pay
in addition to name
inherited from person
,
and an employee
can be instantiated as a person
.
Instances which contains two or more entity value are called “complex entity instance”,
and they are mapped into exchange structures using one of two rules,
“internal mapping” or “external mapping”:
/* internal mapping */
#1 = EMPLOYEE('Hitori Goto', 10);
#2 = STUDENT('Ikuno Kita', 'Shuka');
/* external mapping */
#3 = (PERSON('Hitori Goto') EMPLOYEE(10));
#4 = (PERSON('Ikuno Kita') STUDENT('Shuka));
#5 = (PERSON('Nizika Iziti') EMPLOYEE(15) STUDENT('Simokitazawa'))
Using internal mapping,
the inherited attributes (name
) shall appear sequentially
prior to the explicit attributes (pay
, school_name
).
Using external mapping, an instance is represented by a list of
“partial complex entity value” enclosed by ()
.
The instances #1
(#2
) described by internal mapping
and #3
(#4
) described by external mapping are same value of employee
(student
) entity,
but internal mapping cannot describe #5
case.
Different from usual Object-Oriented Programming (OOP) languages like C++ or Python,
person
can be both employee
and student
simultaneously,
i.e. a person
object may have both pay
field and school_name
field like as #5
.
This type of inheritance is called ANDOR
in EXPRESS,
and it is the default constraint for supertype.
We can write this constraint explicitly in the entity declaration of person
:
ENTITY person SUPERTYPE OF (employee ANDOR student);
name: STRING;
END_ENTITY;
or as a separate SUBTYPE_CONSTRAINT
declaration:
SUBTYPE_CONSTRAINT person_prop FOR person;
employee ANDOR student;
END_SUBTYPE_CONSTRAINT;
We cannot determine the subtypes of an entity from its ENTITY
declaration
due to default constraints.
SUBTYPE OF
relation are gathered into Constraints struct
to look up subtype paths from supertype path before legalizing AST of entities.
There is two other types of constraints. First is ONEOF
constraint:
ENTITY pet;
name : pet_name;
END_ENTITY;
SUBTYPE_CONSTRAINT separate_species FOR pet;
ABSTRACT SUPERTYPE;
ONEOF(cat, rabbit, dog);
END_SUBTYPE_CONSTRAINT;
ENTITY cat SUBTYPE OF (pet);
END_ENTITY;
ENTITY rabbit SUBTYPE OF (pet);
END_ENTITY;
ENTITY dog SUBTYPE OF (pet);
END_ENTITY;
You know a pet cannot be both cat and rabbit in real world.
This is represented by ONEOF
constraint in SUBTYPE_CONSTRAINT
declaration.
Second is AND
constraint:
ENTITY person;
END_ENTITY;
ENTITY male SUBTYPE OF (person);
END_ENTITY;
ENTITY female SUBTYPE OF (person);
END_ENTITY;
ENTITY citizen SUBTYPE OF (person);
END_ENTITY;
ENTITY alien SUBTYPE OF (person);
END_ENTITY;
SUBTYPE_CONSTRAINT person_prop FOR person;
ONEOF(male, female) AND ONEOF(citizen, alien);
END_SUBTYPE_CONSTRAINT;
AND
behaves as that for boolean.
Legalize trait
Most of structs in this sub-module implements Legalize trait
for creating it from a corresponding AST portion.
Legalize::legalize
is called recursively while traversing AST.
These structs, called IRs (intermediate representations), are designed with
following rules:
- Code generation only looks IRs, never looks AST. Every information required for code generation must be contained in IR.
- Code generation does not execute global analysis, e.g. check if a type reference refers a primitive type or not.
This crate is motivated for generating Rust code, but is designed to use for generating other contents, e.g. Python code or HTML reference.
Structs
Global constraints in EXPRESS components
Enumeration of values,
e.g. TYPE text_path = ENUMERATION OF (up, right, down, left); END_TYPE;
Intermediate Representation
Instantiable subtypes described by a list of partial complex entity, e.g. $[A, B & C]$
Namespace of loaded EXPRESS schema
Partial complex entity data type, e.g. $A \And B \And C$ in ISO document
Rename of user defined type,
e.g. TYPE box_height = positive_ratio_measure; END_TYPE;
Scope declaration
Select of user defined types,
e.g. TYPE geometric_set_select = SELECT (point, curve); END_TYPE;
Rename of primitive type,
e.g. TYPE label = STRING; ENDTYPE;
Enums
Expression appears in SUBTYPE_CONSTRAINT
with resolved Path
Named AST portion of corresponding Path
Identifier in EXPRESS language must be one of scopes described in “Table 9 – Scope and identifier defining items”
Semantic errors
Traits
Legalize partial AST input into corresponding intermediate representation