Module espr::ir[][src]

Expand description

Intermediate Representation (IR) legalized (semantically analyzed) from SyntaxTree

Legalize procedure consists of three steps:

First two step are global analysis, which make the last step local.

Namespace creation

Legalize phase starts with Namespace creation. This step also introduces Scope and Path to represent scopes and names in EXPRESS schema. Using Namespace, we can look up corresponding Path to a local identifier appears in a Scope.

SCHEMA sc1;  -- Scope "sc1[schema]" starts
  ENTITY a;  -- Path "sc1[schema].a[entity]" is registered
    x: REAL;
    y: REAL;
  END_ENTITY;

  ENTITY b;  -- Path "sc1[schema].b[entity]" is registered
    z: REAL;
    a: a;    -- Identifier "a" is resolved as a path "sc1[schema].a[entity]"
  END_ENTITY;
END_SCHEMA;  -- Scope "sc1[schema]" ends

SubType/SuperType constraints

ENTITY may have subtype or supertype constraints.

ENTITY person;
  name: STRING;
END_ENTITY;

ENTITY employee SUBTYPE OF (person);
  pay: INTEGER;
END_ENTITY;

ENTITY student SUBTYPE OF (person);
  school_name: STRING;
END_ENTITY;

This means employee has a field pay in addition to name inherited from person, and an employee can be instantiated as a person. Instances which contains two or more entity value are called “complex entity instance”, and they are mapped into exchange structures using one of two rules, “internal mapping” or “external mapping”:

/* internal mapping */
#1 = EMPLOYEE('Hitori Goto', 10);
#2 = STUDENT('Ikuno Kita', 'Shuka');

/* external mapping */
#3 = (PERSON('Hitori Goto') EMPLOYEE(10));
#4 = (PERSON('Ikuno Kita') STUDENT('Shuka));
#5 = (PERSON('Nizika Iziti') EMPLOYEE(15) STUDENT('Simokitazawa'))

Using internal mapping, the inherited attributes (name) shall appear sequentially prior to the explicit attributes (pay, school_name). Using external mapping, an instance is represented by a list of “partial complex entity value” enclosed by (). The instances #1 (#2) described by internal mapping and #3 (#4) described by external mapping are same value of employee (student) entity, but internal mapping cannot describe #5 case. Different from usual Object-Oriented Programming (OOP) languages like C++ or Python, person can be both employee and student simultaneously, i.e. a person object may have both pay field and school_name field like as #5.

This type of inheritance is called ANDOR in EXPRESS, and it is the default constraint for supertype. We can write this constraint explicitly in the entity declaration of person:

ENTITY person SUPERTYPE OF (employee ANDOR student);
  name: STRING;
END_ENTITY;

or as a separate SUBTYPE_CONSTRAINT declaration:

SUBTYPE_CONSTRAINT person_prop FOR person;
  employee ANDOR student;
END_SUBTYPE_CONSTRAINT;

We cannot determine the subtypes of an entity from its ENTITY declaration due to default constraints. SUBTYPE OF relation are gathered into Constraints struct to look up subtype paths from supertype path before legalizing AST of entities.

There is two other types of constraints. First is ONEOF constraint:

ENTITY pet;
  name : pet_name;
END_ENTITY;

SUBTYPE_CONSTRAINT separate_species FOR pet;
  ABSTRACT SUPERTYPE;
  ONEOF(cat, rabbit, dog);
END_SUBTYPE_CONSTRAINT;

ENTITY cat SUBTYPE OF (pet);
END_ENTITY;

ENTITY rabbit SUBTYPE OF (pet);
END_ENTITY;

ENTITY dog SUBTYPE OF (pet);
END_ENTITY;

You know a pet cannot be both cat and rabbit in real world. This is represented by ONEOF constraint in SUBTYPE_CONSTRAINT declaration. Second is AND constraint:

ENTITY person;
END_ENTITY;

ENTITY male SUBTYPE OF (person);
END_ENTITY;

ENTITY female SUBTYPE OF (person);
END_ENTITY;

ENTITY citizen SUBTYPE OF (person);
END_ENTITY;

ENTITY alien SUBTYPE OF (person);
END_ENTITY;

SUBTYPE_CONSTRAINT person_prop FOR person;
  ONEOF(male, female) AND ONEOF(citizen, alien);
END_SUBTYPE_CONSTRAINT;

AND behaves as that for boolean.

Legalize trait

Most of structs in this sub-module implements Legalize trait for creating it from a corresponding AST portion. Legalize::legalize is called recursively while traversing AST. These structs, called IRs (intermediate representations), are designed with following rules:

  • Code generation only looks IRs, never looks AST. Every information required for code generation must be contained in IR.
  • Code generation does not execute global analysis, e.g. check if a type reference refers a primitive type or not.

This crate is motivated for generating Rust code, but is designed to use for generating other contents, e.g. Python code or HTML reference.

Structs

Global constraints in EXPRESS components

Enumeration of values, e.g. TYPE text_path = ENUMERATION OF (up, right, down, left); END_TYPE;

Intermediate Representation

Instantiable subtypes described by a list of partial complex entity, e.g. $[A, B & C]$

Namespace of loaded EXPRESS schema

Partial complex entity data type, e.g. $A \And B \And C$ in ISO document

Rename of user defined type, e.g. TYPE box_height = positive_ratio_measure; END_TYPE;

Scope declaration

Select of user defined types, e.g. TYPE geometric_set_select = SELECT (point, curve); END_TYPE;

Rename of primitive type, e.g. TYPE label = STRING; ENDTYPE;

Enums

Expression appears in SUBTYPE_CONSTRAINT with resolved Path

Named AST portion of corresponding Path

Identifier in EXPRESS language must be one of scopes described in “Table 9 – Scope and identifier defining items”

Semantic errors

Traits

Legalize partial AST input into corresponding intermediate representation

Functions