The Tunes HLL Semantics
General concepts
- Objects
- The language manipulates objects.
- Any identifiable abstraction in the system is an object.
- Attributes
- Among objects, some are attributes, or equivalently, functions.
- Objects and attributes operate on each other, by some duality isomorphism,
you could exchange objects and attributes, and say that objects apply on
functions instead of functions applying on objects.
- Between attributes and functions is only a difference of point of view.
We tend to say one or the other according to the way we'd like it to be
implemented: we tend to call it function if we think it better evaluated
before its argument object, and attribute if we think it better evaluated
after this object. Actual implementation may make these considerations
meaningless.
- Note that these are just particular objects, and thus share other
features with objects. In particular, attributes can have attributes,
functions can operate on functions, and function or attribute objects can
be result of computations.
- Context
- An object is never considered alone, but in a context, or space,
that describes the meaningful aspects of the object:
the context gives the full semantics of the object.
- Only meaningful constructors may be used on objects; for example, even
if an object could be modified, by considering it in a context where modifying
it is excluded, it is made read-only; by considering an object in a context
which has a modifier method, the object is made read-write.
- The context may also explicitly define specifications of the object,
so that the system may simplify programs in a way that will keep the
semantics by doing transformations that let meaningful things unchanged, while
limiting resource allocation in unmeaningful dependences.
e.g. if your methods for a list maintain a count of list elements that
you don't need, it may be ripped off automatically by the system if not
declared meaningful somewhere. Note that different views on the "same" object
may consider different things as meaningful...
- Rewrite
- We need some way to express the fact that objects interact, and knowledge
evolves, .....
- Rewrite has Higher Order semantics:
to express t.....
- Active annotations
- As an example of the power of annotations context, in some metaspaces,
objects may have active attributes, that are evaluated together with
the object.
- Trivial examples include scheduling/usage information as call counter
for a function, notification of one's login to others, etc.
- actively annotating objects may be a fairly standard programming technique:
to parse a text, dynamically write annotations it during its being read; active
annotations to the byte reader may correspond to the program's state; it may
then become trivial to write programs that depend on the past and
future of the local input context.
- but this feature should be made static (i.e. removed from the dynamic
context) whenever not needed, because it hampers good implementation:
in fact, active annotation may mean that an object must notify others
whenever it is evaluated; dynamic active annotations mean some space and/or
time overhead so that such notification is done for every element in a
dynamicly evolving list of objects.
- Access rights
- Access rights are implemented through HLL context and scoping.
An object is accessible by another object if and only if it is in
that other object's HLL context.
- So an object actually exists in the system if and only if there is a
path of objects that each allows access to the next, beginning from the
boot program, and ending with the considered object.
- You limit the rights by just wrapping the object with evaluators that
will allow access only to parts of existing aspects of the object.
- The compiler or run-time system should inline any frequently evaluated
wrapping code, so this is no performance hit.
- An object that verifies or inlines accesses is called a meta-object.
- In particular, read-only or even constant objects are encouraged
(by not publishing any way to modify the object in any context), because
of their semantics much nicer to express and implement (i.e. much simpler
meta-objects).
- When side-effect functions are hidden from the context, an object may be
completely read-only. Particularly, you can't change an attribute's value if
you don't have the proper "rights".
- There is no such thing in the OS as builtin capability lists or
passwords, but you can still use those lame unsecure mechanisms, if you're
masochistic and paranoid, through proper wrapping.
- Types
- Being able to access objects does not mean you can combine them blindly.
To actually combine two objects or evaluate one someway, you must
somehow have the system trust such operation may not ever result in a crash.
- Thus, two objects can connect if only if there exists meta-objects that
agrees.
- The set of operations that a meta-object allows for an object is often
called the object's type.
We thus call "type-system" a meta-object that manages
connections between objects,
even though these meta-objects can do things completely different
from traditional type systems.
Verifying that objects' type allow them to combine is called
to typecheck the combination.
- Common languages use deterministic first-order or higher-order
type systems; but any kind of type system is possible,
as long as it is trusted.
- An object may have no relevant type in some type system,
and thus not be able to combine with other objects in this type system;
but it may also have types in other type systems, and thus be able to combine
in these.
- Whatever type systems are used is determined by the context.
- To found the system, a very basic type systems will allow to
define and run low-level code, which makes the
LLL a subset of the HLL.
- Besides traditional type systems,
there will be types consisting of list of capabilities,
with or without encrypted signatures, etc.
Actually, there will be a whole type-system construction kit.
- And for optimal security, ultimate type systems will be
based on proofs in some formal mathematical logic system.
- Proofs
- In such ultimate type systems, the type of an object is
a specification of the object, together with a possible proof that this
specification is fullfilled.
- Such type systems, which can be very well combined with lesser type
systems, allow to express just any theoretically computable requirement.
Thus, they (and only they) can make us just as confident in programs as
we can ever be.
- The drawback of such rich type systems is that while there it is
fairly easy to typecheck a program given enough data, there is no
deterministic way to synthetize this data for an arbitrary program.
For this reason, many people refuse to use such typesystems.
While of course nobody should be forced to use them, this attitude
is criminal when it forces people not to use them.
For there is security only such typesystems can provide,
and that is made impossible by the stubborn people who prevent
such typesystems to be used by the common programmer.
- Such systems force programmers to work more to achieve the
additional security they can provide, so what ?
Of course one must work to achieve any result;
I can't imagine any system (computer or not) where this is not the case
(by definition of work).
By letting people underspecify their programs,
these rich systems can be used by the lazy or hurried people
as any type system, so they can still do what they could with
poorer type systems, and are not forced to work more.
But with a rich type system they can provide or require
such information, whereas poor type systems prevent them from
doing so.
Therefore, rich type systems are additional freedom,
while poor type systems are a terrible limitation to computer users.
- Thus, objects that are to provide rich information and still
run in a decidable way must include enough data to generate a
proof of their specification together with this specification.
Typechecking such objects can be done very efficiently in a
deterministic way.
- Generating the proofs can be done with lots of tools,
but we know (Godel, Turing, etc) that there will always be
need for intervention of an intelligent being,
because no deterministic generic algorithm exists to generate proofs.
- Using the Curry-Howard isomorphism,
that shows that proofs and programs have the same structure,
we can also obtain proofs from programs and conversely,
with just the proper annotations
(see the Coq proof environment).
- Specifications include invariant and variant part.
- The invariant describes statical information in the program,
that is, what information
is required, conserved, transformed and acquired by
the program when it is run.
- The variant describes what informational magnitudes should be
minimized or maximized by the program, according to some
partial ordering.
Typically, consumed system resources and unwanted
side-effects are invoked here;
but any information can be used here, so that
the greater (or lesser) the magnitude,
the more efficient the program,
and the more satisfied the user.
- Security
- The weaker the typechecking, the lower the security.
- All systems, from the poorest to the richest,
rely on some number of axioms.
The problem is to limit the number and unsecurity of these axioms,
so we can trust the system as much as possible.
- Eventually, the list of such axioms will be made very small,
so that the human user can control it. The most fundamental axioms,
consistency of the most basic system components, will be very reliable
because the whole world will be warrant for it, the sources being
freely available, and bugs (or hopefully absence of known bugs)
will be reported immediately on the Internet.
- Because proofs depend on axioms, the axiomatic context must be
clear for any object. Also, there must be world-wide ways to identify
axioms and objects, so proofs are universally understandable.
- To ensure security while still allowing objects to use axioms that
other objects distrust, the system can isolate objects from each other
in sandboxes, that can be much like the "processes" of traditional systems.
- Choice
- The language must require as few redundancies as possible:
a programmer may want to give as few information as possible to
the computer, that will have to fill the missing redundant information all
by itself (with human hint as needed)
- There is a mechanism for that, which is the use of a "choice" construct,
or implicitness construct. When such a construct is used, the system must
somehow provide an object fulfilling the requirements, if there exists one,
or else issue an exception.
- Exactly how an object is chosen must be completely human-controllable,
as there is no decidable algorithm to fulfill constraints. This control
is done through annotations and contextual information, as usual;
it can be statically or dynamically bound, etc.
- This is how the user interface is implemented:
when a program needs some data to be provided, it calls the "magic" of the
choice axiom.
- Of course, the choice axiom can be a source of unsecurity, or at least
absence of warranties, and shall be listed in the list of axioms used by
any object invoking it.
Standard Basic constructors
The HLL provides a set of basic constructors.
This set should enable it to emulate any kind of other set of constructors.
- Applying Functions
- As in SELF or LISP, the basic constructor will be applying an
attribute/function to an object.
- As in Scheme or ML, and unlike SELF or LISP, the attribute/function
is not a meta-level symbol, but a zero-level (first-class) object.
- Abstracting terms
- The next basic constructor is typed abstraction on any sub-object of
an expression, using any type system, which allows to create new functions.
i.e. lambda-expressions from an arbitrarily (but strongly) typed lambda
calculus, which gives something like
lambda (x).sin(2*x+1),
or (using Dylan syntax)
method (x) ; sin(2*x+1) ; end method ;
or (using CAML syntax)
fun x -> 2 *. x +. 1
or (using Coq syntax)
[x:Real](2 *. x +. 1)
- Defining typed constructors
- You can define typed constructor axioms; but they won't be trusted
unless you are trusted too; that is, trusting the axiom creator
is part of any proof.
e.g. If you have access to the Coq-like object "Set" (the type of
constructible data objects, that correspond to actual programs and
their data in Coq).
e.g. you can use standard "compilers" that construct objects from tools;
among them are creations of inductive definitions and suches (see Coq)
or (if you're trusted enough in the context), you can associate low-level
objects to a high-level specification without a proof.
[.....]
- Standard constructors (including all the previous) can be selectively
trusted, so that contexts can be very limited (e.g. only applying functions
among those from a list), or very open (trusting all the inherently secure
constructors plus a list of axioms/trustees)
- Inductive definitions (as in Coq) may be used as a secure way to construct
new types.
- Evaluation strategy
- Objects can be future or actual;
- you can explicitly force or delay objects, or do it implicitly by
just annotating a function object to say that its argument may be future,
or should be actual.
- functions can be pure or impure (without or with side-effects);
in the latter case, specifications may include how impure the function is.
- .....
Standard Library of language extensions
- The standard library of extensions provides standard objects for use in
various contexts.
- For the User Interface part, we should collaborate tightly with the
UI subproject.
Semantic Difficulties
- Syntax is definitely NOT a technical difficulty.
Moreover, a proper system should be able to simulate any syntax.
The problems are in the semantics, which is quite another thing.
[However, syntax is a matter of conventions,
and conventions involve difficulties other than technical].
- The most difficult thing (I think) is about administrative, weak
pointers, that are often useful for performant algorithm, but do not count to
determine the object's usefulness. e.g. a device driver or event
handler of any kind must register to the event provider/resource allocator,
which in turn may (or may not) need have pointers back to the handler (e.g.
linked list of handlers); but this back-pointer does NOT represent
that the driver/handler is actually needed.
- Then, there is the problem of the efficiency of dynamic annotations,
all the more when it comes to annotating light objects:
what mechanisms can be used to annotate terminal objects ?
Reserving an annotation list for each object would be quite too costly
(and a C++ like bloat), and should be reserved for objects that are
more often manipulated by humans than used in actual computations
(i.e. the objects nearest to the human user).
- So annotation mechanism depends on: the annotation object, the annotated
object, and the context meta-object.
- It seems that somehow, the system should (dynamically) differentiate
coarse-interfaced objects (including single integers that are being
watched by external humans) from lightly interfaced ones. The former
would have an "annotation" field, while the latter would have a more
expensive mechanism for annotation (e.g. using hashed association tables),
or would have to be annotated through a coarse-object wrapping.
- And what about annotating annotations ?
- We must have some efficient implementation to track dependencies when
there is partial evaluation, so that no invalidated code is run.
Note that this encompasses any kind of "make" utility, and is much better,
as it supports a fine grain and exact semantics !
- We must keep the semantics as simple and controllable as possible,
to allow program proof.
- Most importantly, all objects should be defined clearly in an
abstract context that shows the "limits" of the object:
an object is a term made with implicit and explicit parameters.
Though some implicit parameters may be instantiated, they can be migrated;
though explicit parameters may be future values, they cannot be migrated.
These concepts naturally extend to all meta-objects for a considered
object.
Now, we want to be able to say things about the constructors that an
object was made with. We want to be able to determine a superlist of
constructors used to make an object, and/or to say that the object can
be expressed in a theory that does not include some constructor.
The problem is when we use meta-constructors that can be eliminated
(by e.g. beta-conversion, inference, or more generally higher-order term
rewrite), but that we don't want to be actually eliminated. Such meta-
constructors are part of the implicitly instanciated, migratable
meta-constructors for the object. Typical example: though we define
"true" natural integers by Peano's axioms, we want to use usual
logarithmical size representation to manipulate them. So even if we
discuss about a type Nat=0|S:Nat->Nat, we can manipulate
usual system integers, and other such types.
To Do on this page
Complete the semantics.
Detail the rules to prove programs.
Map each constructor to all its syntaxes.
Find out how multiple type systems can be made compatible...
Separate annotation mechanisms from annotation resolving policies...
Find out how to separate or merge the "useful/administrative" or
"in context/out of context" information.
Wait for feedback, as usual.
Back to the Tunes
HLL Subproject.
Page Maintainer:
Faré
-- rideau@clipper.ens.fr