[draft]

No computer reference for utility: answers the question "are computers useful". Now, that we know the answer to this question, we shouldn't stop in such a way; Sure computers in general haven't failed and OSes earn their share of this success. But that doesn't mean that things are "optimal", and deserve no critics. On the contrary...

If safety criteria are not expressible, then to be safe, programs must be understandable by a one man. Because that man won't ever be there to maintain code, because armies of "maintainers" won't replace him, because there is no tool to safely adapt old programs to new points of views, then every so often, code must go to the dust bin. No wonder why software evolves so slowly: only some small human experience remains, and even then, because there is no way to express what that experience is, it cannot spread in technology fast ways, but only man to man.

Util

  • Part I would:
  • Part II would discuss programming language utility, stating the key concepts about it.
  • Notably after discussing how to be able to construct as many new concepts as possible, it should explain that the key to concept expressivity (that reflectivity cannot indefinitely postpone) is their separation power, and thus the capability to affirm one of multiple alternatives, to express different things, to negate and deny things.
  • Part III would discuss what is tradition, what is its role, how it should or not be considered, what it currently does wrong. It would debunk myths
  • Efficiency,Security,Small-grain: take two, you have the third. That is, when you need two of them, you also need the third, but when you have two of them, you automatically have the third.
  • with OO, people discovered that implicit binding is needed. Unhappily, most "OO" only know as-late-as-possible binding and no such thing as reflectivity (=implicitness control) or migration (=modification of implicitness control).
    The problem of having generic programs instead of just specific ones is exactly the main point that we saw about having a good grammar to introduce new generic objects, instead of just an increasing number of terminal, first order objects, that actually do specific things (i.e. extending the vocabulary).


    It may be said that computing has been doing quantitative leaps, but has not done any comparable qualitative leap; computing grows in extension, but does not evolve toward intelligence; it sometimes rather becomes more largely stupid. This is the problem of operating systems not having a good conceptual kernel: however large and complete their standard library, their utility will be essentially restricted to the direct use of the library.


  • Newest Operating Systems: the so-called "Multimedia revolution"
  • This phenomenon can also be explained by the fact that programmers, long used to software habits from the heroic times when computer memories were too tight to contain more than just the specific software you needed (when they even could), do not seem to know how to fill today computers' memory, but with pictures of gorgeous women and digitized music (which is the so-called multimedia revolution). Computer hardware capabilities evolved much quicker than human software capabilities; thus humans find it simpler to fill computers with raw data (or almost raw data) than with intelligence.

    Those habits, it must be said, were especially encouraged by the way information could not spread and augment the common public background, since because of lack of theory and practice of what a freely communicating world could or should be, only big companies enforcing "proprietary" label could up to now broadcast their software; people who would develop original software thus had (and sadly still have) to rewrite everything from almost scratch, unless they could afford a very high price for every piece of software they may want to build upon, without having much control on the contents of such software.


  • The role of the OS infrastructure in the computer world is much like that of the State in human societies: it should provide justice by guaranteeing, by force if needs be, that contracts will be fulfilled, and nothing more. In the case of computer software, this means that it will guarantee that contracts passed between objects will be fulfilled, that objects should fulfill each other's requirements before they can connect. When there is no Justice, there is no society/OS, but only chaos.


    What is really useful is a higher-order grammar, that allows to manipulate any kind of abstraction that does any kind of things at any level. We call level 0 the lowest kind of computer abstraction (e.g. bits, bytes, system words, or to idealize, natural integers). Level one is abstractions of these objects (i.e. functions manipulating them). More generally, level n+1 is made of abstractions of level n objects. We see that every level is a useful abstraction as it allows to manipulate objects that would not be possible to manipulate otherwise.

    But why stop there ? Everytime we have a set of level, we can define a new level by having objects that arbitrarily manipulate any lower object (that's ordinals); so we have objects that manipulate arbitrary objects of finite level, etc. There is an unbounded infinity of abstraction levels. To have the full power of abstraction, we must allow the use of any such level; but why not allow manipulating such full-powered systems ? Any logical limit you put on the system may be reached one day, and this day, the system would become completely obsolete; that's why any system to last must potentially contain (not in a subsystem) any single feature that may be needed one day.

    The solution is not to offer any bounded level of abstraction, but unlimited abstracting mechanisms; instead of offering only terminal operators (BASIC), or first level operators (C), or even finite-order offer combinators of arbitrary order.

    offer a grammar with an embedding of itself as an object. Of course, a simple logical theorem says that there is no consistent internal way of saying that the manipulated object is indeed the system itself, and the system state will always be much more complicated than it allows the system to understand about itself; but the system implementation may be such that the manipulated object indeed is the system. This is having a deep model of the system inside itself; and this is quite useful and powerful. This is what I call a higher-order grammar -- a grammar defining a language able to talk about something it believes be itself. And this way only can full genericity be achieved: allowing absolutely anything that can be done about the system, from inside, or from outside (after abstracting the system itself).


    ..... First, we see that the same algorithm can apply to arbitrarily complex data structures; but a piece of code can only handle a finitely complex data structure; thus to write code with full genericity, we need use code as parameters, that is, second order. In a low-level language (like "C"), this is done using function pointers.

    We soon see problems that arise from this method, and solutions for them. The first one is that whenever we use some structure, we have to explicitly give functions together with it to explain the various generic algorithm how to handle it. Worse even, a function that doesn't need some access method about an the structure may be asked to call other algorithms which will turn to need know this access method; and which exact method it needs may not be known in advance (because what algorithm will eventually be called is not known, for instance, in an interactive program). That's why explicitly passing the methods as parameters is slow, ugly, inefficient; moreover, that's code propagation (you propagate the list of methods associated to the structure -- if the list changes, all the using code changes). Thus, you mustn't pass explicitly those methods as parameters. You must pass them implicitly; when using a structure, the actual data and the methods to use it are embedded together. Such a structure including the data and methods to use it is commonly called an object; the constant data part and the methods, constitute the prototype of the object; objects are commonly grouped into classes made of objects with common prototype and sharing common data. This is the fundamental technique of Object-Oriented programming; Well, some call it that Abstract Data Types (ADTs) and say it's only part of the "OO" paradigm, while others don't see anything more in "OO". But that's only a question of dictionary convention. In this paper, I'll call it only ADT, while "OO" will also include more things. But know that words are not settled and that other authors may give the same names to different ideas and vice versa.

    BTW, the same code-propagation argument explains why side-effects are an especially useful thing as opposed to strictly functional programs (see pure ML :); of course side effects complicate very much the semantics of programming, to a point that ill use of side-effects can make a program impossible to understand or debug -- that's what not to do, and such possibility is the price to pay to prevent code propagation. Sharing mutable data (data subject to side effects) between different embeddings (different users) for instance is something whose semantics still have to be clearly settled (see below about object sharing).


    The second problem with second order is that if we are to provide functions other functions as parameter, we should have tools to produce such functions. Methods can be created dynamically as well as "mere" data, which is all the more frequent as a program needs user interaction. Thus, we need a way to have functions not only as parameters, but also as result of other functions. This is Higher order, and a language which can achieve this has a reflective semantics. Lisp and ML are such languages; FORTH also, whereas standard FORTH memory management isn't conceived for a largely dynamic use of such feature in a persistent environment. From "C" and such low-level languages that don't allow a direct portable implementation of the higher-order paradygm through the common function pointers (because low-level code generation is not available as in FORTH), the only way to achieve higher-order is to build an interpreter of a higher-order language such as LISP or ML (usually much more restricted languages are actually interpreted, because programmers don't have time to elaborate their own user customization language, whereas users don't want to learn a new complicated language for each different application and there is currently no standard user-friendly small-scale higher-order language that everyone can adopt -- there are just plenty of them, either very imperfect or too heavy to include in every single application).

    With respect to typing, Higher-Order means the target universe of the language is reflective -- it can talk about itself.

    With respect to Objective terminology, Higher-Order consists in having classes as objects, in turn being groupable in meta-classes. And we then see that it _does_ prevent code duplication, even in cases where the code concerns just one user as the user may want to consider concurrently two -- or more -- different instanciations of a same class (i.e. two sub-users may need toe have distinct but mostly similar object classes). Higher-Order is somehow allowing to be more than one computing environment: each function has its own independant environment, which can in turn contain functions.


    To end with genericity, here is some material to feed your thoughts about the need of system-builtin genericity: let's consider multiplexing. For instance, Unix (or worse, DOS) User/shell-level programs are ADTs, but with only one exported operation, the "C" main() function per executable file. As such "OS" are huge-grained, with ultra-heavy inter-executable-file (even inter-same-executable-file-processes) communication semantics no one can afford one executable per actual operation exported. Thus you'll group operations into single executables whose main() function will multiplex those functionalities.

    Also, communication channels are heavy to open, use, and maintain, so you must explicitly pass all kind of different data & code into single channels by manually multiplexing them (the same for having heavy multiple files or a manually multiplexed huge file).

    But the system cannot provide builtin multiplexing code for each single program that will need it. It does provide code for multiplexing the hardware, memory, disks, serial, parallel and network lines, screen, sound. POSIX requirements grow with things a compliant system oughta multiplex; new multiplexing programs ever appear. So the system grows, while it will never be enough for user demands as long as all possible multiplexing won't have been programmed, and meanwhile applications will spend most of their time manually multiplexing and demultiplexing objects not yet supported by the system.

    Thus, any software development on common OSes is hugeware. Huge in hardware resource needed (=memory - RAM or HD, CPU power, time, etc), huge in resource spent, and what is the most important, huge in programming time.

    The problem is current OSes provide no genericity of services. Thus they can never do the job for you. That why we really NEED generic system multiplexing, and more generally genericity as part of the system. If one generic multiplexer object was built, with two generic specializations for serial channels or flat arrays and some options for real-time behaviour and recovery strategy on failure, that would be enough for all the current multiplexing work done everywhere.


    So this is for Full Genericity: Abstract Data Types and Higher Order. Now, if this allows code reuse without code replication -- what we wanted -- it also raises new communication problems: if you reuse objects especially objects designed far away in space or time (i.e. designed by other people or an other, former, self), you must ensure that the reuse is consistent, that an object can rely upon a used object's behaviour. This is most dramatic if the used object (e.g. part of a library) comes to change and a bug (that you could have been aware of -- a quirk -- and already have modified your program accordingly) is removed or added. How to ensure object combinations' consistency ?

    Current common "OO" languages are not doing much consistency checks. At most, they include some more or less powerful kind of type checking (the most powerful ones being those of well-typed functional languages like CAML or SML), but you should know that even powerful, such type checking is not yet secure. For example you may well expect a more precise behavior from a comparison function on an ordered class 'a than just being 'a->'a->{LT,EQ,GT} i.e. telling that when you compare two elements the result can be "lesser than", "equal", or "greater than": you may want the comparison function to be compatible with the fact of the class to be actually ordered, that is x<y & y<z => x<z and such. Of course, a typechecking scheme, which is more than useful in any case, is a deterministic decision system, and as such cannot completely check arbitrary logical properties as expressed above (see your nearest lectures in Logic or Computation Theory). That's why to add such enhanced security, you must add non-deterministic behaviour to your consistency checker or ask for human help. That's the price for 100% secure object combining (but not 100% secure programming, as human error is still possible in misexpressing the requirements for using an object, and the non-deterministic behovior can require human-forced admission of unproved consistency checks by the computer).

    This kind of consistency security by logical formal property of code is called a formal specification method. The future of secure programming lies in there (try enquire in the industry about the cost of testing or debugging software that can endanger the company or even human lives if ill written, and insurance funds spent to cover eventual failures - you'll understand). Life concerned industries already use such modular formal specification techniques.

    In any cases, we see that even when such methods are not used automatically by the computer system, the programmer has to use them manually, by including the specification in comments or understanding the code, so he does computer work.

    Now that you've settled the skeleton of your language's requirements, you can think about peripheral deduced problems.

    .....

    ..... A technique should be used when and only when it is best fit; any other use may be expedient, but not quite useful.

    Moreover, it is very hard to anticipate one's future needs; whatever you do, there will always be new cases you won't have.

    lastly, it doesn't replace combinators And finally, as of the combinatorials allowed allowing local server objects to be saved by the client is hard to implement eficiently without the server becoming useless, or creating a security hole;

    ..... At best, your centralized code will provide not only the primitives you need, but also the combinators necessary; but then, your centralized code is a computing environment by itself, so why need the original computing environment ? there is obviously a problem somewhere; if one of the two computing environment was good, the other wouldn't be needed !!!; All these are problems with servers as much as with libraries.



    • implicit vs explicit is what differentiates a HLL from a LLL. A LLL will require the pow
    • not building an artificial border between programmers and users => not only the system programming language must be OO, but the whole system.
    • easy user extensibility -> language-level reflection.
    • sharing mutable data: how ? -> specifications & explicitly mutable/immutable (or more or less mutation-prone ?) & time & locking -- transactions.
    • objects that must be shared: all the hardware resources -- disks & al.
    • sharing accross time -> persistence - reaching precision/mem/speed/resource limit: what to do ? -> exceptions
    • recovering from exceptional situations: how ? -> continuations (easy if higher-order on)
    • tools to search into a library -> must understand all kind of morphism in a logically specified structure.
    • sharing accross network -> distribution
    • almost the same: tools for merging code -> that's tricky. Very important for networks or even data distributed on removable memory (aka floppies) -- each object should have its own merging/recovery method.
    • more generally tools for having side effects on the code.
    • A common myth about programming is that low-level programming allows more efficiency than high-level programming. This is completely untrue, while the opposite is quite true. Actually, people spend several million dollars at developping optimizing C and FORTRAN compilers, but a much cheaper Common LISP compiler (CMU Common LISP, developped by a few students and teachers), achieve similar performance, while allowing the whole expressivity of a real high-level language. Also, people may see that a large part of modern optimizers consist in making the whole code higher-level, so it can be better understood and optimized by the compiler. Any amount of time spent at manually optimizing some routine, could be equally spent at developping some specialized optimizing heuristics of same effect on the particular low-level routine, but that could generalize to further modified versions of the routine, or of similar routines, thus improving reliability and maintainability as well as performance, and saving a lot of time. Of course, this means that compiler technology with the ability to accept user-defined optimizing heuristics be widely available. But this is just possible and will be case. Instead of losing ever more time at low-level coding, most low-level people should consider making such a compiler appear sooner. Actually, a trivial theoretical argument could have told us that already: high-level programs contain more information and less noise than low-level programs, hence, can be manipulated and compiled more efficiently, with proper tools; and anything that can be done in low-level can be done at least as well, and surely more cleanly and genericly, in high-level.

    • Structures
    • we consider Logical Structures: each structure contains some types, and symbols for typed constants, relations, and functions between those types. Then we know some algebraic properties verified by those objects, i.e. a structure of typed objects, with a set of constants & functions & relations symbols, et al.

      A structure A is interpreted in another structure B if you can map the symbols of A with combinations of symbols of B (with all the properties conserved). The simplest way to be interpreted is to be included.

      A structure A is a specialization of a structure B if it has the same symbols, but you know more properties about the represented objects.

    • Mutable objects
    • We consider the structure of all the possible states for the object. The actual state is a specialization of the structure. The changing states accross time constitute a stream of states.

    • Sharing Data
    • The problem is: what to do if someone modifies an object that others see ? Well, it depends on the object. An object to be shared must have been programmed with special care. The simplest case is when the object is atomic, and can be read or modified atomically. At one time, the state is well defined, and what this state is what other sharers see. When the object is a rigid structure of atomic objects, well, we assume that you can lock parts of the object that must be changed together -- in the meantime, the object is unaccessible or only readable -- and when the modification is done, everyone can access the object as before. That's transactions. Now, what to do when the object is a very long file (say text), that each user sees a small part of it (say a full screen of text), and that someone somewhere adds or deletes some records (say a sentence) ? Will each user's screen scroll according to the number of records deleted ? Or will they stay at the same spot ? The later behaviour seem more natural. Thus, a file has this behaviour that whenever a modification is done, all pointers to the file must change. But consider a file shared by _all_ the users across a network. Now, a little modification by someone somewhere will affect everyone ! That's why both the semantics and implementation of shared objects should be thought about longly before they are settled.

    • Problem: recovery
    • What to do when assumptions are broken by higher priority objects ? e.g. when the user interrupts a real-time process, when he forces a modification in an otherwise locked file, when the process is out of memory, etc.

      Imagine that a real-time process is interrupted for imperative reasons (e.g. a cable was unplugged; a higher-priority process took over the cpu, etc): will it continue where it stopped ? or will it skip what was done during the interruption ? Imagine the system runs out of memory ? Whose memory are you to reclaim back ? To the biggest process ? The smallest ? The oldest ? The lowest real-time priority ? The first to ask for more ? Or will you "panic" like most existing OSes ? If objects spawn, thus filling memory (or CPU), how to detect "the one" responsible and destroy it ?

      If an object locks a common resource, and then is itself blocked by a failure or other unwilling latency, should this transaction be cancelled, so others can access the resource, or should all the system wait for that single transaction to end ?


      As for implementation methods, you should always be aware that defining all those abstraction as the abstractions they are rather than hand-coded emulation for these allows better optimizations by the compiler, quicker write phase for the programmer, neater semantics for the reader/reuser, no implementation code propagation, etc.

      Partial evaluation should also allow specialization of code that don't use all the language's powerful semantics, so that standalone code be produced without including the full range of heavy reflective tools.


    Summary

  • Axioms:
    • "No man should do what the computer can do quicker for him (including time spent to have the computer understand what to do)" -- that's why we need to be able to give order to the computer, i.e. to program.
    • "Do not redo what others already did when you've got more important work" -- that's why we need code reuse.
    • "no uncontrolled code propagation" -- that's why we need genericity.
    • "security is a must when large systems are being designed" -- that's why we need strong typechecking and more.
    • "no artificial border between programming and using" -- that's why the entire system should be OO with a unified language system, not just a hidden system layer.
    • "no computer user is an island, entire by itself" -- you'll always have to connect (through cables, floppies or CD-ROMs or whatever) to external networks, so the system must be open to external modifications, updates and such.

    Current computers are all based on the von Neumann model in which a centralized unit executes step by step a large program composed of elementary operations. While this model is simple and led to the wonderful computer technology we have, laws of physics limit in power future computer technology to no more than a grand maximum factor 10000 of what is possible today on superdupercomputers.
    This may seem a lot, and it is, which leaves room for many improvement in computer technology; however, the problems computer are confronted to are not limited anyway by the laws of physics. To break this barrier, we must use another computer model, we must have many different machines that cooperate, like cells in a body, ants in a colony, neurones in a brain, people in a society.

    Machines can already communicate; but with existing "operating systems" the only working method they know is "client/server architecture", that is, everybody communicating his job to a one von Neuman machine to do all the computations, which is limited by the same technological barrier as before. The problem is current programming technology is based on coarse-grained "processes" that are much too heavy to communicate; thus each job must be done on a one computer. machine that executes Computing s all the requirement to be used as for Tunes, or design a new one if none is found.


    That is, without ADTs, and combinating ADTs, you spend most of your time manually multiplexing. Without semantic reflection (higher order), you spend most of your time manually interpreting runtime generated code or manually compiling higher order code. Without logical specification, you spend most of your time manually verifying. Without language reflection, you spend most of your time building user interfaces. Without small grain, you spend most of your time manually inlining simple objects into complex ones, or worse, simulating them with complex ones. Without persistence, you spend most of your time writing disk I/O (or worse, net I/O) routines. Without transactions, you spend most of your time locking files. Without code generation from constraints, you spend most of your time writing redundant functions that could have been deduced from the constraints.

    To conclude, there are essentially two things we fight: lack of feature and power from software, and artificial barriers that misdesign of former software build between computer objects and others, computer objects and human beings, and human beings and other human beings.

    Persistence is necessary for AI:

    • Intelligence is the fruit of a long tradition. Even a most intelligent and precocious human being must be carefully bred for years before yielding the faintest result.
    • How could you expect a machine to become intelligent as soon as it is built and powered-up, or even after being powered-up for some hours, or some days ?
    • computers currently do not allow any information to persist reliably more than a few months, and won't translate information from old software to newer ones.
    • Hence, artificial intelligence is not possible with existing architecture.
    • However, systems with persistent memory could be a first step toward AI.

    To conclude, I'll say

  • Stress on comput*ing* systems means having a computer *Project* not only a computer *Object*.
  • Why are existing OS so lame ? For the same reason that ancient lore is completely irrelevant in nowadays' world:
    At a time when life was hard, memories very small and expensive, development cost very high, people had to invent hacker's techniques to survive; they made arbitrary decisions so survive with their few resources; They behaved dirtily, and thought for the short term.
    They had to.

    Now, technology has always evolved at an increasing pace. What was experimental truth is always becoming obsolete, and good old recipes are becoming out of date. Behaving cleanly and thinking for the long term is made possible.
    The problem is, most people don't think, but blindly follow traditions. They do not try to distinguish what is truth and what is false in tradition, what is still true, and what no longer stands. They take it as a whole, and adore it religiously, abdicating all their critical faculties. As a result their morals are an unspeakable burden, mixing common sense, valid or obsolete experimental data, and valid, outdated, or false rules. Their roots are not in actual facts, but in ancient lore, hence their being irrelevant.

    Tunes intends to rip off all these computer superstitions.
  • People tend to think statically in many ways.
  • When confronted with some proposition in TUNES, people tend to consider it separated from the rest of the TUNES ideas, and they then conclude that the idea is silly, because it contradicts something else in the traditional system design. These systems indeed have some coherency, which is why they survived and were passed by tradition. But TUNES tries to be much more coherent even,
    ------>8------>8------>8------>8------>8------>8------>8------>8------>8------
       Now, the description could be restated as:
    "project to replace existing Operating Systems, Languages,
    and User Interfaces by a completely rethough Computing
    System, based on a correctness-proof-secure
    higher-order reflective self-extensible fine-grained
    distributed persistent fault-tolerant version-aware
    decentralized (no-kernel) object system."
    
    
    > i saw your answer about an article in the news, so i wanna know,
    > what is tunes ?
       Well, that's a tough one.
       Here is what I told Yahoo:
    "TUNES is a project to replace existing Operating Systems, Languages,
    and User Interfaces by a completely rethough Computing
    System, based on a correctness-proof-secure
    higher-order reflective self-extensible fine-grained
    distributed persistent fault-tolerant version-aware
    decentralized (no-kernel) object system."
    
       Now, there are lots of technical terms in that.
    Basically, TUNES is a project that strives to develop a system where
    computists would be much freer than they currently are:
    in existing systems, you must suffer the inefficiencies of
    * centralized execution [=overhead in context switching],
    * centralized management [=overhead and single-mindedness in decisions],
    * manual consistency control [=slow operation, limitation in complexity],
    * manual error-recovery [=low security],
    * manual saving and restoration of data [=overhead, loss of data],
    * explicit network access [slow, bulky, limited, unfriendly, unefficient,
     wasteful distribution of resource],
    * coarse-grained modularity [=lack of features, difficulty to upgrade]
    * unextensibility [=impossibility to do things oneself,
     people being taken hostage by software providers]
    * unreflectivity [=impossibility to write programs clean for both human
     and computer; no way to specify security]
    * low-level programming [=necessity to redo things again everytime one
     parameter changes].
    
       If any of these seems unclear to you, I'll try to make it clearer in
    
    
    
    * unindustrialized countries: the low reliability of power feeds make
    resiliant persistency a must.
    
    ------>8------>8------>8------>8------>8------>8------>8------>8------>8------
    


  • Previous: Bibliography
  • Up: Table of Contents


    To Do on this page

  • All stuff in it should be transferred to the definitive version or deleted.


    Faré -- rideau@clipper.ens.fr