Also see the disaster of State managing services.
Well, then
Seeing how existing human States and computer kernels fail to do their job is left as an exercise to the reader.
The point is that all this infrastructure is meant to help objects (people) communicate with each other in fair terms, so that the global communication is faster, safer, and more accurate, with less noise, while consuming less resources. It should make the objects nearer to each other.
The role of State id to allow people to communicate.
To stay politically as neutral as possible (after all, this is a technical paperr), the paper should try to not explicitly use a reference to State, if possible. Instead, it would conclude with a note according to which the very same argument would hold when applied to human societies as similar dynamical systems.
Those habits, it must be said, were especially encouraged by the way information could not spread and augment the common public background, since because of lack of theory and practice of what a freely communicating world could or should be, only big companies enforcing "proprietary" label could up to now broadcast their software; people who would develop original software thus had (and sadly still have) to rewrite everything from almost scratch, unless they could afford a very high price for every piece of software they may want to build upon, without having much control on the contents of such software.
What is really useful is a higher-order grammar, that allows to manipulate any kind of abstraction that does any kind of things at any level. We call level 0 the lowest kind of computer abstraction (e.g. bits, bytes, system words, or to idealize, natural integers). Level one is abstractions of these objects (i.e. functions manipulating them). More generally, level n+1 is made of abstractions of level n objects. We see that every level is a useful abstraction as it allows to manipulate objects that would not be possible to manipulate otherwise.
But why stop there ? Everytime we have a set of level, we can define a new level by having objects that arbitrarily manipulate any lower object (that's ordinals); so we have objects that manipulate arbitrary objects of finite level, etc. There is an unbounded infinity of abstraction levels. To have the full power of abstraction, we must allow the use of any such level; but why not allow manipulating such full-powered systems ? Any logical limit you put on the system may be reached one day, and this day, the system would become completely obsolete; that's why any system to last must potentially contain (not in a subsystem) any single feature that may be needed one day.
The solution is not to offer any bounded level of abstraction, but unlimited abstracting mechanisms; instead of offering only terminal operators (BASIC), or first level operators (C), or even finite-order offer combinators of arbitrary order.
offer a grammar with an embedding of itself as an object. Of course, a simple logical theorem says that there is no consistent internal way of saying that the manipulated object is indeed the system itself, and the system state will always be much more complicated than it allows the system to understand about itself; but the system implementation may be such that the manipulated object indeed is the system. This is having a deep model of the system inside itself; and this is quite useful and powerful. This is what I call a higher-order grammar -- a grammar defining a language able to talk about something it believes be itself. And this way only can full genericity be achieved: allowing absolutely anything that can be done about the system, from inside, or from outside (after abstracting the system itself).
.....
First, we see that the same algorithm can apply to arbitrarily complex data
structures; but a piece of code can only handle a finitely complex data
structure; thus to write code with full genericity, we need use code as
parameters, that is, second order. In a low-level language (like "C"),
this is done using function pointers.
We soon see problems that arise from this method, and solutions for them. The first one is that whenever we use some structure, we have to explicitly give functions together with it to explain the various generic algorithm how to handle it. Worse even, a function that doesn't need some access method about an the structure may be asked to call other algorithms which will turn to need know this access method; and which exact method it needs may not be known in advance (because what algorithm will eventually be called is not known, for instance, in an interactive program). That's why explicitly passing the methods as parameters is slow, ugly, inefficient; moreover, that's code propagation (you propagate the list of methods associated to the structure -- if the list changes, all the using code changes). Thus, you mustn't pass explicitly those methods as parameters. You must pass them implicitly; when using a structure, the actual data and the methods to use it are embedded together. Such a structure including the data and methods to use it is commonly called an object; the constant data part and the methods, constitute the prototype of the object; objects are commonly grouped into classes made of objects with common prototype and sharing common data. This is the fundamental technique of Object-Oriented programming; Well, some call it that Abstract Data Types (ADTs) and say it's only part of the "OO" paradigm, while others don't see anything more in "OO". But that's only a question of dictionary convention. In this paper, I'll call it only ADT, while "OO" will also include more things. But know that words are not settled and that other authors may give the same names to different ideas and vice versa.
BTW, the same code-propagation argument explains why side-effects are an especially useful thing as opposed to strictly functional programs (see pure ML :); of course side effects complicate very much the semantics of programming, to a point that ill use of side-effects can make a program impossible to understand or debug -- that's what not to do, and such possibility is the price to pay to prevent code propagation. Sharing mutable data (data subject to side effects) between different embeddings (different users) for instance is something whose semantics still have to be clearly settled (see below about object sharing).
The second problem with second order is that if we are to provide functions
other functions as parameter, we should have tools to produce such functions.
Methods can be created dynamically as well as "mere" data, which is all the
more frequent as a program needs user interaction. Thus, we need a way to
have functions not only as parameters, but also as result of other functions.
This is Higher order, and a language which can achieve this has a
reflective semantics. Lisp and ML are such languages; FORTH also, whereas
standard FORTH memory management isn't conceived for a largely dynamic use of
such feature in a persistent environment. From "C" and such low-level
languages that don't allow a direct portable implementation of the
higher-order paradygm through the common function pointers (because low-level
code generation is not available as in FORTH), the only way to achieve
higher-order is to build an interpreter of a higher-order language such as
LISP or ML (usually much more restricted languages are actually interpreted,
because programmers don't have time to elaborate their own user customization
language, whereas users don't want to learn a new complicated language for
each different application and there is currently no standard user-friendly
small-scale higher-order language that everyone can adopt -- there are just
plenty of them, either very imperfect or too heavy to include in every
single application).
With respect to typing, Higher-Order means the target universe of the language is reflective -- it can talk about itself.
With respect to Objective terminology, Higher-Order consists in having classes as objects, in turn being groupable in meta-classes. And we then see that it _does_ prevent code duplication, even in cases where the code concerns just one user as the user may want to consider concurrently two -- or more -- different instanciations of a same class (i.e. two sub-users may need toe have distinct but mostly similar object classes). Higher-Order is somehow allowing to be more than one computing environment: each function has its own independant environment, which can in turn contain functions.
To end with genericity, here is some material to feed your thoughts about
the need of system-builtin genericity: let's consider multiplexing.
For instance, Unix (or worse, DOS) User/shell-level programs are ADTs,
but with only one exported operation, the "C" main() function per executable
file. As such "OS" are huge-grained, with ultra-heavy inter-executable-file
(even inter-same-executable-file-processes) communication semantics no one can
afford one executable per actual operation exported. Thus you'll group
operations into single executables whose main() function will multiplex those
functionalities.
Also, communication channels are heavy to open, use, and maintain, so you must explicitly pass all kind of different data & code into single channels by manually multiplexing them (the same for having heavy multiple files or a manually multiplexed huge file).
But the system cannot provide builtin multiplexing code for each single program that will need it. It does provide code for multiplexing the hardware, memory, disks, serial, parallel and network lines, screen, sound. POSIX requirements grow with things a compliant system oughta multiplex; new multiplexing programs ever appear. So the system grows, while it will never be enough for user demands as long as all possible multiplexing won't have been programmed, and meanwhile applications will spend most of their time manually multiplexing and demultiplexing objects not yet supported by the system.
Thus, any software development on common OSes is hugeware. Huge in hardware resource needed (=memory - RAM or HD, CPU power, time, etc), huge in resource spent, and what is the most important, huge in programming time.
The problem is current OSes provide no genericity of services. Thus they can never do the job for you. That why we really NEED generic system multiplexing, and more generally genericity as part of the system. If one generic multiplexer object was built, with two generic specializations for serial channels or flat arrays and some options for real-time behaviour and recovery strategy on failure, that would be enough for all the current multiplexing work done everywhere.
So this is for Full Genericity: Abstract Data Types and Higher Order.
Now, if this allows code reuse without code replication -- what we wanted --
it also raises new communication problems: if you reuse objects especially
objects designed far away in space or time (i.e. designed by other
people or an other, former, self), you must ensure that the reuse is
consistent, that an object can rely upon a used object's behaviour. This is
most dramatic if the used object (e.g. part of a library) comes to change
and a bug (that you could have been aware of -- a quirk -- and already have
modified your program accordingly) is removed or added. How to ensure object
combinations' consistency ?
Current common "OO" languages are not doing much consistency checks. At most, they include some more or less powerful kind of type checking (the most powerful ones being those of well-typed functional languages like CAML or SML), but you should know that even powerful, such type checking is not yet secure. For example you may well expect a more precise behavior from a comparison function on an ordered class 'a than just being 'a->'a->{LT,EQ,GT} i.e. telling that when you compare two elements the result can be "lesser than", "equal", or "greater than": you may want the comparison function to be compatible with the fact of the class to be actually ordered, that is x<y & y<z => x<z and such. Of course, a typechecking scheme, which is more than useful in any case, is a deterministic decision system, and as such cannot completely check arbitrary logical properties as expressed above (see your nearest lectures in Logic or Computation Theory). That's why to add such enhanced security, you must add non-deterministic behaviour to your consistency checker or ask for human help. That's the price for 100% secure object combining (but not 100% secure programming, as human error is still possible in misexpressing the requirements for using an object, and the non-deterministic behovior can require human-forced admission of unproved consistency checks by the computer).
This kind of consistency security by logical formal property of code is called a formal specification method. The future of secure programming lies in there (try enquire in the industry about the cost of testing or debugging software that can endanger the company or even human lives if ill written, and insurance funds spent to cover eventual failures - you'll understand). Life concerned industries already use such modular formal specification techniques.
In any cases, we see that even when such methods are not used automatically by the computer system, the programmer has to use them manually, by including the specification in comments or understanding the code, so he does computer work.
Now that you've settled the skeleton of your language's requirements, you can think about peripheral deduced problems.
.....
.....
A
technique should be used when and only when it is best fit; any other use
may be expedient, but not quite useful.
Moreover, it is very hard to anticipate one's future needs; whatever you do, there will always be new cases you won't have.
lastly, it doesn't replace combinators And finally, as of the combinatorials allowed allowing local server objects to be saved by the client is hard to implement eficiently without the server becoming useless, or creating a security hole;
..... At best, your centralized code will provide not only the primitives you need, but also the combinators necessary; but then, your centralized code is a computing environment by itself, so why need the original computing environment ? there is obviously a problem somewhere; if one of the two computing environment was good, the other wouldn't be needed !!!; All these are problems with servers as much as with libraries.
Actually, the same holds for any kind of static information that might have been gathered about programs: you can live without the computer checking it, by checking it yourself. But then you must do computer work, are not guaranteed to do it properly, and cannot offer the guarantee to your customers, as youuur proof is all inside your mind and not repeatable!!!
BTW, this whole wrangle is exactly why I recommend avoiding the term "weakly typed." It means at least three different things to different people, and various combinations to other people: 1. dynamic typing 2. implicit conversions, and 3. unchecked types
A structure A is interpreted in another structure B if you can map the symbols of A with combinations of symbols of B (with all the properties conserved). The simplest way to be interpreted is to be included.
A structure A is a specialization of a structure B if it has the same symbols, but you know more properties about the represented objects.
The simplest case is when the object is atomic, and can be read or modified atomically. At one time, the state is well defined, and what this state is what other sharers see.
When the object is a rigid structure of atomic objects, well, we assume that you can lock parts of the object that must be changed together -- in the meantime, the object is unaccessible or only readable -- and when the modification is done, everyone can access the object as before. That's transactions.
Now, what to do when the object is a very long file (say text), that each user sees a small part of it (say a full screen of text), and that someone somewhere adds or deletes some records (say a sentence) ? Will each user's screen scroll according to the number of records deleted ? Or will they stay at the same spot ? The later behaviour seem more natural. Thus, a file has this behaviour that whenever a modification is done, all pointers to the file must change. But consider a file shared by _all_ the users across a network. Now, a little modification by someone somewhere will affect everyone ! That's why both the semantics and implementation of shared objects should be thought about longly before they are settled.
Imagine that a real-time process is interrupted for imperative reasons (e.g. a cable was unplugged; a higher-priority process took over the cpu, etc): will it continue where it stopped ? or will it skip what was done during the interruption ? Imagine the system runs out of memory ? Whose memory are you to reclaim back ? To the biggest process ? The smallest ? The oldest ? The lowest real-time priority ? The first to ask for more ? Or will you "panic" like most existing OSes ? If objects spawn, thus filling memory (or CPU), how to detect "the one" responsible and destroy it ?
If an object locks a common resource, and then is itself blocked by a failure or other unwilling latency, should this transaction be cancelled, so others can access the resource, or should all the system wait for that single transaction to end ?
As for implementation methods, you should always be aware that
defining those abstraction as the abstractions they are,
rather than hand-coded emulation for these,
allows better optimizations by the compiler,
quicker write phase for the programmer,
neater semantics for the reader/reuser,
no implementation code propagation for the reimplementer,
etc.
Partial evaluation should also allow specialization of code that don't use all the language's powerful semantics, so that standalone code be produced without including the full range of heavy reflective tools.
Current computers are all based on the von Neumann model in which
a centralized unit executes step by step a large program composed of
elementary operations.
While this model is simple and led to the wonderful computer technology
we have, laws of physics limit in power future computer technology
to no more than a grand maximum factor 10000 of what is possible today
on superdupercomputers.
This may seem a lot, and it is, which leaves room for many improvement
in computer technology;
however, the problems computer are confronted to are not limited anyway
by the laws of physics.
To break this barrier, we must use another computer model,
we must have many different machines that cooperate,
like cells in a body, ants in a colony,
neurones in a brain, people in a society.
Machines can already communicate; but with existing "operating systems" the only working method they know is "client/server architecture", that is, everybody communicating his job to a one von Neuman machine to do all the computations, which is limited by the same technological barrier as before. The problem is current programming technology is based on coarse-grained "processes" that are much too heavy to communicate; thus each job must be done on a one computer. machine that executes Computing s all the requirement to be used as for Tunes, or design a new one if none is found.
That is, without ADTs, and combinating ADTs, you spend most of your time
manually multiplexing. Without semantic reflection (higher order), you spend
most of your time manually interpreting runtime generated code or manually
compiling higher order code. Without logical specification, you spend most of
your time manually verifying. Without language reflection, you spend most of
your time building user interfaces. Without small grain, you spend most of
your time manually inlining simple objects into complex ones, or worse,
simulating them with complex ones. Without persistence,
you spend most of your time writing disk I/O (or worse, net I/O) routines.
Without transactions, you spend most of your time locking files. Without
code generation from constraints, you spend most of your time writing
redundant functions that could have been deduced from the constraints.
To conclude, there are essentially two things we fight: lack of feature and power from software, and artificial barriers that misdesign of former software build between computer objects and others, computer objects and human beings, and human beings and other human beings.
To conclude, I'll say
Things below are a draft.
III. No computer is an island, entire in itself
response to human
Centralized code is also called "client-server architecture"; the central code is called the server, while those who use it are called clients. And we saw that a function server is definitely something that no sensible man would use directly; human users tend to write a library that will encapsulate calls to the server. But it's how most operating systems and net-aware programs are implemented, as it's the simplest implementation way. Many companies boast about providing client-server based programs, but we see there's nothing to boast about it; client-server architecture is the simplest and dumbest mechanism ever conceived; even a newbie is able to do that easy. What they could boast about would be not using client-server architecture, but truely distributed yet dependable software.
A server is nothing more than a bogus implementation for a library, and shares all the disadvantages and limits of a library, with enhanced extensibility problem, and additional overhead. It's only advantage is to have a uniform calling convention, which can be useful in a system with centralized security, or to pass the stream of arguments through a network to allow distant client and servers to communicate. This last use is particularly important, as it's the simplest trick ever found for accessing an object's multiple services through a single communication line. Translating software interface from library to server is called multiplexing the stream of library/server access, while the reverse translation is called demultiplexing it.
Multiplexing means to split a single communication line or some other resource into multiple sub-lines or sub-resources, so that this resource can be shared between multiple uses. Demultiplexing is recreating a single line (or resources) from those multiple ones; but as dataflow is often bi-directional, this reverse step is most often unseparable from the first, and we'll only talk about multiplexing for these two things. Thus, multiplexing can be used to share a multiple functions with a single stream of calls, or convertly to have a function server be accessed by multiple clients.
Traditional computing systems often allow multiplexing of some physical resources, thus spliting them into a first (but potentially very large) level of equivalent logical resources. For example, a disk may be shared with a file-system; CPU time can be shared by task-switching; a network interface is shared with a packet-transmission protocol. Actually, what any operating system does can be considered multiplexing. But those same traditional computing systems do not provide the same multiplexing capability for arbitrary resource, and the user will eventually end-up with having to multiplex something himself (see the term user-level program to multiplex a serial line; or the screen program to share a terminal; or window systems, etc), and as the system does not support anything about it, he won't do it the best way, and not in synergy with other efforts.
What is wrong with those traditional systems is precisely that they only allow limited, predefined, multiplexing of physical resources into a small, predefined, number of logical resources; there they create a big difference between physical resources (that may be multiplexed), and logical ones (which cannot be multiplexed again by the system). This gap is completely arbitrary (programmed computer abstractions are never purely physical, neither are they ever purely logical); and user-implemented multiplexers must cope with the system's lacks and deficiencies.
So we see that system designers are ill-advised when they provide such specific multiplexing, that may or may not be useful, whereas other kind of multiplexing is always needed (a proof of which being people always boasting about writing -- with real pain -- "client/server" "applications"). What they really should provide is generic ways to automatically multiplex lines, whenever such thing is needed.
More generally a useful operating system should provide a generic way to share resources; for that's what an operating system is all about: sharing disks, screens, keyboards, and various devices between multiple users and programs that may want to use those accross time. But genericity is not only for operating systems/sharing. Genericity is useful in any domain; for genericity is instant reuse: your code is generic -- works in all cases -- so you can use it in any circumstances where it may be needed, whereas specific code must be rewritten or readapted each new time it must be used. Specificity may be expedient; but only genericity is useful on the long run.
Let us recall that genericity is the property of writing things in their most generic forms, and having the system specialize them when needed, instead of hard-coding specific values (which is some kind of manual evaluation).
Now, How can genericity be achieved ?
But why should you depend on guesses? A Good programming language would allow you
Now, there are lots of technical terms in that. Basically, TUNES is a project that strives to develop a system where computists would be much freer than they currently are: in existing systems, you must suffer the inefficiencies of
So to conclude, there is essentially one thing that we have to fight: the artificial informational barriers that lack of expressivity and misdesign of former software, due misknowledge, misunderstanding, and reject of the goals of computing, build between computer objects and others computer objects, computer objects and human beings, human beings and other human beings.
....
an open system, where computational information can efficiently flow with as little noise as possible. Open system means that people can contribute any kind of information they want to the available cultural background, without having to throw everything away and begin from scratch, because the kind of information they want to contribute does not fit the system. Example: I can't have lexical scopes in some wordprocessor spell-checker, only one "personalized dictionary" personalized at once (and even then, I had to hack a lot to have more than one dictionary, by swapping a unique global dictionary). So bad. I'll have to wait for next version of the software. Because so few ask for my feature, it'll be twenty years until it makes it to an official release. Just be patient. Or if I've got lots of time/money, I can rewrite the whole wordprocessor package to suit my needs. Wow! On an open system, all software components must come in small grain, with possibility of incremental change anywhere, so that you can change the dictionary-lookup code to handle multiple dictionaries merged by scope, instead of a unique global one, without having to rewrite everything. Current attempts to build an open system have not been fully successful. The only successful approach to offer fine-grained control on objects has been to let sources freely available, allowing independent hackers/developers to modify and recompile; but apart from the object grain problem, this doesn't solve the problems of open software. Other problems include the fact This offers no semantical control of seamless data conservation accross code modification; contributions are not really incremental in that the whole software must be integrally recompiled, stopped, relaunched; Changes that involve propagation of code among the whole program cannot be done incrementally with non because they because many semantical changes are to be manually propagated accross the whole program. "as little noise as possible": this means that algorithmic information can be passed without any syntactical or architectural constraint in it that would not be specifically intended; that people are never forced to say either more than they mean or less than they mean. Example: with low-level languages like C, you can't define a generic function to work on any integer, then instanciate to the integer implementation that fits the further problem. If you define a function to work on some limited number type, then it won't work on longer numbers than the limit allows, while being wasteful when cheaper more limited types might have been used. Then if some 100000 lines after, you see that after all, you needed longer numbers, you must rewrite everything, while still using the previous version for existing code. Then you'll have two versions to co-debug and maintain, unless you let them diverge inconsistently, which you'll have to document. So bad. This is being required to say too much. And of course, once the library is written, in a way generic enough so it can handle the biggest numbers you'll need (perhaps dynamically sized numbers), then it can't take advantage of any particular situation where the known constraints on numbers could save order of magnitudes in computations; of course, you could still rewrite yet another version of the library, adapted to that particular knowledge, but then you again have the same maintenance problems as above. This is being required to say too little. Any "information" that you are required to give the system before you know it, without your possibly knowing it, without your caring about it, with your not being able to adjust it when you further know more, all that is *noise*. Any information that you can't give the system, because it won't heed it, refuse it as illegal, implement in so inefficient a way that it's not usable, is *lack of expressiveness*. Current languages are all very noisy and inexpressive. Well, some are even more than others. The "best" available way to circumvent lack of expressiveness from available language is known as "literate programming", as developed, for example, by D.E.Knuth with his WEB and C/WEB packages. With those, you must still fully cope with the noise of a language like C, but can circumvent its lack of expressiveness, by documenting in informall human language what C can't express about the intended use for your objects. Only there is no way accurately verify that objects are actually used consistently with the unformal documented requirements, which greatly limits the (nonetheless big) interest of such techniques; surely you can ask humans to check the program for validity with respect to informal documentation, but his not finding a bug could be evidence for his unability to find a real bug, as well as the possible absence of bug, or the inconsistency of the informal documentation. This can't be trusted remotely as reliably as a formal proof. The Ariane V spacecraft software had been human-checked thousands of times against informal documentation, but still, a software error would have $10^9 disappear in fumes; from the spacecraft failure report, it can be concluded that the bug (due to the predictable overflow of an inappropriately undersized number variable) could have been *trivially* pin-pointed by formal methods! Please don't tell me that formal methods are more expensive/difficult to put in place than that the rubbish military-style red-tape-checking that was used in place. As a french taxpayer, I'm asking immediate relegation of the responsible egg-heads to a life-long toilet-washing job (their status of french "civil servants" prevents their being fired). Of course my voice is unheard. Of course, there are lots of other software catastrophes that more expressive languages would have avoided, but even this single 10 G$ crash would pay more than it would ever cost to develop formal methods and (re)write all critical software with!
People are already enough efficiency-oriented so that TUNES needn't invest a lot in it, just providing a general frame for others to insert optimization into. In the case of a fine-grained dynamic reflective system, this means that hooks for dynamic partial evaluation must be provided. This is also an original idea, that hasn't been fully developed. contribution that TUNES
It is as if mathematicians would have to learn a completely new language, a completely new formalism, a completely new notation, for every mathematical theory! As if every book couldn't actually assume results from other books, unless they were proven again from scratch.
It is as if manufacturers could not assemble parts unless they all came from the same factory.
Well, such phenomena happen in other place than computer software, too. Basically, it may be conceived as a question of lack of standards. But it's much worse with computer software, because computer software is pure information. When software is buggy, or unable to communicate, it's not worth a damn; it ain't even good as firewood, as metal to melt or anything to recycle. Somehow, the physical world is a universal standard for the use of physical objects. There's no such thing in the computer world where all standards are conventional.
Faré -- rideau@clipper.ens.fr