The basic idea is that we shall support as much of the underlying
architecture as possible, and allow low-level access to it;
but offer our own, standardize interface, implemented on top of it.
- Basic I/O
-
That's just interfacing Tunes and the usual POSIX file mechanisms,
so we can do I/O. Access can be done through a low-level object, but several
views of files as more sympathetic abstractions (arrays, list, sequence, input
or output channels, etc) are given.
-
To allow multithreaded I/O, we'll have to use "select".
- Dynamic code loading
-
We should support dynamic loading of modules,
and if possible modules in shared memory.
We should also support (as far as possible)
importing of standard low-level system modules
(have a look at how ELF works under Linux...).
- Threads
-
We shall port our usual Tunes thread system to POSIX,
and use SIGVTALRM to time-schedule our cooperative threads.
- PTY handling
-
This is a very useful package, and not only to Tunes:
once it's done, we automatically obtain the capability to
dynamically combine and recombine inputs and outputs of pty's:
choose I/O channels independently; remap them; split them; join them;
merge them; filter them; cook them; buffer them; use low-level to high-level
and converse translations between them; do anykind of processing to them
before passing them.
-
This is much better than what, say, screen, do,
because we can independently remap input and output,
and combine them in infinitely many useful ways.
Because we can control pty's, we can fully control Unix programs,
and allow these programs to interface Tunes nicely.
With this, we can automatically have powerful
text window systems and subsystems under unix,
-
Of course, all that depends on there being some reliable pty daemon;
until Tunes is very stable,
a particular session can be dedicated to that
(as Tunes is fine grained, there's no memory problem);
most probably, separate from others
(communicating by messages as usual).
-
man tty, man pty, man select, more /usr/include/**/*.h,
read emacs info pages, etc.
- X
-
We should try to reuse some existing X toolkit package and interface it
to Tunes. Tk is said to be good...
- Other
-
More generally than the above SIGVTALRM, we should support provide a
standard Tunes interface to signals.
-
All the fork, popen, etc, primitives, should be available too, in a
straightforward way. Processes are viewed as multiple Tunes "hosts"
in a distributed system; as of performance, it should be made clear that
forking won't increase the total machine speed
(though it may increase the share we have ;),
so there is no reason to migrate "on the same machine" unless it
eases communication; merging servers into a one active and another one asleep
may be good, unless there are security or time share competition reasons.
-
Network communication is available, too.
Apart from their low security/efficiency
performances, they are the same as any other Tunes
communication subsystem.
Implementation ideas
- System configuration
- Existing "POSIX-compliant" systems vary a great lot:
word size, endianness, stack growth direction,
possible assembly (and register) optimizations,
availability of useful (GCC) extensions to ANSI C
such as like first-class labels or typeof() statements,
names and flags of compiler tools,
available libraries and corresponding header files to include,
etc.
- Thus we must have some automatic portable configuration subsystem.
caml-light does things entirely automatically, and determines
each feature of the system. PFE uses some kind of database.
Sure we should use a mix of these,
where there is a system to generate configuration files,
and some precomputed configuration files for known systems.
- Any one willing to manage that ?
- C as a portable assembler
- Some say C is a portable assembly language.
Why not take them at their word,
and produce some assembly-like C source,
with global variables, labels, jumps in a one big procedure,
using m4 as a macro-processor, and/or outputing C from the HLL ?
This way, we will be really building a generic assembly
implementation of the Tunes LLL, not having to build a
completely different C based one, while still taking advantage
of C optimizing compilers, and we can use our specific calling
and multithreading conventions without interfering with the C
calling stack (still useful for I/O).
- C and memory management
- Efficient garbage collection and persistent memory management is
particularly difficult if using standard C libraries at the same
time. These use their own uncontrollable data-structures in
malloc'ed memory.
- All persistence and garbage collection should thus be done outside
of this malloc'ed memory:
This should be done by mmap'ing some persistent file into a
system-dependent address space zone out of reach of malloc
(say, at least 0.5 GB), and/or mprotect()ed so that we'll be
warned in case malloc() reaches it.
- Some statical mapping allows much more optimizations, and far
easier saving/loading of a persistent memory image.
If it's not possible, restoring (not saving) the image would
relocate pointers if actual dynamical address is different...
- Shall we use the C call stack pointer as the heap pointer ?
O'TOP Difficulties
- See LLL difficulties...
- Having some hardware independence, but limit inefficient code to the least
possible.
- Points where POSIX particularly sucks
- To be sure that buffers are flushed, we must fsync();
but this is a blocking call.
Hence, fsync()ing and similar calls should be
delegated to a fork()ed process that will queue blocking calls
and warn the main process back when they are finished.
- Modifying the memory mapping tables to do garbage collection
is very slow, as it involves a syscall for each operation,
and leads to big memory mapping tables for which POSIX
implementations is not optimized at all.
Hence, we can't use the mmap() mechanism for fine-grained
memory mapping.
Microkernels might allow user-level pagers, though.
- There is no standard or reliable way to reserve a large chunk of
address space without cluttering up swap memory and
without creating a mess with the C library, particularly
if we want to reuse addresses from a swap file accross
sessions. On each architecture, we will have to add some
architecture-dependent memory mapping instructions.
Most likely, we will mmap() a huge sparse file on a filesystem
that supports it; but then there's block leakage,
and the only standard way to reclaim blocks is
that when the filesystem overflows,
it is time to erase the file, and begin again.
- When resuming connections, there is absolutely NO WAY to trust
the files to be in the right format; we must check, recheck,
and re-recheck permanently, or blindly trust.
The persistent files must be
in read-only mode, and be changed to read/write mode just
when open by the right process (use the permission mode as
an open indicator ?)
- To allow multithreading, special care will have to be taken
to do only non-blocking I/O with select() and such,
to queue requests, etc. Yuck.
- There no way to have safe real-time response. So bad for
games, animations, etc.
- Garbage Collecting
- Paging under POSIX is too slow and messy. Should be done
as seldom as possible (only at minor and major GC)
- We'll sacrify the low bit of words,
but use macros to choose whether integers or pointers
will be unchanged.
Bibliography
Here is a list of sources of inspiration to write the O'TOP project:
To Do on this page
- A lot of work...
- use fsync() and fdatasync() to be sure
that the mmap()ed memory image file is well updated
on disk. The file should always contain two states, a current one and
a synchronized one see persistency.
- Where shall I talk about the way we can use Tunes to build
wrappers around POSIX emulator,
so as to achieve better, fine-grainedlier configurable black-boxen
than a stupid paranoid Unix kernel can ever achieve?
Back to the
Tunes LLL subproject.
Page Maintainer:
Faré
-- rideau@clipper.ens.fr