Search logs:

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present


http://bespin.org/~qz/search/?view=1&c=osdev&y=19&m=2&d=22

Friday, 22 February 2019

12:00:30 <jmp9> okay
12:00:42 <jmp9> how to prevent gcc pass arguments via registers
12:00:45 <jmp9> and only stack
12:00:53 <jmp9> (i need this for printf)
12:04:17 <nyc> There are function attributes for varargs/printf affairs.
12:04:45 <nyc> https://gcc.gnu.org/onlinedocs/gcc-3.1/gcc/Function-Attributes.html
12:04:55 <jmp9> void __attribute__((cdecl)) kprintf(const char* fmt,...)
12:04:56 <jmp9> doesn't work
12:05:51 <nyc> cdecl wouldn't be the attribute.
12:06:42 <nyc> format is the attribute and it needs some extra args.
12:07:29 <klange> Looks like Travis CI is shutting down, so I'll have to move my automated builds somewhere else...
12:07:51 <nyc> extern int my_printf (void *my_object, const char *my_format, ...) __attribute__ ((format (printf, 2, 3)));
12:16:23 <jmp9> okay
12:16:35 <jmp9> any ideas how implement printf on x86-64?
12:16:48 <jmp9> because arguments in registers, not stack
12:17:09 <Mutabah> jmp9: stdarg.j
12:17:11 <Mutabah> jmp9: stdarg.h
12:17:20 <jmp9> i can't use libc
12:17:27 <Mutabah> it's freestanding
12:17:43 <jmp9> anyway
12:25:37 <aalm> u lazy?
12:25:49 <aalm> r u
12:32:42 <Mutabah> jmp9: You can easily use stdarg.h with just a standard freestanding cross-compiler
12:34:19 <jmp9> of course
12:34:33 <jmp9> but i want write it myself
12:34:38 <jmp9> try my skills, you know
12:34:49 <Mutabah> I don't think you can with non-stack calling conventions
12:35:56 <doug16k> jmp9, the x86_64 elf abi documents it
12:36:10 <doug16k> you want to implement printf in assembly though? really? why?
12:36:16 <jmp9> i did it on 32 bit
12:36:29 <jmp9> on 64 bit i've stuck because arguments in registers, not in stack
12:36:39 <jmp9> ah
12:36:44 <jmp9> i've implemented printf in C
12:36:52 <jmp9> and year ago i also did it in assembly
12:37:47 <doug16k> all you need to do is translate it into a problem you'd like to solve, lay down those register parameters in memory and index into that area
12:38:15 <jmp9> i came to the idea that i can basically push registers to stack
12:38:22 <doug16k> does va_arg abi even use registers? let me refresh, I should know that off the top of my head
12:38:23 <jmp9> before call printf
12:38:45 <Mutabah> doug16k: va_list is a pointer to a memory blob, `...` funtions still use registers
12:38:51 <jmp9> or made function that access registers for indexes below 6
12:42:07 <doug16k> ya, you just lay down the 6 register parameters somewhere, and index into them appropriately, when they run out, you transition to the stack arguments
12:43:16 <doug16k> it's just a bit more complex to step to the next argument in your asm version of getting the next var arg
12:43:59 <doug16k> since you probably can ignore the sse registers, it should be about that hard
12:44:42 <doug16k> plus your code would have to "know" that a 64 bit thing can't be half in a register and half on the stack, so go get the whole thing off the stack and the last register was wasted
12:44:58 <doug16k> not "64 bit thing" I mean, double the size of a register thing
12:56:28 <doug16k> hey! this looks very nice -> https://www.mjmwired.net/kernel/Documentation/virtual/kvm/msr.txt#244
12:56:51 <doug16k> paravirtualized eoi
12:57:57 <doug16k> could be a 3rd implementation of my xapic/x2apic abstraction, just hooking the eoi write operation to redirect it to paravirt eoi
01:03:55 <jmp9> my code works
01:04:13 <jmp9> but gcc prefers to save registers to local stack variables and fuck them
01:05:16 <jmp9> fuck i get annoyed
01:15:01 <Mutabah> It's trying to be fast
01:22:46 <jmp9> okay
01:22:53 <jmp9> i wrote my va_list implementation
01:23:01 <jmp9> and it works perfectly
02:00:16 <nyc> J. K. Ousterhout, “Scheduling techniques for concurrent systems”. In 3rd Intl. Conf. Distributed Comput. Syst., pp. 22–30, Oct 1982.
02:35:55 <mobile_c> how do i port X11 to android as when ever i try to search for it i just get SSH and port forwarding crap
03:01:57 <klys> mobile_c, the hard way; one lib at a time. for reference you might look at a graphics lib that has been ported to android such as allegro 5.
03:03:14 <mobile_c> ok
03:05:03 <mobile_c> i dont think allegro is what i want
03:05:29 <klys> see if sdl2 has been ported, I haven't looked myself
03:27:40 <klys> mobile_c, have you seen this one: https://play.google.com/store/apps/details?id=au.com.darkside.XServer&hl=en
03:28:46 <mobile_c> no
03:29:05 <mobile_c> i DO NOT want anything that requires VNC
03:29:34 <mobile_c> NOR do i want anything that creates an environment that is SEPERATED from android
03:37:40 <klys> would that mean you need transparent bitmaps that screenshot apps behind them?
03:59:50 <ybyourmom> It's DA WEEKEND
04:00:21 <ybyourmom> In 2 hours anyway
04:00:27 <ybyourmom> (In Australia, but the rest of you will still be working like peasants in the fields)
04:00:49 <klange> We've got another 4~5 here.
04:01:31 <nyc> I'm about to drop. Bikeshedding about licensing is putting me to sleep.
04:20:00 <nyc> Well, I checked in code to a private github repo, so at least that base is covered.
04:43:02 <nyc> I think I have bikeshedded my way to the French government's version of BSD redone so it works internationally with WIPO (World International Property Organization) and European countries' limitations on liability disclaimers, CeCILL-B. Worst case I can relicense it later.
05:44:58 <nyc``> I got a hold of a programming language implementor. They're a lot more interested in 35-year-old scheduling algorithms than they are my VM stuff. My guess is that the scheduling issues have known benefits, or maybe even benefits known to be more influential than TLB.
05:57:45 <nyc``> I watched a FOSDEM presentation today that said the entire performance difference observed with database workloads with large pages was attributable to the pagetable memory conserved to map a massively shared buffer pool shared memory segment.
05:59:50 <lsneff> Is anyone here familar with how jit runtimes switch switch a function to a more-optimized version of said function?
05:59:55 <lsneff> Do they rewrite all the calls to that function?
06:02:43 <yrp> uh, not exactly that, but iirc jitted functions generally will have a plt-like setup where the arguments types are checked against jitted implementations
06:02:49 <yrp> at least for dynamically typed jitted languages
06:02:57 <nyc``> i.e. there's been no progress in mainline since '05 ... AIUI Oracle is carrying shpte out-of-tree which solves the issue as does reclaiming pagetables, albeit at the cost of minor faults.
06:03:41 <lsneff> yrp: How could it be implemented efficiently for statically-typed jitted (well, aot) languages?
06:04:58 <yrp> dont know, i could make shit up but i think youd be better off reading luajit or similar
06:05:28 <lsneff> I'm been trying to find documentation on v8, but it's been somewhat lacking
06:05:59 <yrp> https://doar-e.github.io/blog/2019/01/28/introduction-to-turbofan/
06:06:13 <yrp> that comes at it from a different angle than the one you want
06:06:19 <yrp> but has a bunch of useful debugging info for playing with it
06:06:52 <yrp> the previous post on the blog explains some of the other parts for firefox
06:14:39 <nyc``> I think the upshot is that head-to-head performance comparisons aren't going to be possible without knocking some other things out of the way.
08:45:38 <mrvn> nyc``: scheduling depends on your goal and your environment.
08:46:57 <klys> situational scheduling, except there are no situational schedulers
08:48:33 <nyc``> They're language runtimes. M:N threads, aio, and gang scheduling are their obvious needs.
08:52:02 <mrvn> If you have aio then you shouldn't need M:N threads.
08:52:53 <mrvn> As for gang scheduling: Isn't that bad half the time?
08:54:06 <mrvn> If you have lock contention then the more threads of a process you ran in parallel the worse it will get.
08:54:45 <mrvn> Or if you have message passing and aren't cpu bound then running the threads serial on the same core will improve cache performance.
08:57:39 <mrvn> nyc``: when you implement AIO make sure you can set events on a file descriptor for read to be level triggered and write to be edge triggered.
08:58:44 <nyc``> I would say to try it and see if that weren't so strongly influenced by preconceptions.
09:02:10 <mrvn> nyc``: My scheduler is simple. Tasks are in a doubly linked ring. Every time the kernel loops around the ring it increments a generation counter. Each task has a timeslize and a last_generation. The active task gets scheduled when it's timesize it up. Then on every syscall/interrupt if a task gets woken up and has time left in the timeslize or the last_generation < generation then it preempts the current task.
09:03:21 <bcos_> For scalability I prefer "per CPU queues"
09:03:29 <mrvn> last_generation < generation indicates the task has slept long and it gets a new timeslize.
09:03:42 <bcos_> ..with each CPU having its own timer and doing its own thing
09:04:08 <mrvn> bcos_: yeah, that scheduler should run per cpu with the occasional migrating from busy to idle cpus.
09:05:12 <mrvn> The gang scheduling decision would belong to the migration code I think.
09:05:15 <nyc``> When syscalls are issued as message passing as if over a network, their results can be reaped like replies over a network. Things like polling need a little more sophistication than just how they're triggered.
09:06:05 <bcos_> mrvn: I can't help but imagine a system with a single timer where a single "reschedule_all_the_things()" function decides what all CPUs should do next
09:06:39 <mrvn> bcos_: the timer only triggers when you have a task that never sleeps.
09:06:48 <nyc``> (See standard discussions about select()/poll() for more details.)
09:07:07 <mrvn> nyc``: kill select/poll. epoll is the way to go
09:07:10 <bcos_> (I mean, can't help but imagine gang scheduling as..)
09:08:07 <mrvn> bcos_: yeah. you would have to coordinate between cores so that threads of the same process run at the same time.
09:08:56 <nyc``> bcos_: A good motivating case is a system simulator running a kernel that uses spinlocks.
09:09:06 <mrvn> I don't have shared memory and only message passing so that naturally syncs threads.
09:10:09 <mrvn> nyc``: that would make each thread sticky to one core
09:10:43 <nyc``> mrvn: The standard commentaries on select()/poll() include commentaries on epoll.
09:11:04 <klys> sigio -> increment; main() -> for(;;) -> if( sem > 0 ) nonblocking select(); handle(); sched_yield(); // can you improve on this idea of a main loop?
09:11:45 <mrvn> klys: epoll + epoll_wait. select is O(n)
09:12:49 <mrvn> klys: With sigio also take care to handle syscalls getting interrupted by signals very consistently.
09:13:19 <mrvn> while(true) { epoll_wait(); handle; }
09:14:24 <mrvn> I actually noticed an improvement with while(true) { epoll_wait(); handle(); handle(flags=NON_BLOCKING); }
09:14:28 <nyc``> mrvn: Threads being sticky to cores or at least spread out away from each other happens and is desirable.
09:15:54 <nyc``> (And seriously, look up the commentaries on the connection count problem from other OS's or that compare the approaches of various OS's.)
09:16:06 <mrvn> In the time handle() runs a busy socket will get new data so I read from the FDs speculative before doing another epoll call. Only makes sense with few sockets that are mostly busy.
09:16:33 <mrvn> or just one high bandwith socket in my case.
09:17:53 <mrvn> nyc``: The problem is busy waiting for locks. If you don't run all threads at the same time you quickly run into cases where all a thread does is wait for a lock and then gets scheduled.
09:18:07 <mrvn> potentially forver because the lock is only free when the task isn't running.
09:19:34 <mrvn> Luckily linux allows hot plugging the spinlocks so for an emulator you can replace them with primitives optimized to your native system.
09:20:22 <nyc``> It most typically manifests as poor performance but can theoretically lead to starvation, yes.
09:21:29 <mrvn> nyc``: It's one of the reasons why I think |threads| > |cores| is a bad idea and I don't like shared memory over message passing.
09:22:36 <nyc``> That would be Linux itself. Other guests may not be as capable etc. In general, userspace may want to use spinlocks itself.
09:23:14 <mrvn> nyc``: they should use a primitive provided by the OS, Like a futex on linux.
09:25:52 <nyc``> I think there are also wait-free lockless algorithms that get unhappy about randomly being descheduled, though maybe not as dramatically as spinlocks.
09:25:54 <mrvn> Hopefully they use higher level abstraction in the language runtime or posix stuff. So you only have the interface to the OSes native primitives once.
09:27:18 <mrvn> nyc``: wait-free lockless algorithm will loose some time when another thread intervenes but hopefully they will complete their work in <1/2 timeslize. Otherwise you can get into cases where a thread is constantly interrupted.
09:27:51 <mrvn> nyc``: good algorithm have the interuptee finish the work of the interrupted.
09:27:58 <klys> mrvn, suppose though that your main loop also needs to fill some periodic function like retrace or redrawing a clock. epoll?
09:28:24 <mrvn> klys: epoll has a timeout. Or add a timerfd to the epoll.
09:29:26 <nyc``> mrvn: I guess if you don't like gang scheduling you can leave it for other people to implement.
09:29:27 <mrvn> klys: a timerfd will be more regular. Timeouts easily drift or are inaccurate.
09:30:32 <mrvn> nyc``: what I don't like about it is that it makes core wait for each other. But your example of a emulator running a foreign kernel clearly can benefit from it. It's just something I want to avoid needing in native code.
09:31:04 <mrvn> nyc``: I don't have shared memory and I don't even have threads. Only processes. So the emulator is screwed already.
09:33:17 <nyc``> I'm just throwing bones at language implementors in the hopes they'll use the original ideas (virtual memory, not scheduling or aio) somewhere too.
09:34:11 <immibis> mrvn: you know, something seems very off about designing an algorithm so that if Thread A is descheduled in favour of Thread B, B does the work A was going to do
09:34:21 <immibis> it's a sign that threads are a bad abstraction
09:34:38 <immibis> ideally if there's work to do you just do the work, nobody cares which thread the work is done on, only which CPU
09:34:58 <immibis> (in fact you probably don't care which CPU, only that it's A CPU)
09:35:18 <mrvn> immibis: yes, my though. But if you have threads > core then who knows when the original thread will be back to complete it's work?
09:35:35 <immibis> <immibis> it's a sign that threads are a bad abstraction
09:35:50 <immibis> if only there was an OS that allowed you to not have that situation ever come up
09:36:17 <immibis> maybe it can detect that B is waiting for A and *not* unschedule A to schedule B
09:36:34 <mrvn> Kind of makes me want to say that gang scheduling solves a problem the bad threaded design invented in the first place.
09:36:55 <mrvn> immibis: wake up A and lend it the timeslize form B.
09:37:06 <klys> a sched_yield() for related active pids?
09:37:49 <mrvn> immibis: wait queues can work that way easily. If B waits for a lock and has time left then the lock holder can run for the remaining time until it frees the lock.
09:39:04 <klys> the worst thing I usually experience with multiple processes running is that a memory hog can and will take over and thrash, and oom kill just doesn't happen.
09:39:30 <mrvn> klys: because it's swapping and swapping and never runs truely out?
09:39:51 <klys> mrvn, what puzzles me about it is that I have no swaps set up.
09:40:10 <immibis> you can't really thrash without swap
09:40:15 <mrvn> klys: everything mmaped SHARED is swapped out by simply forgetting it.
09:40:17 <klys> true.
09:40:25 <mrvn> immibis: yes you can
09:40:59 <mrvn> Or rather you have to consider every SHARED mmap as a bit of swap.
09:42:43 <mrvn> I would love to tell Linux to lock all binaries and libs into memory and only swap malloc()ed and other manually mmap()ed stuff.
09:43:09 <immibis> bet someone is relying on them being swappable
09:43:11 <nyc``> Communicating full processes can still busywait on shared memory and just plain pipeline processing can easily want to run in parallel or not at all (the pipeline will block anyway, so...).
09:43:31 <immibis> like maybe a game with a 1G data section of textures (hah, like games are ever that self-contained)
09:46:34 <mrvn> immibis: they usualy don't have them linked in directly. But even if, so what? They need that 1G of textures. Leave it in memory. Swap the other 60GB or ram.
09:46:57 <immibis> mrvn: they're copied to the GPU and then they don't need to be in main RAM
09:47:14 <immibis> they might also be compressed in the executable (and yet they still manage to have GBs upon GBs)
09:47:15 <mrvn> immibis: depends. Does the gpu even have it's own memory?
09:47:33 <immibis> even if it doesn't, you can't specify that the GPU reads from the mmap'ed copy
09:48:09 <immibis> your graphics API either makes a copy, or it has to be stored in memory you obtain from that API so you have to make a copy yourself
09:48:22 <mrvn> immibis: you could have some flags on the elf segment marking it for use of the GPU. So the kernel would load it directly to the gpu.
09:48:32 <immibis> which GPU?
09:48:48 <mrvn> that's for the GPU driver to then implement.
09:49:12 <immibis> my system has 2, I have another 2 in boxes if I wanted to properly stress test this idea
09:49:26 <immibis> (only 1 is actually in use)
09:50:01 <mrvn> the far easier thing is to say: Don't link your GPU textures into the binary. Load them at runtime and free the memory if they get copied to the GPU.
09:50:13 <immibis> then your executable is no longer self-contained
09:50:32 <immibis> i'm sure most people agree executables should be more self-contained than they actually are
09:51:15 <mrvn> immibis: 1) nothing moddable is, 2) that's what docket images and the like are for. An shared binary is never self contained anyway.
09:51:24 <mrvn> s/docket/docker/
09:51:34 <immibis> would you recommend distributing games as docker containers?
09:51:50 <immibis> i wouldn't. for one thing, the graphics driver is a vendor-specific shared library
09:52:25 <mrvn> immibis: the image wouldn't bring it's own X.
09:52:47 <immibis> there is a client-side driver library which gets loaded into the process.
09:52:54 <immibis> for OpenGL/Vulkan/etc
09:53:09 <mrvn> The only game I bought in the last decade is factorio and that comes as tar.xz. Unpack, run, done.
09:53:12 <immibis> this minimizes context transition overhead
09:54:44 <immibis> also infiniband drivers are similar
09:55:08 <mrvn> It could use some fancy stuff like a desktop file so it has an icon on the desktop. But for me it's ideal. I start stuff from console anyway.
10:00:57 <mrvn> Apropo factorio. I can't recommend it. It will leave you no time to work, eat, sleep or do any OS Dev work.
10:01:08 <mrvn> Huge distraction.
10:01:13 <mrvn> 8-P
10:05:49 <sginsberg> hi
10:06:12 <immibis> hi
10:06:22 <immibis> did you manage to interrupt unto yourself successfully?
11:10:54 <Pyjong> Is there any name for when serial link sends characters back as form of acknowledgement?
11:17:32 <sginsberg> yes immibis.
11:30:38 <mrvn> an echo?
11:34:28 <Pyjong> mrvn: yes, thanks..
11:39:46 <klys> local echo on/off
11:54:47 <mrvn> klys: local echo is more of a software thing of the terminal emulator. It simply prints what you type.
03:16:07 <bodo44> Hey guys
03:16:18 <bodo44> there is something i dont get during interrupt handling
03:16:34 <bodo44> do you always store the current registers in the task structure?
03:16:53 <bodo44> or do you only copy them when switching tasks or switching to a system call?
03:18:59 <bodo44> and when a system call is made. do you do it like: -> save userspace registers -> set task registers so it starts running in kernel space within the system call handler function -> once system call is finished, set registers back to userspace registers? or did i get this wrong. or do you handle syscalls directly before sending end of interrupt?
03:21:49 <mrvn> yes, no, yes, yes
03:22:24 <mrvn> and that makes no sense. syscalls aren't interrupts and have no EOI
03:23:02 <mrvn> To your earlier question: if you like and if you like.
03:23:26 <bodo44> Thanks so far. Is it a big performance drawback to always store the entire register state?
03:23:50 <mrvn> I always save all regs to the task struct (as opposed to the kernel stack and opposed to saving only a few). Makes debugging and task switching simpler for me.
03:24:14 <mrvn> Hard to inpsect the user space regs when you don't know where they are saved.
03:24:34 <bodo44> You said syscalls arent interrupts, okay, so no EOI is needed - but i implemented them by doing a software interrupt
03:25:09 <bodo44> Do you then let the thread continue running in kernel space? or do you directly handle it in the syscall handler
03:26:06 <mrvn> bodo44: I do all my syscalls directly but I only have 5 of them.
03:26:48 <bodo44> okay :)
03:27:07 <mrvn> alloc(), sleep(), sendmsg(), recvmsg(), exit()
03:27:50 <mrvn> I might join sleep(), sendmsg(), recvmsg() into one call.
03:29:49 <mrvn> bodo44: On x86/x86_64 you let syscalls always run in kernel space. On ARM you might change modes.
03:30:10 <mrvn> bodo44: the big question is wether you enable interrupts or not.
03:33:09 <bodo44> my question is more like: does the scheduler (timer) interrupt a syscall that takes long
03:33:36 <bodo44> if it should be able to, that means that the syscall handler should be finished asap, and the task should jsut continue executing in kernel space
03:33:40 <mrvn> bodo44: do you want to?
03:34:31 <mrvn> bodo44: The syscall handler can be interrupted when you enable interrupts. Nothing to do with being in the syscall handler or not.
03:35:18 <mrvn> "int 80" I think always disables interrupts. For syscall you can configure it. And you can always enable them once you saved the regs.
03:36:44 <bodo44> mrvn: but while im in my interrupt handler - until i do iret - no other interrupt can interrupt me right?
04:38:52 <c32> hello, i'm in unreal mode, transfer a binary , go in Protected mode and jump to it but it doesn't work, qemu restarts in a loop
04:38:59 <c32> i verified that the binary is correctly loaded into the right address and jump to it with ljmp *0x100000
04:39:14 <c32> what could be going wrong ?
04:39:47 <c32> i tried debugging with gdb but i couldn't get it to work with 16bit code yet
05:04:13 <c32> IT WORKS! It was that jmp!
05:04:27 <c32> it should've used segment 1
05:04:51 <c32> ljmp $0x08,$0x100000
05:05:32 <doug16k> c32, to debug real mode code, first make sure it isn't x86_64 target, currently only i386 debugs real mode and 32 bit code correctly
05:05:52 <doug16k> c32, then, to enable real mode debugging, use this command: set architecture i8086
05:06:10 <c32> qemu-i386?
05:06:19 <doug16k> qemu-system-i386 will work
05:06:32 <c32> doug16k: Thanks a lot!
05:06:43 <doug16k> qemu-system-x86_64 will debug all wonky because of a workaround for a gdb bug in qemu
05:06:55 <doug16k> when debugging real or protected mode
05:07:39 <c32> ok, thanks!
06:00:46 <jmp9> okay i fixed kprintf for x86-64
06:00:54 <jmp9> format functions now work fine
06:35:52 <Ameisen> I likw this installer. "Choose the geographic area where you live. "... America... US"
06:35:56 <Ameisen> those are both accurate.
06:36:57 <renopt> United States of Antartica
06:37:19 <renopt> (possibly a thing in the not too distant future?)
06:42:55 <Ameisen> Ah, yes, the Great City of the Ice Shelf
06:43:54 <Ameisen> The Most Serene Grand Republic of Amundsen-Scott
06:44:20 <Ameisen> So... thought I had, maybe someone has some good input. My previous emulator (and current ones) have the restriction that I need to be able to execute N instructions, and then return.
06:44:36 <Ameisen> As far as I can tell, this prevents most intraprocedural optimizations, as I cannot 'smear' the results of instructions
06:45:51 <Ameisen> now, the naive solution to this would be to create N versions of each block/trace, so depending on how many instructions are left to call, I would call into a Nl % N version of that function which can be optimized on that basis
06:45:59 <Ameisen> that's going to be pretty size-consuming (and time consuming for AOT)
06:46:23 <Ameisen> and still isn't perfect, of course, it just makes it so a certain nmumber of instructions cna be optimized together for each variant
06:46:50 <Ameisen> the other alternative, in my mind, would be an equivalent to an 'unwind table' that lets the state be 'reset' to where it would be when the caller needs to return
06:46:59 <Ameisen> not sure if I'm missing something.
08:14:47 <nyc``> I need to nowebify all my source.
08:37:04 <knebulae> @nyc: what does nowebify mean?
08:49:17 <knebulae> can you believe this motha-fucking-shit? https://www.i-programmer.info/news/149-security/12556-google-says-spectre-and-meltdown-are-too-difficult-to-fix.html
08:49:27 <knebulae> BS. I've fixed it. Lol.
08:50:15 <knebulae> All those smart people and no one's thinking. Par for the course.
08:50:25 <FreeFull> knebulae: How did you fix it?
08:50:41 <FreeFull> By using a CPU that doesn't do speculative execution?
08:51:47 <nyc> IA64 sounds like a good start.
08:51:57 <FreeFull> Itanium?
08:52:06 <nyc> Yes, Itanium.
08:52:06 <FreeFull> There aren't many people that use that any more
08:52:19 <nyc> Do you understand why Itanium is immune?
08:52:32 <FreeFull> The compiler decides the scheduling?
08:52:49 <nyc> Well, to the extent there is speculation, it's all compiler-driven.
08:53:15 <FreeFull> The thing is, nobody is using Itanium on their devices
08:53:19 <FreeFull> It's all either x86 or ARM
08:53:26 <nyc> They should start.
08:53:36 <nyc> In order to fix Spectre/Meltdown.
08:54:02 <FreeFull> Who's going to start making the new Itanium chips? It doesn't seem like Intel wants to.
08:54:16 <FreeFull> Also, people like having existing software keep working
08:57:19 <knebulae> @FreeFull: no system code ever executes on an application processor. No usermode (well, non admin) access to HPC or RDTSC. That doesn't mean applications can't compromise each other though.
08:57:37 <doug16k> nyc, it's not immune. not unless there is no cache and no way to do precise timing
08:58:17 <nyc> doug16k: There is no automatic speculation.
08:58:25 <FreeFull> knebulae: You want to cram multiple processors with different architectures into the same device?
08:58:34 <FreeFull> Kernel code running on one, and usermode software on another?
08:58:42 <doug16k> it's probably easier on ia64 - you can mark an instruction as expecting an exception and it will not crash until you try to use the value
08:58:50 <knebulae> @FreeFull: that too, but it's possible heterogeneously.
08:58:59 <knebulae> Not as safe
08:59:40 <doug16k> so, the ia64 will happily let you do illegal accesses, because it is by design - it is there so the compiler can aggressively schedule loads ahead of the point where it is sure it is safe
09:01:57 <doug16k> so `if (thing) other = *thing;` can schedule that *thing load way before the if
09:02:06 <FreeFull> You can get CPU that does no speculation, just don't be surprised when it's slow
09:03:18 <doug16k> I'm not saying I'm sure ia64 is immune, but I suspect that that unsafe load thing can be exploited
09:03:39 <doug16k> isn't immune*
09:03:49 <knebulae> @FreeFull: my concept on interrupt sleds (direct interrupt to usermode), has proven to be performant, once you pass 16 core / 32 thread.
09:05:31 <FreeFull> A very large chunk of CPUs currently used do not have that many cores
09:05:43 <knebulae> @FreeFull: I'm not writing a kernel for today
09:06:27 <FireFly> nyc: as far as I understand, Meltdown doesn't necessarily depend on speculation..
09:07:53 <nyc> Eh, it's yet another instance of the same speculation leak pattern that happens to involve the cache.
09:09:25 <nyc> ```Oracle has stated that V9 based SPARC systems (T5, M5, M6, S7, M7, M8, M10, M12 processors) are not affected by Meltdown, though older SPARC processors that are no longer supported may be impacted.``` <--- I'm not sure whether to actually believe this.
09:09:56 <doug16k> nyc, amd x86 isn't affected by meltdown either
09:10:14 <doug16k> American Micro Devices
09:10:40 <knebulae> @FreeFull: I'm not sure exactly what mental model you have of a kernel that's fully dialed-in on its cpu, but with the design I have in mind, at least the way I visualize the "flow" is that it's an inside-out kernel.
09:10:41 <nyc> doug16k: AMD is more credible than Oracle. The security bigwig at Oracle has a history of bad behavior.
09:11:01 <doug16k> yes, gotta keep customers afraid enough to keep paying for the expensive one
09:11:20 <CompanionCube> *cough* That One Blog Post *cough*
09:11:49 <knebulae> Instead of the flow being through the system ring out to the user ring, it flows from the user ring into the kernel ring, and since that ring is isolated (the bsp never leaves ring0), it is free to handle its own tasks without interruption.
09:12:17 <FreeFull> Fuck Oracle
09:12:37 <FreeFull> Their whole business relies on screwing people over
09:12:50 <knebulae> And without processes knowing what it's doing.
09:13:27 <nyc> I'm probably not concerned with quite the same issues, but I'm very unhappy that Oracle is near-killing if not actually killing SPARC.
09:13:33 <knebulae> APs at that point become "Just Another System Resource"
09:13:45 <CompanionCube> nyc: killing, present tense?
09:13:47 <knebulae> Macro-scheduling for manycore becomes trivial
09:13:51 <FreeFull> As soon as Oracle bought Sun, a lot of people were not happy
09:14:00 <FreeFull> Because they knew what it meant
09:14:02 <knebulae> O.R.A.C.L.E.
09:14:26 <nyc> CompanionCube: There are news stories that all hardware engineers i.e. SPARC designers had been laid off.
09:14:38 <CompanionCube> exactly.
09:14:48 <CompanionCube> which makes the usage of the present tense questionable
09:15:02 <nyc> CompanionCube: There are more recent reports than http://fortune.com/2017/09/05/oracle-layoffs-hardware-solaris/
09:17:25 <renopt> of everyone that could have bought sun, why did it have to be oracle
09:17:26 <knebulae> It's kind of a complete re-think of a kernel with pencil and paper from the Intel & AMD x64 manuals, my knowledge of hand assembly, and 17 years of hanging around here. 4 OS textbooks. Just saying f*ck it all, and literally throwing all the conventional wisdom out the window.
09:18:02 <knebulae> And I made a non-microkernel, microkernel.
09:18:13 <knebulae> I.e. all of the isolation, none of the shit.
09:18:34 <doug16k> nyc, you can use paging attributes to make all code pages uncacheable. that way, every code fetch is from UC memory, so every instruction is a serializing instruction. it won't speculate one instruction ahead. it will stall every instruction until retirement
09:20:35 <knebulae> I had the idea in 1997 when playing with the flux-os toolkit from the guys in Utah. But life happened. And I was not a very good developer back then. I still don't think I'm a good developer, but I sure as hell am a lot better at reading, locating and understanding the information I need; plus the internet in general is a lot better now too.
09:21:50 <bleb> does anyone have tips for finding jobs to do OS development? feels like all the jobs are webshit these days
09:22:23 <nyc> It's a little inside information, but: https://www.thelayoff.com/t/U5uIwh1
09:22:25 <knebulae> @bleb: embedded & iot. c and c++.
09:22:35 <doug16k> so, you'll get something like 15 uncore cycles per instruction (~375 core cycles?), so it would run (very) approximately 10.6 MIPS on a 4GHz machine?
09:22:54 <knebulae> @doug16k: is that directed towards me?
09:23:15 <nyc> That basically means Fujitsu is abruptly EOL'ing their SPARC lines.
09:23:17 <doug16k> nyc. you can utterly kill perf and completely fix speculation :D
09:23:43 <knebulae> @doug16k: nvm.
09:23:51 <nyc> bleb: Not me. I haven't been able to get a kernel hacking job since 2009.
09:25:49 <doug16k> I gotta time some UC code to see exactly how terribly it does (or how well, knowing those guys!)
09:26:04 <doug16k> and gals! :P
09:27:53 <doug16k> many years ago I screwed up linux kernel so it never cached anything ever. when booting my drive made such terrifying horrible grinding noises I didn't have the heart to let it go on. now scale that up 1000x and that's UC code
09:28:17 <doug16k> disk cache I mean
09:29:32 <doug16k> deliberate experiment, it showed me the utterly necessary of a bit of disk caching
09:30:11 <doug16k> that drive was a 10k rpm drive too, so it seeked hard, really hard
10:14:10 <klys> woot sparc
10:39:48 <geist> nyc: fujitsu has invested in their arm64 core, maybe that's replacing it?
10:40:10 <klys> https://www.ebay.com/itm/Sun-SPARC-T5220-Server-1x-8-Core-1-2GHz-32GB-RAM-DVD-No-HD-or-Rails-/312452561274
10:41:40 <nyc> geist: It's sad that yet another architecture is lost.
10:46:31 <geist> yep
10:46:38 <Ameisen> yay, I need to find a new mob.
10:46:40 <Ameisen> job*
10:46:43 <geist> though i personally never had much love for sparc
10:47:48 <nyc> Well, you know my personal commitment to SPARC.
10:50:29 <geist> not particularly, no.
10:51:40 <geist> i've always thought of sparc as one of the most boring of all risc machines
10:51:57 * Ameisen isn't sure if any of y'all are hiring
10:52:01 <geist> only thing vaguely interesting about it is the register window, and i think most modern cpu architecture folks consider that a long term mistake
10:52:08 <klys> https://www.ebay.com/itm/Sun-Oracle-SPARC-T3-1-16-Core-1-65GHz-2x-1200W-PSU-Enterprise-Server-NO-RAM-HDD/132941907886
10:52:13 <klys> >16 cores
10:52:53 <geist> well, == 16 cores
10:53:00 <geist> and no ram. i bet ram may be hard to find
10:53:13 <klys> there was plenty of ram when I searched
10:54:17 <geist> cool
10:54:37 <geist> i thik the T3 cores are pretty derpy, but it would be fun to fiddle with
10:54:53 <klys> does that count as sparc64
10:55:42 <geist> i never completely got the difference (if any) between sparcv9 and sparc64
10:56:02 <geist> it may be something like ppc and power, where they're basically the same thing, though in different contexts they may differ
10:56:38 <nyc> sparc64 was never an official name.
10:56:45 <geist> it's my completely uninformed recollection that the fujitsu stuff was always called sparc64 whereas sun stuff followed one of te sparc specs (7 8 and 9)
10:57:00 <geist> yah may just be a trademark/marketing thing
10:57:27 <geist> also see altivec vs vmx vs velocity engine
10:59:04 <nyc> The Linux sparc32 maintainer collection that I had in my possession was lost during my worst MRSA attack in 2008, so my relationship with SPARC has been tragic for a while.
10:59:33 <klys> > 131.50 total 16 core: https://www.ebay.com/itm/Sun-Microsystems-Oracle-SPARC-T3-1-Server-16-Core-1-65GHz-No-RAM-or-No-HD/382797060794?epid=1283622801
11:00:59 <klys> oh nuts that's an auction
11:02:00 <geist> also seems to be missing drive trays
11:02:06 <klys> well some of those other deals, unless you want to wait for things to really start coming unglued
11:02:07 <geist> wonder how easy it would be to get one
11:02:27 <geist> the only little sparc machine i ever had much love for was a spunky little Netra T1 that a buddy of mine and I jointly bought
11:02:37 <geist> it was pretty slow, but it served static web pages like a champ
11:03:24 <geist> https://www.ebay.com/itm/Sun-Netra-T1-105-Rack-Mount-Server-w-2-4GB-Hard-Drives/232931525338
11:04:04 <klys> yeah that looks like one
11:05:46 <geist> it was part of a push sun was doing around 2000 or 2001 where they were selling cheapo sparcs
11:06:00 <geist> blade 100 was also a cheapo one. about $1k. i had one of those too
11:06:08 <klys> oh that explains why the cpu specs aren't listed
11:06:21 <geist> it's a ultrasparc IIi, single core. 440Mhz
11:06:35 <geist> there was a run of low end ultrasparcs there. IIi, IIe
11:07:11 <geist> the blade 100 was interesting: it was basically a ultrasparc IIe, which was basically kind of modern x86 soc looking. memory controller on board + PCI controller
11:07:25 <geist> so they glued it to a standard PC southbridge, and so you have a PC looking architecture + sparc
11:08:00 <geist> there were some PPC based machines a little later that looked pretty similar. pegasos and whatnot
11:08:41 <geist> https://www.ebay.com/itm/Sun-Blade-100-502MHz-256MB-Ram-CD-No-NVRAM/292263260046
11:08:50 <geist> if you want a cheapo sparcv9 to hack on, that's it
11:12:36 <geist> if i didn't have a desire to not pick up more crap, i'd totally impulse buy the blade 100
11:14:31 <klys> https://opencores.org/projects/s1_core
11:17:13 <klys> geist, the last thing you picked up was that trs-80 model 100, right?
11:18:41 <klys> I'm here next to three largely empty serer/router cabinets
11:24:29 <geist> klys: yes
11:24:39 <geist> it was a real hit at the office
11:25:06 <klys> I guess that has pretty much a built-in os
11:26:52 <klys> have you hooked it up to a serial port?
11:39:18 <doug16k> klys, the original basic rom was written by the designer but later basic was microsoft's basic @ 12KB
11:40:15 <doug16k> that would be its OS
11:40:49 <klys> doug16k, I'm p.sure it can run a cp/m ho
11:40:53 <klys> though*
11:41:31 <doug16k> being a Z80 that wouldn't be a surprise
11:42:02 <klys> it's an 8085
11:42:08 <doug16k> n what
11:42:22 <klys> predecessor to the z80
11:42:25 <klys> intel chip
11:42:34 <doug16k> trs-80 had z80
11:42:42 <doug16k> that's why there is an "80" in the name
11:43:15 <klys> I mean, it has p.much the same instruction set
11:43:27 <doug16k> oh similar yes
11:43:37 <Ameisen> On sites like Google Jobs... does it even matter what position you apply for, or does your resume just go into a virtual list somewhere
11:43:50 * Ameisen questions if he's qualified to work at Google anyways
11:44:04 <Ameisen> I'm surprised that the Z80 is still widely used
11:44:27 <azonenberg> Ameisen: i've never got a job by submitting my resume to a giant pile
11:44:32 <azonenberg> not even an interview
11:44:36 <klys> ameisen, usually if you want a z80 you run an emulator like zsim, etc.
11:44:46 <azonenberg> it's always been knowing somebody and sending it to them directly
11:44:57 <azonenberg> then if they like it, send a formal application in later on to make HR happy so they can proceed with interviews
11:44:57 <Ameisen> that's also been my experience, except in the case of _one_ job
11:45:19 <Ameisen> HR contacted me for a QA/tester position, I said I was a programmer, got interviewed and hired.
11:45:33 <Ameisen> but I already had experience before that.
11:45:47 <Ameisen> I've just been out of the 'finding work' game for like three years
11:45:57 <Ameisen> and haven't solidly made any connections lately.
11:46:08 <doug16k> what you do is impress the hell out of everyone at the moderately good job until someone with a friend at some kick ass place gets a big referral bonus for suggesting you :D
11:46:10 <azonenberg> Did you walk into a bar and order -1 beers, followed by 65536 beers, and then an octopus?
11:46:11 <azonenberg> :p
11:47:08 <immibis> did you remember to order a ?
11:47:18 <immibis> and a ' OR 1 = 1 OR '' = '
11:47:24 <azonenberg> oh and then some under-21 walks into the bar and orders a glass of water
11:47:29 <azonenberg> the bartender explodes, killing everyone in the room
11:47:44 <azonenberg> because the testers only tried beer :p
11:47:53 <klys> doug16k, I stand corrected. the model 100 did not have a cp/m written for it.
11:48:00 <immibis> order a bartender
11:48:18 <Ameisen> azonenberg - I ended up getting interviewed as a programmer
11:48:23 <doug16k> klys, it was probably feasible to do it though
11:48:23 <Ameisen> which was an interesting interview.
11:48:52 <Ameisen> then from there I got my last job.
11:49:03 <Ameisen> medical reasons and such aside, I no longer have said job.
11:49:09 <Ameisen> still have a house, though
11:49:12 <Ameisen> and a family
11:49:37 <Ameisen> our QA testers at that company were never very good at trying completely random things to break stuff. _I_ was better at that with testing my own stuff.
11:50:16 <Ameisen> we had a bug in control code where trying a diagonal movement on a dpad didn't work. Nobody ever caught that. Most of the devs didn't use the dpad.
11:50:38 <Ameisen> it was _never_ reported to me. I only foudn otu because someone mentioned it on a forum. Wasn't allowed to patch it. Fixed it in the next game.
11:51:53 <doug16k> yeah. everyone tests 5 divided by zero, and check for error. but do they ever press "plus 5" and press equals at that point? there aren't enough negative tests
11:53:03 <Ameisen> I did have a bug that was 'Wonder Woman's Invisible Plane doesn't have a texture and is invisible'
11:54:35 <Ameisen> we were also bad at tracking what builds QA was using (at some point I just started embeddeding a bunch of build/svn information into logs so I knew which commits were in builds)
11:54:42 <Ameisen> for some reason management was hesitant to do that officially.
11:54:58 <nyc> I honestly don't think I'll ever have or be able to get a job again. I mostly think my efforts will be some kind of postmortem talent display like TempleOS only nowhere near as impressive and attracting zero notice from anyone --- basically a dead github repository nobody will ever look at for a dead programmer no one will remember.
11:55:22 <doug16k> a bit fatalistic no?
11:56:37 <klys> nyc, if you're targetting all-platforms, you should perhaps write an intel-syntax assembler, that could get popular.
11:56:45 <nyc> doug16k: It's rather optimistic about how much I'll actually accomplish. Being more realistic it will probably be an embarrassment that I'll never be able to get anywhere with and never have good excuses for where the time went or why I haven't fulfilled any of the promises of eventual code delivery.
11:57:49 <doug16k> don't worry about other people, do it for your own enjoyment
11:58:44 <klys> doug16k, have you worked on your todo/wishlist file yet
11:58:45 <nyc> And, for that matter, an embarrassment about how I failed to live up to the standard I set in my former career.
11:59:40 <nyc> (which, in all actuality, wasn't that impressive in the first place)