channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/
#osdev is the only channel with logs prior to 2004
daily #osdev logs after 12sep2018 are not available until midnight PDT
use the link below if you need today
Wednesday, 20 February 2019
12:17:51 * nyc reads the seL4 manual.
12:18:42 <mahaxemu> anything good in there?
12:21:48 <nyc> mahaxemu: Well, it's a microkernel, so a lot of kernel projects could be done as e.g. memory managers that runs as applications of a sort, or userspace things that would be drivers.
12:22:24 <mahaxemu> so like multi-user DOS ?
12:23:45 <nyc> mahaxemu: Hmm, not entirely.
12:27:26 <nyc> Ouch, scheduling is a bit weak.
12:27:49 <nyc> ```seL4 uses a preemptive round-robin scheduler with 256 priority levels. When a thread creates or modifies another thread, it can only set the other thread’s priority to be lower than or equal to its own. Thread priority can be set with seL4 TCB Configure() and seL4 TCB SetPriority() methods.```
12:27:58 <nyc> (from http://l4hq.org/docs/manuals/seL4-1_2.pdf)
12:29:32 <mahaxemu> what's weak about it, the round-robin part?
12:30:07 <mahaxemu> iirc it has to have real-time latencies gaurantee's
12:32:13 <nyc> mahaxemu: It doesn't allow one to write a scheduler and/or process manager esp. when it comes to SMP and NUMA affairs e.g. gang scheduling virtual machines.
12:33:27 <nyc> ```seL4 does not allow Page Tables to be shared, but does allow pages to be shared between address spaces.``` <--- that also hurts because sharing pagetables for arches that use the things is critical to workload feasibility in many instances.
12:36:33 <mahaxemu> how does soemthing like that work, the data in the page is shared but the virtual addresses are different?
12:38:31 <Mutabah> mahaxemu: Same physical page, same virtual address, but the paging structures are different instances
12:45:09 <nyc> I'm not sure if mahaxemu was asking about what the seL4 docs said or what I mentioned about shared pagetables.
12:46:12 <mahaxemu> i'm not sure why shared pagetables is such a big deal
12:46:51 <mahaxemu> but i know little of paging so that's probably why
12:49:11 <Mutabah> nyc: is that just an efficiency hit?
12:49:44 <nyc`> mahaxemu: Large-scale sharing of large memory regions duplicates the pagetables covering the shared memory a large number of times. Typical is 1000+ processes sharing regions that are 2GB or more.
12:50:55 <knebulae> nyc: I used to be obsessed with seL4
12:51:29 <nyc> knebulae: Interesting. How do the capabilities work?
12:52:52 <knebulae> Basically (this is from about 9 years ago, so my memory is very poor), what they proved is that the microkernel does not crash and that it does not have security violations. Meaning, if the kernel grants a capability (which is essentially an access token) to or for some resource, that that capability can not be impersonated, and there is no viable code path where the kernel will grant access to an invalid capability.
12:53:38 <nyc> knebulae: I'm sort of lost as to how they get used to control or represent anything.
12:53:40 <knebulae> And by does not crash, I mean it's interrupt delivery, task switching and fault handling code were complete and error free.
12:54:01 <knebulae> So it basically never crashes. But it also doesn't do anything. So some academics have said it's worthless.
12:54:19 <knebulae> You can run linux personalities on it, and the machine will never crash, but the linux personalities can crash.
12:54:33 <knebulae> @nyc: they are just an abstract concept
12:54:42 <knebulae> the capabilities represent whatever you program
12:54:53 <mahaxemu> how can they not be impersonated?
12:54:56 <knebulae> it makes more sense to think of it as an access token
12:55:03 <knebulae> @mahaxemu: you
12:55:16 <knebulae> '@mahaxemu: you'd have to read their papers. It's been far too long for me.
12:55:29 <mahaxemu> good call, thats interesting if true
12:55:58 <nyc> What on earth do they represent?
12:56:18 <knebulae> @nyc: access to a shared resource, or the right to receive an irq for example.
12:56:28 <knebulae> All drivers were usermode. True microkernel.
12:56:47 <knebulae> They used the capabilities to maintain the integrity of the system.
12:57:14 <knebulae> And had a basic scheduler and a very limited libc.
12:57:20 <nyc> Okay, right to receive an interrupt makes sense. A shared resource is a bit abstract.
12:57:23 <knebulae> I hacked on it before they got the x64 port going.
12:57:31 <knebulae> @nyc: a disk controller
12:57:37 <knebulae> @nyc: a display adapter
12:57:41 <knebulae> etc.
12:58:14 <knebulae> The reason the kernel didn't crash is because all it did was verify capabilities and pass messages.
12:58:22 <knebulae> The verified version is like 8800 lines of C code.
12:58:42 <knebulae> x86 32-bit at least
12:59:56 <nyc> The seL4 approach to SMP is sort of a non-starter for me.
01:00:05 <knebulae> Plus, they got as fast as you can get with a microkernel (although the commercial pistachio guys may be faster, but their L4 kernel is in C++).
01:00:38 <knebulae> It's not open source, or at least wasn't the last time I cared to check.
01:01:41 <knebulae> @nyc: seL4 was an academic exercise. SMP performance was not their goal. Their goal (well, Dr. Lidtke's goal) was to provde that a microkernel could be made bulletproof and fast, and to prove that the mach guys just did it wrong.
01:01:55 <knebulae> s/provde/prove/
01:02:38 <knebulae> and he largely succeeded. But now the problems moving forward with manycore are just different.
01:02:43 <nyc> Before I got buried in Linux, I was in AIX and before that DYNIX/ptx and before that nowhere near kernels apart from a class with XINU ... no idea about other kernels micro or otherwise.
01:03:30 <knebulae> @nyc: well, I'm not a historian either. I just had a soft spot for seL4, the little kernel that could.
01:06:08 <nyc> It looks like capabilities are destinations for messages.
01:06:27 <knebulae> @nyc: it should be noted that they took a ton of liberties (cough shortcuts) with the verified implementation of seL4 that made it very problematic to port even to x86-64. It took several years to get that going.
01:06:49 <knebulae> @nyc: right. again, long time.
01:07:04 <knebulae> but yes, if you are not entitled to send a message to a destination.... etc. microkernel.
01:10:41 <nyc> The part that's harder to figure out how to do somehow anyhow is CPU scheduling and L4 basically completely dodges the issue by being a UP multikernel.
01:16:31 <knebulae> @nyc: and it scales really well
01:16:58 <knebulae> lol. I told you. 20+ years of writing slot machines. Lottery scheduling is the only way unless you want to dedicate CORES to scheduling only.
01:17:17 <knebulae> for manycore, not today
01:18:01 <nyc> knebulae: That doesn't count as scaling so much as artificial disaggregation.
01:18:17 <knebulae> @nyc: you say potato. it worked.
01:18:44 <knebulae> The cpu doesn't know the difference.
01:19:03 <knebulae> Everything else is an artificial abstraction
01:20:04 <knebulae> @nyc: come on nyc, expand yo' mind... lol.
01:20:25 <nyc> When you have actual shared memory parallelism to implement e.g. big fat number crunching with real cross-thread memory reference these kinds of cop-outs do not fly.
01:21:03 <knebulae> @nyc: cop outs? the kernel's job is to establish the communication and get the f*ck out of the way.
01:21:25 <knebulae> Not play fancy games
01:21:31 <knebulae> and guess on behalf of the user
01:21:49 <knebulae> if the user wants better scheduling for a starved thread, he can adjust the priority
01:22:33 <nyc> Partitioning the system in order not to have to implement SSI scheduling and more is not providing SSI scheduling and shared memory.
01:24:05 <knebulae> @nyc: I'm not sure we're on the same wavelength. I'm not suggesting that SSI services not be provided.
01:24:24 <nyc> SSI == Single System Image
01:24:42 <knebulae> I'm saying that we spend too much time in the kernel on decisions (like which thread to run), that are not really going to be as relevant as they were in 1-4/8 core designs.
01:24:45 <knebulae> I know
01:25:10 <nyc> Multikernel means software partitioning and so not being SSI.
01:25:20 <knebulae> When you have more cores than threads, who cares how fancy the scheduling algorithm is?
01:26:02 <knebulae> I'm not suggesting to be multikernel (at least my design)- I'm kind of speaking to two different (but not unrelated things).
01:26:23 <nyc> seL4 is multikernel AFAICT
01:26:36 <knebulae> First, seL4 and their just wantonly running a kernel per cpu approach. I was more interested in their work in making microkernels fast.
01:27:27 <knebulae> That led me to the point that we are wasting time with fancy scheduling and ML scheduling when it won't matter in 10 years. Which is related, but not the same thing as my seL4 comment. Sorry about that.
01:27:54 <knebulae> I muddied the waters.
01:29:29 <nyc> It actually takes some load balancing to spread tasks out well.
01:29:53 <knebulae> @nyc: depends on how far into the future you're talking
01:29:53 <nyc> Not sure what you mean by ML scheduling.
01:30:24 <knebulae> @nyc: scheduling based on observed behavior (statistical, or as it is likely sold- machine learning).
01:31:26 <nyc> That must not really be used by people with any AI background.
01:32:30 <knebulae> @nyc: I have no idea. I played with NN crap in C maybe in 2002 or 2003. I wasn't really at a level that I could wrap my head around it then.
01:33:02 <nyc> AI people will pretty much think of ML like Moscow ML, Standard ML of New Jersey, Caml / Ocaml, etc.
01:33:40 <knebulae> right
01:37:56 <nyc> http://www.cs.utah.edu/flux/papers/atomic-osdi99.ps.gz looks promising, though I had to convert it to PDF to read it.
01:38:42 <knebulae> @nyc: those guys did the flux-os toolkit. I was messing with that back in 97
01:38:55 <knebulae> I was such a newb I couldn't even get it to compile.
01:39:20 <knebulae> I had just taken a semester of C++ on Sun machines using gcc (first exposure to GNU). Lol. Blast from the past.
01:39:46 <nyc> Fluke is interesting because of its use of interrupt model programming.
01:40:04 <knebulae> @nyc: I am unfamiliar.
01:41:59 <nyc> Instead of having threads that each have a stack it just has one stack per CPU and deferred execution is structured as queued-up callbacks.
01:43:28 <knebulae> @nyc: interesting. monolithic or microkernel?
01:43:39 <nyc> Fluke was a microkernel.
01:43:42 <knebulae> ok
01:46:38 <nyc> The important parts are async IO, low kernel thread overhead, and support for M:N threads.
01:46:38 <doug16k> sounds like you can put a few kV into that OS and it won't even blow it up
01:46:48 <nyc> doug16k: kV?
01:46:57 <doug16k> fluke multimeter joke. sorry
01:47:06 <doug16k> kilovolts
01:47:15 <knebulae> lol
01:54:27 <knebulae> @nyc: I'm just curious, because you mention M:N threads frequently, why is this such a big deal? I mean it's not really a challenging feature to implement on the OS side, and much like any feature the usermode guys get, they're clearly going to f*ck it up.
01:54:55 <knebulae> Provide an API and let them shoot themselves in the foot.
01:55:49 <nyc`> knebulae: Kernel threads don't scale because they consume pinned memory.
01:56:30 <knebulae> again, what is the problem with pinning memory and getting the affinity of the thread right in the first place?
01:56:38 <knebulae> s/getting/setting/
01:57:26 <knebulae> As I understand it, Linux does everything in it's power to get it right, and also to make sure it doesn't have to migrate threads between cores.
01:57:30 <knebulae> I'm not sure what NT does.
01:57:58 <knebulae> Unless you're only talking about threads that share memory
01:58:17 <knebulae> which is dangerous anyway in kernel
01:59:26 <knebulae> maybe I'm taking a juvenile approach, or my understanding just isn't as complete as I think it is.
02:01:18 <nyc`> Hmm? Linux is terrible at all this. It doesn't even have gang scheduling at all.
02:01:55 <knebulae> I think I just missed it because I'm tired. You want M:N scheduling of kernel threads.
02:02:03 <knebulae> Duh. My bad.
02:02:34 <knebulae> I wasn't putting 0 and 1 together.
02:04:56 <knebulae> @nyc: a lot of these issues just really stop being issues when you get to manycore. My kernel is like Oprah. "You get a CORE" and "you get a CORE," and, well, you get the idea.
02:06:02 <nyc`> knebulae: Kernel threads differing from user threads is the point of M:N threads. You can't have them entirely in one or the other.
02:11:18 <knebulae> @nyc: by "kernel threads" I meant threads that run in kernel space only, not threads that use the operating system provided threading mechanism vs their own internal mechanism (stack-switching).
02:13:02 <knebulae> I hadn't considered M:N threading in a purely kernel threaded context, but depending upon the security model of the operating system, it could certainly benefit in the same way that userspace threads do, and that's generally to cut down on the address space switches and limit the switch to the stack.
02:14:01 <nyc`> knebulae: Well, you'll in general want to keep track of which CPU's have loaded up their TLB's and caches from where.
02:15:00 <knebulae> @nyc: if you avoid affinity changes (process migration) the importance of tracking that data becomes less important, no?
02:16:03 <knebulae> @nyc: and doug16k maybe can chime in, the cpu is damn good at keeping that sh*t hot.
02:16:37 <nyc`> Where something is supposed to wake up is the question.
02:17:02 <knebulae> @nyc: you have a thread that is blocked; and thread that is idle; or a thread that needs to run;
02:17:49 <knebulae> @nyc: sorry; you have threads that are blocked, threads that are idle, and threads that are ready to run.
02:18:07 <knebulae> Every nanosecond you spend determining which thread to run is too long.
02:18:31 <knebulae> With proper headroom.
02:19:31 <knebulae> Now, I have never gotten heavy into where what you've encountered matters most, and I've got to believe that's networking.
02:19:56 <knebulae> Tons of little messages that have to get delivered, and tons of threads blocking on the network receive.
02:20:03 <nyc`> There's more like whether it's push or pull load balancing.
02:20:24 <knebulae> @nyc: right, I see.
02:22:02 <knebulae> See, my thought is to provide userspace the tools to make their own decisions and craft their own process w.r.t. scheduling. We have gone bare, bare, bare on most boxes with virtualization (software-wise), so fully configuring a system's scheduling and execution policies should be well within the reach of a competent system administrator. "Let them tweak..."
02:22:39 <knebulae> Not you West Virginia. Sit back down.
02:22:58 <knebulae> just kidding :) I love WVa!
02:25:07 <nyc`> MCS locks were originally done in userspace on BBN Butterflies.
02:25:54 <knebulae> I will have to read about that.
02:26:28 <nyc`> Spinlocks aren't very good in userspace without gang scheduling.
02:26:57 <knebulae> I honestly believe if you write a great TINY os, with some really unique system layout, memory sharing, whatever else system calls, and really prove what you can do with userspace libraries, that people will love the flexibility such a system offers.
02:27:11 <knebulae> @nyc: you should never have to use a spinlock on SMP
02:27:50 <knebulae> @nyc: there should be a system call to wait on an event.
02:28:10 <knebulae> Probably the best (and one of the worst features) of NT. WaitForSingleEvent
02:28:17 <nyc`> The point was that such an ancient system had these kinds of features.
02:29:12 <knebulae> @nyc: sometimes you need to take a step back and really (and I mean really) think about what you're doing. Think about what the cpu sees, and see if some of our long-held assumptions or just "this is the way we do it" still makes sense given the hardware advances of the last 40 years.
02:29:58 <knebulae> There's a lot of cruft. A lot of ideas that were not practical at the time, but might make sense now. Assumptions that are no longer true. Etc.
02:32:26 <knebulae> I remember a time when everybody re-wrote a bunch of code in the RISC days because it was faster to use linked lists rather than arrays. Then cpus changed, x86 moved to micro-arches with uops, and then it was faster to use indexes again, so arrays were all the rage again. The point is, sometimes you have to revisit things from time to time. Kick things around. Go back to brass tacks. Break out the pencil and paper and the trusty intel or AMD
02:32:26 <knebulae> manual. Remember that what your computer is doing is not on the screen. It's in the box next to you.
02:35:21 <nyc`> knebulae: Hmm? I'm not planning on an x86 port until after I load userspace on the startup arches.
02:37:46 <knebulae> @nyc: gotcha. I guess my point is I kind of feel where doug16k is coming from. A cpu is like a buzzsaw. Everything we, as systems programmers do, that interrupts or gets in its way, just slows it down in the long run.
02:38:31 <knebulae> We can slow it down as eloquently and as minimally as we like, but the effect is the same.
02:39:17 <knebulae> only amplified x100 or x1000 on manycore
02:40:37 <nyc`> Threads matter, too, and are interesting at least with more than 2.
02:41:32 <knebulae> @nyc: and that's something I hadn't considered, and that's "hyperthreading," which is clearly the rage at the high end as my Sun investigation over the last few days confirmed. POWER is big on it too.
02:41:54 <knebulae> x64 just doesn't have anything like it (well, more than 2), so I haven't given it proper consideration, and how that impacts scheduling.
02:42:24 <nyc`> They're sort of anti-affinity domains. You can pack things in them, but you would do best to minimize or spread out across cores.
02:42:25 <knebulae> or I should say my haphazard lottery scheduling (with a few rules)
02:43:50 <knebulae> @nyc: plus, I believe the level of support and the characteristics of each "more logical cores than physical cores" varies across architectures.
02:44:06 <nyc`> Another thing is that floating point heavy threads and integer heavy threads interfere with each other less, so there's even more to look at.
02:44:09 <knebulae> I didn't want to use the Intel trademarked term.
02:44:46 <nyc`> I thought Alpha called it SMT.
02:44:56 <knebulae> @nyc: yes
02:45:17 <nyc`> SPARC is beautiful with 8 threads per core.
02:45:37 <knebulae> @nyc: but when you start talking about a 32-core x64, it's pretty easy to put integer code on one core and fp code on another.
02:46:22 <knebulae> @nyc: yes, but x64 will have more physical cores- cheaper and sooner. And it's not getting closer.
02:46:46 <knebulae> Again, TODAY, that 32-core/64-thread x64 chip is $1300.
02:47:27 <knebulae> @nyc: but I'm not harping on architectures. And I wasn't harping on arches earlier either, sorry if you mistook.
02:47:34 <knebulae> I'm just in awe at what we're getting.
02:47:37 <nyc`> No, the point is that when you have to put threads together on a core, threads that use different functional units are best to group together.
02:48:20 <nyc`> Well, x86 needs to die. Hopefully Apple switches to ARM.
02:50:01 <nyc`> And Chromebooks switch to ARM too.
02:50:50 <nyc`> And the Chinese MIPS laptops go somewhere.
02:52:09 <nyc`> And Fujitsu takes over server space with its SPARC boxen.
02:52:46 <nyc`> And IA64 gets resuscitated.
02:54:11 <nyc`> And IBM goes BlueGene crazy with POWER everywhere for cheap enough to make a big hole in x86 space.
02:56:50 <nyc`> You know it's some bloated stuff when it takes multiple embedded systems of other architectures to do power management etc. for it.
03:04:58 <nyc`> I actually like SMT. Just get a big fat bag of functional units and issue whatever you've decoded to whatever's available so variations in it all esp. with cache misses can be amortized.
03:07:51 <ebrasca> Hi I need help with ext4 extents.
03:08:15 <ebrasca> What they are and where to find some documentation?
03:10:47 <doug16k> knebulae, the intel xeon phi is an x86 with an extremely large number of SMT threads per core
03:11:56 <doug16k> well maybe not extremely large number, 4
03:14:11 <doug16k> can get 256, 272, or 288 thread versions of their latest one
03:14:24 <doug16k> 72 core = 288 threads
03:15:08 <doug16k> > 3.4 TFLOPS they say
03:16:41 <doug16k> 400+ GB/s 16GB MCDRAM BW, 102.4GB/s DDR4 BW
03:20:11 <doug16k> oops, ddr4 is in bits
03:23:23 <doug16k> nyc`, ya, now there are so many functional units that it's a tad of a waste to not have SMT on because one instruction stream probably won't be able to use enough
03:25:03 <doug16k> two loads and a store every cycle is nuts. they'll be that good or better
03:29:49 <doug16k> ebrasca, do you have this? https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
03:30:35 <doug16k> ebrasca, are you asking in general what "extents" means?
03:31:26 <doug16k> anyway, more specifically, https://www.kernel.org/doc/html/latest/filesystems/ext4/dynamic.html#extent-tree
03:33:57 <geist> i wonder if i can get ahold of one of the later xeon phis
03:34:19 <ebrasca> doug16k: I need to implement extents for reading ext4 files.
03:36:03 <ebrasca> doug16k: For now my implementation can read from ext2 and ext3.
03:37:56 <doug16k> ebrasca, section 4.2.3 of that book I pointed out explains it pretty thoroughly. there is a header, an interior node format, and a leaf node format
03:37:58 <knebulae> I got knocked out guys and was away for a few minutes. @doug16k, the phi is very interesting to the way I solved the microkernel problem.
03:38:10 <doug16k> it's not clear but it sounds like they are describing a bit of a B+ tree
03:38:43 <doug16k> knebulae, oh good I was wondering how much you missed :)
03:39:36 <knebulae> @doug16k: a lot I think. I'll read the archive later.
03:40:42 <knebulae> @doug16k: around 12 minutes worth of convo from > 3.4 TFLOPS until you asked ebrasca "are you asking in general what "extents" means?" So yeah, apparently a bunch :/
03:42:02 <doug16k> ebrasca, i.e. each interior node of the tree has a sorted list of key+child_pointer items the tell you the lowest value in each subtree. you binary search those to find the right descendent to descend into
03:42:13 <doug16k> then once you reach the leaf, you binary search for the seek position
03:43:34 <doug16k> each "extent" is just an LBA+length pair. so many blocks starting at some LBA
03:44:16 <doug16k> ah, the interior node keys would be the file offsets. so they'd describe the lowest file offset you'll find if you descend into that child
03:49:57 <doug16k> it's this data structure but they use the variation where there are isn't one more child pointer than keys in interior nodes, so is it slightly more redundant than the textbook B+ tree -> https://en.wikipedia.org/wiki/B%2B_tree
03:50:45 <doug16k> slightly simpler though
03:55:32 <ebrasca> Is this B+ tree inside some file?
03:55:49 <doug16k> it replaces the logical block map
03:55:58 <knebulae> @ebrasca: on disk structure
03:56:15 <knebulae> @ebrasca: beneath the fs layer
03:56:30 <doug16k> you have an extent tree instead of a logical block map tree
03:56:30 <knebulae> well, *in* the fs layer
03:59:12 <doug16k> ebrasca, the logical block map tree is repetitive and a bit silly when describing an unfragmented file. extents allow each fragment to be up to 32768 blocks, in one extent tree leaf record
04:00:36 <doug16k> the old logical block map is a bit silly because it ends up using significant space to describe sequentially increasing LBA numbers
04:02:17 <doug16k> the old way isn't silly when the drive is fragmented though, in that case it doesn't cause an increase of metadata because of fragmentation so slightly better in that way
04:02:47 <doug16k> with unfragmented drive, extents blow away block map trees
04:02:57 <knebulae> on the whole rust discussions from yesterday, I just have to throw this one out there: RAII, not borrow-checking. 'nuff said.
04:04:53 <ebrasca> I think I get it, is it i_block from inode structure?
04:07:57 <doug16k> ebrasca, yes
04:24:29 <ebrasca> doug16k , knebulae : Thank you!
04:25:48 <knebulae> @ebrasca: thank doug man! lol.
04:26:01 <knebulae> I mean, you did, but I don't deserve it.
04:26:57 <ebrasca> knebulae: I give thanks to both.
04:27:14 <knebulae> @ebrasca: right on.
04:47:35 <ybyourmom> I won again
04:47:36 <ybyourmom> kek
04:51:06 <knebulae`> up on the lappy; been meaning to do this for weeks.
10:38:21 <mrvn> wake-on-key-pressed should work with USB keyboards, right?
10:53:06 <bcos_> mrvn: "should" - sure
11:09:40 <renopt> hmmmmm, gcc's stddef.h defines size_t to be unsigned long, musl defines it to be unsigned int...
11:09:43 <renopt> T.T
11:11:39 <lkurusa> for what platform?
11:12:38 <renopt> x86, building a cross compiler
11:31:28 <knebulae> @renopt: there's nothing wrong with that. on x32 gcc, unsigned long is 32-bit and unsigned int is also 32-bit.
11:32:16 <knebulae> @renopt: it's the same data type to the compiler
11:35:01 <nyc`> I remember back when 32-bit ABI's were being said to be unnecessary because the 64-bit affairs alleviated so much register pressure it overwhelmed the hits from larger immédiates and operands. Now it's doing the same as every other 64-bit arch, only vastly dirtier.
11:41:11 <nyc`> That is, for x86-64.
11:44:31 <nyc`> I think Fluke approached SMP by virtual processors but am still a bit foggy.
11:46:12 <nyc`> I might just drop the microkernel design at this point.
11:51:37 <renopt> knebulae: gives a error about redefining the typedef though
11:52:40 <knebulae> @renopt: then don't include gcc's stddef.
11:53:10 <knebulae> @renopt: I don't think musl requires it as a standalone libc written to compile under gcc and others.
12:12:53 <mrvn> knebulae: are you sure long is 32bit on x32? x32, not ia32.
12:15:18 <knebulae> I was speaking of ia-32
12:16:05 <knebulae> early morning shorthand
12:16:14 <lkurusa> IA-32 is ILP32 I believe
12:16:33 <mrvn> x32 could be IP32L64
12:17:01 <mrvn> Not that the L matters in any way. It's a horrible broken type and should never ever be used.
12:17:31 <lkurusa> I don't think I ever really use long, it's always either `int` or `uintXX_t`
12:17:39 <mrvn> lkurusa: good
12:18:04 <klys> bring it up to the iso c standards committee :.)
12:18:04 <lkurusa> (and their unsigned/signed counterparts respectively, depending on requirements, etc)
12:18:28 <knebulae> right. so stupid. so int got extended to 32, but not 64. ok. but long int is 32-bit on msvc (probably to maintain backwards 16-bit code compatibility), and long int is 64-bit on gcc. But gcc moved to long long, then to uint64_t/int64_t. I prefer the stated sizes.
12:18:50 <nyc`> Long is register size, int is the indeterminate one.
12:18:51 <mrvn> klys: the standard doesn't say int is 32bit or long long is 64bit. That's just common similarity on every 32+ bit system. But long is the odd one out with being different everywhere.
12:19:17 <mrvn> nyc`: no. On x86_64 on windoes long is 32bit. On Linux it is 64bit.
12:19:34 <knebulae> @mrvn: right. That's the problem. That's why one of the first things I always like to have is my own stdint.h in place for the compiler I'm using. just makes it easier.
12:19:53 <moondeck> mrvn: Is that for compatibility?
12:20:14 <mrvn> moondeck: verry likely.
12:21:20 <mrvn> nyc`: int is in practice never larger than 32bit or 16bit if the cpu doesn't have 32bit registers. int > 32bit would leave a gap with char=8, short=16, int=64. No type left for 32bit.
12:21:20 <nyc`> Then there is no safe ground to stand on anywhere in C's native types.
12:22:02 <lkurusa> just use the fixed-width types :)
12:22:04 <mrvn> That said: If the size matters use intXX_t.
12:22:08 <lkurusa> or if you don't care that much, just use int
12:22:36 <nyc`> C is actually really bad about its types' relationships to hardware.
12:22:39 <mrvn> I haven't worked with any system that had 8 bit ints. Anyone?
12:23:07 <lkurusa> oh, well a system like that would break my code
12:23:18 <lkurusa> thankfully nobody's ever going to run my code on a system like that
12:23:24 <mrvn> lkurusa: obviously.
12:23:32 <klys> I tried to compile a small-c for cp/m-86, though it couldn't to through. so, never got to work with it.
12:23:42 <lkurusa> i have implicit assumptions that int should be more than 'a reasonable size' and that's not 8 bits
12:24:14 <lkurusa> most of the code where i anticipated someone going over the limit have checks for wrap around
12:24:27 <lkurusa> i.e. pid_t
12:24:32 <mrvn> I would expect int to always be 16+ bit. The compiler just has to emulate it with 2 bytes when the hardware doesn't have 16+bit registers.
12:25:01 <mrvn> lkurusa: can't have a negative pid.
12:25:31 <klys> well there were 8-bit mpus that had c compilers, just that they have fallen out of popularity.
12:27:20 <klys> for example, geist has a 6809 board, and yeah it can represent a 16-bit integer, just that there aren't really enough registers to play with. so it had two stacks.
12:28:39 <mrvn> The C64s cpu had a address mode for the zero page. Kind of pseudo registers.
12:29:00 <klys> really nobody with an x86 chip has any business going "push al; push cl; push dh; pop ah"
12:29:12 <klys> I don't think that would even work
12:29:29 <mrvn> huh? Why not?
12:29:50 <klys> it's a 16 bit stack
12:30:04 <mrvn> So? Iirc it may or may not put padding
12:30:41 <mrvn> But as long as you push and pop the same type that shouldn't matter.
12:31:05 <mrvn> "push al; push ah; pop ax;" on the other hand ...
12:33:38 <mrvn> I'm not well versed on 16bit mode. Did exception align the stack to 16bit or would an unaligned stack double fault?
12:33:50 <klys> those aren't instructions iirc
12:36:43 <klys> "For maximum performance, data structures (including stacks) should be designed in such a way that, whenever possible, word operands are aligned at even addresses and doubleword operands are aligned at addresses evenly divisible by four. Due to instruction prefetching and queuing within the CPU, there is no requirement for instructions to be aligned on word or doubleword boundaries." - i386 manual
12:39:39 <mrvn> klys: That' doesn't answer the question. Say your stack is odd will the CPU subtract 1 from it before it pushes the exception state onto it?
12:40:01 <klys> no, that's a performance optimization that amounts to a sanity check
12:40:53 <mrvn> Has a big impact on exception/syscall/interrupt speed though.
12:41:45 <mrvn> Still is in 32/64bit mode. It's a good idea (required on !x86 even) to align the stack to 8/16 byte boundaries before calling C code.
12:50:41 <ybyourmom> gang gang
12:52:40 <moondeck> ay
02:16:19 <nyc`> I think stack alignment ends up being a byproduct of load/store alignment constraints on most sane architectures.
02:43:42 <nyc`> I'm guessing that some sort of abstraction besides a thread may make more sense to invoke when scheduling threads. That may be part of the trouble with microkernel schedulers.
03:37:50 <renopt> nyc`: yeah, I've been thinking about that, how you could schedule groups of threads better
03:39:46 <renopt> current thoughts are since I'm planning to have a token for thread privileges anyway, could just make that more general, so it would be a group token
03:40:15 <renopt> then when picking the next thread give a higher priority to threads in the same group as the current thread
03:40:33 <nyc`> renopt: Well, there's more of an issue with how one provides a policy-free mechanism for it.
03:42:41 <renopt> hmm, how so
03:44:33 <nyc`> renopt: That is AIUI what microkernels try to do.
03:47:06 <renopt> with thread groups? not a very original idea I admit :P
03:47:54 <renopt> what kind of abstractions are you thinking of?
03:48:12 <renopt> just having processes as a kernel concept?
03:51:33 <nyc`> renopt: I don't know if thread groups are an issue. I'm just sort of head scratching about the right thing to even try to do. They often want to punt memory management and file management and more off to threads. That sort of construct just seems inapt for scheduling.
03:53:26 <nyc`> renopt: seL4 punts completely and just has a hardwired rather braindead scheduler and literally partitions systems into uniprocessor systems.
03:55:24 <renopt> oh, yeah no bueno
03:57:15 <nyc`> renopt: I'm still trying to fully decipher Fluke, but am hoping it's significantly different and better.
04:01:17 <nyc`> renopt: I'm not focused on clustering or anything, but am hoping I can get some clustering capability for free out of going the microkernel route. Amoeba and Sprite (I think) provided SSI clustering, and I think Sprite even carried out cross cluster node load balancing.
04:02:26 <nyc`> A lot of it is a byproduct of when you provide services via IPC they're relatively easy to migrate to RPC.
04:04:49 <nyc`> renopt: I'd probably mostly be focused on memory management and esp. how well it can utilize multiple page sizes in different situations like steady spectra or isolated with large gaps.
04:07:26 <nyc`> Cluster-related VM issues obviously turn up like replication and large page utilization of memory replicated across nodes whose page size spectra are heterogeneous.
04:08:54 <renopt> memory consistency mediated over an unreliable channel :D
04:09:40 <nyc`> Migration issues would likely also turn up like recognizing that processes with significant contiguity established would likely lose the contiguity if migrated etc.
04:11:26 <nyc`> renopt: Sharing memory read-write across cluster nodes probably can't be done apart from some turn-taking with write access for fairness.
04:19:54 <renopt> so what your saying is it's kinda complex
04:23:02 <nyc`> renopt: It's a simple matter of programming.
04:25:29 <renopt> yeah, just a bit of typing
04:27:30 <nyc`> Simple matters of programming are cliches of a sort for intractable problems of the ages.
04:32:26 <mrvn> nyc`: you want to cluster a heterogeneous cluster? How do you want to migrate tasks?
04:34:26 <mrvn> sharing memory across a cluster is a verry bad idea. sharing file access is already a total bottleneck.
04:34:38 <mrvn> performance wise that is
04:34:41 <nyc`> mrvn: There are limits to cross-migration. In the above mention, TLB differences in what page sizes are supported would be the only difference.
04:35:04 <mrvn> nyc`: you can't migrate an x86 task to an arm cpu.
04:35:45 <nyc`> mrvn: Obviously, so to be relevant that isn't happening.
04:35:45 <mrvn> or even an SSE3 using amd64 task to an amd64 with just SSE2.
04:36:10 <mrvn> So basically you can only cluster homogeneus systems
04:38:18 <mrvn> nyc`: I would design for a message passing approach and forget about shared memory, at least cluster wide. The performance is just too bad as soon as one thread writes.
04:38:38 <nyc`> mrvn: DYNIX/ptx would migrate tasks to where CPU features used were available. For instance, on a 386 with no FPU it would catch FPU usage exceptions and migrate the task to CPU's that had FPU's.
04:38:56 <mrvn> nyc`: that works.
04:39:06 <mrvn> .oO(But you can never go back :)
04:39:54 <mrvn> nyc`: did it notice when a process stopped using the FPU and migrate it back?
04:40:17 <nyc`> I did FibreChannel boot for DYNIX/ptx. If it hadn't been relegated to pure legacy, I probably would have done more work on it.
04:41:57 <nyc`> mrvn: I sort of doubt it. I didn't get to look around that much before everyone got moved to AIX Monterey.
04:42:41 <mrvn> it's unlikely that a process stops using the fpu. so might not have bothered.
04:43:20 <mrvn> and it's not like you can ever forget the fpu state once it was used.
04:43:44 <nyc`> My suspicion is that it noticed if tasks had ever used the FPU when trying to migrate to nodes without, but didn't try very hard otherwise.
04:44:00 <mrvn> nyc`: I think nowadays the most inhomogeneous cluster you need to support is big/LITTLE systems.
04:44:39 <nyc`> No idea what you mean.
04:45:19 <mrvn> nyc`: There are ARM systems that have cores with different power. Some slow ones that use little power and some fast ones that eat power for breakfast.
04:45:56 <mrvn> So when the system is idle you just churn away at the low power cores. But when the user starts Candy Crush you power up the big cores.
04:46:36 <nyc`> I don't think it takes that much to do the sort of task scheduling we just mentioned.
04:47:28 <mrvn> my conern is about it being usefull
04:48:34 <nyc`> Gang scheduling is the only really difficult part.
04:49:31 <mrvn> sure you can do it. But at 20ms per write access nobody sane will use it.
04:49:54 <mrvn> and all the insane people will scream about it being so slow.
04:50:11 <nyc`> I think anti-affinity for competition for functional unit resources on SMT might occasionally be helpful.
04:50:39 <mrvn> nyc`: no idea what that has to do with sharing memory across a cluster
04:53:45 <nyc`> mrvn: I was saying that even the complex scheduling things weren't that far out. Memory sharing is a lot more complicated.
04:56:52 <nyc`> mrvn: I don't think full read-write is going to have much expected of it. Replication is probably the majority of what there is to help performance.
04:59:40 <mrvn> nyc`: one thing it will teach you is how to optimize for distrubuted locks and how to avoid them.
04:59:44 * Jarizh thinks drinking Coke beats smoking Tobacco
05:00:00 <mrvn> The delay between systems is even wo
05:00:08 <mrvn> worse than thst between numa regions
05:00:11 * renopt vapes
05:00:31 * mrvn plays factorio
05:04:51 <nyc`> mrvn: Oh for sure. Remote cluster node access latencies are literally as bad as device IO and then some e.g. copying received packets to strip protocol headers off them.
05:06:08 <nyc`> Assembling things from multiple packets too.
05:08:13 <nyc`> There's supposed to be some kind of RDMA to help with these things, but I have doubts it's gone anywhere.
05:10:12 <nyc`> Realistically all I'm going to be able to do is fire up a handful (under 5) of qemu processes and get them to talk to each other.
05:14:13 <nyc`> Hopefully I'll have something worth putting up on GitHub before my laptop stops working.
05:21:12 <doug16k> it doesn't have to work to be on github
05:21:56 <geist> yah, *always* back up your shit
05:21:58 <geist> *always*
05:22:19 <geist> put it on google drive or something. there's no excuse for not keeping working backups of your stuff
05:24:21 <doug16k> putting something up makes it almost officially not a throwaway project
05:24:29 <doug16k> it motivates a bit
05:25:23 <geist> yes. and plus i think you get like one private repo on github, i believe
05:25:55 <geist> the instant you say something like 'oh i'll put it up when it works' you tempt the fates to come strike down your laptop
05:26:58 <doug16k> ^
05:27:03 <nyc`> I guess I'll use the private repo once I get to the Wi-Fi downstairs.
05:28:22 <doug16k> what you are doing there is prophesying that it's dying sometime in the near future, at some deep subconscious level you know it
05:28:53 <doug16k> don't ignore that
05:29:37 <doug16k> I don't base deadlines on the life left in my laptop :)
05:30:32 <nyc`> I don't honestly think I'm going to get anything done anyway. It's mostly a timekiller while I can still do anything.
05:33:43 <geist> nyc`: irrelevant. back up your stuff
05:33:49 <geist> you will always wish you had
05:34:20 <geist> doesn't matter what you feel about it right now or what it is. there is always a good reason to back it up. you dont know what happens next month or 5 years from now where you wish you had saved a thing
05:34:21 <eryjus> ^^ +1
06:26:44 <sginsberg> yo
08:34:07 <knebulae> I hate to be the guy that always slides stuff this way from reddit, but someone posted a cpu micro-benchmarking tool (userspace and kernel module for linux obviously) for x64. Might help some people here: https://github.com/andreas-abel/nanoBench
08:34:26 <knebulae> not mine btw
08:34:47 <knebulae> it's very cool. a nano-benchmarker
08:39:58 <knebulae> damn, the website is even better. Hope I didn't get suckered in to anything. It's by Saarland Informatics Campus. Never heard of them.