Search logs: #osdev - 1 February 2019

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present

http://bespin.org/~qz/search/?view=1&c=osdev&y=19&m=2&d=1

Friday, 1 February 2019

01:04:37 <geist> aww, RIP itanium
01:11:14 <klange> F
01:13:05 <aalm> ?
01:13:34 <klange> https://www.anandtech.com/show/13924/intel-to-discontinue-itanium-9700-kittson-processor-the-last-itaniums
01:19:24 <shakesoda> F
01:20:39 <nyc> Itanium was a great idea. I'm sorry to see it go. It would've been far better than 64-bit extensions of x86.
01:27:01 <geist> yah i feel bad about it really
01:27:14 <nyc> Hopefully embedded RISC undercuts the 64-bit x86 extensions.
01:27:23 <geist> i'm sure a lot of people really put a lot into it
01:32:06 <nyc> I'm surprised ARM laptops aren't going around more than they are with the additional battery life they would have to offer.
01:32:47 <geist> may be in the laptop form factor the cpu isn't really the dominant battery draw for everyday use
01:32:52 <geist> backlight on the display, storage, etc
01:33:03 <geist> so though you might be able to eek out a little more, it may not be significant
01:33:25 <geist> there are ARM based chromebooks though, but they're usually fairly low end
01:34:19 <knebulae> from my perspective there's a couple of gaps developing in the market.
01:35:29 <knebulae> we need a "workstation" architecture again. machines meant for getting sh*t done.
01:36:00 <knebulae> I guess that's xeon now
01:36:10 <nyc> I'll have to go downstairs and see what the breakdown of power draws is. I need to go fishing for PCI on MIPS.
01:36:41 <Ameisen> hmm
01:36:54 <Ameisen> msvc has a lot of extra keywords/attributes that are actually really useful. clang even supports most of 'em
01:37:26 <nyc> knebulae: Floating point will make you cry even on Xeon.
01:38:20 <knebulae> @nyc: perhaps b/c everybody's hopped on the ML train and is using GPUs nowadays. Gotta chase that $$.
01:38:49 <geist> yeah get the chedda
02:31:15 <mischief> there are workstation laptops
02:31:18 <mischief> like P52
02:52:28 <Jmabsd> i'll ask you about weak memory ordering here (realizing #asm may not be optimal for it)
02:52:51 <Jmabsd> here's the value slot: int v = 10; /* original value */
02:52:52 <Jmabsd> then core A does: v = 20; /* #0 */ v = 25; /* #1 */ v = 30; /* #2 */ v = 35; /* #3 */
02:52:52 <Jmabsd> now say core B has: while (true) { printf("%i\n",v); }
02:53:27 <Jmabsd> v is stored on an aligned address, and this means that the store operations will reach core b *ATOMICALLY*.
02:53:36 <knebulae> @mischief: looks nice. has a retro thinkpad look to it.
02:53:39 <Jmabsd> so v will be either of 10, 20, 25, 30 and 35 at any given time, nothing else.
02:54:52 <Jmabsd> now, while the pace of the stores dropping in to B is undefined, then,
02:55:08 <Jmabsd> still for stores to an aligned memory slot, they will still arrive in order right??
02:55:22 <Jmabsd> meaning, what happens on B will be to actually print 10, 20, 25, 30, 35 IN ORDER
02:55:40 <Jmabsd> maybe any of 10, 20, 25 and 30 will be SKIPPED because the stores dropped in so fast
02:55:53 <Jmabsd> however there will be no situation where the prints would be done in another order.
02:56:23 <Mutabah> yes, they should never happen out of order (as they're all write to the same address)
03:04:04 <geist> but if it were writing to another variable, w, in a different address, then the relative ordering btween them is undefined
03:04:15 <graphitemaster> Ameisen, sorry nope
03:04:53 <geist> with regards to the stores reaching core B 'atomically' that's a different story
03:05:02 <geist> each architecture has a notion of atomicity of memory stores
03:05:47 <geist> but in general, at least modern cores in usual memory situations (ie, cache enabled, cpus fully coherent, etc) you can generally expect aligned words to appear atomically
03:06:01 <geist> ie, you do a 32bit store on a 32bit aligned address and it'll work. if it's unaligned, usually all bets are off at that point
03:06:50 <geist> on x86, for example, this is fairly well documented. AMD has more complete description of the memory order and store atomicity
03:07:02 <geist> but it's fairly concrete, at least since say 486 or so
03:08:20 <geist> fairly certain it breaks down when you get up to 128 bit and above vector stuff though. i dont think there's any guarantee that a 256 or 512 bit vector is stored atomically
03:15:42 <Jmabsd> geist: exactly, correct.
03:16:15 <Jmabsd> geist: yeah memory ordering on strongly ordered architectures is clear.
03:16:28 <Jmabsd> on weak architectures like ARM64, the normal strategy is to sugar your code with memory barriers isn't it??????
03:17:02 <Jmabsd> geist: right, the biggest value type for which there's normally atomic accesses, is the architecture's native word size, e.g. 64bit for ARM64.
03:19:11 <nyc> That could probably go up to a cacheline without much pain on the hardware implementation.
03:20:19 <geist> could, but it doesn't necessarily. it should be documented however. ARM has about a 10 page section on it
03:20:37 <Jmabsd> to get terminology right: the problem of a newly allocated value slot containing TRASH, is called the "dangling pointer" problem?
03:21:04 <Jmabsd> so on core A you do, say, int* i = malloc(8); *i = 12345; and then you pass the pointer to another CPU core.
03:21:32 <Jmabsd> the other CPU core gets your pointer, and reads the value, but your initialization - here to 12345 - hasn't reached the other core yet, so it sees UNINITIALIZED memory which could be private or random data!
03:25:04 <Jmabsd> geist: correct?
03:28:49 <geist> correct
03:29:43 <geist> that's an actual real problem if you construct an object and it runs the constructor and then it's passed over
03:30:04 <geist> you have to have some sort of barrier there to ensure the other side sees it
03:30:43 <geist> even if you do something like store the address of the newly constrructed object into a global pointer, and then assume the pointer is valid o the other side
03:31:05 <Jmabsd> geist: so the scientific term for this is the "dangling pointer" problem????
03:31:25 <geist> well, i think that's a fairly generic term that can mean a lot of things
03:31:28 <geist> it's a memory order thing
03:31:32 <Jmabsd> ah, and what about the problem that core B can see a partial view only of a set of stores to multiple slots that core A did, does that have a name?
03:31:39 <Jmabsd> or it's just the "memory ordering problem" in a wide sense
03:31:44 <geist> note that on x86 it goes through great pains tomake sure that As writes shows up in order to B
03:31:56 <Jmabsd> yeah i know
03:32:08 <Jmabsd> that's strong ordering
03:32:18 <geist> it's just weakly ordered cpus like ARM make no such guarantees. essentially the cpu will be coherent to itself, but 'external observers' may not observe any particular order, or any order at all (it can set in cpu A's write buffer forever)
03:32:29 <Jmabsd> years ago I read an Intel manual where they talked of a lower level kernel memory mode where you switch the CPU into weak ordering, but anyhow yeah userland X86 is always strongly ordered. (y)
03:32:43 <Jmabsd> exactly.
03:32:47 <geist> note in practice this isn't a big deal since most of the time you grab/release a mutex and whatnot and those have implicit barriers in them
03:33:00 <geist> and of course atomic instructions or code sequences that explicitly have memory barriers
03:33:38 <geist> but then ARM at least has a wide variety of atomics with differing levels of barrierness
03:33:48 <geist> ie, you can have a weakly ordered atomic, which is pretty cool
03:34:03 <geist> or an acquire or release semantic atomic. useful for implementatinos of mutexes or spinlocks
03:34:08 <geist> which only barrier in one direction
03:34:29 <geist> a 'full barrier' ensures that all memory accesses before the barrrier and after the barrier have occurred in that order
03:34:43 <geist> wheras something like an acquire or release barrier only ensure one of those directions
03:34:49 <Jmabsd> geist: aren't memory barrier operations still Veery Veeery Expensive?
03:35:03 <geist> yes, but welcome to SMP
03:35:11 <_mjg> is it even worth it though? memory barriers can be mind bending and the most popular platform (x86) makes it even easier to screw up without other archs to test
03:35:20 <Jmabsd> geist: right.
03:35:24 <geist> the tradeoff is that otherwise the cpu can be very out of order, so it's pretty neat
03:35:35 <Jmabsd> geist: yeah normally the barriers are of three types - bidirectional, and "write barrier", and "read barrier", right
03:35:40 * Griwes pops in
03:35:41 <geist> yah acquirre/release
03:35:47 * Griwes sees a memory ordering discussion
03:35:51 <geist> that's the c++ nomenclature at least
03:35:57 * Griwes decides it's probably time to go to bed
03:36:01 <geist> ARM decided to follow the c++ model in the way they name their stuff
03:36:05 <Jmabsd> Griwes: lololol yes probably, spare your mind
03:36:14 <Griwes> lol I'm not spared anyway
03:36:22 <Jmabsd> geist: wait, can you define "acquire" and "release".
03:36:38 <Jmabsd> geist,Griwes: i talk to a project that defines some of this itself, and i like to understand how THEY solved it
03:36:44 <Griwes> I've been a witness to some memory ordering discussions in the concurrency study group of the C++ committee
03:36:49 <Jmabsd> to understand that, i need to understand the problem surface well first, this is why i doublecheck it with you.
03:36:50 <geist> note i only understand this to the point of being pretty dangerous. almost no one really really internalizes it. i always have to go back and think through it from a computer architecture point of view
03:36:52 <Jmabsd> the problem surface.
03:37:04 <Jmabsd> Griwes: WOW, cool!
03:37:06 <Griwes> there's like 4 people who can talk about this from a position of authority
03:37:07 <Jmabsd> what was it like?
03:37:12 <geist> yah, exactly
03:37:27 <Jmabsd> Griwes: PM.
03:37:38 <Jmabsd> Griwes,geist: memory ordering is one of the hairier aspects of software.
03:37:46 <geist> i like to have a practical knowledge of things. it gets really wishy washy when you start talking about it from a standards and hypothetical virtual machine point of view
03:37:49 <Jmabsd> Griwes: only four, why?
03:37:51 <Griwes> if only just of software, lol
03:37:51 <Jmabsd> and why them
03:38:10 <Griwes> because only they _really_ understand the spec and the issues
03:38:22 <geist> this comes up from time to time at work, and it generally ends up devolving into 'the c++ model if you pedantically follow it disallows you to write system software'
03:38:37 <Griwes> geist, well, armv8 does pretty much implement the C(++)11 memory model though, right?
03:38:54 <geist> Griwes: yep, that's what i was saying when you popped in. they went ahead and followed the acquire/release semantics
03:39:00 <Griwes> I know that our (Nvidia) hardware is also trying to implement it as much as possible
03:39:35 <Jmabsd> Griwes: you mean C++ memory model correlated with hardware model?
03:39:39 <Jmabsd> WHY so hard to understand !?!?!?
03:39:46 <Griwes> you'll literally see `.acquire.` as part of PTX instructions
03:39:55 <Jmabsd> err.
03:40:03 <Jmabsd> geist,Griwes: can you define acquire and release please.
03:40:15 <Griwes> but it helps when one of our hardware architects is one of those 4 or 5 people I mentioned before :D
03:40:26 <geist> yep. load/store with acquire/release barrier
03:40:50 <Jmabsd> geist: please map acquire and release to "rbarrier()", "wbarrier()" and "barrier()"?
03:41:04 <Jmabsd> Griwes: oh, btw, your employer would make a positive difference in this world if they opensourced their drivers.
03:41:23 <geist> uh. i think theyre' not precisely the same thing
03:41:34 <Jmabsd> AMD's drivers are a bit disastrous, also their performance, like, take a Vega 64 on a Thunderbolt 3 link with Windows, it'll give useless performance at least on really high res.
03:41:34 <geist> SEQ_CST is full barrier for sure
03:41:45 <Jmabsd> meanwhile Nvidia performs ultra-gracefully. anyhow that was OT. /
03:42:01 <Jmabsd> geist: so can you define acquire and release?
03:42:03 <geist> ACQUIRE is backwards from what you think it is, and i thnk it means 'everything after must happen after the barrier'
03:42:09 <Jmabsd> so those are memory synchronization constructs?
03:42:11 <geist> and RELEASE is the opposite of that
03:42:32 <geist> as in if you use acquire/relase for grabbing and releasing a lock, the stuff in the middle will happen definitively between the lock
03:42:38 <Jmabsd> geist: ah, so acquire and release regard both writes and reads?
03:42:48 <Griwes> Jmabsd, allow me to not comment on that comment about drivers ;p
03:42:53 <geist> i'm not willing to definitively say yes or no on that
03:42:56 <geist> but probably
03:43:03 <geist> this is where i encourage you to play with godbolt
03:43:05 <Jmabsd> Griwes: lol. you mean maybe you don't agree with employer, or such. yeah sure no need to comment.
03:43:17 <geist> and get familiar with c++ atomics
03:43:25 <Jmabsd> Griwes: just please add some more votes to your employer, on releasing OS, as they're keeping too big parts of the computing community hostage today
03:43:35 <Griwes> no, I mean I won't comment
03:43:45 <Jmabsd> Griwes: an empty aluminium can is more useful than your company's 5000 USD price tag hardware, in many usecases
03:43:52 <Jmabsd> at least you'll get five cents for the aluminium can lol.
03:43:54 <geist> Jmabsd: suggestion: please dont tell peoeple how much you do or dont like their company
03:44:02 <geist> it gets old really really fast. i can tell you from experience
03:44:04 <Jmabsd> Griwes: sure nw.
03:44:22 <Jmabsd> geist: godbolt?
03:44:41 <geist> https://godbolt.org/
03:44:44 <Jmabsd> geist: re like company, ok - anyhow also it's off topic to this conversation. nw.
03:45:06 <geist> it's pointless and annoying to tell someone that their company they work for sucks because X and Y
03:45:18 <Griwes> Jmabsd, generally you can find some spec that defines which ordering is which here: http://eel.is/c++draft/intro.races and in sections linked to from there, if you want a definition (though you'll have to piece it together from how the standard is written)
03:46:08 <geist> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html is interesting, because it's the underlying builtins that gcc and clang use to implement atomics
03:46:18 <geist> and you can see the __ATOMIC_ACQUIRE/ACQ_REL/etc there
03:46:20 <Griwes> but what geist wrote is probably a good way to go about it
03:46:25 <Jmabsd> geist,Griwes: *why* did C++ find itself in the need of coming up with new memory synchronization primitives?
03:46:29 <Jmabsd> or did they?
03:46:36 <Griwes> because C++11 introduced threads
03:46:45 <Griwes> and the spec had to talk about how threads interact with each other
03:46:54 <Griwes> (same for C11 really)
03:47:23 <Griwes> there was no standardese, so standardese was created; new architectures generally try to map onto that model, because it turns out that that is useful
03:47:40 <Griwes> x86 is too strong, which means some operations aren't as efficient as possible
03:48:09 <Jmabsd> Griwes,geist: oh interesting, so SMP was out-of-spec to C++ before C++11?
03:48:31 <Jmabsd> you mean x86 is enormously popular,
03:48:41 <Jmabsd> so performance of C++11 stuff on some other architectures may not be optimal?
03:48:59 <Griwes> there's some architectures (alpha I think?) that are _not_ strong enough, and turns out that doing synchronization when you are weaker than the API model programs are written against is even _more_ inefficient, because you need to do barriers that do too much
03:49:14 <Griwes> C++ doesn't know what "SMP" is
03:49:37 <Griwes> C++ didn't know what a thread is before C++11, and pretended your code always has a single order of instructions when run
03:49:37 <geist> https://godbolt.org/z/O6vdMT is a starting point
03:49:47 <geist> you see that x86 does the same thing, but the arm64 codegen is subtly different
03:50:20 <geist> specifically if the ldxr instrction has the 'a' or if the stxr instruction has 'l' in it
03:50:24 <Griwes> which is of course not true - but that's what is modeled by memory ordering called "sequentially consistent" - which basically means that the behavior of the program is as if all instructions were executed in some arbitrary sequential order
03:50:28 <geist> since those act like acquire and release barriers
03:50:33 <geist> both of them together acts as a full barrier
03:51:13 <Griwes> but as I said - on some architectures there are more efficient ways to implement things that use atomics
03:51:21 <Griwes> especially when you start dealing with lock-free programs
03:51:38 <Griwes> which is where brains usually _really_ start melting
03:51:44 <geist> https://godbolt.org/z/CyisXr the third compiler is using the new single instruction arm64 atomics, but those have acquire/release semantics too
03:51:56 <nyc> Griwes: Garbage collection is nasty in the kernel.
03:52:59 <Jmabsd> Griwes: thank you for pointing out the standardization thing.
03:53:06 <Jmabsd> geist: 'a' vs 'l' what?
03:53:10 <Griwes> nyc, s/ in the kernel//, ftfy
03:53:17 <geist> i've told you, acquire and release
03:53:24 <geist> those are barriers, built into the instruction
03:53:29 <geist> i dunno why its 'l'
03:53:41 <nyc> Griwes: It's all cool in userspace. Lockfree affairs are nasty because they demand GC, though.
03:54:07 <Jmabsd> nyc: GC in kernel, what?
03:54:48 <Jmabsd> nyc: on weakly ordered architectures you still need some barrier to ensure the whole structure arrived to the other core, don't you
03:55:11 <Jmabsd> geist: oh dear, so, how do you define 'an atomic'?
03:55:28 <Jmabsd> i know it's a C++ term, apparently some effecient abstraction, but I don't know what about it.
03:56:04 <Griwes> nyc: what.
03:57:12 <geist> in this case it's a variable that can be modified by multiple threads and/or cpus with without any fear of corruption
03:57:33 <geist> specifically in this case the code is doing an addition to a variable, which aside from everything else requires a read, +, write
03:57:57 <Jmabsd> Griwes: not only C++ but also C did *not* have parallellism in the spec before C++11 no?
03:58:00 <Jmabsd> is there a "C11" spec?
03:58:03 <geist> if you have multiple cpus, or in the case of arm where it takes multiple instructions, it can get interrrupted and corrupted if multiple cpus/threads accesses it at the same time
03:58:06 <Jmabsd> maybe C still doesn't care about parallelism?
03:59:26 <geist> thee is a C11 spec. and it has a notion of threads
03:59:44 <Jmabsd> geist: so C started standardizing parallellism with C11 too yes?
03:59:59 <Griwes> Concurrency, but yes.
04:00:17 <Griwes> The two groups worked together to specify a single memory model
04:01:20 <Jmabsd> http://eel.is/c++draft/intro.races says "A synchronization operation on one or more memory locations is either a consume operation, an acquire operation, a release operation, or both an acquire and release operation.".
04:01:39 <Jmabsd> cool ok
04:02:43 <Jmabsd> would you mind telling me some more authoritative reading references on this matter?
04:03:41 <Griwes> That is *the* authoritative reference. It doesn't get more authoritative than the normative text :P
04:03:59 <Jmabsd> aha so, https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html says that the old ___SYNC - that was the read/write/readwrite barriers right -
04:04:06 <Jmabsd> should now be replaced with ___ATOMIC
04:05:18 <Griwes> If you're looking for things that are more of explanations than definitions, watch https://youtu.be/A8eCGOqgvH4 and part 2 of that (linked as the next suggested video)
04:05:53 <Jmabsd> aha i see about acquire and release, from https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html talking about ATOMIC_ACQUIRE and ATOMIC_RELEASE, that as you said they're temporal constructs - they regard stuff happening before or after the instruction.
04:07:07 <Jmabsd> interesting. and, those two videos have only like 2000 views lol
04:07:12 <Jmabsd> everyone is trying to shy away from this topic, haha =)
04:07:30 <Jmabsd> does ACQUIRE and RELEASE map exactly to RBARRIER and WBARRIER?
04:08:05 <geist> as much as you want us to map that, i dont think any of us want to
04:08:11 <Jmabsd> by the way, do you know what's the history behind BARRIER / RBARRIER / WBARRIER?
04:08:20 <Jmabsd> are they products from the 1960:ies or so and just kind of stuck around?
04:08:20 <geist> it may or may not, or they may be orthogonal concepts that roughtly equal the same thing
04:08:29 <geist> i have no idea where you're getting them from, so i dunno
04:08:40 <Jmabsd> interesting.
04:08:42 <geist> though linux has wmb() rmb() mb(), which is likely to be what you're thinking
04:08:48 <Jmabsd> yeah.
04:09:06 <Jmabsd> also some C compilers have inline macros with that meaning don't they
04:09:06 <geist> and there is a mapping, but i dont know it off the top of my head. it's more like those map to underlying instructions which may or may not directly be acquire/release
04:09:06 <Jmabsd> sec
04:09:13 <Jmabsd> wait restarting client, hold sec
04:09:22 <geist> most likely the compiler doesn't. the framework of the code does
04:09:34 <geist> and most likely those map to different instructions, like MFENCE, or 'dmb' on ARM
04:09:37 <Jmabsd> wait wait
04:09:44 <geist> which are roughtly analagous to it, but... it's complicated
04:09:58 <geist> basically the whole damn thing is far more complicated than it takes to describe on irc
04:10:23 <Jmabsd> hm
04:10:25 <Griwes> Yep
04:10:33 <Jmabsd> geist: so FENCE is another assembly instruction used for memory ordering?
04:10:43 <Griwes> There's a lot of moving parts, and those parts are moving at different time
04:10:47 <Jmabsd> far more complicated - yes i see.
04:10:51 <Griwes> Some are moving during compilation
04:10:52 <Jmabsd> griwes, yes i see.
04:11:03 <Griwes> Some are moving during execution
04:11:05 <Jmabsd> during compilation, yes i totally see that -
04:11:09 <Jmabsd> the compiler could optimize in given ways
04:11:11 <Jmabsd> exactly.
04:11:13 <Griwes> Watch the videos I linked
04:11:13 <geist> it's understandable, but frankly you need more fundamental knowledge of the way these things work
04:11:16 <Jmabsd> the compiler could reorder accesses.
04:11:23 <Griwes> They are quite comprehensive
04:11:31 <geist> yes you need some more basic backgrounds, and/or someone that can desvtibe it much better than us
04:11:34 <geist> with pictures and words
04:11:39 <Jmabsd> geist, i think i follow the BARRIER stuff, and i'm clear about how a compiler is affected also
04:11:53 <Jmabsd> i'll put on my TODO to see that video you referenced, all of it
04:11:54 <Jmabsd> ~3h.
04:12:11 <geist> i've been dealing with this stuff for years and i still dont completely understand it
04:12:22 <geist> or at least i have to sit there a few minutes and internalize it
04:12:28 <Jmabsd> what part of the C11 standard deals with concurrency?
04:12:35 <Jmabsd> geist, yes i totally see.
04:12:42 <geist> and i know enough to know that i dont completely understand it
04:12:55 <geist> that and crypto. best tying you can do there is understand where your limits are and dont think you know what you're doing
04:12:58 <geist> or you'll screw up
04:13:10 <Jmabsd> geist,griwes: the stuff i like to understand now is mostly, knowing at core B that a structure from core A has arrived in full,
04:13:18 <Jmabsd> and, what are the memory primitives for effecient core interoperation
04:13:35 <Jmabsd> yeah i know.
04:13:58 <Jmabsd> exactly. so the work on memory ordering is supposed to yield a small number of language primitives,
04:14:03 <Jmabsd> that put you in a place where you know what you do is safe.
04:22:02 <Jmabsd> so, back.
04:22:04 <Shockk> err hmm this isn't exactly an osdev-related question but I figure there are smart people in here
04:22:13 <klange> That remains to be seen.
04:22:16 <Shockk> lol
04:22:42 <Shockk> in C++, is it possible to do something like the following, or at least something that would kind of achieve the same effect
04:22:58 <Jmabsd> geist,Griwes: would it be correct to say that the 'old' BARRIER based way of concurrency was correlated with the operation of the computer's memory system, while the new acquire/release way instead is motivate from the purpose of the running code?
04:23:19 <geist> i have absolutely no idea what the motivation behind their naming convention is
04:23:21 <Jmabsd> BARRIER means, "dear compiler and CPU, please ensure that at this exact spot, all writes have been punched to all the other cores, because this is concurrent stuff and we need it over there now, thanks".
04:23:37 <Shockk> template<typename T> enum class E { ... }; and then specialize that template for certain types, so that E<sometype> would have different enumeration values?
04:23:53 <Jmabsd> acquire/release means, "dear compiler and CPU, please understand that my code just before this/just after this requires this-particular-kind of memory sync with other cores, so please accord please, thanks"?
04:25:41 <Jmabsd> oh, "barrier" and "fence" is the same, interesting.
04:31:46 <jjuran> Shockk: No idea, but you can always have an enum member of a class template.
04:31:58 <Shockk> jjuran: ah right, and specialize the class template right?
04:32:03 <jjuran> yup
04:32:17 <Shockk> hmm that's a possibility, thanks
04:33:13 <Jmabsd> geist,Griwes: right, Wikipedia comments "The Java Memory Model was the first attempt to provide a comprehensive threading memory model for a popular programming language. After it was established that threads could not be implemented safely as a library without placing certain restrictions on the implementation and, in particular, that the C and C++ standards (C99 and C++03) lacked necessary restrictions, the C++ threading subcommittee set to work on suitable
04:33:13 <Jmabsd> memory model", neat. ( https://en.wikipedia.org/wiki/Memory_model_(programming) )
04:35:23 <Jmabsd> according to Wikipedia this short document http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2429.htm is the C++ memory model.
04:35:40 <Jmabsd> by Clark Nelson and Hans Boehm, that's the guy who wrote the compiler isn't it. :))
04:40:32 <Jmabsd> err, memory barrier operations regard *all* memory. do acquire/release regard only given value slots?
05:02:57 <Jmabsd> geist,Griews: very well - thank you very much for taking the time to discuss concurrency with me. i'll watch the videos you pointed to, and read the C10 spec (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf ) on related topics that's chapter 7.17.
05:09:10 <CrystalMath> this is going to be a super simple question, but i want to be sure i know all the existing techniques before i compare some new ideas
05:09:15 <CrystalMath> so...
05:09:37 <CrystalMath> on a task switch, how do the kernel page tables remain up to date?
05:10:14 <geist> usually you map the kernel identically in all address spaces
05:10:31 <CrystalMath> right, but it might change in a particular page directory, because the kernel changed some maping
05:10:33 <CrystalMath> *mapping
05:10:37 <geist> so in the case of x86 you simply point the top evel page table at the same set of secndary tables that cover the kernel
05:10:47 <geist> ah ha. and yes. that's a huge PITA
05:10:59 <geist> sometimes the answer there is just preallocate all of the 2nd level page tables so you dont have to do that
05:11:15 <CrystalMath> right... but what if a kernel wants a part of its space to be per-process?
05:11:16 <geist> otherwise yes, if you modify a top level entry in the kernel you have to go synchronize all of the active page dirs
05:11:24 <geist> Dont Do That
05:11:32 <geist> or at least, just think of it as part of the process
05:11:55 <CrystalMath> why not? i want to change my kernel so that memory block descriptors are per-process
05:11:59 <geist> if you wanna put supervisor only pieces in a user process, put it in the user side of the addres sspace
05:12:06 <geist> why not because i just told you. it's hard to do
05:12:06 <CrystalMath> well yes
05:12:22 <geist> and it's a security issue, etc. no sane os does that
05:12:30 <CrystalMath> it would be on the kernel half
05:12:34 <geist> plus some architectures actually disallow it
05:12:39 <CrystalMath> but it'd be per-process
05:13:03 <geist> ARM would give you trouble because of the ASID thing, and many arches like MIPS have hard notions that high addresses are supervisor only. hard coded, can't override it
05:13:24 <CrystalMath> but they would still be supervisor-only
05:13:28 <CrystalMath> just not global
05:13:40 <geist> well, okay. but still it's a bad idea
05:14:02 <CrystalMath> yeah it seems kinda hard to update
05:14:02 <geist> but you can do it. it's just hard and will cause complications keeping it synced up
05:14:23 <CrystalMath> but otherwise i have a very limited space for all the memory blocks ever
05:15:30 <CrystalMath> on 32-bit, i mean
05:15:34 <CrystalMath> on 64-bit there is really no problem
05:15:41 <CrystalMath> and this would indeed be crazy
05:18:17 <geist> yah just go 64bit and forget about 32 :)
05:18:25 <geist> it really makes things simpler, for precisely this sort of thing
05:18:35 <CrystalMath> but i promised 486 support forever
05:18:44 <CrystalMath> besides, this is just for fun
05:19:00 <geist> sure, i was hacking on a 386 this xmas too. it's fun!
05:19:07 <geist> retro computing for fun
05:19:23 <CrystalMath> does linux/NT have any non-global supervisor pages?
05:19:31 <CrystalMath> i suppose i could check
05:19:46 <CrystalMath> (run in a VM, info tlb)
05:19:53 <geist> i think the main reason you'd do that is if you have a per cpu thing
05:20:02 <geist> you'd probably map those as non global
05:20:07 <geist> or if you're using the recursive page table thing
05:23:17 <CrystalMath> okay there are such pages
05:23:57 <CrystalMath> (windows 32-bit)
05:24:06 <CrystalMath> precisely at the location where "hyperspace" is
05:24:35 <CrystalMath> i just used info tlb
05:24:43 <CrystalMath> the deleted all < 0x80000000
05:24:46 <CrystalMath> from the output file
05:24:57 <CrystalMath> (info tlb in qemu's monitor)
05:25:14 <CrystalMath> so there's a few that are not marked global
05:25:21 <Jmabsd> geist: a GCC wiki discussion of memory models, https://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync , neat.
05:26:39 <Jmabsd> geist: actually the weak ordering makes more theoretic sense, as they allow more wheels to spin freely from each other in a computer, so to speak
05:26:52 <Jmabsd> weak ordering should be more energy-effecient than strong ordering!
05:27:12 <Jmabsd> the strong ordering must require a lot of data to go between CPU cores all the time
05:27:26 <Jmabsd> "i'll read this and this and this memory range now, will you write to it?" kind of questions
05:28:13 * geist nods
05:33:14 <CrystalMath> geist: so maybe i'll do this: if the page is not global, recurse into it, everything global gets updated
05:33:30 <CrystalMath> but only considering the upper half
05:33:59 <CrystalMath> so like... with PAE there's 4 PDPTEs, 2 will be checked
05:34:09 <CrystalMath> chances are one will be global and just stored in the other page directory
05:34:30 <CrystalMath> the other might not be (signifying that it has non-global stuff inside)
05:34:41 <CrystalMath> then it recurses into it, and copies all the global PDEs
05:34:55 <CrystalMath> then the same for PTEs
05:35:01 <CrystalMath> should be fine... but i wonder how fast
05:37:03 <CrystalMath> geist: i actually have a limit for this "private page area"
05:38:45 <CrystalMath> can't span more than 8 PDEs
05:39:17 <CrystalMath> so that's 8192 PTEs checked at most
05:39:33 <CrystalMath> .... on every task switch though
05:41:49 * geist nods
05:46:17 <CrystalMath> can i set a PTE/PDE's "global" flag, even if global pages aren't supported?
05:46:26 <CrystalMath> just wondering, PTE bits are scarce
05:50:07 <geist> no
05:50:33 <geist> you have to be careful there. iirc, AMD is much pickier about that, and they will slap you if you set a global bit for anything but a leaf node
05:50:44 <geist> ie, a page table entry, or a large page table entry
05:50:48 <geist> inner nodes are MBZ
05:50:53 <geist> for unused bits
05:51:04 <CrystalMath> well, then there goes my last avail bit
05:51:31 <CrystalMath> oh wait, nevermind, i only used 1
05:51:34 <CrystalMath> still have 1 more then
05:58:12 <doug16k> real amd cpus only treat the pml4 G bit as reserved. kvm's nested paging does inappropriately enforce too many level's G bits reserved on amd though
06:01:15 <doug16k> I was supposed to submit a kvm patch for it but never got the build working well enough to test and never finished it :(
06:03:12 <doug16k> CrystalMath, how many available bits are you using?
06:03:14 <doug16k> total
06:04:07 <doug16k> there are 10 available on x86_64
06:05:07 <doug16k> and that's all maxed out. I believe with pkey off you get much more
06:06:51 <geist> a few of us at work just found an interesting KVM/SVM bug on linux 4.15
06:07:19 <CrystalMath> doug16k: but this is 32-bit, so there are 3
06:07:22 <CrystalMath> doug16k: i'm using 2
06:07:35 <geist> basically if you MMIO fault in CPL=3 on a Zen with SMAP enabled, the decode assist doesn't work on Zens
06:07:44 <doug16k> CrystalMath, ah, and not PAE. sorry I assume 64 bit for some reason
06:07:59 <geist> and there's a piece of code added in 4.15 for SEV purposes that cause it to not attempt to decode the instruction, just restart
06:08:07 <geist> so you drop into a hard loop in kvm
06:08:15 <doug16k> ouch
06:08:42 <CrystalMath> doug16k: yeah when PAE is enabled i get more
06:08:55 <geist> the hack was added to essentially retry the NPF if the decode assist failed, because it's assumed you're using SEV, where you can't reach into the guest if you wanted
06:09:12 <geist> but in this case, i think the errata/bug/etc is that SMAP + CPL=3 causes the decode assist to fail too
06:09:24 <geist> presumably because the hardware is trying to fetch from user space through SMAP
06:10:01 <geist> intresting that with SEV enabled, essentially you need full decode assist, which looks pretty nifty
06:10:07 <doug16k> SMAP as in clac stac alignment check thing?
06:10:38 <geist> yah
06:11:01 <geist> reason we hit it is our ACPI code isin user space, and in this particular bytecode is trying to read from one of the HPET registers
06:11:02 <doug16k> but in CPL=3 won't that just mean do alignment check? I must have got confused somewhere
06:11:14 <geist> so we map in HPET space into ring 3 and touch it
06:11:24 <geist> no, SMAP recycles the AC bit to mean something different
06:11:30 <doug16k> right
06:11:32 <geist> it disallows ring0 code from accessing ring3
06:11:35 <doug16k> but it means alignment check in cpl=3
06:12:00 <geist> this is true, but presumably the decode assist hardware is fetching into user space as if it were supervisor mode
06:12:06 <geist> and thus SMAP kicks in
06:12:18 <doug16k> in cpl < 3 alignment exception can't happen even if you want to, SMAP makes AC useful in that case
06:12:31 <doug16k> ah
06:12:51 <doug16k> wow, very subtle bug then
06:13:02 <geist> there's a note in the comment, lemme find it
06:14:06 <geist> https://github.com/torvalds/linux/commit/00b10fe1046c4
06:14:35 <geist> its that line at 4960 that causes it to fail. the line is trying to work around another problem, which is that you cannot directly reach into guest memory if SEV is enabled
06:15:06 <geist> but in our case we actually need to fall through, because we got the same failure but for a different reason, and restarting wont fix it
06:15:55 <geist> we know it's SMAP because if we simply dont use it it all works fine
06:16:03 <geist> it's when we activate SMAP that it all goes wrong
06:19:34 <doug16k> hmmm. I guess everything should have a chicken bit eh?
06:20:56 <doug16k> https://en.wiktionary.org/wiki/chicken_bit
06:20:59 <geist> i see, it only saves the bytesfro the instruction on nested #PFs
06:21:15 <geist> kind of neat. it's one of the decode assists: up to 15 bytes of the instruction so that you dont have to read it yourself
06:22:52 <doug16k> that's cool. avoids you reading code in code cache as data
06:23:17 <geist> yah
06:29:15 <doug16k> should I expect to find relocations for every got entry when loading an elf (x86_64) shared library?
06:30:56 <geist> probably
07:00:04 <Jmabsd> where is there some nice document about the new C11 extensions relating to atomics (and acquire/release??)?
07:00:47 <Jmabsd> i basically like to update myself with the new theory in the memory ordering domain, that came with C11, just to understand how this affects how C programs should be written.
07:04:44 <nyc> I've been tied up with BS all night.
07:14:14 <geist> :(
08:43:47 <nyc> I got no coding done tonight and no significant research, either.
08:45:51 <klys> well that often happens to folks using irc
08:47:43 <nyc> I got wrapped up in something that was not even computer-related, never mind IRC.
09:30:04 <aalm> mm, w/irc one has /ignore avail. atleast.. :[
10:29:17 * renopt kicks the ioapic
01:02:58 <nyc> renopt: IO-APIC's were numerous enough on NUMA-Q's to trip over hardcoded limits etc. IIRC Unisys had systems with similar issues, but didn't go about porting things to them quite as early.
01:08:34 <nyc> I think it was one per-node. We never did get to try out 16-node systems, though. That would have taken some fancy footwork to reprogram things so that the broadcast cluster ID clashed with the local cluster on every node. I never really looked hard into how it waa done in the DYNIX/ptx source.
01:10:42 <bcos_> Back then, 15 nodes would've been the limit
01:11:55 <nyc> No, there was some trick to have a different node to cluster ID assignment on every node.
01:11:59 <bcos_> (8-bit APIC ID split into a 4-bit "CPU within node" and a 4-bit "node/cluster number"; where "all bits set in node/cluster number" implies "broadcast to all nodes/clusters")
01:13:00 <Griwes> 15 nodes should be enough for everyone
01:13:02 <nyc> The broadcast cluster ID was set up to clash with the local node.
01:16:02 <bcos_> Realistically; once you need to exceed ~8 nodes you probably should be pushing the "many computers on a fast LAN" model
01:17:00 <nyc> Sequent actually sold these things.
01:17:29 <bcos_> When?
01:18:16 <nyc> They even ran one internally with a legacy node of 386's to stress migration for floating point usage code in the kernel.
01:18:22 <bcos_> Historically there's been a few "not PC compatible, but with 80x86 CPUs" systems with far more CPUs than the PC compatible systems supported
01:18:58 <nyc> bcos_: They were one.
01:19:20 <bcos_> (possibly the oldest I came across was a rare "8 Pentium CPUs" thing from someone - can't quite remember then name - something like "avalon" maybe)
01:20:51 <nyc> 8 was well within architectural limits.
01:21:28 <nyc> Sequent literally had 386's without FPU's strung together.
01:22:13 <bcos_> Within theoretical limits. Never worked in practice (bandwidth limits of "everything connected to front-side bus") without very specialised (and completley non-standard) "magic routers" rammed into the chipset
01:25:14 <bcos_> Hrm. I think we might be talking about the same thing (Aviion = my "sounds like avalon" = your Sequent)
01:25:19 <bcos_> https://en.wikipedia.org/wiki/Aviion
01:25:53 <nyc> I'm not sure what you're calling a standard is anything more than what was getting hawked to the lowest common denominator. Standards organizations etc. are mostly silent on architectures beyond PCI and such.
01:29:04 <bcos_> Standard (for PC) was "N CPUs and memory controller on same front-side bus". Non-standard is "proprietary interconnect/directory cache/who-knows-what rammed into the middle to make it work for more CPUs"
01:29:35 <bcos_> - looks like there might have been 3 competing attempts - SCI, whatever DG used and whatever DEC used
01:29:48 * bcos_ shrugs
01:31:04 <nyc> Sequent used SCI rings and did a lot of wild things in the kernel to make it all run fast with high remote node latencies.
01:31:18 <bcos_> Of course now that standard PCs have shifted to hyper-threading or quickpath (and got rid of FSB) NUMA is part of the standard for PCs
01:31:33 <bcos_> D'oh - hypertransport or quickpath
01:31:50 * bcos_ nods
01:32:15 <bcos_> Also needed a specially modified OS to support it (couldn't just use a normal "Windows NT")
01:32:32 <nyc> Hypertransport still has some sort of nasty architectural limit.
01:32:52 <nyc> bcos_: Who, when, on what?
01:33:31 <nyc> Sequent never even attempted to run Windows on its boxen.
01:34:09 <bcos_> I'm talking of DG's Aviion (which was a competitor to Sequent)
01:34:31 <nyc> They literally predated Windows 95.
01:34:57 <bcos_> ?
01:35:25 <nyc> Sequent predated Windows 95.
01:35:26 <bcos_> "Some versions of these later Intel-based machines ran Windows NT, while higher-end machines ran the company's flavor of Unix, DG/UX." (from 1st paragraph of wikipedia page)
01:36:56 * bcos_ remembers because I saw one of these "8 pentium chip" things on eBay about 15 years ago and wanted it; but did a pile of research to figure out if it'd work with my software and found that it doesn't work with any normal "off the shelf" software
01:42:45 <nyc> Yeah, they need vendor kernels.
02:06:05 <knebulae> what's stymying the world on this frigid friday?
02:12:34 <knebulae> 25 years ago I learned to program on one of these: http://www.salicontech.com/microprocessor-microcontroller-trainer.html / yesterday I fought for 1.5 hours with a printf error. Let's hope today is better. :)
02:17:51 <knebulae> That unit even came with a stack of pre-printed worksheets to helpfully assist you in converting your written assembly mnemonics into the corresponding cpu instruction values. It even had the most common mnemoonics across the top. ;)
02:18:23 <knebulae> Then you could type in your entire program 4 bits at a time.
02:20:13 <knebulae> To be fair, that unit was old as h*ll then tho...
02:21:02 <knebulae> I think that was about the time of the fsb decoupling (DX2 chips?)
02:41:28 <nyc`> My laptop is misbehaving again with the network card going invisible. Bluetooth is showing up, which is odd, but consistent with the other reports of network problems with this laptop model (Yoga 2 Pro).
02:46:27 <nyc`> It's supposedly some kind of hardware issue with rfkill switching.
02:51:39 <nyc`> I may very well try to port the Israeli guy's driver hack, but doubt it'll help because ideapad_laptop blacklisting doesn't help.
02:55:39 <nyc`> Closing the lid and taking it back upstairs will probably do more for it than the dozen or so reboots between Windows and Linux I've done.
02:57:11 <nyc`> If I had the tools to open the case, I'd try pulling the CMOS battery.
02:57:19 <knebulae> @nyc: I've got a case of netgear n to ethernet adapters. I'll ship you one.
02:58:54 <nyc`> knebulae: A USB Wi-Fi adapter is probably all that can be done. There's no Ethernet port on the laptop.
03:00:26 <knebulae> Gotcha. No ethernet port? Damn.
03:02:29 <nyc`> Lenovo Yoga 2 Pro's were trying to get a low weight and slim form factor to compete with Ultrabooks or whatever they were called.
03:03:16 <nyc`> No serial, no Ethernet, no swappable battery packs.
03:04:42 <knebulae> @nyc: I also have a spare USB 3.0 wireless-AC adapter. I haven't used it in 18 months. I'll donate that to your cause.
03:05:03 <knebulae> I usually get 600+Mb/s out of it.
03:05:10 <nyc`> knebulae: Merci beaucoup, mon frère.
03:05:33 <knebulae> nebulae⊙no
03:06:59 <knebulae> @nyc: it's this guy: https://www.netgear.com/home/products/networking/wifi-adapters/a6210.aspx
03:07:15 <knebulae> see if that's suitable for your needs
03:08:54 <nyc``> knebulae: That'll be fantastic, thank you so much!
03:09:22 <nyc``> knebulae: I emailed with my address.
03:09:35 <knebulae> @nyc: I'll get it out to you today.
03:13:57 <nyc``> I need to get a job so I can buy a new laptop.
03:14:45 <knebulae> @nyc: I responded, but some mailservers don't like mine (geist... j/k lol).
03:19:12 <nyc``> Excelente!
03:29:49 <knebulae> from my cashless card system vendor: "we are implementing changes that help recover lost transactions..."; If you're losing your transactions, then I weep for what this update is actually papering over.
03:31:30 <knebulae> And then I weep when I remember what we paid for said system.
03:33:12 <nyc`> I used to eBay E3K's and AlphaServer 1200's etc.
03:33:53 <knebulae> was there any money in it?
03:34:40 <knebulae> When I was in undergrad, I would've killed for an alpha for mips box. Closest I came (due to age more than anything) was the Sun workstations we used in the labs. I don't even know what they were.
03:34:51 <nyc`> knebulae: I was the buyer. I kept them and wrote Linux kernel code on them.
03:35:04 <knebulae> @nyc: ahhh, gotcha.
03:35:25 <nyc`> knebulae: That was the 90's?
03:35:30 <knebulae> 96
03:36:18 <nyc`> No way I could've gotten my hands on those things then. They were too current and my career hadn't taken off yet.
03:36:31 <nyc`> My career has since ended.
03:37:12 <knebulae> @nyc: I remember. I spent north of $5k on a 486dx2-50 w/8mb in 1994. Workstation hardware was very expensive.
03:37:29 <knebulae> Maybe it was '92
03:37:50 <knebulae> *lowly 486dx2-50
03:38:33 <knebulae> Good ole' Ted Waitt and his spotted cow boxes.
03:39:19 <knebulae> Then that pot-head steven had to come along and ruin everything. Dude, I don't *want* a Dell. ;)
03:40:05 <eryjus> the same gateway computers that McDonald's bought for their In-Store Processors (or ISPs)... thus the cow painted boxes..
03:40:18 <w1d3m0d3> does anyone here happen to know how to get 1MHz accurate timers in Linux
03:40:57 <knebulae> @eryjus: I though the cow boxes was because they were in a field in South Dakota
03:41:37 <nyc`> w13d3m0d3: HPET? I don't know if interrupt processing can be done at that rate.
03:41:52 <eryjus> i worked in a store at that time and IIRC they re-branded when they picked up the McD's contract...
03:42:04 <knebulae> @eryjus: gotcha
03:42:06 <w1d3m0d3> nyc`: but in linux userspace
03:42:18 <w1d3m0d3> since I want to emulate a 6502 CPU
03:43:07 <nyc`> I could've learned a lot more if I could've actually gone to class instead of spraying toxic goo on the underside of car bodies and chipping overweld off of train cars.
03:44:28 <knebulae> @nyc: I was not so diligent a student back then. If I only knew then.... yadda, yadda...
03:44:45 <eryjus> knebulae checking wikipedia it might have been coincidental timing
03:45:23 <knebulae> @eryjus: right; no worries.
03:50:00 <nyc`> knebulae: If I knew better then, I would've gone into something other than EE or CS. 8% retention over 10 years is a big red sign saying Do Not Enter.
03:56:18 <knebulae> @nyc: I have some thoughts on that.
03:56:32 <knebulae> Is it really only 8%?
03:56:48 <nyc> So said the IEEE.
03:57:10 <knebulae> I have to say, despite my thoughts, that 8% number has me flabbergasted
03:57:49 <knebulae> anoecdotal I know, but none of my tech friends have left tech.
03:57:56 <knebulae> *anecdotal
03:58:03 <nyc> I've basically been on the rocks ever since a bad case of MRSA I had 11 years ago.
03:58:37 <knebulae> @nyc: that'll do it
03:58:48 <nyc> So it's not necessarily because they're unfit to actually do it.
04:00:24 <nyc> I came in slightly before the wave that graduated into joblessness with the dot com bust.
04:00:27 <knebulae> Well, some people have "it" and some people don't. And I'm not talking about code golf, or the ability to transcend mere mortal thought while coding. The "it" factor is the passion.
04:01:04 <knebulae> Lots of people become doctors or lawyers too for reasons other than their love of practicing law or medicine.
04:02:04 <nyc> The industry is crap about abusing salary as unpaid overtime.
04:02:56 <lkurusa> Ah OpenBSD, love you.
04:03:13 <knebulae> There's just a lot of people who's parents told them "go into computers," without really giving it that much thought. And not really thinking about what that meant they would be doing with their lives for the next 50 years. On top of that, the job of a day-to-day software developer (unless your in Silicon Valley) is likely to be substantially more mind-numbingly boring than you were expecting.
04:03:45 <knebulae> *you're
04:04:36 <knebulae> @nyc: and yes, they work the kids like dogs
04:05:19 <knebulae> But it's always been like that throughout history; it's not unique to IT.
04:05:25 <nyc> knebulae: Not just the kids.
04:06:52 <knebulae> @nyc: I get it. I'm not going to cry for the kids that overachieved in order to land those jobs. That's the deal. If you don't want to play that game, there's plenty of other places to get your beak wet.
04:07:43 <nyc> knebulae: Computer programming affairs are in a category with fast food management and such for uncompensated overtime.
04:08:44 <knebulae> Legacy, unfortunately.
04:08:56 <knebulae> I have confidence the rules will change.
04:09:19 <knebulae> Maybe not to where they need to be, but certainly to a place that's more palatable to the industry workers.
04:10:31 <knebulae> Meanwhile, we have folks in my area who think <insert language here> gurus should cost $50k.
04:11:33 <nyc> I don't think I'll ever be hired to work as a programmer of any kind again. It's unclear what I can do at this point since education appears unavailable and my physical abilities are lacking.
04:12:31 <Shockk> sorry, really quick question for C++ gurus; if I have an empty std::vector and call v.resize(size), will all new entries be default-constructed?
04:12:55 <knebulae> @nyc: education unavailable? I'm re-doing my comp-sci education as sort of a 20-year refresher. Between MIT OCW and Khan (so-so), EdX and the like, there has never been more amazing educational material available. It's truly remarkable.
04:13:39 <knebulae> Plus I actually give a sh*t this time around. :)
04:13:50 <nyc> knebulae: I mean actual credentials.
04:14:04 <knebulae> @nyc: well, those cost money
04:14:36 <knebulae> @nyc: sorry; in my situation, I already have the BSc.
04:15:23 <knebulae> Which is all but worthless by itself anyway.
04:16:04 <nyc> knebulae: Hence unavailable to those trying to dig their ways out of deep deep holes. I have a BSc (CS & math from Purdue) that's worthless and that keeps me from any aid for getting a new one.
04:16:23 <knebulae> @nyc: B1G represent!
04:16:42 <knebulae> I have a buddy who did MechE at Purdue.
04:18:00 <knebulae> @nyc: I can certainly empathize with your deep holes comment; perhaps more than most. But just keep at it. If you have skills, you can certainly make a living. I've never been asked for my credentials; but I was also not seeking permanent employment.
04:18:47 <knebulae> You have to gig it up; clients will take a short-term risk on you that they wouldn't take for a longer term commitment on their part.
04:19:13 <nyc> knebulae: I'm plotting to do some sort of kernel code to scare up interest.
04:20:11 <knebulae> @nyc: that's what I'm doing. I've been sitting on this for 20 years. In the last 26 days I've made more progress than I ever did before. And here's the scary part. It was *easy*. It was so, much, harder in 2003-2005.
04:20:59 <nyc> knebulae: I've got cpumask_t and pgcl on my resume and am getting zero hits, so I doubt a NIC driver for somebody's hobby kernel is going to actually help much.
04:22:44 <knebulae> @nyc: I don't want a driver from you
04:23:25 <nyc> knebulae: Or a USB stack for the Hurd.
04:23:42 <knebulae> @nyc: I hurd that's almost finished
04:23:53 <knebulae> @nyc: anyhow, I see your point.
04:31:21 <nyc> knebulae: Maybe a more aggressive goal is needed?
04:32:42 <knebulae> @nyc: The great irony is that there are people, right now, looking for YOU (well, someone who has your skills).
04:32:56 <knebulae> @nyc: it happens all the time.
04:33:28 <knebulae> @nyc: my biggest problem as a manager, and other managers I converse with, is very simple: "How do you find good people?"
04:33:56 <knebulae> @nyc: and every good person I talk to says man, it seems like no one wants my skills.
04:34:38 <knebulae> So despite all of our advances in algorithms, job matching, etc. it all still sucks *ss.
04:34:52 <lkurusa> i'm not a manager, but my two cents are: nothing beats person-to-person recruiting
04:35:19 <lkurusa> literally go to conferences, or other events and talk to people, i got all my gigs that way
04:35:24 <knebulae> @lkurusa: I can tell in 5 (usually under 2) if you are appropriate for a role.
04:35:32 <lkurusa> never applied specifically to a company
04:35:39 <knebulae> I could read a 1,000 page resume and still not know.
04:36:05 <lkurusa> knebulae: right, so clearly the solution is to ask your recruiters to talk to more people :)
04:36:19 <nyc> If you have to apply, it's game over anyway.
04:36:25 <knebulae> that'd by my kids' solution.
04:36:32 <lkurusa> just kidding, but i guess if you're looking for good people you should go where the good people are and not wait until algorithms find them i guess
04:36:57 <lkurusa> tautology, but you get what i mean
04:37:29 <knebulae> @lkurusa and @nyc: this was my point; get around people and talk the talk; people like to work with others they get along with and that they know can pull their own weight.
04:37:56 <knebulae> the pot will not stir you
04:38:29 <nyc> AFAICT hiring managers only ever look for two things, both TLA's: NCG's and H1-B's.
04:39:00 <knebulae> @nyc: only BigCo
04:39:49 <lkurusa> what's NCG?
04:39:55 <knebulae> @nyc: and if you're applying for a job that's bring in a bunch of H1B's who are wet behind the ears, do you really want to do that on a day to day basis for 60+ hours per week?
04:40:15 <nyc`> New College Grad.
04:40:19 <lkurusa> Ah
04:40:34 <lkurusa> i doubt that's the case
04:40:37 <knebulae> See, IT has become infected with the same disease that infected the legal profession
04:41:13 <lkurusa> in fact, the company i work for definitely doesn't just hire ncg/h1b people
04:41:18 <nyc`> knebulae: If I were on vacation, I'd work at least 80 hours a week.
04:41:44 <knebulae> @nyc: me too
04:41:53 <knebulae> @nyc: I inherited it from my father.
04:42:30 <nyc`> knebulae: They can do their 60 hours and go home. I'll have another 60 hours to actually get work done.
04:43:20 <knebulae> @nyc: yes; however I feel like this is punishment for how much of a smart-*ss I was in my 20s.
04:44:21 <nyc`> Ikarusa: I haven't had an inside look at the profession in 10 years. It was actually mostly word of mouth hiring people who someone knew.
04:46:13 <nyc`> Ikarusa: Occasional crops of NCG's were taken on cold, but that was IBM that just lays off the ones that don't make the grade in a couple of years.
04:47:42 <lkurusa> it's lkurusa :p
04:47:44 <lkurusa> but yeah
04:48:58 <nyc`> Ikurusa: IBM mostly set up whole campuses in Bangalore and Shenzhen and such as opposed to bringing in H1-B's in general. Various people from other sites would get transfers though, and places in the world without IBM sites too.
04:50:40 <nyc`> Other places without the international presence and size to do things like set up whole sites in India and China bring in more H1-B's.
04:52:57 <nyc`> What would I have to do to compete with cpumask_t and pgcl?
04:54:21 <nyc`> (I may have to explain what those are.)
04:57:39 <nyc`> cpumask_t changed the Linux kernel from using unsigned longs to track CPUs to using bitmap data structures to. That was to get new IBM pSeries boxen with 256 CPUs running Linux with all of them being used.
05:02:14 <nyc`> pgcl used a larger kernel allocation unit with ABI emulation to keep it from being observable as an mmap() alignment or size requirement in order to reduce the memory footprint of the big array of per-page data structures so 64GB x86-32 could boot and run while Linus was vetoing XKVA (what mingo later called 4/4).
05:03:45 <nyc`> pgcl was actually originally Hugh Dickins' idea. He called his larpage.
05:04:16 <nyc`> A lot happened between 2.4.7 and 2.5.59.
05:04:51 <nyc`> Those were pretty earth-shattering affairs.
05:07:41 <nyc`> So what do I do for the next bullet point on my resume after those things, inspired by #osdev?
05:27:07 <knebulae> @nyc: I think you're looking at it from the wrong perspective
05:28:31 <knebulae> @nyc: those types of projects (depending upon your level of involvement) prove that you have the engineering chops to work on basically anything. I mean, if you're looking for actual os development work, the truth is, there isn't a lot of that going on, aside from the obvious few places. Unless you just want to work on Linux and write drivers for the rest of your life.
05:29:09 <knebulae> And I'm sure there's a TON of shops who could use a linux driver writer.
05:29:17 <knebulae> Just not in NYC for a F500.
05:30:50 <lkurusa> you in NYC?
05:31:18 <knebulae> Automotive, CNCs, PLCs, etc. There's so much equipment that needs good low-level C coders. There's money everywhere for stuff like this. Companies just can't find the expertise. Or we, as programmers, consider it a failure if we're not in on the latest and greatest sh*t.
05:31:34 <knebulae> @lkurusa: no. murder mitten.
05:31:52 <lkurusa> Ah
05:31:54 <nyc`> Beyond my initials being NYC (Nadia Yvette Chambers) I'm also in NYC the city.
05:31:55 <lkurusa> it must be cold up there
05:32:16 <knebulae> @nyc: I hadn't noticed that. Lol.
05:33:08 <lkurusa> great!
05:33:25 <lkurusa> i'm a junior new yorker
05:33:34 <lkurusa> or rather, wanna-be lol
05:33:50 <nyc`> I think it's -7.8°C / 18°F.
05:34:33 <kingoffrance> geist: y'all around/awake?
05:35:30 <nyc`> Maybe doing a whole OS is what I need to be ambitious enough to get people's attention.
05:36:38 <knebulae> @nyc: it was -22 here last night.
05:36:40 <knebulae> :/
05:36:42 <knebulae> F
05:37:18 <knebulae> @nyc: did you get my email? again, cough, cough, G**gle doesn't like my mail server.
05:37:39 <aalm> [spam]
05:37:56 <knebulae> @aalm: but I don't spam.
05:38:04 <nyc`> knebulae: I got it and replied.
05:38:07 <knebulae> It's finickey.
05:38:16 <knebulae> @nyc: ok
05:38:34 <knebulae> @nyc: I meant the latest
05:38:45 <aalm> knebulae, you need to sacrifice something to please the ggods
05:38:54 <nyc`> knebulae: I see the tracking number. Merci beaucoup !
05:39:00 <knebulae> @nyc: very good
05:39:09 <knebulae> @aalm: that's what it seems like.
05:39:27 <aalm> and that's why i have @gmail.com :p
05:39:44 <knebulae> @aalm: getting people to add me to their contacts list appears to help dramatically. As well as associating the domain/website with a physical address on google maps.
05:40:03 <knebulae> @aalm: I do too; just not for here and a couple of other side projects.
05:40:10 <nyc`> I need to do something earth-shattering again.
05:41:06 <knebulae> @nyc: nobody ever mentions when you're in the middle of "it," that this might be your *it*. Lol.
05:42:43 <knebulae> Most people never get to do anything earth-shattering once. At least not in their professional lives. You should consider yourself blessed. Little solace, I know, but it's more than most.
05:47:15 <knebulae> I mean, you've got to throw some sh*t against the wall and see what sticks-- you can't wait for the world to come to you. I have it on good authority that the "baby-steps" method works wonders. (I'm only half joking)
05:53:19 <knebulae> @nyc: if you're looking for a gig, I have an easy project that I've put out to our normal overseas guys, but... it's small. Rent money for a month over there.
05:54:00 <knebulae> @nyc: linux & RPi
05:54:07 <knebulae> cake
05:54:13 <knebulae> I just don't have the time.
05:54:27 <knebulae> translation: I'm taking some *me* time.
05:55:09 <aalm> let them eat cake!
05:56:25 <knebulae> @aalm: my man!
05:56:32 <knebulae> @aalm: speaking my language
05:59:04 <nyc`> knebulae: I need to avoid significant income for the moment.
05:59:14 <knebulae> @nyc: perfect!
05:59:38 <knebulae> @nyc: my boss will love you!
05:59:40 <knebulae> j/k
06:00:08 <aalm> but RPi? o_O
06:00:21 <knebulae> @aalm if you see what it is, you'll understand.
06:00:34 <knebulae> @aalm: doesn't have to be RPi.
06:00:40 <knebulae> Just a small SoC
06:00:54 <aalm> sounds better
06:00:56 <knebulae> That's what our previous vendors used.
06:01:55 <knebulae> Nothing like spending tens of thousands of dollars to get a few hundred bucks worth of raspberry pis and someone's nephew's C# code on a micro-SD card.
06:02:19 <aalm> aww
06:02:31 <knebulae> All running on a micro instance powered backend at AWS.
06:02:46 <knebulae> I was doing more professional sh*t when I was 12.
06:04:02 <nyc`> I was never a hobbyist. I was a math person who got dragged into CS.
06:05:05 <nyc`> I wasn't going for teaching so it was actuarial affairs or programming.
06:05:10 <knebulae> @nyc: my math skills are not strong. It's my biggest regret.
06:05:12 <aalm> i'm still only bsd hacker/hobbyist who fell in love w/embedded stuff
06:06:32 <knebulae> I wound up being a hired gun; Unable to show my code to anyone.
06:06:58 <knebulae> I have a knack for deconstruction
06:07:44 <knebulae> Then went into finance for a decade; Bottom fell out in 2008, went back to IT, finished my BSc.
06:08:51 <knebulae> And here I am, running a different IRC client, with a different nic, in roughly the same place I was in 1997. Only 2 miles down the road too. Lol.
06:09:07 <aalm> lol
06:09:09 <nyc`> People say I should try to go into quant stuff, but I'm underqualified for it.
06:09:49 <knebulae> @nyc: do people who actually *do* quant stuff say that, or just people who know you're better than the average person at math?
06:10:30 <knebulae> I only ask, because there's a big difference, especially in nyc.
06:10:51 <nyc`> knebulae: I say it to the people who know only that I'm above average at math.
06:11:47 <knebulae> @nyc: there's definitely chedda in that game.
06:11:52 <aalm> anything over binary is too much numbers for me:p
06:11:58 <aalm> 01 is enough
06:12:52 <nyc`> knebulae: It's there, but they want Ph.D.'s.
06:13:41 <knebulae> @nyc: did you check your couch for an extra $150k?
06:13:49 <knebulae> @nyc: and 3 years?
06:14:30 <nyc`> knebulae: I don't even have a couch.
06:14:49 <knebulae> @nyc: since you are a mathematician (by my definition anyway), can you recommend a good resource for refreshing on Calc 3 & 4 & Linear Algebra?
06:15:43 <nyc`> kbebulae: Apostol, Rudin, and Hoffman & Kunze.
06:15:54 <knebulae> most of the courses get frustrating about half way through because the stuff starts to come back to me and get repetitive.
06:16:18 <knebulae> @nyc: textbooks, authors, professors?
06:16:25 <knebulae> nvm. my fingers ain't broke.
06:16:46 <nyc`> knebulae: Textbooks named by authors.
06:16:53 <knebulae> @nyc: ok
06:17:43 <nyc`> Whittaker & Watson is an oldie but goodie.
06:17:58 <aalm> so, what's the thing you need linux on a small SoC for?
06:18:57 <kingoffrance> thats what i was gonna ask knebulae. im neither qualified for looking, but what is rpi being used for (if you are at liberty to speak, very high level) ?
06:19:21 <knebulae> @aalm: cashless card systems for video arcades
06:19:40 <knebulae> guys, not high level at all.
06:19:41 <knebulae> lol
06:19:55 * bcos_ tries to get the image of "elephant riding a small skateboard" out of his brain
06:20:06 <aalm> :D
06:20:10 <knebulae> We're not targeting casinos yet; this is for internal use.
06:21:12 <knebulae> When I was 20 and lived on Mt. Dew and Dominos, this would've been done over a weekend.
06:22:10 <kingoffrance> what do you mean "hired gun unable to show code" -- NDAs, just hard to put details on "resume" because it is so project specific?
06:22:29 <program> Does anybody know about some nice and simple front end for Arm GDB for windows?
06:22:49 <knebulae> The offer stands to anyone who's following along too. Hit me up: nebulae⊙no I'd rather give this to someone who hangs out here or on AoD than the overseas guys (no offense to anyone overseas- this is a specific group of guys).
06:22:56 <program> And my gdb is not compiled with tui option unfortunately
06:23:19 <knebulae> @kingoffrance: I wrote code for the finance industry (hence my segue into that industry).
06:23:35 <knebulae> @kingoffrance: and no specific identifiable job title
06:23:36 <kingoffrance> i am very much noob, but that is why i hated "web" stuff -- i do consider myself slightly above joe average php. not that he is necessarily "bad coder" but everything i ever did professionally, was all throwaway, "we are rewriting everything in a year or two"
06:24:00 <kingoffrance> its like building/designing a house "by the way, we will demolish it all in 2 years" :/
06:24:21 <kingoffrance> my hope is lower-level "OS" stuff is a little slower moving, built to last, etc.
06:24:42 <aalm> kingoffrance, that's why you don't do linux
06:24:58 <kingoffrance> yes, bsd has a reputation anyway of slower moving
06:25:39 <aalm> (less useless/annoying churn)++
06:27:20 <aalm> program, then i'd look into virtualizing something better than windows for the given task, if you can't do w/o the windows host:]
06:27:39 <kingoffrance> thats also probably why i tend towards oo-ish, even pseudo. because in real world i have seen, there is next to nothing "reuse", very much hype
06:33:56 <knebulae> @kingoffrance: that's why there's this big push to use composition
06:35:49 <kingoffrance> a good example: look at nl_langinfo() manpage, i feel less bad about "rolling my own" e.g. looks to be hardcoded 7 days per week, 12 months. i dont *need* anything different, but it seems impossible to use posix and c "standards" e.g. for 13-month lunar calendar. nevermind "mars time" (not that i work for nasa)
06:36:08 <knebulae> I know, we didn't re-use all this object code, because the objects became too unique, so let's just instead just re-use the function code; wait! we've got to call it something snazzy! Well, it definitely resembles function composition in mathematics if you squint right... You get the point.
06:36:12 <kingoffrance> that is what i woudl call "halfway vaguel reusable abstraction, but still have to roll your own if you step too far outside"
06:37:20 <knebulae> You can't really re-use code. Not unless it's a library, and that was the code's intent from the start.
06:39:03 <knebulae> s/unique/specialized
06:41:06 * bcos_ thinks code reuse is a mistake
06:42:44 <kingoffrance> the model i like, that doesnt even need an oo-language per se, is similar to freebsd geom, "stackable" components. then many components are specialized, but you can combine/chain them for more options. similar to e.g. vlc or mplayer or winamp "chain multiple effects plugin".
06:43:54 <kingoffrance> its not really true polymorphism or inheritance (some might say that is a good thing), but less complex, and similar effect perhaps
06:45:38 <knebulae> @kingoffrance: I think we need people who understand what it is that their computers are actually doing so we don't end up with all this godforsaken ambiguity in telling a f*cking computer what to do.
06:45:40 * bcos_ nods. I want/prefer re-usable services (implemented with "optimised for the specific purpose, not re-usable at all" code)
06:50:18 <kingoffrance> well, im a noob to hardware, but i see many times it seems due to ignorance or only knowing one language, ppl unconsciously "mirror" something at a higher level, because they had no idea there was a lower interface for similar thing. then the low interface rots and everyone recreates it higher. i see it like property brothers or something, they discover an unused buried sewer system in the...
06:50:20 <kingoffrance> ...backyard, and now they have 2 to maintain
06:50:32 <knebulae> Go play with one of those 8085 boards for 2 or 3 days and then come back to the PC. You'll be a 1000x better developer.
06:51:11 <knebulae> @kingoffrance: not YOU specifically, I mean in general. For people who want to become better.
06:53:21 <kingoffrance> what terrifies me is the "3d scene in only 5 lines of JS" hype ... i think "you have no idea how many lines of code your browser is, do you?"
06:56:42 <aalm> mm, when i got my first arduino, it took me like 2hrs of C++ while testing out the demos, and i was already looking at how to for C; lean&mean ftw.
07:05:17 <knebulae> @aalm: I long for C plus. They can keep the other plus.
07:33:48 * geist yawns
07:37:35 <kingoffrance> geist, what did you mean other day osdev sometimes gets "eccentrics" ? ppl who dont seem serious about OS, they just have lots of vague ideas but no (lowlevel) code, or just random "trolls" who just want to chat ?
07:39:18 <Ameisen> hmm... still not able to get my local build of ruby to outperform the distro's build
07:39:25 <Ameisen> even with O2/O3 and march=native and LTO
07:39:33 <Ameisen> could be that I'm using clang
07:47:05 <geist> kingoffrance: basically
07:50:52 <knebulae> @kingoffrance: I'm not sure if you've noticed or not, but no one has really asked an os development related question in a few days.
07:51:15 <kingoffrance> a great irony, is it is not something i need, but gettext and catopen both seem to assume hosted C implementation and filesystem
07:51:30 <kingoffrance> do kernels really even do "localization" or leave that for userland?
07:51:52 <knebulae> @kingoffrance: Windows probably does.
07:52:01 <kingoffrance> sure, you probably arent during currency in kernel, but error messages, etc.
07:52:27 <knebulae> @kingoffrance: it uses and understands Unicode (wchar16_t) in kernel.
07:52:34 <knebulae> That's far too much already.
07:52:56 <geist> knebulae: it's not really true, maybe just when you werent there
07:53:17 <knebulae> @geist: fair enough
07:53:36 <geist> but it comes in waves. sometimes you get a lot of new folks with lots of ideas, sometimes you get new folks that are starting from scratch, and sometimes the more experienced folks get a vacation or something and go on a tear
07:54:08 <knebulae> @geist: irc has always been hot & cold, at least in my experience
07:54:14 <geist> usually new folks wander off after a while, but a few of them stick around, and thus the population of regulars changes over time
07:54:27 <geist> yep. i think there's definitely a cycle based on when school starts and lets out
07:54:31 <geist> or around finals time
07:55:18 <knebulae> It's nice to have a sounding board. My girlfriend isn't really in to osdev, nor are my teenage daughters. :)
07:55:50 <kingoffrance> i myself wont do translation in kernel, but my charsets are functionally similar to gettext msgids, so my kernel/drivers/etc. woud be "neutral" and userland can "Translate"
07:56:25 <kingoffrance> knebulae: yeah, i hate wide chars as much as unicode :)
07:56:58 <knebulae> @kingoffrance: I know they're not the same; but I didn't feel like going down that rabbit hole.
07:57:40 <geist> generally kernels dont do any internationalization
07:57:55 <knebulae> @geist: my comment about windows was a joke
07:59:33 <geist> okay
08:00:02 <knebulae> @geist: I guess making windows jokes on irc isn't as cool as it used to be ;)
08:00:22 <geist> i've seen far wider tangents that folks have gone down, so i have to assume even if it's a joke that someone will take it seriously
08:00:33 <knebulae> poe's law
08:01:03 <geist> ah yes. never read that before, but that's a pretty good explanation
08:02:19 <kingoffrance> thats the thing knebulae, if you say "windows probably does x in kernel" noone knows if you are joking :/
08:02:37 <knebulae> Prepare for a massive case of the baader-meinhof-phenomenon (for those unfamiliar: https://science.howstuffworks.com/life/inside-the-mind/human-brain/baader-meinhof-phenomenon.htm)
08:03:34 <geist> and since a fair amount of the UI was traditionally implemented in the kernel it may actually be the case anyway
08:08:55 <kingoffrance> thats kinda scary. const FcCharSet * FcLangGetCharSet (const FcChar8 *lang); (scary because, it may just be a wrapper around catopen and/or gettext, but if its not, we might have 3 competing "standards" now)
08:20:29 <knebulae> @geist: I'm struggling with a concept, and I'm looking for some guidance; what I really want is a deterministic method for calculating at what point it is appropriate to perform work as a batch, or to amortize that work over smaller units of computation. This is important to the batching of syscalls question that comes up every so often, but is widely applicable.
08:22:04 <knebulae> my ideas center around instrumentation and intelligent code-paths through the kernel
08:28:03 * geist nods
08:31:13 <knebulae> Well, if the kernel knows (through benchmarking or otherwise) what the cost of various operations are on its host cpu (think like address space switches w/privilege change, address space switch without privilege change, or atomic vs lock-free operations), then it could dynamically determine, based on system load and current and anticipated demand, what the best "tasking" strategy would be. Sort of dynamically adapting to the current demands of the
08:31:13 <knebulae> user, whoever that may be.
08:31:49 <geist> possibly
08:32:07 <geist> generally speaking though most kernels are fairly static in the way their code flow works
08:32:17 <knebulae> @geist: naturally
08:32:20 <geist> it's hard enought o make a design that is solid and secure
08:32:35 <geist> having it dynamically decide what it's doing adds another dimension to it
08:32:49 <knebulae> @geist: right;
08:32:57 <geist> in general, i think it's also a slight mistake to focus too much on the hardware
08:33:00 <geist> that changes, over time
08:33:16 <geist> designing the kernel around the requirement of the hardware currently will eventually bite you
08:33:21 <geist> or at least not giving youtself an out
08:33:30 <geist> but then in your case you're actually advocating making it more flexible
08:34:57 <knebulae> @geist: I just want to avoid situations where the "crosstalk" in kernel becomes overwhelming as the core count increases. Long term, that is "a" problem to solve. How will 10k cores talk to one another? Will they need to?
08:35:32 * geist nods
08:35:48 <geist> sadly this is a tough problem to solve as a hobbyist
08:35:58 <geist> unless you have a large chunk of cash to buy high core count machines
08:36:12 <geist> 10-15 years ago even having SMP was already out of the reach of most folks
08:36:16 <knebulae> @geist: in addition, when I said dynamic, I should have said at boot, not on each function invocation.
08:36:19 <geist> now having >64 cores or so is still generally hard
08:37:23 <knebulae> @geist: I built a threadripper for a client a few weeks ago for <$2k. 64GB 16core/32thread. It's coming soon; at least on hobbyist timelines.
08:38:48 <geist> yes. i knew you were going to say that, but frankly almost no one here has $2-$3k to dump on their hobby os dev machine
08:38:55 <geist> that's also why i pointed out >64
08:39:26 <kingoffrance> it almost sounds like you have a legitimate reason to have currency (cost of operation) in kernel knebulae :) cpu time "dollars" and such
08:39:34 <geist> this will change of course, but i was generally pointing out that unless you want to dump a lot of money on it it's going to be hard to have a good test rig for high core count design
08:39:54 <geist> and thus this sort of thing has traditionally been hard for hobby os dev
08:40:08 <geist> but you're right, AMD has put high core count in the reach of at least regular people
08:40:15 <geist> used to require a $10k dual xeon box or whatnot
08:40:16 <knebulae> @geist: well, neither do I. This machine was for actual work. That being said, I suppose microsoft does something similar, but with separate kernels, but AFAICT (or remember), the end user has zero control over which kernel is chosen, so it's not really the same.
08:40:36 <geist> nah, that's easy. you can have a loader pick a kernel if you want
08:40:51 <geist> even the BSDs usually have a .up vs .mp kernel that gets picked, at least at install time
08:41:01 <geist> linux distros used to do the same thing, but generally i've seen them not bother with up
08:41:12 <geist> you could argue that you could build a .numa kernel or whatnot
08:41:16 <knebulae> @geist: it's time has passed
08:41:24 <knebulae> w.r.t. .u
08:41:49 <geist> really i think the state of the art is to simply do it dynamically
08:42:59 <knebulae> @kingoffrance: not really. my code is public. it sucks. But I'm doing it right this time; and by right, I mean right for a hobbyist.
08:45:26 <bcos_> Wait.. AMD released a "EPYC 3451" chip for embedded stuff (like smartphones, etc), with 16 cores and 100W TDP??
08:46:48 <geist> what core is it? zen or puma or excavator?
08:46:57 <bcos_> It's Zen
08:47:06 <geist> it's probably not for smartphones, it's for network routers and whatnot
08:47:09 <geist> like intel Rangeley
08:47:29 <geist> https://en.wikichip.org/wiki/amd/epyc_embedded/3451 'dense servers and edge devices'
08:48:14 <geist> my router is a 4 core 8 thread rangeley, similar idea. soldered to the board, 'embedded' server chip with lots of built in NICs
08:50:17 <bcos_> That's messed up though (it's based on a HPC part - their "2 * 32-core chips" stuff)
08:50:46 <geist> has to be or you wouldn't get 64 PCI lanes. a regular Zeppelin has 32 pci lanes
08:51:05 <knebulae> @geist: I did go a bit overboard with the use of the word dynamic earlier. I wanted to clarify that I'm not talking about inserting some type of code in the actual context-switching hot-path to do statistical calculations or anything like that.
08:51:06 <geist> so it must be a dual zeppelin design, probably downclocked to give it a reasonable TDW
08:51:30 <geist> knebulae: yah. we're actually toying with the idea of using some sort of machine learning in the zircon scheduler
08:51:34 <geist> so time will tell
08:52:26 <geist> bcos_: oh i didn't see it was a 16, yeah that has to be a downclocked regular dual zeppelin design
08:52:28 <knebulae> @geist: but a watcher thread to "keep an eye on things" and potentially swap an allocator or a scheduler (in a many-core design) should be ok, right :)
08:52:29 <geist> basically it's a low end EPYC
08:53:37 <geist> the difference may be the socket (or lack of)
08:53:38 <aalm> s/low end/embedded/ there, fixed. :p
08:53:59 <geist> yah, that's a BGA. that is probably where 'embedded' really comes in
08:54:16 <kingoffrance> knebulae: sorry if im not helpful. i meant abstract currencies for various devices/latency/etc. in other words "statistical calculations". disk quotas already do "grace period" like a loan or line of credit, a user can temporarily exceed their limit for awhile, at which point if they remain in default, no more loans :/
08:54:37 <bcos_> Might make sense for high density blades if you can figure out how to cool a 1.21 gigawatt rack.. ;-)
08:54:40 <kingoffrance> old mainframe OS probably did much more of such, to bill ppl
08:54:47 <clever> geist: there was a recent linus techtips, where they took a laptop cpu (bga package), that they found on ebay, it had a desktop socket attached to it...
08:54:48 <geist> bcos_: i think that's well understood
08:54:52 <knebulae> @kingoffrance: right; situational awareness.
08:55:04 <geist> clever: makes sense. it's likely that the pinout of the bga is roughtly the same as a socket
08:55:11 <clever> yeah
08:55:13 <geist> even closer once cpus lost their pins 5-10 years ago
08:55:24 <geist> they may be basically the same package
08:55:58 <clever> well, there could be BGA pads directly on the raw die
08:56:11 <clever> and they BGA that onto the carrier board, that acts as the socket interface
08:56:18 <geist> probably not, the die is generally much smaller than the package
08:56:19 <knebulae> @kingoffrance: by the code I mean.
08:56:21 <clever> and laptops skip the middleman
08:56:24 <kingoffrance> sad thing is if i go to courthouse and ask them to print a public record, they charge $1-$2 per page, supposedly based on cost of database accesses (old IBM something or other)...i have no idea how they figure the costs, but their database and/or OS keeps tabs
08:56:33 <geist> but the bringout of the pins on the die are most likely roughly near where the pads are brought out
08:56:36 <geist> so that the traces are short
08:57:27 <kingoffrance> courthouse records == one place they still use actual green screens, in past 5 years anyway
08:58:33 <ProfessorHaxaemo> i've never seen a real green console, only amber :(
08:58:43 <geist> ah real green screens are fantastic
08:58:57 <geist> though i did also use a amber plasma screen when i was little. Toshiba T3100 'laptop'
08:59:08 <geist> it's acually fairly pleasant
08:59:20 <knebulae> very easy on the eyes
09:00:08 <knebulae> Last terminal I used was a wyse. I can't remember if it was green or orange. I think I was 7 or 8. :/
09:00:16 <doug16k> I've used green monitors at libraries in the late 80's
09:00:22 <geist> yep. i have a wyse50 terminal here
09:00:23 <knebulae> I just remember the owl (I think)
09:00:26 <doug16k> I wonder what they were using
09:00:37 <geist> nice green screen, though this one has some sort of vhold issue, needs to get looked at
09:01:27 <geist> years ago i had a little 13" vga monitor i got from someone that had a 3 way switch on the side
09:01:32 <geist> 'regular, amber, and green'
09:01:51 <geist> it basically skewed the color of white
09:02:07 <geist> kinda wish i kept that thing, it was pleasant to run DOS in amber or green mode
09:02:45 <knebulae> Ok, an OS related question. How long do you think it took someone to write DOS?
09:03:09 <knebulae> I mean, things were much, much different back then, but...
09:03:39 <ProfessorHaxaemo> QDOS?
09:03:58 <knebulae> I was thinking MS-DOS, being the only DOS I am familiar with.
09:04:12 <ProfessorHaxaemo> qdos was what they bought and turnedit into ms-dos
09:04:19 <knebulae> ahh. Ok.
09:04:29 <knebulae> CP/M right?
09:05:12 <knebulae> I mean, there's just not that much there, as far as an operating system goes, even for its day.
09:05:49 <knebulae> nvm; it was a stupid question.
09:06:27 <ProfessorHaxaemo> i was just confused because you said "someone"
09:06:41 <ProfessorHaxaemo> qdos was written by one person, ms-dos by a team ( i imagine )
09:11:52 <knebulae> @ProfessorHaxaemo: I knew DOS was written by one guy for CP/M as per my recollection. But the details are hazy. So I wondered just how long it actually took. I'm not sure any account I've ever read talks about it.
09:14:08 <ProfessorHaxaemo> deomnstrated june 1979, shipped in november. that's all I got to work with here.
09:14:35 <knebulae> @ProfessorHaxaemo: thanks man
09:14:37 <ProfessorHaxaemo> https://en.wikipedia.org/wiki/86-DOS
09:15:39 <knebulae> Wow, my memory is bad. Lol.
09:15:43 <kingoffrance> ProfessorHaxaemo: you can buy em on ebay. i had a green and amber but gone now. i even saw a light blue one once :/ there may be red too.
09:16:15 <aalm> https://twitter.com/whitequark/status/1091391736616759307 :_D:D:D
09:16:51 <aalm> awesome.
09:18:22 <ProfessorHaxaemo> april 26 1981 was version 1.0
09:19:38 <ProfessorHaxaemo> err 28, wow
09:20:16 <ProfessorHaxaemo> in their defense, they didn't have osdev.org back then
09:21:37 <ProfessorHaxaemo> thanks guize :)
09:22:12 <knebulae> @aalm: so they're using qemu to execute the bios/firmware code from x86 peripherals?
09:22:25 <knebulae> on AARCH64?
09:22:59 <kingoffrance> knebulae: probably not helpful, maybe such is already done. i did notice for my "sort" library (wrapper around qsort etc.) the manpages all listed o^n etc. complexity, so i made that another "charset" (or enum, but probably need some way for "custom" specifiers as well) so a program at runtime can choose algorithm/backend based on type of data to sort (really, ill probably make another...
09:23:01 <kingoffrance> ...backend for that). that can be extended to anything, if one has the time to "tag" all their code thusly. you could simply tag things "good algorithm for < 8 cores" "good for > 64" etc.
09:23:19 <kingoffrance> that would be better than nothing anyway, without having to take "statistics"
09:23:44 <knebulae> @kingoffrance: right
09:23:49 <Ameisen> why do some programming languages allow utf-8 identifiers...
09:23:51 <Ameisen> such a terrible idea
09:24:01 <Ameisen> but, hey, zalgo identifiers are fnu
09:24:03 <Ameisen> fun*
09:34:21 <knebulae> the reason I speak to a "dynamic" (loaded word, sorry) approach to algorithm selection (and scheduling/tasking "strategy" really) is that from a conceptual approach, I am moving up one level w.r.t. allocation of processing resources. Kinda conceptually similar to a 4th level page map. So certain blocks of the system can have varying execution profiles; one block may be appropriate for daemons, while another is suitable for presenting the user with a
09:34:21 <knebulae> terminal or desktop. But ultimately all under the watchful eye of a "master" kernel. Non-heterogenous architectures (true AMP) could be managed by such a kernel.
09:35:35 <knebulae> Usermode "sleds" could allow applications to choose the most suitable execution profile provided by the system.
09:36:07 <knebulae> In my mind, a "sled" is what I would provide to Intel NICs to handle their business their own way in userland with kernel intervention.
09:36:24 <knebulae> I'm sorry if that's not the current/correct nomenclature.
09:37:07 <knebulae> s/with/without kernel intervention
09:37:20 <knebulae> s/nomenclature/terminology
09:37:35 <knebulae> my keys are typing faster than my brain is moving. sorry.
09:39:37 <immibis> geist: cut off the red and blue pins on your VGA plug? :P
09:41:09 <klys> geist, herc 720x384 green screen?
09:41:37 <klys> oh it's a wyse50 terminal. nm.
09:42:14 <immibis> also a lot of monitors have a colour temperature option, that basically tints the whole screen, and some let you define a custom tint as RGB values
09:42:32 <immibis> (i.e. it doesn't have to match any valid colour temperature)
09:43:13 <doug16k> reminds me of the vt100 manual. the author's tone slips into boasting about how awesome it is here and there, rightfully so
09:43:58 <Ameisen> hmm, my build is actually significantly faster in some of the tests.
09:43:59 <immibis> bcos_: "embedded stuff" could be SDN routers
09:44:02 <Ameisen> in others, significantly slower
09:49:52 <klys> s/384/348/
09:55:42 <kingoffrance> knebulae: i dont know proper terminology any better than you, havent even approached a scheduler. that example was closest ive gotten to such things. why i like "stackable" backends.
09:56:26 <kingoffrance> it of course could get time-consuming tagging things since many algorithsm are likely one formula below number of items, then another above, so that complicates things
09:57:50 <knebulae> @kingoffrance: well, I certainly wouldn't want to categorize or tag anything from userspace.
10:03:39 <knebulae> @geist: not sure if you're still listening; if not, I get it. But it's important to note that some of the things I'm looking at are not feasible right now with core counts and workloads as they exist. The things I want to do become much more feasible when it's acceptable to just "throw away" or "not care" about a non-trivial (2 or more) number of cores.
10:04:17 <knebulae> and until the average person has at least 16, I don't see that as being a cost anyone would be willing to give up.
10:07:12 <knebulae> At that point it would be acceptable to have exclusive processors for user/supervisor mode.
10:13:03 <ProfessorHaxaemo> hmmm https://en.wikipedia.org/wiki/Jerry_Pournelle#Pournelle's_first_law
10:14:35 <knebulae> it opens a lot of doors with "trusted" execution of untrusted native code, which I feel is critical moving forward.
10:28:38 <knebulae> I will just pose this final q today: what happens to operating systems when the core count exceeds (maybe even vastly) the number of threads? So much code, and so much effort to assemble a symphony.
10:29:12 <kingoffrance> knebulae: i meant "can be time-consuming for programmer even to tag by hand" i.e. add an "attribute" to each "class" or "object" if one was doing this for broad swarths of code, not just "scheduler implementations" or whatever
10:29:40 <knebulae> @kingoffrance: gotcha
10:31:10 <kingoffrance> thinking of having to add such "annotations" to all of libc or boost :/
10:44:37 <immibis> knebulae: the whole point of user mode is to execute untrusted native code
10:45:13 <immibis> the fact that we can't run untrusted native code in user mode on any widely used OS is a major design flaw in those OSes
10:46:35 <knebulae> @immibis: fair enough; but I consider what I'm talking about and usermode to be semantically different.
10:47:29 <knebulae> but perhaps that distinction was made for me where one previously did not exist (through fundamental architectural missteps)
10:48:02 <immibis> I don't even know what you are talking about
11:11:43 <geist> i'm starting to agree with immibis here
11:11:55 <immibis> geist: I don't even know what knebulae is talking about
11:11:56 <geist> you're talking about things in a highly abstract way that is frankly a little hard to follow
11:12:08 <geist> or at least hard to follow as a nuts n' bolts kind of guy i am
11:19:12 <kingoffrance> knebulae: i dunno hardware or software wise, but human/language wise, re: what happens to operating systems when the core count exceeds (maybe even vastly) the number of threads?
11:19:50 <knebulae> @geist: I want to run a full time system processor and keep other cores in usermode. High bandwidth peripherals like GPUs and NICs can use the "interrupt to usermode" method (sled). Never leaving the kernel on the bsp, and maintaining full control over the APs.
11:20:17 <knebulae> Throw in a foreign arch or two on a high speed bus
11:20:25 <knebulae> Then I'd be cooking
11:20:51 <kingoffrance> same thing that happened when "computers" are no longer people, but mostly machines. the definitions change, or new terms are invented. same with now there are millions of kings ("we the people" in u.s.) ... the language will adapt, if unconsciously
11:21:10 <kingoffrance> i can guarantee the language/terms will warp at that point :/
11:24:15 <knebulae> but right now IPC through IPI is just not performant
11:27:11 <Griwes> that's not a word
11:29:14 <immibis> knebulae: sounds like the kernel is just another task and should be scheduled as such
11:29:22 <immibis> if you have less tasks than cores, then by all means dedicate a core to it
11:30:25 <knebulae> @Griwes: you and I must not share a common definition of the word "word" :)
11:30:46 <Griwes> "performant" is not a word
11:30:47 <knebulae> @Griwes: it may be slang or improper, but it is certainly a word.
11:30:50 <Griwes> nope
11:30:59 <immibis> kingoffrance: then everyone gets more parallel until the number of threads exceeds or adapts to the number of cores
11:31:07 <immibis> kingoffrance: same way people utilize GPUs now
11:32:04 <knebulae> @Griwes: and even worse, you knew what I meant; so it actually fulfilled its intended purpose.
11:32:43 <Griwes> it's still not a word
11:32:55 <kingoffrance> "c" used to be "high level language"...modern books, not so much
11:44:40 <nyc`> Idle threads? Wait for interrupt etc.?
11:46:15 <nyc`> kingoffrance: High-level language is pretty squishy. One minute it's a macro assembler, the next it's Haskell.
11:57:18 <immibis> your party receives 50 XP from this encounter. your language levels up! your language is now a high level.
11:58:09 <clever> immibis: https://twitter.com/dmofengineering