Search logs: #osdev2 - 9 June 2022

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present

http://bespin.org/~qz/search/?view=1&c=osdev2&y=22&m=6&d=9

Thursday, 9 June 2022

00:23:00 <radens> Is there something like the bochs magic breakpoint for qemu?
00:23:00 <radens> Is there something like the bochs magic breakpoint for qemu?
00:27:00 <zid`> All fixed, anyway
00:27:00 <zid`> Turns out the mistake was dumb, and I absolutely should have known where to look
00:27:00 <zid`> given I *knew* where I'd made code changes previously
00:27:00 <gog> mew
00:28:00 <mrvn> radens: that int 3 thing?
00:28:00 <radens> mrvn:
00:29:00 <zid`> The chain is restored
00:29:00 <zid`> https://cdn.discordapp.com/attachments/465586075830845475/984252575413075978/unknown.png
00:29:00 <zid`> for when you absolutely need to chain windows -> linux -> boros -> gameboy together
00:29:00 <heat> sup gog
00:29:00 <radens> mrvn: xchgw bx, bx will break to the bochs debugger and is a nop for normal software. I've seen arm hint space instructions used similarler
00:29:00 <radens> *similarly
00:30:00 <zid`> I do kinda miss the bochs magic breakpoint
00:30:00 <heat> sometimes it do be like that
00:30:00 <heat> get a little tipsy, call her, tell her you're sorry and that you miss her
00:30:00 <gog> yeah afiak there's nothing like that in qemu, you actually have to do int3 and have a debugger attached
00:31:00 <radens> Oh so if you have gdb attached to the qemu remote and you do int3 it will break to gdb?
00:31:00 <heat> i don't think so?
00:32:00 <zid`> or just throw a completely invalid instruction in or something and -d int and wait for the fault to happen :P
00:32:00 <gog> no i think there's more to it than that
00:32:00 <mrvn> shouldn't be to hard to make the TCG translate xchgw bx, bx into int3
00:32:00 <zid`> int debug_magic(int n){ asm("mov rax, %0; ud2":(r) "n"); }
00:32:00 <radens> I mean I could patch xchgw bx, bx to call into the gdb stub
00:32:00 <heat> xchg breaks to the emulator's debugger, int3 doesn't work here
00:33:00 <heat> int3 will just trigger a VM-internal trap
00:33:00 <zid`> you can even pass a fault code that way :p
00:34:00 <heat> linux's BUG_ON kinda works like that
00:34:00 <mrvn> zid`: but that fails when you don't have a debugger attached
00:34:00 <heat> https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/bug.h#L66
00:34:00 <bslsk05> elixir.bootlin.com: bug.h - arch/x86/include/asm/bug.h - Linux source code (v5.18.2) - Bootlin
00:34:00 <zid`> nod, it makes a lot of sense to do it like that irl
00:34:00 <zid`> if you don't have spooky magic
00:35:00 <heat> as far as I can see it's just so you have a fully consistent register dump
00:35:00 <heat> instead of doing $weird_stuff
00:36:00 <heat> you'd still need to pretend it's an interrupt entry and exit and push/pop stuff
04:20:00 <zid`> Pushed an update to my gameboy code, fixed it deserved it after I spent so much time staring at it in qemu
04:20:00 <zid`> s/fixed/figured
04:20:00 <zid`> It now runs an additional bizzare scene demo!
04:20:00 <zid`> without any corruption on the last few rows
04:53:00 <Jari--> hi
04:53:00 <Jari--> zid`
08:45:00 <mrvn> is anyone using 8.8.8.8 for dns and has slow lookups?
12:56:00 <ddevault> I can allocate memory in userspace :D
12:58:00 <zid`> I can't, allocators are annoying :P
13:06:00 <mrvn> ddevault: (s)brk or something sensible?
13:06:00 <ddevault> something sensible
13:06:00 <ddevault> mmap, essentially
13:07:00 <ddevault> the kernel design is based on seL4
13:07:00 <ddevault> s/based on/inspired by/
13:40:00 <heat> brk go brrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrk
14:20:00 * Andrew pushes themselves onto the stack
15:02:00 <ddevault> bloody PIT
15:11:00 <ddevault> hmmm
15:11:00 <ddevault> the PIT stops firing when I jump to userspace
15:11:00 <j`ey> accidentally left IRQs masked?
15:12:00 <ddevault> if they were masked wouldn't they not create interrupts in the kernel, either?
15:23:00 <j`ey> ddevault: yeah afaik, but it sounds like you aren't getting interrupts at all..?
15:23:00 <ddevault> no
15:23:00 <ddevault> I am getting PIT interrupts before jumping to userspace, but not after
15:25:00 <ddevault> afk
15:28:00 <j`ey> if you're using qemu or someting, maybe you can check if interrupts are enabled
16:09:00 <heat> maybe you're setting the wrong eflags
16:13:00 <heat> does anyone know of a way to measure tlb shootdowns for a process in Linux?
17:15:00 <geist> mrvn: did your dns resolve?
17:15:00 <geist> i use 8.8.8.8 but dont seem to have particularly slow lookups right now
17:15:00 <geist> ddevault: when you jump to user space you load a new eflags via the iret or sysexit
17:15:00 <geist> you can easily accidentally leave IRQs masked
17:57:00 <FatAlbert> i guess this is how a channel when 90% of the people in it knows what they doing look like
17:58:00 <heat> we pretend, yes
17:59:00 <FatAlbert> i think i'll study CS in Uni
18:00:00 <FatAlbert> as oppose to #linux when 90% of the peopel there probably use windows anyway and that's why the chat is a complete nonsense
18:01:00 <FatAlbert> at least here when people talk ... i learn something
18:06:00 <FatAlbert> so don't get me wrong but i hope your computer will break in some spectacular way
18:30:00 <sham1> Let's not hope that the breakage is too spectacular
18:30:00 <sham1> That might easily lead to you being admitted to a hospital
20:00:00 <heat> you should all use 1.1.1.1
20:00:00 <heat> just saying
20:00:00 <sortie> heat, enough! 1.1 will NEVER happen
20:01:00 <sortie> No way you're gettign a 1.1.1 patch release
20:01:00 <sortie> And 1.1.1.1 is a pipe dream
20:01:00 <heat> :D
20:01:00 <heat> if 1.1.1.1 is a pipe dream, what's 192.168.0.1?
20:01:00 <sortie> /16 like god intended
20:02:00 <gamozo> I use IPV6 but then use /128 to make sure it's pointless
20:02:00 <gamozo> I change my IP every packet to prevent being hacked
20:08:00 <mrvn> geist: https://paste.debian.net/1243576 dig works fine but ping sleeps 5s for some reason.
20:08:00 <bslsk05> paste.debian.net: debian Pastezone
20:09:00 <mrvn> I don't even get why it's doing the same request 3 times
20:11:00 <ddevault> geist: hm, aight
20:12:00 <mrvn> Same issue with 1.1.1.1 by the way.
20:13:00 <heat> glibc right?
20:13:00 <mrvn> me?
20:13:00 <heat> i've seen some pretty weird behavior wrt nss when it was misconfigured
20:15:00 <mrvn> normal Debian install. I only configured dhcp to leave the resolv.conf alone.
20:17:00 <mrvn> It looks like the first message to 8.8.8.8 contains 2 DNS queries and only gets one reply. Waiting for a second times out and then it sends each request again separately. Do I see that right?
20:18:00 <heat> that looks correct yes
20:20:00 <heat> let me trace mine
20:20:00 <heat> mrvn, how does your resolv look?
20:20:00 <mrvn> Do you get the same when you starce "ping www.debian.org"?
20:20:00 <bslsk05> www.debian.org: Debian -- The Universal Operating System
20:20:00 <mrvn> "nameserver 8.8.8.8"
20:20:00 <heat> ok lets see
20:20:00 <mrvn> I'm guessing the 2 dns requests are IPv4 and IPv6
20:22:00 <heat> mine only sends a single DNS request it seems
20:22:00 <heat> sendto(5, "J\254\1\0\0\1\0\0\0\0\0\0\0017\0017\0010\0010\0010\0010\0010\0010\0010\0010"..., 90, MSG_NOSIGNAL, NULL, 0) = 90
20:22:00 <heat> then a poll, FIONREAD, recvfrom and close
20:23:00 <mrvn> 90 bytes seems long
20:23:00 <mrvn> 48. 60 or 108 bytes reply?
20:24:00 <heat> 122
20:24:00 <heat> lets wireshark it
20:25:00 <heat> ok....
20:25:00 <heat> i'm wrong
20:25:00 <heat> it's 2 40 byte packets
20:26:00 <heat> and then i get two responses
20:27:00 <heat> aha I was looking at the wrong thing
20:27:00 <heat> its connecting to 8.8.8.8 twice
20:27:00 <heat> wait, makes sense, it's for the reverse lookup of the IP addr
20:28:00 <heat> so, two 32 byte msgs sent by sendmmsg, no timeouts and a 48 byte response and a 60 byte response
20:28:00 <heat> everything works here
20:30:00 <heat> mrvn, how does your wireshark look?
20:30:00 <heat> if you don't get two responses, you may have a broken firewall or router in the way I guess
20:35:00 <mrvn> standard query or A, AAAA, responce for A, 5s pause, query for A, response A, query AAAA, response AAAA
20:36:00 <mrvn> It looks like the second query is lost because it's send as separate frame
20:41:00 <heat> hm?
20:42:00 <heat> it's not lost
20:42:00 <heat> you're just losing the response
21:13:00 <zid`> It's not lost, heat just can't find it and doesn't know where it is
21:14:00 <zid`> ddevault: Did you fix your EFLAGS?
21:15:00 <ddevault> I will be looking into that tomorrow
21:16:00 <zid`> It should just be that the stack frame you constructed that iret pops hasn't got interrupt enable bit set in eflags, from what you sai
21:16:00 <zid`> said
21:16:00 <ddevault> that's probably it
22:42:00 <gorgonical> Do I "have" to store the address of the per_cpu data area in a register?
22:42:00 <gorgonical> I guess I do, don't I?
22:43:00 <mjg_> what arch is this? amd64?
22:43:00 <gorgonical> risc-v
22:43:00 <mjg_> oh, no opinoin :)
22:44:00 <gorgonical> I'm trying to figure out how Linux does this: I think linux uses tpidr_el0 for TLS, and then tpidr_el1 for PDA, and then I forget how/where the task_struct is stored
22:45:00 <gorgonical> RISC-V so far I think uses tp register for TLS, then CSR_SCRATCH for task_struct, and I have no idea where the PDA is stored
22:46:00 <mrvn> Do you want to access the per_cpu data area often and fast?
22:47:00 <gorgonical> Even if I didn't, wouldn't I still need some way for the CPUs to know their index in the central directory?
22:47:00 <mrvn> the cpu normaly has an opcode to get the ID
22:49:00 <gorgonical> There *is* a uscratch register that maybe I can shove this into. That saves one dereference
22:49:00 <gorgonical> My fear is that libc or something already uses uscratch
22:49:00 <mrvn> How many registers do you have that the kernel can write to but user can't?
22:50:00 <heat> gorgonical, there's a tls register iirc
22:50:00 <gorgonical> Yeah, usually tp, I think that's like x4
22:51:00 <heat> yeah
22:51:00 <heat> I use it for my percpu data in the kernel since it's, well, my tls for all intents and purposes
22:51:00 <mrvn> can't userspace write to x4?
22:52:00 <gorgonical> But in user-mode thats tls storage. The kernel exception handler I'm stealing from linux shoves the task_struct* into the tp register
22:52:00 <gorgonical> mrvn: yes but that's why the kernel stores whatever it wants in one of these scratch registers
22:52:00 <heat> you keep it in the scratch register and swap it
22:52:00 <heat> it's like the swapgs stuff
22:53:00 <heat> https://github.com/heatd/Onyx/blob/master/kernel/arch/riscv64/interrupts.S#L16
22:53:00 <bslsk05> github.com: Onyx/interrupts.S at master · heatd/Onyx · GitHub
22:53:00 <heat> https://github.com/heatd/Onyx/blob/master/kernel/arch/riscv64/scheduler.cpp#L225-L228
22:53:00 <bslsk05> github.com: Onyx/scheduler.cpp at master · heatd/Onyx · GitHub
22:54:00 <mrvn> For my kernel design I need at least 2 kernel only scratch registers.
22:56:00 <gorgonical> Do you have to make a call to the SBI firmware to read the hart id?
22:56:00 <gorgonical> Manual says hartid is machine read-only
22:57:00 <mrvn> don't you have call the firmware for everything on risc-v?
22:57:00 <gorgonical> what do you mean
22:57:00 <mrvn> nothing, just hearsay
22:59:00 <heat> yes
23:00:00 <heat> you need to call sbi
23:00:00 <heat> mrvn is mostly right :)
23:00:00 <gorgonical> well that's not gonna work for percpu then is it
23:00:00 <gorgonical> a call down to sbi to fetch cpuid to index a percpu pda directory
23:02:00 <gorgonical> okay so the chip does not have the N extensions so it does not have uscratch
23:03:00 <gorgonical> arm really gets out easy here because it has a tpidr_el0 that riscv doesn't have a direct equivalent for
23:04:00 <gorgonical> i guess maybe you could stash the pda ptr in mscratch, but sbi probably won't allow that
23:04:00 <heat> what are you doing
23:05:00 <heat> whats a "pda"?
23:05:00 <gorgonical> per-cpu data
23:05:00 <mrvn> public display of affection
23:05:00 * heat kisses mrvn
23:05:00 <klange> We'll have none of that here!
23:05:00 <gorgonical> personal digital assistant even
23:05:00 <heat> gorgonical, for the kernel?
23:05:00 <gorgonical> yes
23:05:00 <klange> I only do Pocket PCs.
23:06:00 <heat> gorgonical, why can't you use sscratch?
23:06:00 <gorgonical> sscratch already has the kernel task_struct* in it
23:06:00 <gorgonical> for context switching and all that
23:06:00 <heat> if you're using linux, they solved that problem I guess
23:06:00 <mrvn> why not store pre core data there and task_struct in per code data
23:06:00 <gorgonical> they did, but I can't figure out how. The per-cpu code si very abstracted
23:06:00 <mrvn> ?
23:07:00 <klange> I can walk through what I'm doing in misaka
23:07:00 <gorgonical> mrvn: that's probably a good idea
23:07:00 <gorgonical> that would require updating the task_struct variable whenever you switch_to though
23:07:00 <heat> klange, you have riscv now?
23:07:00 <klange> no, aarch64
23:08:00 <heat> this is riscv
23:08:00 <mrvn> gorgonical: obviously
23:08:00 <heat> gorgonical, btw, you're wrong (as I suspected) https://elixir.bootlin.com/linux/latest/source/arch/riscv/kernel/entry.S#L28
23:08:00 <bslsk05> elixir.bootlin.com: entry.S - arch/riscv/kernel/entry.S - Linux source code (v5.18.3) - Bootlin
23:08:00 <klange> > tpidr_el0
23:08:00 <klange> this isn't
23:08:00 <heat> wait maybe you're not wrong sorry
23:08:00 <heat> i didn't read everything that went after that
23:08:00 <heat> but they do their loading right there
23:08:00 <gorgonical> I only sort of understand how this is done on ARM64 so my understanding of how they solved this problem is vague
23:08:00 <gorgonical> heat: what do you mean about the loading?
23:09:00 <heat> loading of the tp
23:10:00 <gorgonical> Yeah. tp in userland points to tls. In kernel land they want it to point to the task_struct. So first insn is to swap them. CSR_SCRATCH contains the task_struct ptr. Then all the context switching is with the kernel-tp
23:11:00 <gorgonical> I've had a busy day so maybe I overlooked it, but the thread_info struct at the start of the task_struct struct doesn't appear to contain the pda
23:11:00 <mrvn> why should the task truct have anything about the per code data?
23:12:00 <gorgonical> It shouldn't, but the thread_info might point to it I supposed
23:12:00 <klange> On ARM64, I tell the compiler x18 is reserved and then stick the per-core pointer in there. I stole that from geist. That's only in the kernel; x18 is the swapped on context switch with everything else
23:13:00 <gorgonical> klange: That might be the thing to do
23:13:00 <zid`> sounds like mips where k0 is free for the kernel or whatever
23:13:00 <klange> In userspace, tpidr_el0 is the thread pointer. This is fully controlled by userspace and dutifully restored by the kernel. Then gcc's built-in understanding of thread-locals takes over.
23:13:00 <klange> The important thing to note is the difference between per-core and per-thread. An execution context that is per-thread does not unexpected change between function calls, but a per-core one definitely can if one of those function calls is a context switch.
23:14:00 <klange> So trying to convince the compiler to use thread-local stuff for your per-core stuff is a no-no, as it could elide a load after a function call; it can't do that if you tell it to always reference based on the register
23:15:00 <mrvn> Another thing that's nicer without per task kernel stack. per-core never changes while inside the kernel.
23:15:00 <gorgonical> I see. I mean the naive way is to use cpuid as an index, but I think that's pretty slow on riscv
23:15:00 <klange> The option to tell gcc and clang that x18 is reserved is `-ffixed-x18`. You can also make that part of your ABI spec and bake it in... which Apple does in userspace on macOS for stupid legacy reasons.
23:16:00 <heat> gorgonical, the proper way is to keep the percpu data pointer in sscratch and tp
23:16:00 <mrvn> gorgonical: and every other archs too
23:16:00 <klange> It's slow everywhere. Reserving a register is better because register-based addressing is universally faster.
23:16:00 <gorgonical> heat: what do you mean?
23:17:00 <klange> It's even faster than the tpidr lookups on ARM, since those still need a cycle or two to pull the msr out into a general register anyway - which is why gcc and clang will happily elide that operation when they can (*if they are doing it as part of native TLS)
23:17:00 <heat> gorgonical, erm. just keep it there
23:18:00 <heat> what more do you need?
23:18:00 <klange> I recent did a thing on macOS to manually do TLS operations for my interpreter because macOS's default is always calling out to library functions and using (basically) GOT callbacks
23:18:00 <gorgonical> and then put a ptr to the currently executing task in the percpu?
23:18:00 <heat> yes
23:18:00 <heat> that's what I do
23:19:00 <gorgonical> I think that's the best option I have unless I discover linux does something really smart
23:19:00 <gorgonical> thank you
23:19:00 <klange> I put something a level above the task, but maybe that's a design mistake on my part
23:19:00 <heat> i mentioned this option like 20 minute ago xD
23:19:00 <gorgonical> you did but I got confused about who mentioned what
23:19:00 <mrvn> You can chain it all from bottom to top: core -> thread -> process -> group
23:20:00 <klange> fun fact, for the userspace thread pointer macos uses tpidrro_el0 instead of tpidr_el0
23:20:00 <klange> UNLIKE EVERYONE ELSE
23:20:00 <heat> what's tpidrrrrrrrrrrrrrrrrrrrrrro
23:20:00 <klange> which means __builtin_thread_pointer does the wrong thing
23:20:00 <klange> it's like tpidr but read only (and also it's a different register)
23:20:00 <heat> sounds like __builtin_thread_pointer needs to be fixed for the darwin targets
23:21:00 <klange> yeah, not sure why it hasn't been fixed to return the right thing
23:21:00 <mrvn> That sucks, no user space threads that wayx.
23:21:00 <klange> I assume because it's part of a gcc compatibility thing that they only care about on linux
23:21:00 <klange> it's only read-only in the direct sense
23:21:00 <klange> you can still use a syscall to set it
23:22:00 <klange> and they use it the same way in the end
23:22:00 <klange> _except_ that they push it all behind hooks that the dynamic linker sets up
23:22:00 <klange> rather than actually linking slot lookups
23:22:00 <klange> so everything is slow as hell
23:22:00 <heat> 10 bucks in how there's a massive exploit in the sillicon and they abstracted it that way so you can't set bad tp values
23:22:00 <klange> and thus we come to this fantastic bit of code: https://github.com/kuroko-lang/kuroko/blob/master/src/kuroko/vm.h#L219-L223
23:22:00 <bslsk05> github.com: kuroko/vm.h at master · kuroko-lang/kuroko · GitHub
23:23:00 <mrvn> kind of defeats the purpose of user space threads if you have to syscall to swap threads.
23:23:00 <klange> This inlines the thread slot lookup _like every other platform does normally_, so thread-local storage is just as fast as it is on Linux, or ToaruOS.
23:23:00 <heat> you almost always have to use a syscall to swap threads
23:24:00 <heat> fsgsbase is super recent in the grand scheme of things
23:24:00 <mrvn> heat: tls too
23:24:00 <heat> fsgsbase is like 2014 recent
23:24:00 <heat> not 2001 recent
23:25:00 <mrvn> post c99, basically still wet paint :)
23:25:00 <heat> huh
23:25:00 <heat> how old is tls actually?
23:26:00 <heat> like as an actual concept
23:26:00 <mrvn> I would use tpidrro_el0 for the shared kernel/user thread pointer and tpidr_el0 for tls.
23:26:00 <heat> i used ~linux's nptl date
23:26:00 <heat> mrvn, you'll leak a kernel address that way
23:27:00 <mrvn> https://en.cppreference.com/w/cpp/keyword/thread_local c++11
23:27:00 <bslsk05> en.cppreference.com: C++ keywords: thread_local (since C++11) - cppreference.com
23:27:00 <klange> tpidr_el0 on macOS _appears_ to be the core ID.
23:27:00 <mrvn> heat: obviously. It's shared. Things like the pid and tid.
23:27:00 <klange> Not the thread ID, not a _pointer_ to a core struct. Just a number for the core you are on.
23:27:00 <klange> The most useless thing ever.
23:28:00 <mrvn> So if userspace writes to tpidr_el0 the core suddenly is a different core to the kernel?
23:29:00 <klange> tpidrro_el0 is the base of the thread-local data, which uses the descriptor slot approach; the rest of the thread struct is behind it at a fixed offset (basically it points into the middle of a struct, where a big array of pointers happens to start, which is typical)
23:29:00 <heat> can't you switch tls models?
23:30:00 <heat> or do they just not support it
23:30:00 <klange> models? there are no models on macOS
23:30:00 <klange> The only thing that has different TLS models is ELF.
23:30:00 <mrvn> Another of those optimizer things. Setting the tpidrro_el0 is slow so you make it point to a pointer to thread local.
23:30:00 <gorgonical> Update: it actually seems like Linux does the simple thing and just takes the processor ID as an index
23:30:00 <klange> (I _wish_, initial-exec is what I want for my interpreter, of course)
23:30:00 <heat> gorgonical, that's horrific
23:30:00 <klange> (inlined dynamic isn't really that slow, though, if done properly)
23:30:00 <heat> the x86 code doesn't
23:31:00 <gorgonical> In the default case it does anyway
23:31:00 <gorgonical> https://elixir.bootlin.com/linux/latest/source/include/asm-generic/percpu.h
23:31:00 <bslsk05> elixir.bootlin.com: percpu.h - include/asm-generic/percpu.h - Linux source code (v5.18.3) - Bootlin
23:31:00 <mrvn> gorgonical: sure that isn't just the fallback in case it has no spare scratch register?
23:31:00 <heat> I was looking at that a few minutes ago and it seems they only bothered to optimise the x86 code
23:31:00 <gorgonical> There's no percpu.h in several of the arch's
23:31:00 <gorgonical> mrvn: That's what I think
23:31:00 <heat> no, you're correct
23:31:00 <gorgonical> And riscv doesn't seem to have a percpu.h def so I guess that's what it does here
23:31:00 <mrvn> if you have no other way using the cpu ID as index into an array works universally.
23:32:00 <heat> https://elixir.bootlin.com/linux/latest/source/arch/riscv/include/asm/smp.h#L60
23:32:00 <bslsk05> elixir.bootlin.com: smp.h - arch/riscv/include/asm/smp.h - Linux source code (v5.18.3) - Bootlin
23:32:00 <mrvn> damn, I wanted to do some more work on my kernel this week and it's friday already.
23:32:00 <klange> Symbol points to the descriptor, descriptor has key (index into the thread-local pointer array) + offset (because keys can be shared by many thread-locals), then you do tp[key]+offset and try to inline that as best as you can...
23:33:00 <klange> anyway I made my interpreter speedy on macos by abusing knowledge of how the thread local storage model works and it's great, the end
23:33:00 <klange> thank you apple for at least having these parts of macos be open-source
23:33:00 <heat> yes, that's the standard for dynamically linked objects in linux as well afaik
23:34:00 <gorgonical> heat: so linux does what you suggested; store the cpuid that the thread is currently running on and use that
23:34:00 <klange> (also for various reasons they can't change this abi, so this inlining of why dyld's hooks do is totally safe)
23:34:00 <mrvn> Looking back over how long this TLS discussion is running I really have to ask: Aren't threads more trouble than they are worth?
23:34:00 <gorgonical> or what mrvn suggested, I literally can't remember
23:34:00 <heat> gorgonical: no, i didn't suggest that, because that's slow
23:34:00 <klange> mrvn: probably
23:34:00 <heat> mrvn, no?
23:34:00 <heat> what's the alternative?
23:35:00 <klange> more processes
23:35:00 <mrvn> just imagine all the data race you avoid by not having thread :)
23:35:00 <klange> "I don't see the problem with that." - CPython
23:35:00 <gorgonical> heat: what's the alternative? This is faster than getting the cpuid directly isnt it
23:35:00 <mrvn> heat: factory model with message passing works nicely.
23:35:00 <gorgonical> Oh you're suggesting to directly put the ptr in, not the index
23:35:00 <gorgonical> yes that is better strictly
23:35:00 <heat> gorgonical, yes
23:36:00 <mrvn> gorgonical: if you store the ID in a register you might as well just store a pointer
23:36:00 <mrvn> one less memory access and addition
23:36:00 <gorgonical> Makes me wonder why linux does it this way then
23:36:00 <heat> https://github.com/heatd/Onyx/blob/master/kernel/include/onyx/riscv/percpu.h
23:36:00 <bslsk05> github.com: Onyx/percpu.h at master · heatd/Onyx · GitHub
23:37:00 <mrvn> maybe some archs don't have enough bits for a pointer in the scratch register but enough for the core ID?
23:37:00 <heat> they just didn't bother optimizing any other arch
23:37:00 <heat> probably because their percpu code is so fucking obtuse
23:37:00 <heat> macros on top of macros
23:38:00 <j`ey> linux is all macros
23:38:00 <heat> the kernel is just a giant macro
23:39:00 <heat> by the way, re: my previous question, perf has a tlb shootdown tracing thing
23:39:00 <heat> perf has everything
23:39:00 <heat> it's almost as good as a io_uring eBPF folio
23:40:00 <klange> I should do a riscv port of Misaka.
23:40:00 <klange> I hear it's very similar to aarch64 anyway.
23:40:00 <heat> it's a very simple arch
23:41:00 <heat> my x86_64 code = 7110 lines, riscv64 = 2882 lines
23:53:00 <geist> yay fun discussions
23:53:00 <geist> and yeah riscv is fun in that it's the bare minimum, no fuss. just gets it dun
23:53:00 <geist> sometimes that's annoying, but most of the time it's just straightforward
23:55:00 <heat> sup geist
23:56:00 <geist> not much. workin. meetings
23:56:00 <geist> the usual
23:57:00 <heat> sounds horrible
23:57:00 <heat> i've spent the last week doing my onboarding
23:57:00 <mrvn> always remember: This too shall pass
23:58:00 <zid`> are you officially a whatever now?
23:58:00 <heat> cloudflare yes
23:58:00 <zid`> officially a cloud
23:58:00 <heat> an orange one, yes
23:58:00 <zid`> just don't get it confused with being a spider
23:58:00 <zid`> they're the same word in japanese
23:59:00 <heat> spiderflare sounds great
23:59:00 <zid`> spiderflare is my metal band
23:59:00 <zid`> The first single on our LP is called Inferno Venom
23:59:00 <heat> anyway if I have to go through another 1hr onboarding session I'll unlive myself
23:59:00 <heat> let me engineer pls
23:59:00 <zid`> are you going to write their cool blog posts
23:59:00 <heat> hope so