Search logs: #osdev - 8 February 2019

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present

http://bespin.org/~qz/search/?view=1&c=osdev&y=19&m=2&d=8

Friday, 8 February 2019

12:00:02 <klange> Did you look at DT_INIT's value? Did you look at what was at DT_INIT's value + base address after you loaded the phdrs? What are you actually calling? Did you try to trace with gdb to see what was happening?
12:04:50 <mobile_c_> INIT = 0x55fac2275220
12:05:21 <mobile_c_> library[library_index].base_address = 0x7f693fdf2000
12:05:29 <mobile_c_> library[library_index].base_address + INIT->d_un.d_ptr = 0x7f693fdf4120
12:05:48 <mobile_c_> INIT->d_un.d_ptr = 0x2120
12:07:48 <nyc> I'm surprised at how difficult a time I'm having with this.
12:08:10 <nyc> This shouldn't be difficult at all.
12:09:08 <nyc> Maybe I've suffered an unnoticed blow to the head.
12:36:39 <doug16k> mobile_c_, assuming your compiler even uses that section, this should do it: __attribute__((__constructor__)) void test_constructor() {}
12:37:01 <mobile_c_> ok
12:37:31 <doug16k> it may not be, it may be configured to use DT_INITARRAY
12:39:05 <doug16k> if you have .init_array, ld will make a DT_INITARRAY record in the .dynamic section. I'm guessing having a .init output section would put a DT_INIT in there
12:39:36 <doug16k> I configured gcc to use init_array so I'm not sure. I definitely get DT_INITARRAY in my libraries
12:47:16 <doug16k> objdump -p dumps the .dynamic section at the end of its output
12:47:39 <doug16k> you don't even need to debug anything to see if init is there
12:50:42 <doug16k> then objdump -S and find text with the dt_init value
12:51:06 <doug16k> drop leading zeros
12:54:09 <doug16k> if you don't have a DT_* record you expect to have, you probably need to put that information in its own output section with the standard name
12:54:28 <doug16k> I didn't get any DT_INITARRAY until I moved it out of .rodata where I threw it into .init_array
01:01:37 <mobile_c_> how do i get glibc to map x lib to a specific address
01:04:58 <klange> I'm not sure you do.
01:08:57 <nyc`> -fPIC is needed at minimum.
01:09:01 <doug16k> may I ask why you want x lib at a specific address?
01:14:14 <nyc`> I'm mostly unhappy about my code not working on MIPS.
01:14:53 <nyc`> This is not something that should be difficult.
01:18:26 <mobile_c_> doug16k: cus it makes it easy asf to compare the addresses than
01:18:51 <mobile_c_> i dont wanna do leads of hexadecimal subtraction ;-;
01:18:58 <mobile_c_> loads*
01:26:00 <mobile_c_> fuck my life
01:26:10 * mobile_c_ dies
01:27:05 * klange goes to mobile_c_'s funeral
01:51:55 <geist> nyc`: mu experience when getting LK working on mips is that the toolchain is more fiddly than it should
01:52:11 <geist> primarily because it has a lot of options, and i found the .sbss and .sdata stuff to be annoying
01:52:37 <geist> https://github.com/littlekernel/lk/blob/master/arch/mips/rules.mk#L39 i think helped somewhat
01:53:04 <geist> it turned off something annoying, may have been sdata/sbss
01:53:25 <geist> though i did end up trying to call it out https://github.com/littlekernel/lk/blob/master/arch/mips/linker.ld#L48
01:58:03 <geist> https://github.com/littlekernel/lk/blob/master/scripts/do-qemumips#L4 is the machine i ended up using
01:58:16 <geist> iirc it's a weird M14k + a sort of PC looking bus
03:06:21 <doug16k> geist, what does %KERNEL_BASE% do? what are the % signs?
03:08:42 <doug16k> ah I see, elsewhere sed replaces them
03:08:46 <klange> It's run through sed as a preprocessor, those are replaced with values from the Makefile...
03:09:04 <doug16k> yes. thanks
03:17:30 <eryjus> OMG!!! This cannot be happening!
03:18:05 <klange> Be more specific.
03:18:10 * eryjus finally has tasks swapping on his pi!
03:18:19 <klange> \o/
03:18:29 <eryjus> that was painful!!
03:18:39 <Telyra> congratulations on making code run on that nightmare of a platform
03:19:03 <bluezinc> \o/
03:19:15 <bluezinc> eryjus: Congratulations!
03:20:20 <zhiayang> Telyra: the pi, or arm in general?
03:20:27 <Telyra> The pi
03:20:31 <bluezinc> zhiayang: the pi is terrible.
03:20:33 <eryjus> Looking back through my notes, I started rpi back on 11-Nov.
03:20:36 <eryjus> the pi
03:20:54 <bluezinc> I almost ended up having to teach a course in bare-metal pi...
03:20:56 <zhiayang> what's bad about it?
03:20:59 <eryjus> it's my first attempt at anything ARM, but the pi was a bitch
03:20:59 <Telyra> Any platform whose system bus is USB is already suspected of high treason against the computer kingdom
03:21:04 <zhiayang> LOL
03:21:19 <bluezinc> zhiayang: entire thing is proprietary, and there's basically no docs.
03:21:38 <zhiayang> are there better arm boards to get, then?
03:21:42 <bluezinc> because broadcom are a bunch of f*ers...
03:21:51 <Telyra> Everything useful is covered in more NDAs than the Secure Enclave
03:22:31 <Telyra> Broadcom basically took one of their normal ARM SoCs and twiddled enough bits for their usual docs to be useless
03:22:37 <bluezinc> zhiayang: most of them are based around proprietary broadcom chipsets.
03:22:41 <Telyra> Then sold that to the Raspberry Pi "Organization"
03:23:12 <Telyra> I want to get a hold of a DIY platform with a BCM switch ASIC in it tbh
03:23:15 <bluezinc> Telyra: the Raspberry Pi guys are also broadcom developers.
03:23:27 <Telyra> Yeah the RPi org is *bad*
03:23:30 <bluezinc> otherwise, they'd never get a hold of the docs.
03:23:42 <Telyra> iirc they're considered a non-profit somehow, despite making $lodsamoney
03:23:43 <zhiayang> so what i'm feeling is that most arm boards are bad, but rpi is especially evil?
03:23:56 <bluezinc> to be fair, they basically went, "look at this cool chip we made"
03:24:00 <Telyra> ODROID is a good ARM board
03:24:08 <bluezinc> zhiayang: beaglebone is based on a sitara, odroid is an exynos.
03:24:57 <bluezinc> both are better than broadcom, which is like saying they've killed less people than Ghengis Khan.
03:25:48 <bluezinc> the beagle looks pretty OK.
03:26:14 <Telyra> I just want AMD to put out a friggin Zen-based successor to the G series
03:26:30 <bluezinc> also ended up almost teaching a class on using that one bare-metal. Fortunately I ended up with a more traditional microcontroller.
03:27:28 <bluezinc> if I had to pick an arm platform today, with no further research, I'd probably go for the STM32 series.
03:27:34 <eryjus> thanks everying for the cheers and accolates!
03:27:58 <bluezinc> eryjus: so whats next?
03:29:51 <eryjus> Checking my notes as we speak... the arch and platform abstractions are practially non-existent to I will probably take that on before I get into 64-bit
03:31:04 <eryjus> bluezinc, but most likely a cocktail to celebrate
03:31:40 <bluezinc> eryjus: well-deserved.
03:32:34 <bluezinc> eryjus: also, out of curiosity, how much do you have in the way of notes?
03:32:50 <bluezinc> I've never really been a fan of keeping them myself.
03:32:59 <eryjus> LMAO!!! https://github.com/eryjus/century-os/blob/master/JOURNAL.md
03:33:54 <eryjus> It's more like "how not to write a hobby OS" but there are relevant things in there I refer back on frequently
03:33:55 <bluezinc> oh.
03:34:06 <bluezinc> that's a lot more than I expected.
03:34:32 <bluezinc> to be fair, I never really was big on notes at university either.
03:35:33 <eryjus> neither was i .... i used to have institutional recollection of every conversation I ever had... then I got old and now I have learnt to take lots of notes
03:40:53 <klange> eryjus: jeebus that is a lot of text
03:43:22 <eryjus> I'm going to have to split that up a bit -- my editor is starting to choke on it.
03:44:54 <zhiayang> must be a pretty crappy editor then
03:45:24 <doug16k> eryjus, needs more journal.md though
03:45:40 <klange> mine handles it just fine
03:46:37 <eryjus> vscode cannot keep up with spell checking and when I update the .md it spins the cpu..
03:49:35 <CrystalMath> emacs FTW
03:50:50 <klange> If you can write your own OS, you can write your own text editor.
03:51:07 <klange> And every good OS needs a good text editor, so ipso facto if you want to write your own OS you need to write your own text editor.
03:51:28 <CrystalMath> or you can port emacs
03:51:37 <CrystalMath> it shouldn't really need much more than EmacsLisp working
03:52:01 <CrystalMath> when you have emacs running
03:52:04 <CrystalMath> you have the entire userland
03:53:19 <klange> But if you port emacs are you really building your own OS? :thinking:
03:55:58 <mischief> nice os, shame it lacks a good text editor
03:58:16 <zhiayang> nice emacs, shame it lacks a good os
04:01:26 <klange> I should check on my vim port, see if it builds under my libc...
04:01:38 <klange> I don't really care, my own editor does Enough™
06:19:19 <mawk> what happens with linux if I use "KVMKVMKVMKVM" cpu name instead of "GenuineIntel" ?
06:19:29 <mawk> wikipedia says it's a "common" cpu name for kvm
06:19:37 <mawk> or that it's known, at leastr
06:23:07 <immibis> probably not much, why would it?
06:24:11 <mawk> I found this inside linux source: « static const char *cpu_vendor_table[X86_VENDOR_MAX] = {"Unknown", "GenuineIntel", "AuthenticAMD"}; »
06:24:17 <mawk> so it's not supported
06:24:36 <mawk> well I thought that would enable linux to be more lax about its environment, or make use of special hypervisor calls or stuff
06:24:45 <mawk> to let me do less work in order to make linux work under KVM
06:25:03 <mawk> but linux probably has other ways of knowing it's under KVM
06:25:48 <klange> Part of the point of KVM is that you don't need to know you're running under KVM.
06:26:03 <klange> Which is why it works just fine running things that existed long before it did.
06:26:14 <klange> That was the big advantage of KVM over Xen, which required specific support.
06:27:16 <mawk> I see
06:29:37 <mawk> so I enabled the IRQ chip thing which includes an ioapic, there's also the common intel somethingsomething PIT, there's the TSC counter
06:29:52 <mawk> and I'm writing the common serial port over I/O ports
06:30:00 <mawk> after all of that linux should work fine right ?
06:30:06 <mawk> all considerations of hard drive or networking aside
06:30:38 <klange> Are you implementing a hypervisor?
06:30:42 <mawk> but I didn't enable KVM guest support in linux, because I don't know how this program will be tested
06:30:49 <mawk> just a KVM client
06:30:55 <mawk> like qemu
06:33:13 <mawk> https://paste.serveur.io/KiyLZwGL.cpp
06:33:17 <mawk> it's pretty easy so far
08:16:27 <mobile_c_> hi
08:16:40 <mobile_c_> [11:27] * klange goes to mobile_c_'s funeral
08:16:42 <mobile_c_> 0.0
08:17:09 <klange> * mobile_c_ dies
08:17:13 <klange> it made sense in context!
08:17:28 <mobile_c_> :P
08:17:50 <mobile_c_> didnt know id hv a funeral :P
08:50:18 <mobile_c_> ok this is weird
08:51:25 <mobile_c_> all my attempts to call DL_INIT suddenly work now o.o
08:51:57 <mobile_c_> unfortunately puts() still hangs
10:00:14 <mobile_c_> https://paste.pound-python.org/show/V8oQU2vPapYEQlvKRQiJ/
10:00:24 <mobile_c_> what am i doing wrong? ;-;
10:14:51 <mobile_c_> https://paste.pound-python.org/show/Ub5nUelt1ePsBpO9JjPQ/ *
10:56:55 <klange> I should finish my text markup thing...
12:25:27 <zhiayang> anyone know of a way to filter qemu's -d int option by the interrupt vector?
12:27:29 <klange> grep
12:28:17 <knebulae> @klange: I was going to say that, but didn't want to sound like a dick
12:28:35 <zhiayang> klange has no qualms
12:28:40 <klange> that's my serious ansewr
12:28:41 <knebulae> :)
12:28:56 <zhiayang> right, i actually only need the very last interrupt
12:29:03 <zhiayang> the issue is qemu goes ham and throws an error
12:29:14 <zhiayang> like before i even get the tianocore splash screen
12:29:26 <zhiayang> so i can't do anything
12:29:44 <klange> I think you can turn on the equivalent of the -d option from the monitor after tianocore or whatever
12:29:50 <zhiayang> ooh
12:30:00 <klange> I would not recall the particular commands
12:30:03 <knebulae> Probably misconfigured; or maybe a build with a buggy uefi/bios/firmware (I guess depending upon arch)?
12:30:08 <Mutabah> zhiayang: tail?
12:31:16 <knebulae> Tianocore uefi firmware shouldn't freeze prior to getting to uefi shell.
12:33:07 <zhiayang> idk man
12:34:45 <zhiayang> klange: ok, i can apparently do 'log int' in the monitor
12:34:52 <klange> cool
12:34:56 <knebulae> when I've had freezes under Tiano, it was usually due to a mismatched firmware. Are you using the latest version? I am successfully developing under the latest version with no trouble.
12:35:08 <zhiayang> yea i am
12:35:25 <zhiayang> i mean it's a few weeks old but it's not months old or anything
12:35:29 <knebulae> @zhiayang: What's your command line options?
12:35:46 <zhiayang> $(QEMU) $(QEMU_FLAGS) $(QEMU_E9_PORT_FILE) -s -S -monitor stdio -d cpu_reset
12:35:52 <zhiayang> QEMU_FLAGS = -smp 1 -vga std -m $(MEMORY) $(QEMU_UEFI_BIOS) $(QEMU_DISK_IMAGE) -no-shutdown -no-reboot
12:36:06 <zhiayang> QEMU_UEFI_BIOS = -bios utils/ovmf-x64/OVMF-pure-efi.fd
12:37:36 <knebulae> @zhiayang: -S -gdb tcp::9000 -drive file=..\..\qemu_boot\OVMF_CODE-need-smm.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=..\..\qemu_boot\OVMF_VARS-need-smm.fd,if=pflash,format=raw,unit=1 -drive file=fat:rw:..\..\k0\$(Platform)\$(Configuration)\,media=disk,if=virtio,format=raw -drive file=..\..\qemu_boot\UefiShell.iso,format=raw -m 512 -machine q35,smm=on -nodefaults -vga std -global driver=cfi.pflash01,property=secure,value=on -global
12:37:36 <knebulae> ICH9-LPC.disable_s3=1 -cpu Skylake-Client
12:38:02 <zhiayang> owo
12:38:04 <knebulae> @zhiayang: this is mine (sorry no nice env vars) - I think you are missing the SMM component of tiano/uefi
12:38:31 <klange> that's a heck of a qemu command line
12:38:42 <zhiayang> so instead of ovmf-pure-efi.fd i should be loading ovmf-need-smm.fd?
12:38:43 <zhiayang> or both?
12:38:49 <knebulae> It came from Alex Ionescu's VisualUefi.
12:39:00 <knebulae> I would've never been so detailed with qemu.
12:39:07 <klange> qemu-system-x86_64 -cdrom image.iso -serial mon:stdio -m 1G -soundhw ac97,pcspk -enable-kvm -rtc base=localtime -bios /usr/share/qemu/OVMF.fd
12:39:30 <klange> 'course, no kvm if you want -d int to work properly...
12:39:44 <zhiayang> i don't think kvm is going to work on wsl
12:39:48 <zhiayang> but good to know :D
12:39:58 <klange> if you have a recent build, haxm will
12:40:04 <klange> I think?
12:40:07 <knebulae> @zhiayang: are you running qemu under wsl?
12:40:11 <zhiayang> yes
12:40:12 <klange> Hm, might need a native Windows build.
12:40:20 <knebulae> You need a native windows build
12:40:26 <knebulae> Like, absolutely.
12:40:29 <zhiayang> oh i mean
12:40:30 <klange> Do a native windows build with haxm support.
12:40:34 <zhiayang> i'm running the exe from wsl
12:40:38 <zhiayang> so i guess it's a windows build
12:40:53 <klange> -vga std has been the default for years
12:40:55 <knebulae> I recall there being an issue with haxm under windows, but I'm not sure if that's just related to the android emulator.
12:40:56 <klange> -smp 1 is redundant
12:41:09 <zhiayang> oh yes don't mind the smp 1
12:41:10 <klange> no sound chips? tsk tsk
12:41:38 <zhiayang> -d cpu_reset was getting a bit spammy with the AP going nuts
12:41:41 <zhiayang> dk what's going on there either
12:41:51 <klange> look at these -drive lines on knebulae's! wowza
12:42:15 <klange> gods if they deprecate -cdrom I'm gonna have break into the committer's house and slap them
12:42:21 <knebulae> @klange: I distribute my code as part of a visual studio project for beginners.
12:42:30 <zhiayang> klange: qemu people do stupid things
12:42:34 <knebulae> It's self-contained, so you can build, debug, etc. from the IDE.
12:42:51 <zhiayang> i've had to include a patch in the osx version because they added a popup window to confirm quitting, but the default button is cancel
12:42:59 <knebulae> So the command line is a bit unwieldy
12:43:04 <zhiayang> and i can't tab -- so i have to move my mouse and click
12:43:04 <zhiayang> ugh
12:45:41 <klange> Anyyyway, you don't need the SMM stuff at boot, you can provide just the base image. You want it if you need persistence...
12:45:51 <zhiayang> meh, don't really need that
12:45:57 <klange> Just -bios ... *should* work, so I don't think that's it.
12:46:17 <klange> It may actually be qemu tripping over itself debugging interrupts from ovmf :)
12:46:24 <zhiayang> would not be surprised
12:46:33 <zhiayang> i'm still actually quite upset that ovmf takes so damn long to boot
12:47:06 <klange> yeah, no fricken clue what that is about
12:54:22 <zhiayang> bleh
12:54:35 <zhiayang> well at least i know why i'm triple faulting
12:56:19 <mrvn> all good things come in triplet
01:15:40 <graphitemaster> https://twitter.com/x86instructions/status/1093603428037648385
01:20:44 <zhiayang> oh no my page allocator is not returning page-aligned addresses
01:35:57 <zhiayang> does this make sense to any of you
01:36:07 <zhiayang> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
01:36:11 <zhiayang> LOAD 0x004440 0x0000000000604440 0x0000000000604440 0x000028 0x000250 RW 0x200000
01:36:23 <zhiayang> how can align be 0x200000 (which i need to change to be 0x1000 anyway)
01:36:29 <zhiayang> but the virtaddr is 0x604440
01:36:36 <zhiayang> wtaf
01:43:25 <klange> you misunderstand how alignment works in ELFs :)
01:45:44 <zhiayang> the spec isn't very clear on this
01:45:50 <zhiayang> well it's not clear to me
02:01:25 <mrvn> On amd64 the default page size for elf files is 2MB pages.
02:01:44 <mrvn> And the 604440 will match the offest in the file
02:01:57 <mrvn> 0x4440 bytes from the start probably.
02:21:26 <knebulae> anyone here responsible for the wiki?
02:25:04 <Lowl3v3l> knebulae: technically the forums and wiki are completely separate. What do you want to know?
02:25:41 <knebulae> I just have gotten to a point that I wanted to submit a certain commit as a uefi barebones. I feel it would be much more helpful than what's there.
02:26:21 <zhiayang> i would maybe put your content on your user page, and link to it from one of the existing uefi pages
02:26:28 <knebulae> Ok
02:27:00 <zhiayang> once it's fleshed out there's no issue with just appending it to one of the existing pages, or creating a new (non-user) page and linking it from somewhere (eg. the uefi page, or even the front page)
02:27:15 <knebulae> I've never been real active on there as I was an AoD guy, so I'm just not sure what you guys prefer.
02:27:21 <zhiayang> the existing uefi barebones page is absolute trash though, i personally don't have an issue if you just throw it away entirely
02:27:27 <zhiayang> but that's not up to me to decide :D
02:27:51 <zhiayang> hm wait
02:27:58 <zhiayang> https://wiki.osdev.org/UEFI_Bare_Bones this one is fine
02:28:01 <zhiayang> i was thinking of another one...
02:28:18 <knebulae> gotcha
02:29:00 <knebulae> I think what mine does that some of the others don't is actually integrate everything (i.e. most uefi you follow, then switch over to a regular tutorial at a certain point-- mine doesn't require that).
02:29:12 <zhiayang> ah, this is the dumpster fire: https://wiki.osdev.org/Uefi.inc
02:29:45 <zhiayang> (related: do you think there's any merit in having an osdev-tailored uefi bootloader?)
02:29:54 <knebulae> absolutely
02:29:57 <knebulae> You can do so much.
02:30:00 <zhiayang> (i know there were previous projects to do the same with a bios bootloader, but it totally fell through)
02:30:31 <knebulae> UEFI is so much easier.
02:30:42 <knebulae> I mean, it wouldn't even really be OS dev to do a loader.
02:30:53 <knebulae> UEFI has a pretty decent library.
02:31:13 <zhiayang> (unrelated: what happens if you need to send an EOI to the lapic? do oses usually map the ioapic + lapic mmios to all processes?)
02:32:21 <knebulae> @zhiayang: either make sure it's mapped in the processes vmap, or you need to make a system call.
02:34:04 <zhiayang> right, i feel like switching cr3 to ack the lapic is a tremendous waste of resource (wrt. tlb destruction)
02:34:19 <knebulae> @zhiayang: it is
02:34:40 <zhiayang> i'll just map it as non-user then
02:34:47 <knebulae> @zhiayang: but I'm not sure of your system's design, so I can't comment on the ramifications of making the ioapic/lapic available outside of a kernel context.
02:35:06 <zhiayang> look at you, thinking my system has a "design"
02:35:08 <zhiayang> :D
02:35:56 <zhiayang> the timer irq would need to be cpl=0, which is fine i guess
02:36:20 <knebulae> @zhiayang: :)
02:37:44 <zhiayang> let me know if you're done with the uefi stuff btw, i'd love to take a look
02:39:08 <knebulae> @zhiayang: https://github.com/nebulaeonline/nebulae k0 is the directory you want.
02:41:11 <knebulae> @zhiayang: This is basically at the commit where I start making "my" os. All of this is scaffolding, which is why I wanted to give it as a UEFI barebones. As far as scaffolding goes, it's pretty complete.
02:41:26 <zhiayang> your scaffolding seems pretty damn robust
02:41:35 <knebulae> I've been doing this for 22 years.
02:41:40 <zhiayang> in contrast my scaffolding is a bunch of sticks tied together with rubber bands
02:42:42 <knebulae> I actually can't wait to release to all you guys. I have the biggest fucking surprise in the world for you. :)
02:42:51 <zhiayang> :D:
02:43:10 <knebulae> Geist will faint @ google :)
02:43:41 <knebulae> @geist: j/k man. lol.
03:03:54 <nyc`> knebulae: Spiffy!
03:08:09 <nyc`> I probably have to face the rather unpleasant fact that I've deteriorated from ten years of disuse.
03:09:37 <nyc`> zhiayang: It would make sense to map those things in the u area.
03:10:11 <nyc`> Per-process kernel mappings.
03:10:54 <zhiayang> hm? sorry that's a bit confusing
03:11:05 <zhiayang> so you're saying yes, i should map it in every address space
03:11:38 <nyc`> zhiayang: Elder UNIX called per-process kernel mappings the u area.
03:12:55 <nyc`> zhiayang: It put a lot of things there so it could service many interrupts and syscalls without a full address space switch.
03:14:31 <nyc`> zhiayang: e.g. file handles came out of there so lseek() didn't need a full address space switch.
03:18:24 <zhiayang> nyc`: alright, makes sense
03:18:30 <zhiayang> doesn't modern linux have the vdso or something
03:18:30 <nyc`> zhiayang: They might have to map things in to drop refcounts on inodes/vnodes for close(), but AIX, for instance, paged most kernel memory, so a full address space switch might still not be necessary.
03:18:54 <zhiayang> (somewhat related: process switching works now)
03:18:56 <zhiayang> (yay)
03:19:48 <nyc`> zhiayang: That's just code mapped into user address spaces.
03:22:17 <nyc`> zhiayang: Linux isn't how I'd do a lot of things.
03:23:28 <nyc`> zhiayang: TLB management is not Linux' strong point IMNSHO.
03:37:07 <nyc`> To wit, RCU was in part inspired by the TLB management needs of DYNIX/ptx (McKenney was a former coworker).
03:39:01 <nyc`> Life is better with ASN's.
03:58:10 <nyc`> It's still unclear what's going wrong with the MIPS hello world.
04:04:37 <nyc`> Forget AP bringup, I'm not even taking faults or interrupts yet.
04:19:03 <nyc> Hmm, do I need xksseg?
04:41:34 <nyc> Maybe I have to use the 32-bit compat segments?
04:48:44 <zhiayang> "In Figure 4-21, gray shading indicates the fields that are ignored in 64-bit mode"
04:48:53 <zhiayang> cb then why do u gpf when i don't set the bit
04:48:54 <zhiayang> ._.
04:53:20 <knebulae> Ok, I added a link to that branch on the UEFI page. I didn't actually add it to the front page though under the barebones section. I didn't want to overstep.
04:53:40 <knebulae> https://wiki.osdev.org/UEFI#Articles
05:01:15 <nyc> I think xkseg needs the TLB to map it out.
05:02:40 <nyc> kseg0 may be what I need.
05:04:14 <nyc> There might be a subrange of xkphys too.
06:06:26 <hxclurk> Hey
06:07:26 <hxclurk> Can you guys recommend any good courses I can take online that has exercises like these http://www.cs.cmu.edu/~410-s07/projects.html
06:07:52 <hxclurk> I have to do something this weekend
06:12:08 <knebulae> @hxclurk: I have looked, but beyond an actual systems class at a college/university, there's not a lot of edX or other type of os dev classes that I've been able to find.
06:13:02 <nyc> hxclurk: I don't know of many universities that do similar. Carnegie Mellon is probably among very few universities that still have real kernel hacking classes left.
06:13:41 <doug16k> ya they see to focus on low level stuff there
06:14:27 <hxclurk> How the fuck am I supposed to get my kernal hacking fix then, I can't even kernal hack
06:15:24 <nyc> hxclurk: The basic boot affairs are formulaic, and you're following someone's directions regardless. It doesn't really hurt to just copy apart from copyright.
06:15:35 <knebulae> @hxclurk: practice, practice, practice. Reading, reading, reading. Failing, failing, failing. :)
06:16:10 <hxclurk> I've been enjoying the idea of kernal hacking for 2 years now
06:16:18 <hxclurk> Haven't actually kernal hacked till then
06:16:31 <nyc> hxclurk: Once you get past basic booting it's mostly low-level programming in C or Ada or whatever you use.
06:18:02 <nyc> There's no grand secret, it's just a lot of work.
06:18:31 <hxclurk> i just want to play and eat sugar
06:18:44 <hxclurk> can't control myself
06:18:45 <doug16k> do all wireless keyboard drop keystrokes or just this temporary piece of crap from walmart
06:19:05 <nyc> hxclurk: Well, people are desperate for kernel hackers in general. Learn how to do drivers and you'll get jobs.
06:19:24 <nyc> doug16k: My Bluetooth keyboards did not drop keystrokes.
06:19:56 <hxclurk> Do driver-makers start with osdevving?
06:20:07 <knebulae> @hxclurk: or vice-versa. It helps.
06:20:20 <hxclurk> thanks, i kiss you
06:24:26 <nyc> hxclurk: OS's don't get far without drivers.
06:25:22 <knebulae> @nyc: you don't need nearly as many nowadays to get very, very, far. :)
06:25:29 <doug16k> I have the opposite problem though. too much hardware support done, not enough userland done
06:25:55 <nyc> doug16k: Not enough userland in what sense?
06:27:00 <doug16k> incomplete libc
06:27:33 <nyc> doug16k: Are you lacking the syscalls behind it?
06:28:07 <doug16k> I have syscalls behind what's done
06:28:12 <geist> knebulae: hmm?
06:28:24 <geist> re: fainting
06:28:53 <nyc> doug16k: My wild guess is that the parts that aren't done may not be in part because the syscalls aren't there.
06:29:48 <doug16k> example: I recently added socket, connect, accept, listen, sendto, send, recvfrom, recv, but there isn't a socket to be found in the kernel layer, and there is a NIC driver layer and implemented NIC driver :D
06:29:52 <knebulae> @geist: I did it. But I can't say what "it" is right now. But I think everyone here will appreciate it :)
06:30:28 <doug16k> hardware support way ahead of APIS
06:31:01 <nyc> doug16k: The net stack involves significant pain.
06:31:02 <knebulae> @geist: if nothing else, it will be one hell of an "odd" kernel :)
06:31:45 <geist> ah
06:32:45 <nyc> Google interviews me on an almost annual basis, but turns me down every time. My wild guess is that code churn of files with my name on them gets recruiters' attention.
06:33:20 <knebulae> @nyc: Google wouldn't even let me valet cars. ;)
06:33:56 <Telyra> If they open a datacenter or ISP in Vancouver and need sysadmins or network engineers, I'm game
06:34:21 <hxclurk> i work a job none of you would believe even exists
06:34:28 <hxclurk> its so bizarre
06:35:47 <nyc> knebulae: They've got akpm and hugh, so they aren't remotely interested in me.
06:36:52 <knebulae> @hxclurk: I write firmware for low-level children's amusement video games :/
06:37:09 <knebulae> @hxclurk: word jumble; I hope you knew what I meant.
06:37:16 <hxclurk> i write network drivers of flying dildos
06:37:18 <nyc> knebulae: You know the story about hugh and sunil with the race between PPro CPU's updating dirty bits for pagetables in hardware?
06:37:23 <Telyra> I build networks and stuff
06:37:31 <nyc> hxclurk: That is impressive.
06:37:55 <knebulae> I do not. I don't follow linux closely other than to be a competent admin when needed.
06:38:23 <nyc> knebulae: That's the hugh that Google has. The guy who did the original far better working patch for page clustering that my 64GB x86-32 code was based on.
06:39:23 <knebulae> @nyc: so your "inkling" is they don't need you.
06:39:37 <nyc> knebulae: Pretty much.
06:39:44 <knebulae> @nyc: gotcha
06:40:51 <hxclurk> Can you basically say fuck it and not even logically seperate your bootloader from your os?
06:41:11 <knebulae> @hxclurk: you can do whatever you want; you're the programmer.
06:41:34 <hxclurk> Don't steal my idea
06:41:49 <knebulae> @hxclurk: I have my own ;)
06:42:08 <hxclurk> I'll be a billionaire by not seperating bootloader from the os
06:42:50 <knebulae> @hxclurk: I don't think an operating system is going to make anyone a billion dollars again anytime soon; It's just not as critical as it used to be.
06:42:55 <doug16k> 80's games often did that - they had to boot from their floppy
06:43:24 <hxclurk> you dont understand, this is not just os, its bootloader AND os
06:43:26 <knebulae> @hxclurk: unless you're talking big(ger) iron.
06:43:39 <knebulae> @hxclurk: oh my bad; proceed.
06:43:41 <hxclurk> its os that can boot itself
06:44:00 <hxclurk> so you can use your btrfs and usb driver to boot from btrfs usb drive
06:44:33 <doug16k> my bootloader is an OS. it provides memory allocation and I/O services to the boot code. :)
06:44:53 <doug16k> it just loads a much bigger better os and transfers control to it
06:45:25 <hxclurk> HAH, you just reinvented GRUB
06:45:41 <knebulae> @doug16k: I read your first statement and said "that sounds like DOS." Then I read your second statement and said "nevermind."
06:46:26 <hxclurk> my bootloader + os will compute in long mode but switch to real mode everytiem so that the bios routines are not there doing nothing
06:46:27 <knebulae> @doug16k: :)
06:46:53 <doug16k> it abstracts PXE downloads as files that you open/pread/close. you can pread an offset and it will download and cache up to there and copy it over at the right offset
06:47:23 <hxclurk> well my os will boot over bluetooth
06:47:24 <doug16k> it has a physical allocator for loading stuff and a small object heap to keep stack usage minimal
06:47:41 <doug16k> not saying it is great. just saying you could define it as an OS layer
06:48:38 <hxclurk> x86 sucks, we need better physics
06:49:03 <Telyra> My boot process is different per platform (all the same arch, x86). On -generic, it's the usual "firmware loads sector, sector loads rest of bootloader, bootloader loads OS" process.
06:49:10 <nyc> I'm trying to get to hello world on MIPS, SPARC, RISC-V, OpenRISC, ARM, et al.
06:49:39 <hxclurk> my os just boots full linux first then boots itself over it
06:49:50 <Telyra> On -pce, the bootloader and optionally a whole OS image is a coreboot payload
06:50:19 <hxclurk> my os bootloader is just a systemd task
06:50:55 <hxclurk> installation is as easy as systemctl enable myos.service
06:51:43 <hxclurk> doesnt even need drivers
06:52:17 <doug16k> hxclurk, mine's better than grub. it supports x86-64. when it enters my kernel it has mapped everything in high half (KASLR capable)
06:52:31 <hxclurk> i see you are something of an hacker yourself
06:52:43 <hxclurk> can you tell me how to get a local admin at my work windows laptop
06:52:49 <hxclurk> they have bitlockered the drive
06:53:01 <doug16k> mine also uses the memory map for physical allocation and can tolerate arbitrary things in the way. when it enters the kernel paging has compensated for any weirdness in the physmem
06:53:29 <hxclurk> mine lets bluetooth devices DMA
06:55:41 <doug16k> for some very strange reason, gdb will never support x86 properly, and grub will never support x86-64 properly :D
06:56:03 <hxclurk> gdb supports x86
06:56:20 <hxclurk> do you even anti-debugging bro?
06:56:20 <doug16k> and with the rumors of dropping real mode going around, gdb might make it past the end of real mode's life and will never have supported it and never will
06:56:45 <doug16k> hxclurk, try making cs nonzero in real mode and then let me know how well it works
06:56:48 <hxclurk> why are they dropping real mode, i just learned segmentation
06:57:00 <Telyra> In fairness, they've been saying "we're dropping real mode" for like, 20 years now
06:57:18 <hxclurk> who will reward me for knowing segmentation?
06:57:25 <hxclurk> how do i redeem the spoils?
06:57:38 <hxclurk> am i destined to the lifetime of java and js?
06:57:39 <Telyra> maybe that rdos dude who genuinely believes 32-bit segmentation is cool and good
06:57:45 <knebulae> @hxclurk: there are rumors of segmentation returning as well.
06:57:45 <hxclurk> LOL
06:57:50 <hxclurk> L0L
06:57:58 <hxclurk> you've got to be kidding me
06:58:04 <hxclurk> where do you get these rumours from
06:58:12 <nyc> I think it's being used as an idiotic way to have thread-local storage.
06:58:15 <knebulae> the interwebz
06:58:29 <hxclurk> i'd rather have you remove threads than return segmentation
06:58:49 <Telyra> FS and GS accesses for TLS are common
06:58:58 <knebulae> There's also the whole shadow-stack in supervisor mode that will be verified against to usermode stack pointer to ensure no ROP shenanigans. That's going to be a PITA (maybe).
06:59:03 <hxclurk> just because its common doesnt mean its not gay
06:59:09 <nyc> Sane architectures have enough registers and just reserve a register for pointers to thread-local storage.
06:59:26 <Telyra> And if you happen to have an old x86 chip that you want to do W^X on you can structure your address space properly to let you sort of hack it in with CS limits
06:59:50 <doug16k> nyc, instead of having a zero-penalty way to access it without consuming a register?
07:00:06 <hxclurk> register are cheap to implement
07:00:13 <hxclurk> cheaper than all those opcode storage
07:00:16 <doug16k> ISA register bits aren't
07:00:36 <hxclurk> most expensive part of isa register bits is manual paper costs
07:00:54 <nyc> doug16k: When architectures have 32 registers, they're usable.
07:02:03 <hxclurk> god approved os should have processes with no threads and they should have message passin
07:02:10 <knebulae> Sorry, proper name is Intel CET Control Flow Enforcement Technology: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
07:02:55 <nyc> hxclurk: Careful there or you might reinvent TempleOS.
07:03:21 <hxclurk> terry has no sense of code quality
07:03:46 <hxclurk> check his holyc files
07:03:59 <hxclurk> look like literal text files that nothing on god's green earth parses
07:04:53 <doug16k> to force gdb to disassemble the current instruction in real mode -> x /1i (($cs << 4) + $eip)
07:05:07 <doug16k> gdb has no clue about real mode segments, not one
07:05:28 <hxclurk> i wish i was half the man terry was
07:05:36 <hxclurk> https://github.com/minexew/TempleOS/blob/archive/Compiler/PrsExp.HC
07:05:50 <hxclurk> who the fuck writes that
07:05:57 <hxclurk> like how is that even possible to write
07:06:12 <hxclurk> you cant even read it
07:06:50 <hxclurk> how do i crack this code?
07:07:04 <nyc> It is possible to read. He's basically hand-written a parser.
07:07:27 <hxclurk> why am i such a colossal fucking noob
07:08:41 <nyc> hxclurk: You're not, really. Just work at coding things and the practice will add up.
07:09:08 <hxclurk> thanks i kiss you
07:09:08 <nyc> hxclurk: You're probably more learned and practiced than some people who've graduated with CS degrees.
07:15:59 <knebulae> @nyc: lol; I went to a middle of the road state school, and our CS guys were ok. The real problem is that some people just don't have the mind for programming, even if they possess the raw scholastic aptitude to complete a CS underad. Outside the math, and basics, it's only like 12 courses.
07:16:31 <knebulae> A dedicated student could do it self-study in a year at 8-10 hours per day.
07:17:31 <nyc> The big nasty parts of CS curricula were historically writing a compiler and writing a kernel.
07:17:46 <nyc> They don't actually have people do those things anymore.
07:18:17 <knebulae> @nyc: The compiler I'd be in the dark on, although I did read the dragon book around 2004 (I think). I didn't know enough then to retain any of it tbh.
07:18:50 <nyc> knebulae: They actually had people write whole compilers in a semester from the dragon book.
07:19:31 <knebulae> @nyc: I miss those "skunk-works" style classes. They were always so much fun.
07:20:18 <knebulae> That's where you could really tell the wheat from the chaff if you catch my drift.
07:21:02 <nyc> I actually did two compiler classes, one undergrad and one upper-level with pre-qualifier grad students and advanced undergrads.
07:21:48 <knebulae> I would kill to do an advanced degree; I think I'm passed that point in my life though... :(
07:21:55 <nyc> The codegen was macro assembler quality.
07:22:14 <knebulae> @nyc: I wouldn't expect llvm in a semester :)
07:22:29 <doug16k> compilers are great for students because you have to make it work with an incomprehensible level of input complexity. things can be arbitrarily nested and expressions can be arbitrarily complex. it's probably the first time they dealt with unbounded complexity
07:22:34 <nyc> knebulae: I'm still up for a Ph.D. if I could get recommendations for it.
07:22:54 <nyc> knebulae: I mostly can't pass admissions.
07:22:57 <knebulae> @nyc: email me
07:23:32 <nyc> knebulae: I don't think just anyone can write a meaningful recommendation.
07:23:44 <knebulae> @nyc: you think? ;)
07:25:10 <nyc> knebulae: I can hand it to wherever I'm applying, but they'll go about figuring out who the people are who wrote the recommendations, and unless they're sufficiently impressed, the recommendations won't be worth anything.
07:25:56 <nyc> knebulae: If I could get two of akpm, hugh, Paul McKenney, and Hubertus Franke to write recommendations for me, I'd be set.
08:01:08 <nyc> 0x9800000000000000 might be where to load?
08:12:07 <mawk> [20:18:50] <nyc> knebulae: They actually had people write whole compilers in a semester from the dragon book.
08:12:15 <mawk> we did something like that in my school
08:12:39 <nyc> mawk: I was around when Purdue stopped doing that for people in semesters after me.
08:13:05 <knebulae> @nyc: were you at Purdue when they still used the LON-CAPA system over telnet?
08:13:28 <nyc> knebulae: No, I was there in the 90's.
08:13:51 <knebulae> @nyc: I didn't even start uni until 1996.
08:13:59 <nyc> mawk: Did you use the Appel book?
08:14:06 <mawk> yes
08:14:16 <nyc> mawk: My upper-level compiler class used the Appel book.
08:14:39 <geist> is that like a powaerbook or a maekbook?
08:14:52 <mawk> this is the language derived from the one presented by Appel that we made: https://www.lrde.epita.fr/~tiger/tiger.html
08:14:56 <mawk> it looks like ugly caml
08:15:05 <mawk> we did it in C++ with bison/flex
08:15:08 <knebulae> @nyc: https://www.cs.princeton.edu/~appel/modern/c/ ?
08:15:35 <nyc> geist: http://gen.lib.rus.ec/book/index.php?md5=255A64EDE60074D32FAFABEB1B37C492
08:15:37 <knebulae> or is that not the one?
08:15:38 <mawk> we had an optional class that made us have the llvm ir as a target
08:15:45 <mawk> but I didn't take it
08:15:58 <nyc> knebulae: That's probably the one, yes.
08:40:16 <nyc> Linux uses kseg0.
09:02:04 <mawk> so physical addresses only ?
09:02:10 <mawk> so it's different that with x86
09:02:19 <mawk> or you mean it uses it for some purpose
09:08:00 <nyc> Well, it's more like a hardwired BAT.
09:08:09 <nyc> (Batch Address Translation)
09:11:56 <mawk> but I thought kseg0 can't be mapped
09:12:27 <mawk> so it's 1:1 mapping to physical
09:12:27 <mawk> no ?
09:12:42 <nyc> Yes, it's 1:1 to physical.
09:26:04 <lukos> hello
10:02:01 <knebulae> @lukos: what up?
10:07:39 <lukos> @knebulae rewriting my memory manager
10:09:24 <knebulae> @lukos: fun or necessity?
10:11:00 <lukos> @knebulae fun i'm using a singly linked list to allocate the memory, but am thinking about using a bitmap instead
10:11:22 <nyc`> I need to get to hello world on my various arches, then take interrupts, then bring up AP's etc.
10:15:12 <nyc`> There are reasons to go other directions with the memory manager, e.g. in order to reduce the number of cachelines touched.
10:20:54 <nyc`> One might hope to enable large amounts of memory to be freed in time asymptotically smaller than the amount of memory involved.
10:21:46 <nyc`> ZFOD makes the converse difficult.
10:27:30 <nyc`> Reputedly there are cache issues with zeroing in the idle thread, and things want to save power anyway.
10:28:47 <Ameisen> https://en.wikipedia.org/wiki/Sketchpad
10:28:55 <Ameisen> the fact that that existed in 1963... hurts my brain
10:29:43 <geist> yah i dont think that's a general solution any more
10:30:15 <Ameisen> I feel like all the engineers back then were much smarter than the ones of today, though I believe that it's just due to them being very good general engineers - once computers became more powerful, people just became more specialized.
10:30:17 <geist> that being said some arches have no-cache zero operations, which at least dont trash the cache while writing zeros
10:30:42 <geist> but someone was saying that that has some numa cross-socket load issues
10:31:10 <Ameisen> geist - I know that clang, at least, has builtins for non-temporal load/stores (for architectures that support it)
10:31:12 <geist> though i guess if you're locally zeroing most of the time it should minimize that
10:33:01 <knebulae> @Ameisen: they had to learn with pencil and paper, with very few abstractions. Doing things by hand over and over creates a sort of "intuition" about an architecture that I think people are far abstracted away from nowadays.
10:33:22 <geist> knebulae: agreed
10:33:37 <Ameisen> knebulae - well, I was even just reading about how magnetic-core memory was just invented by one guy who needed faster memory for a military project, as well.
10:33:48 <Ameisen> because he saw an advertisement for ceramic ferrites
10:33:51 <Ameisen> and was inspired.
10:34:32 <knebulae> I hate to tell you guys, but that kind of sh*t is everywhere around us. Companies can get blinded by little things or gloss over them because that's just not "their business right now."
10:36:18 <Ameisen> I imagine core memory, despite the time/labor going into winding it, was probably a lot cheaper to make than the mercury delay lines or the electrostatic memory
10:36:30 <nyc`> Linux mysteriously zeros out pagetables during teardown. I slabified them and it got backed out because AMD AGP GART's needed some extra handholding for it.
10:37:06 <Ameisen> nyc` - there's an option in the config system for Linux controlling that.
10:37:07 <nyc`> I think ppc64 still uses the code, though.
10:37:09 <Ameisen> you can disable it.
10:38:31 <Ameisen> I'm guessing to really understand core memory (or at least be able to think up making it in the first place) you'd need to have some physics backgroudn
10:39:10 <knebulae> @Ameisen: well, yes, you are not just going to stumble onto things. You still have to be educated.
10:39:22 <Ameisen> Sure, but these days most people aren't well-educated in... _everything_
10:39:36 <knebulae> Well, that's the problem. There's just too much.
10:39:38 <Ameisen> so, what are the odds we're missing out on some awesome inventions because people aren't well-knowledged in all the fields anymore (and can't really be)?
10:39:50 <nyc`> Hint: Maxwell's equations just transform to a wave equation.
10:39:58 <Ameisen> Also, the fact that they miniaturized core memory so much in the PDP-8 is neat.
10:40:01 <Ameisen> https://upload.wikimedia.org/wikipedia/commons/thumb/e/ee/PDP-8_core_memory.jpg/1920px-PDP-8_core_memory.jpg
10:40:21 <Ameisen> the traces on those boards are also neat to see
10:40:37 <Ameisen> presumably they were hand-etched rather than done by a computer
10:40:44 <geist> nyc`: also re: zeroing out page tables, it depends on how they're torn down. if theyre torn down by unmappig everything, then the code will likely go through and systematically take down every mapping
10:41:01 <geist> so it'd only really be in the 'fast path' where it would want to jus toss the entire PT structure at once
10:41:19 <geist> but that's generally difficult to do, because you probably still want to harvest page table bits (M, A)
10:41:34 <Ameisen> nyc` - makes me realize that if you threw me back to 1948 or so... I'd be completely useless to... basically every field.
10:41:36 <geist> if you just threw it out you might miss some entries that were modified,etc, for shared mem
10:41:40 <Ameisen> possibly an impedence.
10:41:42 <geist> so it's fairly subtle there
10:43:17 <Ameisen> the most I could probably do is tell them the structure of DNA 5 years in advance.
10:43:44 <Ameisen> just weird to think about.
10:43:46 <knebulae> @Ameisen: which would've gotten you a Noebel
10:43:57 <Ameisen> "How did he discover it?"
10:44:01 <knebulae> LSD
10:44:02 <Ameisen> "He claims to be from the future."
10:44:11 <Ameisen> "Give him a nobel and put him in an asylum."
10:44:21 <knebulae> No, I mean the guy who discovered DNA was on LSD
10:44:30 <Ameisen> "He's rambling about page tables and compilers now."
10:44:50 <Ameisen> franklin, watson, and crick?
10:45:14 <nyc`> geist: Well, it also assumes radix tree pagetables.
10:45:18 <Ameisen> Rosalind Franklin was the one who took the actual crystallography images of DNA, and Watson and Crick were the ones who analyzed it and came up with the actual structure.
10:45:55 <knebulae> That's my worry; when I'm in the nursing home and don't have my faculties, I'm going to be yelling about cache coherence, large page tables, and the limitations of Intel's TLB implementation.
10:46:10 <Ameisen> And by that point, all of that will be common knowledge/general ed
10:46:48 <knebulae> @Ameisen: the orderlies will bring me my Z80 board to play with.
10:47:00 <Ameisen> Well, they'll bring you a Z80 emulator.
10:47:03 <geist> nyc`: yah. with something like PPC/POWER you have no choice but to iterate
10:48:39 <Ameisen> I find it funny that the normal strategy for core memory was _to keep it hot_, by putting it in an oven or hot oil bath.
10:48:46 <Ameisen> as opposed to anything modern
10:49:12 <knebulae> after all these years, I actually just repeated a stupid internet rumor. Man, I have to get my sh*t together today. http://realitysandwich.com/314873/francis-crick-dna-lsd/
10:52:24 <nyc> Yeah, forcing the kernel to iterate is bad hardware/hypervisor design.
10:59:22 <doug16k> how many iterations though? if few, it's super fast
11:00:20 <nyc> doug16k: The idea is that it's base page by base page iteration through the address space.
11:00:48 <doug16k> you can binary search then?
11:02:02 <nyc> dough16k: It's not a search process, but process teardown.
11:02:37 <doug16k> ah, I thought you had to do software page faults and deal with page table lookup yourself
11:02:55 <doug16k> er, software tlb miss I mean
11:03:00 <knebulae> You guys see this on HN? https://abopen.com/news/building-a-risc-v-pc/ :drools:
11:03:50 <doug16k> I'm not really following the threads properly
11:04:30 <doug16k> chat threads
11:06:20 <nyc> https://wiki.osdev.org/CFE_Bare_Bones
11:06:36 <geist> yes but like i'm saying you kind of have to
11:06:37 <nyc> doug16k: It was me and geist about background page zeroing.
11:06:49 <geist> at least for any pages that are mapped that may be potentially modified
11:06:56 <geist> on an arch like x86 that have a M bit
11:07:06 <geist> you have to iterate and harvest the M bit as you unmap the page
11:07:14 <geist> so it's not necessarirly okay to just toss the entire address space
11:07:43 <geist> you could probably save zeroing out the page tables, but that's probably not substantial compared to actually walking the tables
11:10:35 <geist> knebulae: neat!
11:10:59 <nyc> Well, if you're going to obey a slab discipline, it's a waste to not actually deal with them as a slab cache.
11:12:11 <geist> doug16k: yah my particular thread is on prrocess teardown whether or not you need to actually iterate over the page tbles of the process and unmap them or can you simply free all the page tbles to the pmm and move on
11:12:38 <geist> and my argument is in simple systems you can, but once you start tracking things like M and A bits you need to harvest each entry on the way out, as if they were actually unmapped
11:13:10 <nyc> If you know you're in the teardown process, you don't need to actually modify the pte's.
11:13:56 <doug16k> gotcha
11:13:57 <nyc> The accessed and modified bits harvesting process needs only go back to the struct page or other metadata associated with the physical page.
11:15:09 <nyc> There is a nasty trick with the modification at least on x86en where they have to be atomic not to race with remote (and possibly speculation on local) CPU's.
11:16:23 <nyc> So it's actually worth avoiding the atomic update.
11:16:24 <doug16k> it's worse than that. their TLB might have it cached with modified bit set and it won't RMW set that bit in that pte because it things it is already set
11:16:50 <doug16k> you have to ensure they have invalidated that tlb entry, to make them stop assuming the dirty bit is set already
11:17:41 <doug16k> no matter how atomically you clear the pte modified bit
11:18:01 <nyc> It's unclear exactly what they do, but they did things to make pagetable sharing work on SMP.
11:19:18 <doug16k> it works because you have to do the invalidations they state in the manual, which includes invalidating the tlbs of other cpus to reliably use the dirty bit
11:20:01 <doug16k> and/or accessed bit
11:20:27 <nyc> doug16k: There's no way they need a TLB shootdown just for that.
11:20:30 <doug16k> missing an access is not a big deal. missing a dirty is unacceptable
11:21:11 <doug16k> how else would it work? the TLB isn't coherent at all
11:22:08 <doug16k> let me see if I can cite intel manual section I am referring to...
11:23:05 <nyc> doug16k: I am pretty sure that when a CPU updates the accessed and modified bits, it uses atomic operations.
11:23:27 <doug16k> sure, it does
11:23:50 <nyc> doug16k: And I think one needs to use atomic operations when modifying a pte on x86.
11:24:18 <doug16k> BUT, only when the TLB entry it has says it needs to do that change. once it makes that change or it pulls in a TLB entry then it will remember the initial state of A and D and elide pte updates that are not needed
11:24:54 <doug16k> so, if core B has a TLB entry that says that D bit is set, then that cpu can write there all day and never update pte
11:25:12 <doug16k> then if core A sneaks in and clears D bit, core B won't know and will continue to skip setting D bit
11:25:57 <nyc> doug16k: I'm not 100% clear on what happens then.
11:26:40 <doug16k> core A must coerce core B into invalidating its TLB entry for that address, then clear D bit in PTE.
11:26:57 <doug16k> if core B's next insn writes there, it will RMW the D bit to 1 again
11:27:09 <doug16k> i.e., won't miss a write
11:27:57 <doug16k> if you didn't invalidate the other core's tlb entry, the other core would continue to assume D bit is already set and won't set it
11:29:43 <doug16k> see intel manual volume 3, 4.10.4 Invalidation of TLBs and Paging-Structure Caches
11:29:43 <nyc> doug16k: My wild guess is that that might happen.
11:34:17 <doug16k> 4.10.4.3: "If software modifies a paging-structure entry that identifies the final physical address for a linear address (either a PTE or a paging-structure entry in which the PS flag is 1) to change the dirty flag from 1 to 0, failure to perform an invalidation may result in the processor not setting that bit in response to a subsequent write to a linear address whose translation uses the entry. Software cannot interpret the bit being clear as an
11:34:17 <doug16k> indication that such a write has not occurred."
11:34:56 <nyc> It's not 100% clear about the role of SMP in all of this.
11:35:45 <doug16k> there's no TLB coherency on one cpu, let alone multiple. they are all equally incoherent
11:40:16 <doug16k> I'm not saying you blindly IPI every cpu every time. you can be clever and know which other cpus (if any) may have had that address space scheduled
11:46:04 <doug16k> I suggest you make your memory manager timid at first then turn on the lazy awesome coolness gradually
11:46:52 <nyc> doug16k: I think the idea was more that during x86 address space teardown that there's performance trouble with pte modification because it's typically atomic to avoid racing with CPU's that might be updating the pte at the same time.
11:47:52 <nyc> So a read-only pte access would be notably less intensive than an RMW to zero it.
11:47:56 <immibis> why would you be tearing down an address space that still has threads running on it?
11:49:00 <nyc> immibis: It wouldn't have threads running on it. The messages from the CPU's that had threads running on it might still be in transit.
11:49:56 <nyc> immibis: Also, speculation.
11:50:38 <doug16k> speculation is limited to the imagination of a core, the effects aren't globally visible
11:51:04 <immibis> you certainly can't speculate writes... right?
11:51:15 <doug16k> the effects become visible at retirement, in order
11:51:52 <knebulae> @doug16k: well that *was* the contract...
11:52:05 <immibis> still, why would you tear down an address space that is scheduled on another core, without having a message from that core confirming the thread is no longer using the address space? that seems like a blatant concurrency bug going on there
11:52:38 <doug16k> knebulae, lol, yeah. my use of "effects" doesn't include whether a cache line is cached ;)
11:52:54 <nyc> It wouldn't be scheduled on the other core.
11:55:27 <geist> re: A and M bits getting modified on x86, it's pretty fast. there's not a requirement that you flush the TLB to get update
11:55:42 <geist> basically the A bit gets set the moment you bring the TLB into existance, if it's not already set
11:56:02 <doug16k> yep sets it then and never again
11:56:04 <geist> the TLB doesn't need to actually track the A bit since it's essentially implicit that the cpu wrote it back
11:56:26 <doug16k> and only if it was 0 when it loaded it
11:56:31 <geist> M bit is more complicated, but due to the strongly ordered nature, it pretty much writes it back immeiately
11:56:51 <doug16k> yep and remembers it is 1 now and never sets it again even if you clear pte M bit
11:57:30 <geist> there's some SMP racy bits there, i think where cpu A can atomically swap a page table entry with 0 (or otherwise disable it), and then cpu B tries to write back the M bit
11:57:49 <geist> the AMD manual is alittle clearer on this, it seems to imply that it'll only write back to 'valid' page table entries
11:58:21 <geist> so presumably the cpu does a RMW of the PTE to or in the M bit, and it only does it if it's still an active entry. unclear if it checks that it still matches the TLB entry it is dealing with
11:58:27 <geist> the intel manual seems pretty unclear on this
11:58:28 <doug16k> you mean it cmpxchg the entry?
11:58:41 <geist> that's what the AMD manual implies, but it's not extremely clear
11:58:50 <doug16k> that would be cool :D
11:58:59 <geist> basically it says something pretty terse like it writes back the M bit to the page table if it's a 'valid' entry
11:59:17 <geist> well, it makes sense, otherwise if it blindly just did an atomic or on the PTE it could trash a new entry
11:59:18 <doug16k> still would only have an effect if the current tlb entry says a bit change is needed
11:59:26 <geist> oh absolutely
11:59:39 <geist> which is why when you unset the M bit you have to flush all the TLB entries on all cpus, etc