Search logs:

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present


http://bespin.org/~qz/search/?view=1&c=osdev&y=19&m=5&d=3

Friday, 3 May 2019

12:11:01 <elderK> doug16k: No x86_32 gcc?
12:11:06 <elderK> (Compiler Explorer)?
12:11:10 <elderK> It seems massively useful
12:11:47 <elderK> Also, is there a place page that like, summarizes what GCC does in what cases? Like, when does it auto-vectorize and generate SSE/MMX instructions for things like printf? :)
12:11:49 <zid> I keep meaning to get around to installing an x32 vm
12:12:01 <elderK> That way, I can get to like, understanding when I need to use what flags, like general-regs-only
12:12:18 <elderK> zid: I thought CE was C++ only? It seems to support C?
12:12:23 <zid> what's CE
12:12:27 <zid> godbolt?
12:12:36 <zid> it supports everything gcc supports
12:12:39 <zid> either via dropdown or -x lang
12:12:42 <elderK> Yeah, Compiler Explorer.
12:14:28 <doug16k> elderK, use -mx32
12:15:54 <doug16k> https://godbolt.org/z/gYtkN4
12:17:08 <doug16k> elderK, callers to stdarg functions don't do a special thing, they call it normally, using the usual register parameters, spilling to stack, etc
12:17:59 <doug16k> the compiler does it a bit lazy. it just pumps the register parameters into an array. it does the same for sse registers, just in case
12:18:16 <doug16k> then, va_arg can just choose integer or sse array, and just grab it from there
12:18:27 <doug16k> once it runs out of those it starts taking them from the stack
12:18:58 <doug16k> so, printf will end up doing a bunch of movabs to a stack buffer, just in case you va_arg something floating point
12:22:18 <doug16k> xmm0 through xmm7 will be saved to that buffer, just in case
12:22:38 <doug16k> additional ones are put on the stack
12:23:05 <doug16k> up to 8 floating point parameters can be passed in xmm registers
12:31:15 <doug16k> neat! seems gcc is being smarter than I thought it was -> https://godbolt.org/z/78DadU
12:31:29 <doug16k> if you take out the double one it removes the xmm stores
12:32:14 <doug16k> I wonder if it is because it thinks my va_list escapes
12:34:58 <doug16k> here I'm making it escape deliberately. see what it does with the register parameters?
12:34:59 <doug16k> https://godbolt.org/z/K5Cy57
12:38:14 <elderK> What is the best way to learn these details? Just disassembling such calls / functions?
12:38:27 <doug16k> x86_64 elf ABI specification
12:39:16 <doug16k> this is almost newest: http://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
12:39:19 <doug16k> probably close enough
12:39:59 <elderK> Thank you doug16k.
12:40:14 <doug16k> tells you everything about the initial state of the stack and machine, details about dynamic linking, TLS, function calls, etc
12:46:17 <doug16k> I lied when I said it calls stdarg functions the normal way. it is the normal way except it also expects you to pass the number of floating point arguments in register %al
12:52:34 <elderK> :) I'll have to check out the ABI in detail some time.
12:59:03 <clever> elderK: http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html also has some related info
01:00:05 <clever> elderK: and http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html then abuses the ELF specs by abusing the fact that absent fields are 0 (if the executable is SMALLER then an ELF header!!)
01:00:13 <clever> and then putting assembly into the elf header itself, lol
01:00:44 <clever> in the end, they made a 45 byte executable, that is a "working" ELF binary
01:02:02 <elderK> Thank you, clever :)
01:05:14 <gog> tfw your massively overhauled code compiles
01:05:26 <jjuran> :-)
01:05:30 <gog> i have yet to test it
01:05:35 <gog> i guarantee it doesn't work
01:06:40 <gog> i'm reluctant to start a debugging marathon when i don't have any cigarettes lmao
02:41:00 <eddyb> ooops I forgot to join this
02:41:17 <Mutabah> Welcome
02:41:18 <eddyb> whenever I switched to IRCCloud
02:41:23 <eddyb> Mutabah: wait
02:41:42 <eddyb> I wonder how many names I should've recognized over the past 5 years and I hadn't
02:41:49 <Mutabah> :D
02:42:02 <Mutabah> Well, when I first joined #rust I was still using thePowersGang here
02:42:23 <eddyb> why does that sound familiar?
02:42:39 <eddyb> anyway I was looking at a random channel I haven't touched in ages and, in the flags, well
02:42:52 <eddyb> alex from exclaim had an entry
02:43:10 <eddyb> how would I even figure out what he's up to lately lol
02:44:41 <eddyb> oh, aejsmith :D
02:45:15 <eddyb> sorry, that was easier than I expected, thanks to the forums (may they never die)
02:48:23 <eddyb> Mutabah: wow I just looked at the nick list and maybe I remember 2 names (tk tech and com buster), wow
02:48:53 <eddyb> I guess a lot changes in a decade
03:25:17 <geist> oh hey eddyb, haven't seen you in a while
03:26:20 <geist> doug16k: re: the SSE thing in va_arg. iirc x86-64 abi has a hidden bool in rax that says whether or not there are any fpu args present
03:26:34 <eddyb> been a bit busy, heh
03:26:37 <geist> there's a switch to disable it, which is a tiny microoptimiation if you're not using float
03:27:00 <eddyb> nowadays I am mostly moving mountains in Rust
03:27:05 <eddyb> with spoons
03:27:06 <geist> word.
03:27:14 <geist> but i'm sure it's very safely done
03:27:19 <eddyb> makes me wish I had more than two hands
03:28:08 <eddyb> geist: I mean, in the implementation. so, we have to be extra careful to not break the safety :P
03:29:37 <geist> ah
03:30:08 <geist> i should fiddle with it at some point but i haven't had the energy to devote to a proper language thrashing
03:33:01 <eddyb> the story is that, more than 5 years ago, miselin (of the Pedigree project) and I were trying to build a tiny kernel in it, and I was bothered that inline assembly outputs weren't treated exactly like an assignment
03:34:04 <eddyb> like, they were treated read/write even when write-only, so you had to initialize your "inport" or w/e result, even if the instruction would immediately overwrite it
03:34:28 <eddyb> so I dug into the compiler and found it stupidly easy to modify to get the right behavior
03:35:04 <eddyb> so that's how I got stuck in compiler land, where I will probably spend the rest of my life
03:46:27 <geist> yah that happens sometimes
05:04:12 <ZetItUp> https://www.youtube.com/watch?v=rywLHa1i9yk cool
05:24:26 <doug16k> ZetItUp, it's actually x86_64
05:25:37 <doug16k> 2W 2core 2MB cache 2.24GHz boost, 2 memory channels, 200mgz graphics base, 2 pcie lanes. were they joking around with all the 2's? :)
05:25:50 <doug16k> https://ark.intel.com/content/www/us/en/ark/products/85474/intel-atom-x5-z8500-processor-2m-cache-up-to-2-24-ghz.html
05:27:13 <doug16k> $27 eh? neat
05:27:25 <ZetItUp> yeah was about to say haha
05:27:59 <geist> very cute. the last little x86 emedded thing i got was an intel galileo and and joule. one of them they cancelled
05:48:13 <doug16k> heavy disk I/O is causing audio skip in chrome? oh now typing is hanging up a bit. seriously?
05:52:17 <bcos> doug16k: On which OS?
05:53:53 <bcos> (it's a situation where thread priorities and IO priorities have a major effect, but...)
05:55:08 <eddyb> have they changed the default IO scheduler in linux yet?
05:55:37 <eddyb> I keep being reminded about... bfq was it? every year or so
05:58:00 <bcos> For a lot of Linux stuff there's no sane way to specific the priority, so you're mostly screwed before you start ;-)
06:08:14 <doug16k> bcos, ubuntu
06:10:03 <doug16k> it's a huge "mv" of a large subtree of stuff
06:10:12 <doug16k> from one filesystem to another
06:10:49 <doug16k> the pause feeling reminds me of garbage collectors
06:12:55 <doug16k> 6.6G mem used, 9.7G "cache", swap: 0kb
06:20:43 <ZombieChicken> doug16k: renice and ionice might help?
06:25:47 <doug16k> ah ionice. put the mv in idle class. will see
06:26:16 <doug16k> nope. horrible ~800ms second pause of sound
06:26:34 <doug16k> wow, 3.5 second pause
06:26:50 <doug16k> yay, going to be eeeeasy to beat this interactive perf lol
06:27:26 <ZombieChicken> doug16k: Did you pass ionice the right process ID?
06:28:09 <doug16k> https://gist.github.com/doug65536/1b57929679d723c58876c2603bbc9d47
06:28:41 <ZombieChicken> try ionice -c3 -p $PID?
06:28:44 <ZombieChicken> that's what I use
06:29:18 <doug16k> done
06:29:26 <ZombieChicken> that worked?
06:30:03 <doug16k> no still had huge gap in audio after that
06:30:07 <doug16k> but it succeeded
06:30:35 <ZombieChicken> weird
06:33:29 <doug16k> oh... sudo perf top is telling me that make is doina lot. I'm not building anything
06:33:52 <doug16k> I think that is eclipse being it's bloated ass self spinning in some loop of crap in the background forever
06:34:02 <doug16k> I uninstalled it
06:34:48 <ZombieChicken> that would certainly cause problems
06:38:30 <doug16k> while true; do sudo killall make && echo 'killed!' ; done <- prints killed! rapidly
06:39:08 <doug16k> so a minigun firing kill signals won't stop it
06:39:30 <doug16k> oh, -9
06:41:49 <doug16k> killall firefox <-- nothing happens
06:42:08 <doug16k> killall -9 firefox <-- oh suddenly not so badass eh firefox?
06:42:08 <ZombieChicken> pkill -KILL make?
06:42:58 <doug16k> won't stop
06:43:05 <ZombieChicken> yeah
06:43:17 <ZombieChicken> sometimes processes in Linuxland don't want to go away
06:43:57 <doug16k> linux scheduling and paging is so bad I can't see not doing better than this
06:43:58 <doug16k> sorry
06:46:57 <bcos> Hrm - "recursive make fork-bomb", with race conditions (killall gets a list of processes, then make forks to recurse, then killall kills the old processes but not the newly forked ones)
06:47:16 <bcos> ^ not necessarily easy to prevent
06:48:45 <ZombieChicken> wb
06:48:53 <ZombieChicken> had to reboot to fix the problem?
06:51:27 <doug16k> yes it was getting too laggy
06:52:29 <ZombieChicken> I wonder whether or not Linus & co. have run into an unfixable bug yet
06:54:21 <doug16k> it's something to do with the kernel. no way another program's disk I/O file moves should be skipping audio in firefox
06:54:51 <doug16k> it just did it again. and typing hung for about 1.2 sec and it suddenly woke up
06:55:59 <doug16k> I had status for nerds open. last pause it had 120 seconds buffered
06:57:53 <immibis> ZombieChicken: spectre
07:00:47 <ZombieChicken> immibis: Isn't that a CPU problem?
07:01:46 <immibis> it's an unfixable bug in the CPU
07:02:18 <ZombieChicken> yeah, I'm talking about a bug with the kernel itself
07:02:37 <ZombieChicken> not something wrong with literally every 'modern' CPU in current existence
07:02:51 <ZombieChicken> (though I think ARM has a 'solution' for that now?)
07:10:01 <doug16k> IBPB solves enough of the problem
07:10:54 <doug16k> it's an MSR you write, Indirect Branch Prediction Barrier
07:11:13 <doug16k> branch histories from before the barrier can't affect jumps after the barrier
07:11:56 <doug16k> also SSBD has been implemented as microcode fix for a variant
07:12:21 <doug16k> STIBP mitigates the SMT thread vulnerability
07:13:15 <doug16k> SSBD = Speculative Store Bypass Disable
07:13:25 <doug16k> STIBP = single thread indirect branch predictor
07:13:35 <doug16k> s/single/separate/
07:14:27 <doug16k> and failing all that, don't run untrusted and trusted code on same core and use retpoline
07:14:59 <retpoline> please use me, thank you
07:15:12 <retpoline> .theo
07:15:13 <glenda> Bye bye.
07:15:13 <doug16k> slowdown is welcome to me. it was cheating
07:16:47 <doug16k> the mitigated x86 is the correct speed
07:18:34 <doug16k> obviously blindly speculating into everything and letting everything share branch history arbitrarily is faster. and wrong
07:18:54 * bcos is still planning to have "how trusted" and "how senstive is the data" values associated with processes; to control (enable, disable, determine how much) security mitigations are needed
07:20:46 <bcos> (without something like that, you have to use permanent "slowdowns for worst case assumptions")
07:28:20 <geist> the ghost of retpoline speaks
07:28:56 * retpoline jumps around
07:29:27 <bcos> retpoline: No, you don't jump around - your return around instead..
07:29:34 <bcos> *you
07:30:11 <geist> jump jump around jump around
07:30:15 <retpoline> i was thinking of Kris Kros
07:30:20 <retpoline> :)
07:30:22 <retpoline> exactly
07:30:33 <geist> house of pain had it all figured out
07:30:36 <Kazinsal_> jump up jump up and get down
07:42:19 <doug16k> they were trying to tell us about spectre and we were too dumb to realize it!
07:44:06 <doug16k> listen to song choruses. there may be a CVE in there if you really listen
07:47:56 <doug16k> "jump around" (warm up BTBs) "jump around" (warm up BTBs some more) "jump up jump up" execute kernel gadget, "and get down" transfer the data down to your low privilege level through sidechannel
08:00:13 <mquy90> I wonder what will happen when returning a struct (stack) in c?
08:04:05 <mquy90> I guess there is copying after ret, because struct address is on eax
08:06:11 <lava> bcos: i also think we need to revisit this idea of annotating secrets
08:06:40 <lava> if we would have this information we could do much more and much more efficiently
08:07:35 <lava> and yes, it would boil down to people performing second-order attacks, but they might not leak the actual data, and surely there will be loads of "bugs" where data is incorrectly annotated... but those will be add-one-keyword fixes
08:07:36 <Mutabah> mquy90: Depends on the size of the struct (and the specific ABI)
08:08:07 <Mutabah> mquy90: if it's "large" (-er than 2 pointers) then the ABI is usually to pass a pointer to the function, and that function writes to the return value via that pointer
08:11:06 <mquy90> pass a pointer to the function, even function without agurments?
08:13:01 <bcos> lava: Just try to imagine an "all memory is non-volatile" future (and the opportunities for cold-boot attacks)
08:13:13 <lava> yes
08:13:32 <lava> gives more and more reasons
08:18:56 <Mutabah> mquy90: Yes. The compiler will do this internally
08:20:20 <mquy90> I did objdump, and I got something similiar to this SO, https://stackoverflow.com/questions/2155730/how-do-c-compilers-implement-functions-that-return-large-structures
08:20:32 <mquy90> @Mutabah, it seems that it doesn't do anything special?
08:38:20 <geist> mquy90: the ABI specifically states what happens when you do that
08:38:47 <geist> generally speaking if it's above a certain size the caller will pass a pointer to a buffer that the callee fills in
08:40:05 <geist> which is what Mutabah wrote a while ago. i should read the backlog more before answering :)
08:56:04 <mquy90> @geist, do you have the link, just to check that specification
09:10:28 <doug16k> shoudn't we do fsbase and gsbase such that if the user mode code saves and restores the value of fs and gs the bases come back?
09:10:54 <doug16k> touch-and-you-are-screwed seems a bad design
09:10:58 <Mutabah> mquy90: "The link" doesn't exist, there's lots of different ABIs
09:11:33 <Mutabah> mquy90: You might want to see if you can find documentation on the "sysV" x86 ABI
09:12:44 <doug16k> ah, can't be done unless fs and gs base are < 4GB
09:14:17 <mquy90> ah thanks @Mutabah, @geist
09:17:11 <doug16k> has anything used call gates? seems like they would certainly be faster than interrupts
09:19:28 <doug16k> interrupt would have to read GDT anyway when it loads cs. why not touch the same line adjacent to it in the GDT and go faster(?)
09:21:06 <doug16k> a syscall gate right there in the gdt next to the hot cs/ds entries seems optimal
09:22:50 <doug16k> is the cost of loading cs from gdt really that bad vs syscall cheating hardcoded flat values?
09:23:52 <doug16k> can't do in 64 bit mode of course, far call is #UD
09:24:13 <doug16k> I think intel allows it but amd64 spec says it is invalid
09:27:04 <doug16k> I guess in syscall you can schedule in some of your call overhead stuff into getting the stack pointer set up and calling the syscall handler. interrupt handling stalls until it has that all completely figured out and does nothing until then
09:46:52 <geist> mquy90: depends on which ABI you're using
09:47:02 <geist> are you x86-64? using the SVR4 abi?
09:47:16 <geist> sysv yah
09:47:56 <geist> if so, then iirc the abi for that says up to 2 words of struct are returned via rax/rdx and after that ther'es a hidden first argument pushed which is a pointer to the return spot
09:48:06 <geist> but you should consult the abi for specifics
10:03:23 <geist> off topic but it's fasctinating: https://www.staff.ncl.ac.uk/daniel.nettle/PowellRobertsNettle.pdf
10:03:51 <geist> TL;DR adding googley eyes to a supermarket donation bucket increased donations
10:05:58 <geist> there's some prior research to this too apparently. the idea is that we're probably tuned to the idea of 'being watched' as a push to conform or be prosocial, as the paper put it
11:25:10 <doug16k> mquy90, yes, as geist mentioned, in general returned objects up to 128 bits will be returned in registers (rdx:rax). Non-trivial objects will not be "returned" really, the called function will receive a pointer into which it constructs the return value in-place
11:29:30 <mquy90> 128bits, you mean 64bits computer?
11:30:38 <mquy90> or I might be wrong, I thought that eax, edx are 2 words -> 64bits
11:31:07 <doug16k> if the caller said something[42] = some_fn() and say some_fn returns a structure, then when it calls some_fn it will really pass a pointer to something[42] into it, and the function will write its "return value" in place -> https://godbolt.org/z/U2e4Uy
11:31:41 <doug16k> it also returns the pointer to the return value
11:31:48 <doug16k> mquy90, I meant on x86_64
11:32:03 <doug16k> on i386 it is the same idea, but yeah, edx:eax -> 64 bit max register return
11:33:24 <mquy90> ah ah, just know about godbolt.org :+1:
11:33:34 <doug16k> https://godbolt.org/z/oz8lw6 <-- note 64 bit "structure" return value in rax
11:34:30 <doug16k> i386 isn't being aggressive -> https://godbolt.org/z/z0Bka1
11:34:50 <doug16k> ret_small is doing the same thing with the pointer to the return structure
11:35:43 <doug16k> i386 never does it -> https://godbolt.org/z/x46O19
11:37:52 <doug16k> compare 32 bit and 64 bit -> https://godbolt.org/z/6aKLZA
11:38:08 <doug16k> x86_64 is awesome in comparison
11:55:05 <doug16k> anyway, TLDR is, i386 pushes a pointer to storage for the return value for structures, x86_64 will use registers up to 128 bits wide then kick over to same as i386
11:55:36 <doug16k> in x86_64 case, first parameter is the pointer to storage, it also returns that pointer
11:56:40 <ZetItUp> https://gyazo.com/e3e591fb3b6c6733b43614be2c6e5847
11:56:47 <ZetItUp> typos in the Intel manual :P
11:57:07 <doug16k> \\ ?
11:57:12 <ZetItUp> ye :)
11:58:05 <ZetItUp> saw it in this video: https://www.youtube.com/watch?v=LA_DrBwkiJA
11:58:12 <ZetItUp> which is pretty interesting btw
12:06:18 <ZetItUp> that... was the best rickroll ever.
12:23:34 <doug16k> ZetItUp, unicode would enable more values than 7-bit printable ascii
12:24:03 <zid> not that ascii has 7 printable bits, more like 3
12:25:10 <zid> 5, final offer
12:25:32 <doug16k> geordi, << log(126-32+1)/log(2)
12:25:33 <geordi> 6.56986
12:25:48 <zid> and as we all know, 6.5 to the nearest multiple of 5 is 5
12:32:32 <ZetItUp> hehe
12:32:54 <doug16k> I always forget log2 is a function
12:33:14 <zid> anything is a function if you're brave enough
12:34:19 <doug16k> it's more precise too usually
12:35:43 <doug16k> geordi, << "wow: " << setprecision(20) << log2(126-32+1) - (log(126-32+1)/log(2))
12:35:44 <geordi> wow: 0
12:36:34 <bauen1> ._.
12:36:40 <bauen1> do we have a c++ bot ?
12:36:49 <zid> geordi just hangs around freenode
12:37:04 <zid> there's a good C bot too but it's hoarded by pragma-
12:37:30 <doug16k> bauen1, geordi is a C++ bot
12:37:39 <bauen1> i need to continue work on my brainfuck irc bot and add some usefull commands and multi channel support
02:24:01 <Pyjong> Hey guys! Would someone perhaps know what is the Pci Express way of enumerating bus?
02:24:20 <zid> mmio config space stuff?
02:24:33 <zid> generally the region location is provided by acpi
02:25:31 <Pyjong> I rather meant device discovery aspect of the enumeration
02:26:21 <Pyjong> Say if I have a root complex and a bunch of switches, how do I find out what downstream ports actually have devices connected on them
02:26:32 <zid> ah, ask elderK ;)
02:26:51 <bcos> Pyjong: It's essentially the same. You can even use the same "access mechanism #1" (with IO port 0x0CF8, etc) - mostly because backward compatiblity with older operating systems
02:27:38 <zid> Will there just be devices every 0x1000 in the mmio range or whatever?
02:27:43 <bcos> ..the newer memory mapped PCI config space is better (faster) and unlocks larger PCI config spaces (4096 bytes rather than 256 bytes)
02:28:06 <zid> if they exist, are they packed, and are they all there, could I probe for vendor:device ids, ec?
02:28:18 <bcos> ..but it's mostly (intentionally) optional - the extra space isn't used for essential stuff
02:29:04 <Pyjong> Emm ok so.. how does it work though? I have the rootcomplex config header, how do I find out about devices on it's downstream ports?
02:29:54 <bcos> zid: Can't remember the exact formula; but it's something like "physical_address_in_mapping = mapping_base + (((bus_number - first_bus_for_mapping) * 32 + device_number) * 8 + function_number) * 4096;"
02:30:07 <zid> oh right I remember that yea
02:30:09 <zid> it's on the wookie
02:30:12 <zid> I remember seeing it semi-recently
02:30:29 <zid> presumably I could probe from that though, was the real question?
02:31:53 <bcos> Mostly I'd do an abstract "get_dword_from_PCI(bus, device, function, offset) { if(using_PCIE_stuff() ) { use PCIE stuff } else {..." and not have to care what the access mechanism is anywhere else
02:32:21 <zid> clearly you should just patch the function call instead of using an if()
03:00:41 <Pyjong> you guys missed it by an inch xD
03:11:20 <bauen1> as it turns out, allocating the first bit of memory is actually really really complicated to do properly
03:13:09 <bcos> bauen1: Yes, best to allocate groups of 8 bits ;-)
03:13:25 <zid> I entirely ignore lowmem
03:13:28 <zid> it's just not worth the hassle
03:14:39 <bauen1> even if you ignore lowmem, you have to find a place in memory above 1mb
03:15:16 <bauen1> and doing it properly involves checking the mmap, because technically the kernel and modules could take up the free memory and then be followed by reserved memory
03:15:45 <bauen1> multiboot2 also doesn't sort the module tags which just makes everything even more annoying
03:15:48 <zid> My place above 1MB is "1MB" :p
03:15:59 <zid> if there's no room for my kernel there, grub wouldn't put me there
03:16:25 <zid> although technically I could OOM if there's like, only 1/2 pages after the end of my kernel instead of the like 10 I end up using
03:16:32 <bauen1> yes
03:16:35 <zid> but who has an amd64 machine with 2MB of total ram
03:16:43 <bauen1> there could be a hole right there
03:17:00 <zid> If there is, that hardware needs a special bootloader fix
03:17:06 <zid> the general case should entirely ignore that possibility imo
03:17:40 <bauen1> well, i'm trying to do this properly :D
03:17:47 <zid> My way is proper, imo
03:17:53 <zid> For a much more important definition of proper
03:17:57 <bauen1> true
03:18:10 <bcos> For UEFI; there's no guarantee that the whole first 4 GiB isn't used by firmware
03:18:11 <zid> supporting common hardware quickly and effectively, rather than hardware that almost certainly doesn't exist, and supporting it may introduce bugs into the normal case
03:18:58 <bcos> (although faulty RAM is also an option)
03:19:30 <zid> imo it's on you to fix your grub to load me at a different address, if the address you put me at is broken
03:19:39 <zid> not me to support badblocks inside the bootloader
03:19:40 <bauen1> the problem isn't actually the size of the kernel, but what starts happening when you start loading modules
03:20:02 <zid> modules are pre-loaded by grub
03:20:10 <zid> and I don't care where they are
03:20:17 <zid> if they're at 1GB that's fine
03:20:33 <zid> but I do need some free pages at the end of it, technically, to make page tables out of, I don't purely allocate out of bss
03:20:39 <bauen1> i guess you just hardcoded the pmm bitmap size then for 32 bit ?
03:20:51 <zid> bitmap is handled by the kernel, not the bootloader
03:21:26 <bauen1> yes, but you need a place for it, you could be on a machine with 32mb ram or 4gb
03:21:30 <zid> bootloader is just doing free += 4096; return free; more or less, so that it can install the bare minimum page tables to install the kernel into virtual memory
03:21:42 <zid> kernel then actually checks the e820
03:21:50 <bauen1> so you just have a big bss ?
03:21:58 <zid> making the bootloader pull free += 4096 from the e820 would just be a pain
03:22:10 <zid> if you don't have 40kB free after the end of the kernel image, you have issues with your machine :P
03:22:27 <bcos> That's most likely in use by modules
03:22:29 <bauen1> ^
03:22:39 <zid> what modules?
03:22:39 <bauen1> grub just puts the modules after your kernel
03:22:46 <zid> no, the kernel *is* a module, in my case
03:22:49 <zid> the bootloader is the kernel
03:22:52 <bauen1> oh
03:23:00 <bauen1> that is intersting
03:23:08 <zid> grub 0.97 won't load ELF64
03:23:18 <bcos> So, 1st is bootloader, 2nd is kernel, 3rd is initRD
03:23:21 <zid> but, module kernel-64.elf; kernel boot32.elf
03:23:22 <zid> works great
03:23:44 <zid> I don't have an initrd, and my bootloader would need changing to support it if I did
03:23:52 <zid> so it's not a problem
03:24:24 <bcos> Ah - you load boot splash screen as a separate file (not in "initRD")?
03:24:37 <zid> I don't 'load' anything from the bootloader
03:24:49 <zid> all it does is install about 10 pages worth of mappings (for the kernel), enters long mode, and jumps
03:24:59 <zid> I could allocate out of bss if I really wanted to
03:25:36 <zid> but it's going to be an arbitrary number of pages, depending on the alignment of the kernel, I might need two PML4 level entries, probably only need 1
03:25:47 <zid> and size
03:25:58 <zid> well, pml4 is a bit high, doubt the kernel is 512GB big :p
03:26:05 <zid> might need multiple PD mappings, might not
03:27:50 <zid> .bss: { * (.bss); _bss_free; . += 40*1024; } would probably be plenty, and I'd just change the code to free = _bss_free;, but that can break just as easily as ignoring the e820, so I went with the easier option
03:28:15 <zid> checking the e820 would be a pain, so I just didn't bother, the kernel already checks the e820 so duplicating all the code would be meh
05:25:37 <androidirc> Hi, i have a cuestion. Im trying to enable long mode, and goes as well as it could be until i load the gdt. When i fix cs, i call ret and system triple faults (bochs throws a #GP fault), the excecution scheme is this: a c function which sets up lmode, calls the lgdt for loading a 64-bit gdt, then loads data segments with 0x10, and i do a far jmp for fixing cs. But, when i get to the fix_cs function and i do 'RET', i got a #GP. What could it be?i tried cha
05:28:33 <bauen1> androidirc: does the #GP happen on the long jump instruction or the ret instruction (most debuggers place the arrow on the line that follows) ?
05:29:16 <androidirc> On the ret instruction
05:29:42 * bcos guesses it's 32-bit protected mode code where the call pushes a 32-bit "return EIP"; but it tries loading a 64-bit code segment so the return expects a 64-bit "return RIP"; so after RET the RIP ends up corrupt and...
05:31:28 <androidirc> so... ret does return eip, so i need retfq?
05:32:01 <bcos> Honestly; I'd do it in assembly to start with. C can't handle CPU mode switches in the middle of a function
05:32:42 <bcos> ..and make it all a linear piece of code (no calls, no rets - just a 64-bit "jmp" at the end)
05:33:17 <androidirc> Ok. So, instead of calling kmain directly from grub, first i should set up lmode and anything else. Sounds good :-)
05:33:39 <androidirc> And also call the kmain from the far jmp(?)
05:34:02 <bcos> It's more worserer than that maybe
05:34:40 <androidirc> Ok. Or could it be because im compiling for 32-bit code the far jmp part
05:34:51 <androidirc> I think im doind that if i dont remember bad
05:35:05 <bcos> Typically (for 64-bit kernel) you want the kernel mapped at a virtual address like 0xFFFFFFFF80000000; but that creates a big mess with linker (and because some versions of GRUB don't support 64 elf)
05:35:26 <zid> which means you end up with 'cool' solutions like mine ;)
05:35:48 <bcos> ..so it can be more fun to have GRUB start a piece of code that sets up long mode and maps kernel; and then have the kernel as a 64-bit module loaded separately by GRUB
05:36:02 <zid> ^
05:36:05 <androidirc> @zid sound great, whats your solution?
05:36:23 <zid> what bcos just said
05:36:23 <androidirc> Bcos, i want also to have compatibility with 32-bit pmode
05:36:38 <androidirc> @yes :-)
05:36:44 <zid> @yes? @zid?
05:36:46 <zid> is this twitter?
05:37:13 <bcos> ALthough this isn't the only way - I've think at least one person I know leaves kernel code at 0x00100000 (and puts kernel data in the higher half) so there's a small unusable blob at the start of user-space
05:37:17 <bcos> Hrm
05:37:21 <androidirc> typo, i was going to do @zid yes
05:37:24 <androidirc> Anyways
05:37:32 <zid> 'zid: ' is the prefered style
05:37:37 <androidirc> Ok
05:37:46 <bcos> Might also be fun to think about security (e.g. meltdown, and if you want kernel to be in its own separate virtual address space)
05:37:54 <androidirc> Ok
05:38:01 <bcos> (..and also things like KASR)
05:38:07 <bcos> *KASLR
05:38:12 <androidirc> Whats that?
05:38:18 <bcos> (Kernel Address Space Layout Randomisation)
05:38:51 <androidirc> it seems to be like a dynamic addr space allocation
05:38:54 <bcos> ..where the idea is that stuff in kernel space gets a randomised address to make various attacks (rowhammer) harder
05:39:15 <androidirc> Oh. Ok
05:39:29 <androidirc> Thanks
05:39:39 <zid> bcos: That isn't really the main attack it mitigates, fwiw
05:39:42 <zid> it's "any buffer over flow"
05:40:01 <zid> because you can't know where your shellcode is to jump to it, or where any kernel functions are to do rops, etc
05:40:26 <zid> can't even exploit write gadgets
05:40:26 <bcos> Just get your shellcode to jump to a known location in user-space..
05:40:41 <bcos> *return to
05:42:17 <zid> anyway https://github.com/zid/boros/blob/master/boot/long.asm That's what I wrote to do this
05:42:29 <zid> C calls this function, and magic happens, then the kernel is running up at -2GB
05:43:04 <bcos> "androidirc has left this server"
05:43:14 <zid> bah good timin
05:44:52 <zid> That looks like line 44 has the stack address wrong, neat
05:45:05 <zid> 10000 instead of 1000 is probably correct
05:45:28 <zid> although I guess only allocating a 4k stack would help me find funny code, maybe I did intend that
05:49:32 <IRCMonkey> Please read ~> https://www.welivesecurity.com/wp-content/uploads/2016/06/windows-10-security-privacy.pdf
05:49:44 <zid> no
05:49:45 <bcos> Why?
05:49:46 <IRCMonkey> Windowz 10 is most virus-safe OS ever…
05:49:55 <zid> please suck a dic
05:49:55 <IRCMonkey> Device Guard, EMIT, Defender…
05:50:31 <IRCMonkey> zid » Suck a flip slit !
05:50:38 <zid> lol
05:52:01 <aalm> .theo
05:52:01 <glenda> Are we finished here?