Search logs:

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present


http://bespin.org/~qz/search/?view=1&c=osdev&y=18&m=11&d=18

Sunday, 18 November 2018

12:02:10 <andrewrk> mrvn, do you know what that concept is called? powering down a core?
12:03:34 <mrvn> no. Maybe your cpu doesn't even have it. I think the RPi3 are crappy in that regard..
12:04:03 <andrewrk> would it be a feature associated with the processor (Cortex-A53), or with the system itself?
12:04:17 <mrvn> system
12:04:22 <andrewrk> that's a good clue. thanks
12:05:06 <mrvn> or the mpcore trm
12:05:46 <mrvn> did you look in the linux kernel if it has a core_die() function?
12:06:15 <froggey> afaik the standard way for doing power management on aarch64 is via the PSCI firmware
12:07:15 <andrewrk> I'll look for that
12:18:09 <andrewrk> mrvn, the Auxiliary Control Register controls access to "implementation defined" registers. does that mean the system could map one of these registers to control power-state of the CPU cores? is that what you were hinting at?
12:19:15 <andrewrk> I'll have a look at PSCI
12:22:12 <andrewrk> this looks right. I see source for it in both linux and freebsd
12:29:17 <mrvn> I'm stuck for some stupid reason trying to enable the MMU using 1MB sections on an ARMv7. Can anyone take a loog and spot a problem? https://gist.github.com/mrvn/b4eda5e397f4368fdf34de5af2aa3bc1
12:29:35 <mrvn> look even
12:29:57 * radens loogs
12:31:53 <radens> mrvn: hm haven't looked at 32 bit arm assembly in a while. I have some similar code lying around I think and the worlds shittiest patch to qemu for dumping the arm page tables, if that will help.
12:33:31 <mrvn> Args, I've been looking at this for an hour and as soon as I pasted the URL I spotted the problem. The identity mapping for the peripherals is off. It maps 0xC0000000 -> 0xF0000000 instead of identity. Never mind, solved.
12:33:56 <aalm> :]
12:34:04 <radens> amazing how fresh eyes can fix things
12:34:55 <mrvn> Is there some trick to make gcc emit code for X/10 and X%10 instead of calling the libgcc divmod function?
12:35:20 <radens> -ffreestanding doesn't do it?
12:36:43 <mrvn> no
12:37:08 <radens> what are your gcc flags?
12:37:10 <mrvn> currently my kprintf has this code: https://gist.github.com/mrvn/469301051ae4a71fde6af86df8d2f423
12:37:46 <mrvn> Would be nice if gcc would do that code on it's own.
12:38:34 <mrvn> radens: added to the gist
12:40:30 <radens> lol is that newton's method unrolled?
12:40:39 <radens> uh yeah never had that problem?
12:41:23 <radens> CXXFLAGS = $(COMMON_FLAGS) -std=c++17 -Wall -Wextra -ffreestanding -fno-exceptions -fno-rtti $(INCLUDES) -march=armv7-a
12:41:24 <mrvn> It's from HackersDelight. It multiplies with the inverse and then error corrects at the end.
12:41:28 <radens> that's what I had.
12:42:04 <radens> are you tied to that march? What happens when you swap it with mine and rebuild everything?
12:42:07 <mrvn> And since 1/10 in binary has a nice repeating pattern doing the mul as shift+add comes out nicely.
12:42:35 <radens> (you probably chose that for a reason, but I'm curious)
12:42:50 <mrvn> arm1176jzf-s is what the RPi has. Could change that for the helios4 build.
12:44:24 <radens> https://github.com/iankronquist/kernel-of-truth/blob/788b08d90a5e83a58061c141d93397647625ba3d/kernel/core/format.c#L46
12:44:49 <radens> that's my equivalent code. It doesn't have any similar problems?
12:46:13 <radens> Or, try replacing your while loop with my for loop and see what happens.
12:46:16 <mrvn> No armv7-a in my gcc
12:46:23 <radens> :(
12:47:47 <mrvn> I tried cortex-a9 but that gets divmod too: 2000564: eb000463 bl 20016f8 <____aeabi_uldivmod_veneer>
12:48:10 <mrvn> 020016f8 <____aeabi_uldivmod_veneer>: 20016f8: e51ff004 ldr pc, [pc, #-4] ; 20016fc <____aeabi_uldivmod_veneer+0x4> 20016fc: 82003000 .word 0x82003000
12:48:26 <mrvn> And that is a trampoline to libgcc in higher half. So no use before the MMU is up.
12:48:45 <radens> are you bringing up the MMU in c code?
12:48:53 <radens> I usually bring it up right away
12:49:06 <radens> saves a lot of hassle and you can make your kernel relocatable
12:49:21 <mrvn> radens: I wanted to so I could use printf to debug stuff. But now I do it right at boot.
12:49:54 <mrvn> I hate that gcc can't generate actuall PIC/PIE code. Thoe trampolines are a pain.
12:51:02 <mrvn> Is there any linker trick one can use to tell that a pice of code is mapped at multiple places?
12:51:30 <mrvn> Like the libgcc. It's at 0x82003000 but also at 0x02003000. Pick the one that's closer.
12:54:12 <radens> https://gist.github.com/iankronquist/61909689ea6696bac930e553eb2e5a44
12:54:48 <radens> could just use clang instead
01:02:30 <radens> (that code is heavily influenced by LK, by the way, so credit where it's due)
01:04:52 <aalm> .theo
01:04:53 <glenda> I said no the first time, and provided a detailed explaination.
01:11:11 <mrvn> How do I cross compile with clng? The docs say to use -target but: clang: error: unknown argument: '-target=armv7a-pc-none-eabi'
01:12:36 <klange> It's --target
01:12:47 <radens> I think it's --target $(TRIPLE)
01:13:01 <mrvn> No, -target $(TRIPLE), the = was the error
01:13:27 <radens> don't you love the consistency of it all?
01:13:45 <klange> "--target=<value> Generate code for the given target"
01:14:09 <mrvn> klange: right, that work too. WTF
01:14:10 <radens> new clang may give you fits over -march, so use -mcpu=$(CPU)+nofp instead.
01:14:33 <klange> If '-target <value>' works it's an alias, and an apparently broken one if '-target=<value>' doesn't :)
01:14:59 <mrvn> Looks like the include files are broken: http://paste.debian.net/1052161/
01:15:36 <klange> I think you may need copious amounts of -nostdinc and -I depending on the target.
01:17:20 <mrvn> I used the same flags as for gcc other than -target. I would expect it to output something even if unsuitable for a kernel. Not have the include files redefine macros they defined themself.
01:19:27 <mrvn> That file can never ever work with __INT64_TYPE__ defined. And I didn't define that.
01:20:31 <mrvn> or at least __INT32_TYPE__ shouldn't be define then.
01:23:32 <mrvn> I give up on clang. Switching would be a bigger project.
01:24:43 <mrvn> Rewriting the Makefile to get the whole LTO for modules and libraries working with clang would be a pain as well.
01:27:22 <radens> you don't need LTO with clang, but it's a plus
03:02:05 <geist> i've bare metal ltoed with gcc too. it works pretty well
03:02:15 <geist> just pass -flto to gcc and then use it for linking (instead of using ld)
03:03:59 <sortie> Evening
03:06:18 <geist> hola
03:06:32 <geist> you still in the states?
03:09:39 <sortie> Yeah in MTV now
03:09:54 <geist> ugh, smoke!
03:10:32 <sortie> I walk around in this thinking "this isn't so bad" because the sky seems blue, but turns out the air is unhealthy and everyone is wearing masks. I'm really not used to this kind of pollution.
03:15:07 <geist> yeah smoke like that is extremely fine. funny thing is most masks dont really help
03:15:13 <geist> you need one with an airtight seal
03:15:29 <sortie> Yeah, I read that, still people seem to think lesser masks will do
03:26:06 <sortie> So I guess I'll do some osdev
03:46:17 * sortie apt upgrades the local Linux kernel while building his own kernel, to motivate the Linux upgrade
03:46:41 <radens> it's a race!
03:47:30 <sortie> Linux won. I'll get it next time!
03:47:37 <Mutabah> Hmm... who will win, downloads or ... oh
03:47:49 <sortie> autoconf for ports lost :)
03:48:03 <aalm> .theo
03:48:03 <glenda> Your process here is really strange.
03:48:20 <CompanionCube> accurate
03:49:29 <radens> is there a way to feed the output of nm to gdb as a list of symbols?
03:53:53 <sortie> For what purpose? To put breakpoints on, or for gdb to use in traces etc?
03:54:27 <ybyourmom> Today I didn't even have to use my AK
03:54:34 <ybyourmom> I gotta say it was a good day
03:55:32 <radens> sortie: breakpoints and stack traces
03:55:50 <radens> I can also translate manually but that's dumb
03:56:36 <sortie> radens: Shouldn't gdb just do that automatically? Maybe there's something extra to your case?
03:57:27 <klange> if you have something to run nm on, you have something you should be able to give to gdb...
03:57:39 <radens> sortie: ah, so I'm debugging linux. I tried to traverse the apt package repo manually, but the symbols are off by a little. So I'm wondering if I can just use /proc/kallsyms which is basically the output of nm.
04:00:01 <sortie> Ah
04:05:04 <radens> it would be cool if there was a symbol server and gdb knew how to fetch the symbols automatically, like some other operating systems.
04:05:45 <sortie> My gut feeling is that gdb can use custom symbol tables
04:06:11 <radens> yeah I don't see anything in the man page or some light googling of the docs
04:06:50 <radens> One could totally write a program to transform the output of nm back into an elf, but that's effort.
04:09:06 <klange> here you go https://github.com/wapiflapi/wsym
04:10:02 <radens> thanks
04:10:10 <radens> literally what I was thinking of doing.
04:13:09 <sortie> Hey klange
04:13:17 <geist> oh that's cute
04:15:00 <pixelherodev> LTO with GCC in bare metal is even easier with CMake
04:15:34 <geist> the tricky part is making sure it doesn't throw symbols away that it thinks are unreferenced and whatnot
04:15:42 <pixelherodev> Just add `-flto` to CMAKE_C_FLAGS
04:15:45 <geist> but it's usually not a big deal, just occasionally fights you
04:15:51 <pixelherodev> Seems fine so far
04:16:02 <geist> i think it's much smarter than it used to be, yes
04:16:06 <sortie> Hey pixelherodev
04:16:15 <geist> i fiddled with it recently on a fairly trivial project and it worked fine
04:16:51 <pixelherodev> Hello?
04:16:56 <pixelherodev> Yeah, just tested it on the game
04:17:11 <pixelherodev> ~50 bytes smaller (which proves it's doing *something*), and nothing's broken
04:17:48 <pixelherodev> (50 bytes smaller *at runtime* based on the kernel_end symbol my linker script adds to the very end of the kernel - might be even smaller on disk)
04:20:28 <geist> yeah
04:20:47 <geist> the usual obvious thing it'll do is inline things that are only called once
04:20:54 <geist> which usually shaves a bit off, depending on the code
04:23:53 <mrvn> It also inline things from different .c files. If you split things up a lot that is a major bonus.
04:24:10 <geist> exactly
04:24:11 <mrvn> And it optimizes functions that are called with constants
04:24:22 <geist> except where it makes it hard to debug, but that's why you should make it an option
04:25:06 <mrvn> The hard paart for me was that I have a micrrokernell. Every driver is supposed to be a separate process with it's own page tables. So code shouldn't be shared between drivers.
04:27:26 <geist> single binary microkernel?
04:27:40 <mrvn> geist: so far.
04:28:48 <geist> yah. i worked on something like that. it was interesting. you can basically pull it off if you dont rely on things like thread local storage
04:28:58 <geist> since the single binary will unify the TLS for all 'processes'
04:30:27 <mrvn> On that note I need something to convert an elf file into basically a struct { void *code; void *rodata; void *data; size_t bss_size; uint8_t image[]; }
04:31:19 <mrvn> preferably in the form of an .o file. Don't want to hexdump the binary and compile it again.
04:31:35 <mrvn> Ideas for a quick hack?
04:32:24 <radens> mrvn: in gas I think there's an .includebin operation
04:33:22 <klange> sortie: hi
04:33:54 <klange> working on dumb stuff / retreading old ground, implementing some widget toolkit stuff again https://i.imgur.com/lpyiN9b.png
04:33:54 <sortie> What's up?
04:34:06 <mrvn> /home/mrvn/src/moose/helios4/boot/boot.S:361: Error: unknown pseudo-op: `.includebin'
04:34:15 <radens> mrvn: another idea is to just cat it to the end of the binary and have your OS loader know about them
04:34:25 <sortie> klange: Mmm looking great. Like the wallpaper too
04:34:39 <radens> or use multiboot modules
04:34:49 <mrvn> radens: That's basically what I want to do. But if I just do that then it overlaps with the .bss
04:35:06 <mrvn> radens: and I still need to now the section sizes.
04:35:39 <radens> mrvn: I've had success before with multiboot modules
04:35:40 <geist> moose!
04:36:15 <mrvn> geist: My Own Operating System Environment.
04:36:27 <mrvn> geist: and yes, the logo is a moose. :)
04:36:31 <radens> mrvn: is it incbin?
04:36:36 <geist> boris, we have found moose
04:37:06 <klange> sortie: shot that in a new zealand earlier this year, outside of Auckland https://www.flickr.com/photos/k-lange/41884953774/in/datetaken-public/
04:38:44 <sortie> Beautiful
04:38:44 <mrvn> radens: yes
04:38:51 <sortie> Almost has a XP vibe to it, but better
04:48:19 <aalm> .theo
04:48:19 <glenda> That doesn't make sense.
04:48:39 <ybyourmom> ur face doesn't make sense
04:53:17 <geist> :(
05:03:48 <mrvn> Does GNU binutils have some format that's like a.out but has 1) .rodata and .init_array (on arm), 2) doesn't place the image at 0
05:03:51 <mrvn> ?
05:13:54 <mrvn> This sucks. I think I need to go straight to an ELF loader in the core kernel to load drivers.
05:14:07 <mrvn> or make my own format.
05:38:36 <radens> mrvn: I do believe that you will need an elf loader.
05:40:03 <radens> if you do make one be aware of your pointer arithmetic.
05:40:13 <radens> they're a great way to pop a kernel
05:43:31 <andrewrk> mrvn, regarding udivmod - why not link against libgcc (or compiler_rt if using clang)?
05:44:00 <andrewrk> as far as I'm aware, both compilers rely on being able to make those lib calls and lack ability to emit the instructions inline
05:44:23 <andrewrk> even in zig, because we use llvm, we have to build compiler_rt from source for the target so that the llvm-generated code can rely on udivmod etc being available
05:45:12 <geist> take off every zig?
05:45:59 <andrewrk> also, even if you hand code the division operation, sometimes the optimizer is so $@&*% clever that it changes the code into a libcall into compiler_rt
05:46:19 <andrewrk> I haven't seen it with division but I've seen it with a for loop memory copy changing into a memcpy call
05:46:25 <radens> it's kind of BS that you need to do that.
05:46:35 <geist> yah in general, especially architectures like pre-arm64, you basically need libgcc/compiler-rt for the math routines
05:52:32 <geist> compiler-rt is a bit more annoying in general, since it includes more bits than just the math routines
05:52:35 <geist> libgcc is almost entirely standalone
05:53:05 <geist> we avoided the problem in zircon much like how linux does: dont allow any operations that result in a libgcc call
05:53:19 <geist> but that's pretty easy if you're 64bit only
05:53:26 <rakesh4545> I am always amazed how giest knows everything.
05:53:45 <geist> I dont know everything. I only know what I know.
05:54:03 <rakesh4545> Relative to me its everything.
05:56:41 <pixelherodev> geist, doesn't 64-bit division / modulus emit libgcc calls?
05:56:51 <geist> not on a 64bit machine. but yes it does on a 32
05:56:59 <pixelherodev> Ah
05:57:01 <pixelherodev> Thanks
05:57:02 <pixelherodev> That explains it
05:57:19 <geist> linux however generally forbids dynamic divides and mods
05:57:34 <geist> which is actually not a bad idea, since a slow divide/mod can be a terrible performanc ehit
05:59:25 <geist> it's not a bad strategy: much of the time you're either dividing by a constant that can be optimized by the compiler, or you're dividing by a power of 2, in which case you can write code to actually acknowledge this and use a shift
06:03:24 <pixelherodev> What is the loopback test referenced by the Serial Ports page?
06:03:43 <geist> i think it's just a feature of original 8250s
06:03:52 <geist> lets you in the chip redirect tx to rx
06:07:23 <aalm> does fifo size guessing code use that? i've forgotten how it's done
06:09:41 <geist> dunno, would have to go check the 8250 manual (which i highly recommend)
06:12:31 * klange hacks a tga header into screenshot output
06:13:44 <geist> tga is the best kind of img
06:14:11 <klange> not as nice as when I could just straight up save a PNG, but, I can take a screenshot of a window in my OS, run `nc -l 9999 > screenshot.tga; convert screenshot.tga screenshot.png` on the host, and `cat /tmp/screenshot.tga > /dev/net/10.0.2.1:9999` on the VM and get a nice PNG with alpha channel out of it.
06:17:31 <klange> https://i.imgur.com/coW1Mao.png
06:18:37 <mischief> fixed a funny bug in 9front's bridge driver earlier.
06:19:10 <mischief> linux sends unpadded arp over wifi which the bridge driver just dropped on the floor
06:19:37 <geist> btw, got my ttl computer thing put together: https://photos.app.goo.gl/rUEo9xxNSxfHcf9W6
06:19:45 <mischief> and even if you changed the size check, it still wouldn't forward them because the ethernet driver checked for minimum transmission unit
06:19:58 <geist> ttps://photos.app.goo.gl/abqXL5Cr6g3y25t49 was fun to assemble. lots of soldering
06:20:29 <klange> so how long until you have it running a port of lk :)
06:20:44 <geist> quite some time. it's a wonky little architecture
06:21:07 <klange> Do you have the updated ROM?
06:21:13 <geist> yah
06:21:29 <mischief> geist: port retrobsd :)
06:21:51 <geist> it's a strange little architecture, kind of fun to read
06:23:13 <pixelherodev> Is this true? ` Writing to some I/O ports can permanently change the internal configuration of your computer`
06:23:30 <geist> sort of, kinda, but probalby not
06:23:46 <geist> what they're probably referring to is wiping out and corrupting your CMOS data
06:23:55 <geist> which used to matter a lot more, and is easy to accidentally stumble across
06:24:14 <geist> now almost nothing in there matters, or is only really just emulated by the rom
06:24:43 <geist> all that aside most old machines would work fine if your cmos were corrupted. they'd just boot up and complain aobut it, and then you go in the bios and reset it
06:25:07 <geist> but it's possible there were old machines that could boot and crash before you could intercept the bios setup screen
06:25:13 <geist> thus you'd have to somehow clear the cmos, etc
06:25:30 <bcos> Computer has to work fine with messed up firmware due to "CMOS battery flat"
06:25:34 <graphitemaster> how do CPUs typically implement floating point ALUs, like a float att, does it just change one of the floats so they have matching exponent, do a traditional addition and then fixup the exponent?
06:25:39 <graphitemaster> I can't find any literature on this
06:25:44 <graphitemaster> s/att/add
06:25:58 <geist> bcos: right. question is were there ever computers that would go apeshit if you corrupted it in a 'sane' way
06:26:08 <geist> vs it clearly being hosed
06:26:14 <bcos> pixelherodev: Yes, it's true in the same way that "you can buy one lottery ticket and win millions of $$" is true
06:26:26 <pixelherodev> Thanks
06:26:30 <graphitemaster> can you even add floats by treating exponent as one add then mantissa as another, with standard ripple-carry circuts
06:26:50 <geist> similarly there were old monitors in the pre-multisync days that you could hypothetically damage if you programmed the display controller to generate bad syncs and whatnot
06:27:03 <geist> but anything modern wont care, it'll just say it is a unrecognized display mode
06:27:30 <klange> A lot of arcade cabinets store critical data in volatile battery-backed storage.
06:27:43 <klange> In the 2000s, usually encryption keys.
06:27:47 <geist> smartphones and all that do too
06:28:01 <geist> theres usually some place somewhere at the bottom that has essetially factory data
06:28:08 <geist> that if you were to wipe it out may render the device into a brick
06:28:12 <klange> Your battery dies? Cabinets fucked until you get it "served" (replace the battery-backed data).
06:28:22 <klange> serviced*
06:28:23 <graphitemaster> there's some arcade cabinets that won't work if you let the internal data wipe itself
06:28:30 <graphitemaster> basically become a brick
06:28:40 <bcos> pixelherodev: In theory (with extremely low probability) you can brick firmware by accidentally flashing the ROM, might be able to damage the monitor (ancient CRTs that don't handle SVGA modes and extremely cheap & nasty notebook screens), and might be able to cook things if you mess up power management badly
06:28:44 <klange> ... I literally just said...
06:29:07 <bcos> pixelherodev: But most of that is impossible on most computers
06:29:16 <geist> right
06:29:32 <pixelherodev> Got it
06:29:33 <geist> at least on desktop PCs
06:29:35 <graphitemaster> the worst is arcade cabinets that use EPROMS and some dingle bat forgot to put the back back on the cabinet and the cabinet is facing a window and over time the sun actually erases them
06:29:41 <bcos> ..and even if it is possible and you were deliberately trying to break things you'd still probably not get it to break
06:29:49 <geist> which are generally designed to be somewhat more tolerant to getting their config/disks messed up
06:30:22 <geist> it's sort of great it went that way. since the IBM PC was an open architecture (essentially) it was generally assumed that all the interesting OS bits come from the disk
06:30:43 <geist> and thus a new PC is basically a blank slate, and it more or less always remained that way all the way through the years
06:31:29 <klange> In the case of arcade cabinets, it's an intentional measure to make sure the cabinet becomes useless when your service agreement for it runs out, so you can't effectively sell it to some rando on the Internet.
06:31:47 <geist> it isn't really implied that computers work that way. there are plenty of examples of proprietary machines (arcade cabinets included) where there was no intention of making it open
06:31:50 <geist> right
06:32:03 <graphitemaster> arcade cabinets still very much are a bit of a legal grey area these days
06:32:05 <geist> and basically most smartphones/tablets are more or less the same way
06:32:26 <graphitemaster> you can have one in a commercial setting on free play but if you start asking money to play the games then everything tumbles
06:32:41 <graphitemaster> because these cabinets are still the property of the company who made them or something
06:32:48 <graphitemaster> and you need to "license" them
06:33:16 <graphitemaster> even old crap that you can't even get serviced anymore
06:33:31 <graphitemaster> really wish they would cleanup the laws there
06:35:57 <mrvn> andrewrk: because libgcc is linked agaist the higher half part of the kernel.
06:37:41 <mrvn> geist: It's too bad a uint64_t/10 still does udivmod
06:38:30 <geist> yeah, 10 is surprisingly hard to do
06:38:47 <geist> you think it'd be a simple series, but that's one of the ones where the usual tricks dont work
06:39:32 <mrvn> geist: it's still quite simple. Multiply by the inverse using shift+add and then a little error correction
06:39:53 <geist> so why do you think the compiler doesn't do it?
06:40:52 <mrvn> geist: because the error correction is a 64bit multiply and branch and it's still a lot of instructions.
06:41:03 <geist> the error parts, yeah
06:41:39 <mrvn> Maybe I should limit my boot kprintf to 32bit.
06:41:44 <geist> and anyway the usual trick is to do essentially a higher bitness level multiply of a fixed point .number
06:42:07 <mrvn> geist: but try doing 64*64=128 bit multiply on a 32bit arm
06:42:09 <geist> but yeah that's exactly where i hit it too. the LK printf, which i use all over the place, including in zircon kernel, just has a single decimal print routine
06:42:14 <geist> and it basicaly always does it as 64bit
06:42:22 <geist> so on32bit machines it's always going through libgcc
06:42:28 <geist> mrvn: exactly
06:42:58 <mrvn> 64*64=128 would need 3 or 4 muls so I think 5 shift+add is faster there
06:43:33 <mrvn> And I need the modulo so the error correction code is basically free as that gives me modulo.
06:44:34 <mrvn> One thing I wondered is wether LTO would figure out that I call cprint_int() only with base=2,8,10 and 16 and optimize for those cases. But that seems to complex for it.
06:45:26 <mrvn> maybe if I use enum Base { BIN=2, OCT=8, DEC=10, HEX=16}?
06:48:07 <geist> if you're in cpp you could easily templatize it and forrce the generation of those
06:48:35 <geist> also i really gotta get that r key fixed
06:49:21 <mrvn> geist: it's not a constexpr. Depends on the % specs. I don't want to switch(base) myself.
06:50:03 <mrvn> I could pass a divmod lambda. :)
06:50:25 * geist nods
06:58:11 <mrvn> geist: also thanks for %a for floats. Doing %f is a pain in the kernel.
06:58:27 <geist> yah it's actually pretty useful. clearly the first one i implemented
06:58:54 <mrvn> great for scanf too. lossless and without the stupid rounding rules
06:59:27 <geist> yah. %f would be great if you're using decimal floats, but no hardware supports that. i think it's an IEEE standard though
07:00:10 <mrvn> it is. And defined down to the last bit how to parse it so that its numerically least off.
07:00:17 <mischief> intel has recently proposed a 16 bit float standard
07:00:34 <geist> yah gpus love it. ARM has some support for it i think
07:00:42 <geist> or at least specced out in their arch
07:01:24 <geist> mrvn: if you looked at my printf i just flat out gave up on denormalized numbers
07:01:36 <geist> from looking at other 'real' implementations at least half the code is dealing with that
07:02:04 <geist> also i take the 'use float math to print floats' strategy, which may not be specifically correct i guess
07:02:32 <geist> most real implementatios i've seen like glibc or whatnot never use float to print float. they do all the math in integer land. accordingly their code is far more complex
07:02:45 <mrvn> but then gcc will save the fpu regs to the stack on function entry. Not just when it hits %f.
07:03:07 <geist> ah. so there's a non ABI compliant hack on x86 for that....
07:03:21 <geist> linux kernel uses it
07:03:43 <geist> https://fuchsia.googlesource.com/zircon/+/master/kernel/arch/x86/rules.mk#146
07:04:24 <geist> well, okay, so that's a different one. but basically the x86-64 ABI has a trrick that stores the number of floats (or a bool, i forget) in rax to a varargs routine
07:04:25 <mrvn> [81323.252605] systemd-logind[4136]: Failed to start user service, ignoring: Unknown unit: user⊙6s
07:04:37 <mrvn> WTF is a 65534.service?
07:04:43 <geist> that switches says, dont bother with that, there's no floats here
07:05:41 <mischief> mrvn: its a template.
07:06:04 <mischief> it's user@.serivce, templated with uid 65534
07:06:23 <mrvn> mischief: right. I thought "user" was the arg
07:06:29 <mischief> no.
07:06:39 <mrvn> so something is trying to become nobody
07:07:08 <mischief> if tats the nobody uid.
07:10:05 <graphitemaster> well printing of floats is real difficult because it has to be bit convertable too
07:10:35 <graphitemaster> as in the printed result of a float when passed into strtof has to produce indentical results from the initial float that produce it
07:10:42 <graphitemaster> I find that requirement in C pretty crazy
07:10:43 <mrvn> sscanf(sprintf("%f", f)) == f must hold true
07:11:03 <geist> graphitemaster: how is that even possible for irrational floats?
07:11:03 <mrvn> given the right precision flags anyway
07:11:11 <geist> which is i think quite possible when trying to print it decimal
07:11:20 <mrvn> geist: you print enough digits that the scanf will round to the same bit pattern
07:11:32 * geist nods
07:11:47 <graphitemaster> the standard puts a huge emphasis on hex-floats being perfectly representable too
07:12:02 <graphitemaster> as in a hex float as a string must produce the same bit-pattern everywhere regardless of system
07:12:05 <mrvn> which also means you have to use more precision for the mantisse when scanning. can't use floats there
07:12:12 <graphitemaster> so hex-float string is the best serialization of floats
07:12:28 <mrvn> graphitemaster: well, but that part is hard to screw up even if you try.
07:12:41 <graphitemaster> glibc has screwed it up ;-)
07:13:10 <mrvn> WTF? That's just scanning in a hex, then normalizing it and adding the exponent.
07:13:36 <mrvn> did they screw up the numer-of-leading-zeroes or what?
07:13:49 <graphitemaster> they failed to conversion from float to hex-float string many times
07:13:55 <graphitemaster> not so much the hex-float string to float
08:44:44 <klange> old (cairo+python) button design: https://i.imgur.com/YX06c1v.png vs. new (native graphics library) button design: https://i.imgur.com/1xebhfk.png
08:45:00 <klange> former is based on the old Clearlooks theme, ne one is based on some more modern themes
08:45:58 <diodesign> looks nice and clean :)
08:50:13 <radens> how do I learn what cpu features qemu supports and which are enabled?
08:51:35 <klange> In what context?
08:51:57 <radens> like I think that there's paravirt stuff turned on which I want to turn off.
08:52:04 <geist> yah i'm still happy with clearlooks
08:52:21 <geist> radens: on which platform?
08:52:24 <radens> and I think I need to pass -cpu host,-SOMEFEATURE
08:52:28 <radens> x86_64 today
08:54:50 <radens> linux knows it's running under kvm so when it goes to wait on a spinlock it *disables interrupts* and halts so KVM knows to deschedule it (I think), but when I hyperjack KVM it hangs because, well hlt with no interrupts.
09:04:33 <bauen1> radens: did you find any code for that in the linux kernel (i would be interested to take a look at it)
09:04:59 <radens> bauen1: only linux as guest
09:05:09 <radens> not the other half of the equation
09:05:46 <radens> kvm_wait in arch/x86/kernel/kvm.c circa 661
09:06:22 <radens> https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/kvm.c#L776
09:07:13 <radens> note the "if ints are disabled, halt unsafely" bit
09:14:15 <bauen1> ty
09:15:33 <radens> I think kvm_kick_cpu is the bit for waking up the guest
09:15:37 <radens> not sure though
10:12:24 <AIOP> New to ARM could someone explain to me what ldmia and stmia does exactly? http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/BABEFCIB.html says `Registers are loaded stored and in numerical order, with the lowest numbered register at the address initially in Rn.`
10:12:30 <AIOP> but loaded stored isn't english
10:13:17 <ybyourmom> Think about how pushing and popping on a stack do
10:13:22 <AIOP> All I can get out of this is that `ldmia r0!, {r1,r2,r3,r4}` is valid syntax and r0 gets incremented 4*registers_listed
10:13:44 <AIOP> So does this mean use r0 as a stack push counter and push r1, r2, r3, r4?
10:13:45 <ybyourmom> Pushing to the stack is a sub followed by several stores
10:13:54 <ybyourmom> So, store multiple, decrement before
10:14:02 <ybyourmom> That's push, but for multiple values
10:14:13 <AIOP> I see
10:14:23 <ybyourmom> And popping multiple values is reading the values currently on the stack, and then decrementing the stack pointer
10:14:28 <AIOP> So the stack address should be in r0?
10:14:32 <ybyourmom> Aka, load multiple, decrement after
10:14:42 <ybyourmom> No, yuou can use any register
10:14:50 <ybyourmom> USually you use the stack pointer
10:14:57 <AIOP> I know but in the example
10:15:04 <AIOP> Oh I see
10:15:06 <ybyourmom> But you can also use those instructions to do things other than stack manipulation
10:15:25 <AIOP> Right like if you have a memory region, like a struct?
10:15:33 <ybyourmom> yes
10:15:56 <ybyourmom> Or if you're for example, writing register context when doing a context switch in kernel, etc
10:16:11 <ybyourmom> The exclamation mark means "write back"
10:16:37 <AIOP> You would use this when you need to push registers before a function call?
10:16:39 <ybyourmom> Because by default the operation won't destructively modify the operand pointer register
10:16:56 <ybyourmom> You could, if you're passing values on the stack
10:17:23 <AIOP> so why the fuck does the official documentation not actually say what its doing
10:17:52 <ybyourmom> I'm sure it's written somewhere
10:18:34 <AIOP> `writes/loads contents of the address at 'reg' into a list of registers`
10:19:02 <ybyourmom> this is why embedded development is a nice secure niche
10:19:05 <ybyourmom> be happy then
10:19:21 <AIOP> lol
10:19:27 <ybyourmom> it's arcane knowledge passed on to the few so that you may earn a high income and eat of the fat of the land
10:19:58 <AIOP> Our pay is higher to pay homage to those who work with their hands.
10:20:09 <AIOP> Isn't that right Debrah
10:21:23 <AIOP> I am working with a rbpi and becuase I know this knowledge now
10:21:37 <AIOP> I know I am writing over the .text section -.-
10:22:02 <AIOP> Because some idiot thought it was smart to offset the binary data as a whole by 0x8000 just to have a stack using a linker script
10:26:45 <AIOP> I didn't understand what you meant about the ! ybyourmom
10:27:32 <AIOP> Do you mean ! activates the ia part of stmia part?
10:31:22 <ybyourmom> AIOP: No, it writes back the value after the IA part
10:31:26 <ybyourmom> Into R0
10:31:35 <ybyourmom> Otherwise the IA part will be like a No-op
10:31:42 <ybyourmom> after the instruction executes
10:32:22 <AIOP> you mean yes then?
10:32:52 <ybyourmom> You right, come to think of it
10:32:56 <ybyourmom> Sure
10:34:34 <AIOP> I am right in other words. That ! is kinda odd. Why not just stm
10:35:49 <ybyourmom> It writes back the value, otherwise, the value in r0 would not be updated, that's all
10:48:29 <klange> new file manager https://i.imgur.com/ZPcl3Ix.png
10:59:55 <AIOP> I found a CLI program for file management on github. I got to thinking, how could it be better than what we have now (aka the cmd line is all about file management).
11:01:34 <AIOP> I absolutely hate graphical user interfaces that tween. Oh how I hate those so. That and OSes and their 'who gets window focus' complex.
11:03:55 <AIOP> and when people say gooey instead of G U I
11:04:14 <AIOP> but I respect what you do klange!
03:38:48 <andrewrk> does this mean that arm64 does not support low power mode cpu cores? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/cpuidle.h
03:40:38 <andrewrk> oh, I'm an idiot. completely failed to notice the macro
03:40:49 <andrewrk> ignore me
04:03:32 <andrewrk> if you are writing a bare bones application for a SoC, is there ever a reason to drop from a higher EL to lower EL?
04:03:53 <andrewrk> e.g. all code is trusted
07:05:42 <mrvn> Is there an attribute for gcc to mark a function as "all registers are caller saved"?
07:08:02 <mrvn> .oO(Actually that's something LTO should optimize for. Ignore the calling convention and figure out which register to use to minimize register saving on calls)
07:11:06 <geist> yah
07:14:48 <mrvn> I'm thinking my kernel core is so simple that all the interrupt handler and syscalls could function without saving any registers. The exception handler saves all user registers once, that should be enough.
07:15:31 <geist> that's generally all you need, yes
07:16:04 <geist> where it gets complicated that forces you to save more (or have some mechanism) is when debugging comes along and forces you to allow user space to modify register state of blocked threads
07:16:20 <geist> then you will need to dump an entire frame down at kernel entry (or find a complex second solution)
07:16:56 <mrvn> at kernel entry I dump the whole user register set. So modifying a user register is current_thread->regs.r2 = 17;
07:17:42 <mrvn> No thread ever blocks in kernel and interrupts are disabled in kernel.
07:24:13 <mrvn> Maybe'naked'
07:24:31 <mrvn> "This attribute allows the compiler to construct the requisite function declaration, while allowing the body of the function to be assembly code. The specified function will not have prologue/epilogue sequences generated by the compiler. ..."
07:27:17 <mrvn> geist: have you used 'void f () __attribute__ ((interrupt ("IRQ")));' on arm before?
07:28:18 <bcos> mrvn: Interrupts are disabled in kernel?
07:29:32 <bcos> (so.. if an IRQ occurs while running kernel code; it waits until there's an (expensive) switch to user-space and causes an immediate (expensive) switch back to kernel?)
07:37:24 <geist> mrvn: i have not
07:37:55 <geist> but like i said, in general you only need to save the callee trashed regs (the opposite of saving callee saved)
07:38:07 <geist> as long as you dont mind having a partial iframe for fiddlign with user space
07:44:41 <mrvn> bcos: I should check the interrupt pending register before return and just stay in kernel. But so far I just assumed interrupts happen infrequent and don't collide (often).
07:45:40 <bcos> Ideally (assuming "single kernel stack per CPU" design) I'd want to check for/handle IRQs before deciding which task kernel should return to (before returning to user-space)
07:45:46 <bcos> Hrm
07:46:12 <bcos> ..also assuming micro-kernel (where IRQ handler mostly causes message/task switch to user-space handler)
07:46:29 <mrvn> geist: An interrupt or syscall nearly always means a task switch so to keep things simple I save everyting on entry and reload everything on exit. If current_task has changed inbetween then it switches, otherwise it returns.
07:47:10 <mrvn> bcos: exactly. The IRQ handler is trivial and usualy means some other task gets woken up and preempts.
07:52:42 <mrvn> bcos: actually I already do that. Because I have a while(pending) { wake_up_lowest_irq(pending); }
07:52:44 <geist> we do in zircon now too (except fpu state), but that's because the user space debugger wants to come along and fiddle with blocked threas
07:52:53 <mrvn> (do that == check for pending irqs before returning)
07:57:33 <mrvn> geist: It might even be faster that way because you have all the saving in a single stm instead of saving them by bits and pieces. That is if you can stop the compiler from saving stuff again later.
08:03:50 <geist> yah, i think i timed it and it was fairly inconsequential
08:03:57 <geist> at least for arm64, which has a lot more crap to save
08:04:29 <geist> it's clearly a bit slower, but only by a handful of cycles
08:04:33 <geist> so... it's a tradeoff
08:04:49 <geist> the more complex solution is what i think linux does on some architectures, where it dumps a partial frame, but saves space for the full frame
08:05:15 <geist> and then if you want to debug the thread or something to put it into a debuggable state, you wake it up in kernel space, it goes to the exit and then instead of returning to user space it dumps the frame and stops
08:06:06 <geist> or dumps the frame and reenters the deep part of the kernel and blocks on what it was doign already
08:06:23 <geist> key is you have to unwind the stack naturally so the state of the registers is known
08:07:51 <geist> i suppose you could try to throw some sort of dynamic DWARF debugging unwind logic to it and actually reverse what the state of the rreigsters are at system input
08:07:57 <geist> but i dont wanna do that
08:17:59 <mrvn> On ARM all that you can avoid saving is r0-r3. If your more complex logic requires more time than saving those 4 regs then it should be faster.
08:18:03 <mrvn> right?
08:20:23 <geist> other way around. remember on irq/syscall entry you save the calle *trashed* registers
08:20:31 <geist> the opposite of what you save when doing a usual function call
08:21:01 <geist> well, syscall you may do a bit differently, but irq/exception entry is involuntary so you have to save the registers that the C code would just start trashing
08:21:12 <geist> so iirc that's r0-r4, r12
08:21:12 <mrvn> ups, yeah. So r4-r13. Ok, that's a bit more.
08:21:34 <mrvn> geist: r12?
08:21:48 <mrvn> geist: r0-r3 are arguments, r4 not iirc.
08:21:53 <geist> r12 is a scratch register on arm32. sometimes called the 'ip' register
08:22:02 <geist> interprocedural link register or something
08:22:10 <geist> and yeah, r0-r3,r12
08:22:16 <mrvn> ahh, yes. something to do with leaving a register to do trampoline magic and such.
08:22:29 <geist> yah. it's basically a scratch register unused for anything else, for precisely that
08:22:36 <geist> on arm64 it's x16,x17
08:24:18 <mrvn> anyway. On task switch I have to save those and my feeling is that I almost always switch tasks. .oO(Should put a counter in there to confirm).