Search logs:

channel logs for 2004 - 2010 are archived at

#osdev is the only channel with logs prior to 2004

daily #osdev logs after 12sep2018 are not available until midnight PDT
use the link below if you need today

Saturday, 16 February 2019

12:05:24 <Ameisen> this is a weird bit of shell script I wrote
12:05:50 <Ameisen> "$(colorize light_red [ERROR]) $@)"
12:05:58 <Ameisen> er, no ) at the end
12:06:04 <Ameisen> I'm surprised I got that function working
12:18:21 <jmp9> ok
12:18:31 <jmp9> mov ebp,esp; pop ebp vs leave
12:25:05 <jmp9> why not use enter and leave?
12:25:48 <jmp9> hey gus
12:25:52 <jmp9> guys
12:29:54 <Mutabah> speed I guess
12:30:13 <Mutabah> the simpler instructions tend to be more efficient (due to pipelining)
12:31:20 <Telyra> Definitely was the case for quite a while at lesat
12:31:24 <Telyra> least*
12:34:12 <Telyra> I wouldn't be surprised if macro-op fusion does wonders to the separate instructions though
12:34:58 <Telyra> micro-op fusion*
12:39:38 <doug16k> jmp9, enter is complex, leave isn't
12:39:49 <doug16k> push and mov are unbeatable
12:40:01 <doug16k> both highly optimized
12:41:16 <doug16k> there's a "stack engine" that can handle the stack pointer changing multiple times and not stall to wait
12:44:19 <doug16k> well, you could beat it by not having any frame pointer, but with a frame pointer enabled, it's highly optimized by the architecture for obvious reasons
12:47:30 <Telyra> Looking at cycle tables for the 486, "enter" is significantly slower than the equivalent movs, maths, and pushes
12:48:39 <Telyra> enter n, 0; is 14 cycles on the 486. push ebp; mov ebp, esp; sub esp, n; is 3.
12:48:43 <doug16k> yeah enter takes an argument to setup multiple nested frames, that nobody ever uses, but slows down enter
12:49:05 <Telyra> So my guess is that "enter" exists less for speed and more for code size.
12:49:08 <doug16k> you'd only want enter for that obscure nested frames thing
12:49:28 <doug16k> and it'd be a ton of cycles if you did use it for that
12:49:38 <Telyra> enter n, q; is 23 cycles
12:50:29 <Telyra> Whereas I guess you'd incur the 3 cycles for the frame itself, a cycle for removing the nested frames from the stack frame
12:50:29 <doug16k> there'd be a multiplication in the cycle count somewhere scaling by the number of frames you asked for
12:50:45 <Telyra> And however many cycles push [ebp-2] costs
12:50:48 <Telyra> er, -4
12:50:49 <doug16k> nobody asks for multiple, just pointing out that enter can become bad
12:50:59 <eryjus> doug16k -- curioius about where you found the documentation on the clock cycles per instruction
12:51:23 <doug16k> nowadays you'd look at agner fog pdfs
12:51:34 <doug16k> but it has been common knowledge that enter is slow for a while
12:51:58 <Telyra> Anything P5 and later, Agner Fog is your slightly obsessive PhD friend
12:52:13 <eryjus> understood; looking to bookmark it for my own use
12:52:15 <doug16k> eryjus, this ->
12:52:23 <Telyra> Earlier than that usually Intel would have the magic numbers in the CPU docs
12:53:02 <doug16k> eryjus, and more generally, that whole page of links ->
12:54:11 <eryjus> doug16k thank you
12:54:13 <Telyra> Appendix E of the i486 Programmer's Reference Manual has timings for both cache hits and misses
12:54:48 <Telyra> Which is something that's not as easy to measure these days but, again, Agner Fog is your friend
12:59:58 <eryjus> thanks again for the reference. will be studying that for sure!
01:00:40 <doug16k> np, everyone needs to look over that stuff
01:01:28 <doug16k> it improves your mental model of what is really happening
01:29:50 <jmp9> ok guys
01:29:59 <jmp9> SATA identify command works fine
01:30:00 <jmp9> on qemu
01:30:03 <jmp9> oh wait
01:30:07 <jmp9> on laptop it get page fault
01:30:11 <jmp9> something went wrong
01:30:12 <jmp9> ok ok
01:39:09 <CompanionCube> so has there been any good comments on MS implementing 9P for WSL?
01:40:29 <jmp9> i have question
01:40:34 <jmp9> when page fault happens
01:40:41 <jmp9> how do i know which address caused page fault?
01:40:44 <jmp9> how get that address?
01:41:06 <knebulae> faulting address is in cr2
01:41:08 <eryjus> cr2
01:41:29 <jmp9> THANKS MAN
02:14:02 <jmp9> okay
02:14:09 <jmp9> how to access ahci memory safely
02:14:16 <jmp9> which addresses i should map?
02:24:36 <jmp9> heyyy
02:27:15 <jmp9> oh yes
02:27:18 <jmp9> my mapping is too small
02:29:47 <klys> graphitemaster around?
02:32:38 <klys> has anyone here used the virgl/virgil/virtio-gpu ?
02:35:48 <klys> it seems mesa is giving me software rendering, which naturally is blocky, minimal quality, except virgl uses mesa as a driver I hear. so i've been testing it by playing videos in qemu with a debian-unstable vm.
03:18:15 <jmp9> I have problem
03:18:27 <jmp9> SATA drive on my laptop doesn't repsonding for IDENTIFY command
03:18:29 <doug16k> klys, I have
03:18:40 <klys> doug16k, hey thanks
03:18:51 <doug16k> it works perfectly in my experience
03:19:00 <klys> my debian vm has just as good 3d with -vga cirrus as it did with -vga virtio -display gtk,gl=on; am using xserver-xorg-video-fbdev; both work with compton (meaning I have gl/mesa).
03:19:07 <jmp9> plz help
03:19:14 <doug16k> I have a driver for it in my OS project. it also works flawlessly with ubuntu. performance is excellent because it doesn't trap video memory accesses ever
03:19:35 <doug16k> the driver has to explicitly push new bits to the screen
03:19:59 <klys> doug16k, whic xserver-xorg-video-diver do you use with ubuntu?
03:20:14 <klys> which* driver*
03:20:32 <doug16k> it would be the non-free vendor video drivers if that's what you mean
03:20:44 <klys> on the guest.
03:20:58 <doug16k> on the guest I do nothing and it just works on ubuntu, natively autodetected
03:21:16 <doug16k> in the OS driver you mean?
03:21:40 <klys> could I hae a paste of your Xorg.0.log if you don't know (from your guest ubuntu image)
03:22:01 <doug16k> ok, 1 sec
03:22:20 <jmp9> any suggestions?
03:22:43 <klys> jmp9, is it supposed to respond via interrupt request?
03:22:54 <klys> idk
03:22:56 <jmp9> i set interrupt respond to zero
03:22:59 <doug16k> jmp9, are you writing a real driver, following the specification, or just hacking a few pokes at it in and hoping it is ok?
03:23:05 <jmp9> i polling instead
03:23:33 <jmp9> i dunno
03:23:46 <jmp9> xd
03:23:52 <doug16k> the specification spells out how to initialize and use it. if you pointed to where you have a failure in code it would help. you'd have to narrow it down to something
03:24:40 <jmp9> it worked on qemu
03:25:11 <doug16k> if it worked on qemu and not on real machine, you are probably not properly acknowledging IRQs. qemu cheats the ahci IRQs and wrong code will work
03:25:38 <zhiayang> something working on qemu is rarely an indication of anything
03:25:55 <doug16k> qemu only cares if correctly functioning drivers work. you could do all sorts of nonsense and qemu would be ok and real machine not ok
03:26:23 <doug16k> qemu doesn't go out of its way much to tell you your driver is bad
03:26:33 <doug16k> maybe a little if you enable tracing, but otherwise, no
03:28:26 <doug16k> jmp9, the spec will document several W1C (or RW1C) register bits. those are "read/write, and write-one-to-clear"
03:28:43 <doug16k> there are registers like that involved with telling the device that you handled that IRQ and now you want another one if it happens
03:29:28 <doug16k> writing 1 to that bit position clears it. writing 0 leaves it unaffected
03:29:42 <doug16k> they need to be zeroed properly to get any (or another) IRQ
03:30:38 <doug16k> qemu doesn't enforce that properly at all. your real machine is probably enforcing it perfectly
03:33:42 <jmp9> what is a "Cold Presence"
03:34:25 <jmp9> you know, i'm not native english-speaker
03:36:02 <doug16k> I think that refers to fancy physical drive bays that can tell if a drive is in that bay or not, from software
03:37:05 <doug16k> there are things in there for big fancy controllers that can use a huge number of drives too, called port multipliers
03:37:15 <doug16k> you don't need to worry about that (yet!)
03:38:33 <jmp9> my laptop has 1 hardrive and 1 cdrom so why i need those
03:39:06 <jmp9> my os is especially for my laptop
03:39:22 <aalm> big fancy? i've got really cheap arm sbcs w/sata supporting pmp
03:41:09 <doug16k> aalm, nice
03:41:43 <doug16k> helpful when you don't have a lot of space for sata ports
03:42:24 <jmp9> i didn't had real pc for my entire life
03:42:26 <jmp9> only laptops
03:42:27 <jmp9> xd
03:50:52 <klys> doug16k, what are ye up to
03:52:03 <doug16k> tracking down an issue that only happens on my fat32 kernel module, problem with static initialization
03:52:25 <doug16k> in DT_INITARRAY calls
03:55:13 <doug16k> aalm, the other end of that pmp cable is the big fancy part, special enterprise drive enclosures
03:59:22 <doug16k> the issue I'm having is with some gross raw assembly it spat out to initialize something and there are no symbols for anything in that region
04:00:04 <doug16k> other than the super helpful name __static_initialization_0
04:00:27 <doug16k> I can see it in objdump -S though, so I know what is happening. don't see why yet
04:00:57 <klys> can ye tell what generated this code
04:01:06 <doug16k> yes
04:01:20 <doug16k> it's initializing a static pool
04:03:11 <doug16k> I can see the backtrace pretty well. there are a couple of ? in the generated stuff but works well enough and traces through them
04:03:29 <doug16k> actually no, I fixed that, now I see perfect backtrace
04:14:59 <zhiayang> how do microkernels pass messages?
04:15:15 <zhiayang> is it possible to say pass msgs from A to B without going through the kernel
04:15:22 <zhiayang> i would doubt so?
04:16:22 <immibis> might be, if they have shared memory set up
04:18:06 <zhiayang> right, my thought was that you’d need to find the target process somehow first
04:18:27 <zhiayang> and somehow tell it about the shared memory
04:18:42 <zhiayang> (presuming this isn’t some case where everything is hardcoded)
04:20:31 <immibis> well if it's your kernel then you get to decide. you could have a system call that maps shared memory into another process and somehow tells it the address. probably not very good for security
04:23:05 <klys>
04:24:16 <zhiayang> i’m not sure that’s a helpful link
04:24:28 <zhiayang> unless i’m missing something
04:25:30 <klys> it seems I was reading somewhere that message passing was preferred over shared memory, and that was the place I remember reading it.
04:25:49 <zhiayang> right
04:26:32 <zhiayang> my forthcoming implementation would probably have processes negotiate the details of shared memory through messages
04:26:41 <zhiayang> then use the shared mem for more efficient communication
04:26:53 <klys> I guess klange's kernel implements message passing, in case he's around.
04:26:57 <zhiayang> unless message passing can be done more efficiently
04:27:09 <zhiayang> eg. if you don’t need to enter the kernel to send/receive messages
04:27:10 <klange> ~
04:27:13 <zhiayang> hence my original question
04:49:54 <ryoshu> hi
04:50:09 <ryoshu> is it possible to enable CR0_PG and disable CR4_PAE?
04:50:41 <ryoshu> and then read IA32_EFER?
04:50:44 <ryoshu> with rdmsr
04:52:25 <ryoshu> I have a case that hypervisor treats this scenario in a guest as invalid
04:52:45 <ryoshu> and it emits exception General Protection fault
04:56:14 <ryoshu> hmm for i386 cpu
04:56:46 <klys> do you mean the hypervisor was emulating a 386 ?
04:56:50 <ryoshu> yes
04:56:56 <klys> well okay
04:58:31 <ryoshu>
04:58:50 <ryoshu> r = 1; causes GP
04:58:53 <ryoshu> execption
04:59:40 <klys> well I would say some of those features, CR4_PAE, IA32_EFER, should not be available on a plain 386.
05:00:09 <ryoshu> do you mean that they are amd64 only?
05:00:24 <klys> I'm not sure about the second one
05:00:42 <ryoshu> openbsd uses that
05:00:42 <klys> PAE is sometimes used on 32-bit chips, though
05:00:56 <ryoshu> probably as a hack for w^x, I recall something about it
05:01:13 <klys> just that PAE is not an i386 feature
05:02:13 <ryoshu>
05:02:18 <ryoshu> slide 7
05:03:25 <klys> there's an i386 manual packed up for online viewing here:
05:04:32 <ryoshu> is this a floppy image?
05:04:34 <ryoshu> for dos?
05:04:39 <ryoshu> $ file i386doc.144
05:04:39 <ryoshu> i386doc.144: DOS/MBR boot sector, code offset 0x3c+2, OEM-ID "FRDOS5.1", root entries 224, sectors 2880 (volumes <=32 MB), sectors/FAT 9, sectors/track 18, serial number 0xa669ac4d, unlabeled, FAT (12 bit), followed by FAT
05:18:08 <klange> ryoshu: It's a bootable FreeDOS floppy with an Intel manual in text form.
05:19:29 <ryoshu> is this 1988 80386 manual?
05:19:55 <klange> It says 1986 at the top.
05:20:53 <ryoshu> I have got a .txt copy, but thanks
05:23:40 * eryjus might actually have an original bound copy of that in a box somewhere...
05:24:13 <ryoshu>
05:24:20 <ryoshu> movl$MSR_EFER, %ecx
05:24:25 <ryoshu> rdmsr
05:25:06 <ryoshu> with disable PAE
05:25:06 <ryoshu> HAXM protests with an exception
05:27:02 <immibis> klys: shared memory is used to implement more efficient message passing
05:27:13 <klys>
05:27:15 <immibis> zhiayang: ^
05:45:06 <doug16k> ryoshu, you describe it as enabling paging then disabling pae. is that really what you mean? in that order?
05:45:48 <doug16k> I'd be surprised if changing PAE with PG=1 ever didn't #GP
05:46:14 <ryoshu> it looks like openbsd 1. enables PG 2. enables NXE in EFER 3. enables PAE
05:46:58 <ryoshu> doug16k: is this sane behavior?
05:47:08 <ryoshu> or works by an accident
05:47:15 <doug16k> no it isn't sane
05:47:17 <ryoshu> due to illegal/UB operation in cpu
05:47:24 <doug16k> you must be misinterpreting what happens
05:47:29 <doug16k> or what order it applies it
05:47:49 <ryoshu>
05:48:44 <ryoshu> HAXM generates GP for MSR_EFER read with !PAE && PG
05:48:48 <doug16k> that can't possibly be right. how could PAE page tables ever work for those cycles between PG=1,PAE=0 and later PAE=1. it magically did page table walks with PAE at the wrong value?
05:49:14 <klys> EFER doesn't exist on a 386?
05:49:17 <doug16k> or does it set up a recursive trick where everything maps back to the same place
05:50:27 <doug16k> if you set up a recursive mapping that maps the 0th physical address and the 2nd 32 bits of the PTE are all zeros, it has a chance of making sense.
05:50:31 <ryoshu> doug16k: please see the code
05:50:51 <ryoshu> ENTRY(cpu_paenable)
05:51:14 <ryoshu> it reloads PDPT
05:51:53 <klys> ryoshu, which cpu are you emulating now?
05:52:10 <ryoshu> i386
05:52:16 <ryoshu> on amd64
05:52:24 <klys> and cpuid tells you it's one?
05:52:25 <doug16k> orl $0xfe0, %edi ???
05:55:32 <klys> the i386 had no rdmsr instruction.
05:55:45 <ryoshu> comments on this 0xfe0?
05:57:07 <ryoshu> klys: if we want to be strict with i386 === then there is no cpuid on it
05:57:11 <ryoshu> klys: just 32-bit x86
05:57:47 <doug16k> you can see if cpuid exists by seeing if the eflags ID flag can be changed
05:58:01 <doug16k> if it changes when you change it, then cpuid exists
05:58:18 <ryoshu> cpuid was added since 80486 (and not all models)
05:58:27 <ryoshu> but it's not important now.
05:58:36 <klys> rdmsr requires i586.
05:58:41 <ryoshu> I'm trying to understand this openbsd hack
05:58:47 <ryoshu> and unbreak HAXM emulating it
05:59:04 <doug16k> I detect all the way back to 8088 in my bootloader (mostly for a laugh) ->
06:00:10 <doug16k> realistically you'd have to go out of your way to get that disk content onto the disk then have trouble connecting that drive to something that old
06:00:48 <ryoshu> I'm not attached to using real hw
06:00:57 <doug16k> can be done of course. there's a market for retro hardware
06:02:03 <klys> all you've said is you got a #GP, so what kind?
06:02:20 <ryoshu> #GP is just inserted by HAXM
06:02:37 <ryoshu> hax_inject_exception(vcpu, VECTOR_GP, 0);
06:02:49 <ryoshu> just after returning from the linked function
06:04:01 <klys> "The general protection exception is a fault. In response to a general
06:04:01 <klys> protection exception, the processor pushes an error code onto the exception
06:04:04 <klys> handler's stack."
06:14:35 <doug16k> interesting that it explicitly fills the ignored bits of cr3 with 1's with a dedicated or instruction
06:14:38 <doug16k> I can't get over that
06:15:10 <doug16k> line 1626
06:16:08 <ryoshu> without writing ignored 1s what would be the value?
06:16:49 <klys> zero
06:16:51 <doug16k> those bits of cr3 (11:5) are irrelevant and ignored by the cpu
06:16:53 <klys> well
06:17:07 <doug16k> unless(!) PCID is enabled.
06:17:32 <doug16k> you're sure that CR0.PG is 1 at entry to paeenable?
06:17:38 <doug16k> can't be
06:17:55 <ryoshu> doug16k I will check it
06:19:52 <doug16k> well, PG doesn't matter unless CR0.PE is also 1
06:20:00 <ryoshu> [ 35321,860271] CR4_PAE=0 CR0_PG=1
06:20:10 <ryoshu> during rdmsr()
06:20:21 <doug16k> and PE?
06:20:51 <klys> that would be the least significant bit of cr0
06:21:22 <klys> reason for gpf 13. Loading CR0 with PG=1 and PE=0.
06:22:29 <doug16k> ryoshu, if you run latest (master) qemu, then my patch enables print $cr0 to display it decoded
06:22:36 <ryoshu> [ 35479,018447] CR4_PAE=0 CR0_PG=1 CR0_PE=1
06:22:36 <doug16k> it was applied recently
06:23:03 <doug16k> (from gdb I mean)
06:23:25 <ryoshu> thanks, for now I'm printing in my dmesg
06:25:15 <doug16k> I guess I'd see if intel actually forgot to disallow that and is somehow has initial page tables that make enough sense in non-PAE mode for it to survive until it turns on PAE. I would expect that to #GP though. weird that the code does that
06:25:16 <ryoshu> and qemu 3.0.0 as it's more stable with haxm/netbsd than 3.1.0 (haxm to be blamed!)
06:28:52 <doug16k> I wouldn't even try to change the interpretation of the page tables while they are being used. it's just asking to break
06:29:16 <doug16k> working or not on real machines, it's ridiculous
06:30:06 <ryoshu> with this I can get openbsd to keep booting...
06:30:10 <doug16k> it's a silicon bug if that works on real processors, no sane os would want that
06:30:44 <ryoshu> however it breaks later, not sure whether due to this reason or not
06:30:50 <ryoshu> with: [ 32796,795571] haxm_panic: Unexpected page fault, kill the VM!
06:32:29 <ryoshu> doug16k: can I mention you in a PR on haxm?
06:32:32 <ryoshu>
06:32:56 <doug16k> no need
06:33:04 <doug16k> fine either way
06:33:27 <ryoshu> anyway virtualization shall reproduce hw bugs
06:37:03 <ryoshu> I won't comment whether I treat o.bsd sane ;)
06:37:12 <doug16k> :
06:37:14 <doug16k> D
06:40:31 <doug16k> I guess they already had the CR4.PGE thing flushing the TLB, maybe they thought it was clever to just TLB flush the PAE 0 to 1 transition too and call it a day
06:42:13 <doug16k> or transition in either direction actually
06:42:48 <ryoshu>
06:47:45 <ryoshu> it's 'Switch over to PAE page tables' function
07:01:02 <doug16k> opengrok needs to check location.hash after populating the content if they want to cheat and defer loading the content and still support hashes :P
07:01:11 <doug16k> either that or fix your hashchange event handler. sheesh
07:01:41 <doug16k> going back to it is broken and getting really irritating
07:03:33 <doug16k> back as in the back button, returning to it after following a link to a type or define or something
07:07:36 <klys> doug16k, about that virtio setup....
07:07:50 <klys> :)
07:15:54 <doug16k> klys, oh 1 sec :D
07:21:44 <doug16k>
07:23:18 <klys> thx
07:23:38 <doug16k> auto resizes beautifully, mouse works totally natively and no weird laggy feel
07:24:03 <doug16k> fullscreen is perfect - ctrl-alt-F
07:24:32 <doug16k> no click to capture or uncapture
07:26:09 <doug16k> klys, also,
07:26:36 <doug16k> those 3 things at the top make the sound have no pops in pulseaudio
07:27:32 <klys> okay mebby playing videos online isn't the best way to test for 3d accel
07:28:09 <doug16k> oh, you want 3d?
07:28:27 <klys> yeah I have DRI2, let me paste mine
07:28:33 <doug16k> then you need to add ,gl=on after sdl no space
07:29:00 <doug16k> that works for me in my OS
07:29:18 <doug16k> I have partially done upgrade of my 2d virtio-vga to 3d-gpu
07:29:27 <doug16k> my driver finds it
07:29:38 <doug16k> it = 3d capability
07:30:21 <doug16k> I should have ,gl=on in my config. I guess I forgot it! it's awesome without it though
07:30:50 <doug16k> I can watch videos nearly perfect in mine. it's a haswell that turbos pretty much continuous 3.6GHz
07:31:01 <doug16k> when pegged
07:31:11 <klys>
07:32:07 <doug16k> this is my laptop btw, not that it matters much, it's gtx 1060m with non-free linux nvidia drivers
07:32:18 <doug16k> er 860m sorry
07:32:23 <klys> okok
07:32:44 <klys> I have a gtx 960 somewhere, just using this tablet-pc notebook for now.
07:33:11 <klys> anyways, it says some things about the gpu in the log I posted
07:33:50 <doug16k> yep, looks fine
07:33:51 <klys> just I need something to test it with?
07:34:07 <klys> like, a "hello world" shader program
07:34:40 <klys> something that should not work if I pass in a cirrusfb card
07:34:50 <doug16k>
07:35:17 <doug16k> no way that will run > 0 fps on cpu with no accel
07:36:31 <doug16k> and if that one seemed to easy on the gpu:
07:37:18 <klys> how to compile it?
07:37:31 <doug16k> it just runs right away behind the editor
07:37:41 <doug16k> your browser is failing it I guess or webgl is off
07:37:52 <klys> oh I see
07:37:56 <doug16k> try it on a fully setup machine if you have one
07:38:32 <doug16k> you realize that that render is just a quad right? and the shader is figuring out what color to make each pixel with the same program for every pixel?
07:39:17 <doug16k> there's no geometry besides the two giant triangles to cover the screen
07:41:26 <doug16k> every run of the shader gets its position and place on the screen as incoming parameters. it runs lots in parallel. my fan goes up to full when it runs
07:42:04 <klys> oo
07:42:12 <klys> it runs in the browser
07:42:25 <klys> now to turn off virgl and test
07:42:28 <doug16k> yes, webgl
07:42:34 <doug16k> oh nice!
07:42:41 <klys> :)
07:44:58 <klys> Loading...
07:45:59 <klys> even compton doesn't get it going
07:46:23 <klys> may have spoke too soon
07:47:01 <doug16k> you lost me at compton
07:47:03 <klys> oo slow
07:47:13 <klys> it's the software renderer
07:47:29 <doug16k> it software runs the shader? wow!!
07:47:39 <doug16k> that 2nd one I gave will kill it to well under 1 fps
07:47:44 <klys> yeah, at a frame every 2 seconds
07:49:32 <doug16k> 0.5fps is pretty impressive for cpu. seriously
07:49:41 <klys> the amiga one is running ~5 fps
07:49:47 <doug16k> they did a really good job on that software shader
07:49:52 <doug16k> ...engine
07:49:54 <klys> yeah
07:50:59 <klys> do you have a webgl interface by chance (i feel dumb)
07:52:06 <doug16k> this is the most popular webgl lib afaik:
07:52:17 <doug16k> you could take one of their examples and go I think
07:52:33 <klys> I mean, DGOS
07:52:53 <klys> if you have virtio-gpu support, what kind of test do you have?
07:53:10 <doug16k> oh I only have a little 3d done. the 2d is fully done but 3d only has a bunch of the type declarations and some helpers for issuing them properly
07:53:38 <klys> so you can test 2d aceleration with a supplied program?
07:53:57 <doug16k>
07:54:24 <doug16k> the "acceleration" you get with virtio 2d is that it doesn't trap every framebuffer memory access or have to do any polling or anything
07:54:57 <klys> is there a demo
07:54:59 <doug16k> you attach backing to the screen with your guest memory and you tell it when you need to copy new things to the screen (because you changed that region)
07:55:39 <doug16k> so, you can do a bazillion memory accesses filling, drawing, scrolling, whatever, then, when you have something to present to the user, you tell virtio to update some region of the screen from your backing
07:55:50 <klys> yeah
07:55:55 <doug16k> then qemu can read it once and update the screen cleanly
07:56:24 <klys> that sounds very sanitized...
07:56:31 <doug16k> it's pretty nice
07:56:35 <doug16k> you have the spec right?
07:57:04 <klys> I still don't have allocation in my project, I'm like halfway there if that.
07:58:21 <doug16k>
07:58:28 <graphitemaster> does anyone know how to get gcc not to generate these endbr64 (branch target instructions)
07:58:46 <klys> I was reasoning with virgl because I had to recommend it to someone else a few weeks ago, and decided to test it myself if it's so great.
07:58:51 <graphitemaster> they're driving me nuts when reasoning about disassembly
08:00:00 <doug16k> klys, there's a doc somewhere that explains how to use the 3d portion (the pdf documents 2d)
08:00:32 <klys> doug16k, yeah just I'm getting way ahead of myself
08:00:36 <doug16k>
08:00:55 <doug16k> let it be motivation to get past the stuff you're at now
08:02:30 <doug16k> klys, as far as I have seen virgl works fine and is awesome
08:02:57 <klys> doug16k, so do you have a 2d accel test program or not...
08:03:00 <doug16k> haven't seen far enough to be sure yet though
08:03:11 <doug16k> ya 1 sec
08:04:35 <doug16k> I made this a while ago. idk if it will use 2d or 3d though ->
08:04:52 <doug16k> it figures out how many balls can bounce at 60 fps
08:05:24 <klys> doug16k right, and can dgos test this too?
08:05:35 <doug16k> no
08:05:38 <klys> kk
08:06:42 <doug16k> I did a full screen clipped blt of a big png file a while back to see how fast it could go in a vm. got it to do about 1200 fps
08:06:53 <doug16k> which is my memory bandwidth
08:07:11 <doug16k> approx
08:07:11 <klys> that answers my question
08:07:24 <doug16k> ya it's heavily optimized
08:07:32 <doug16k> the vm end I mean
08:07:45 <klys> well let's see it, do you still have the benchmark
08:08:18 <doug16k> I'd have to go back tons of commits
08:09:03 <doug16k> if you really want to see, look for a link I posted here a long time ago showing a funny looking bug in my png decoder and that'd be the approx date to checkout from git
08:09:16 <doug16k> pic of a mountain
08:10:37 <klys> yeah this thing is getting ~50fps in virgl+xorg+gnu+linux-4.19+qemu
08:10:55 <doug16k> it always gets about 60 fps
08:10:58 <doug16k> how many balls?
08:11:07 <klys> ~420
08:11:09 <doug16k> "target" value
08:11:15 <klys> at 640x480
08:11:39 <doug16k> ah, that's decent. my 860m in firefox is getting ~960
08:11:46 <doug16k> on real machine
08:12:06 <klys> now it's about 60 at 1024x768
08:12:43 <doug16k> it tracks 60. it adjusts target until it is just about to lose 60 then backs off to 60, then overshoots it... it keeps trying to add more balls without losing fps
08:12:51 <doug16k> look at how many it can do
08:12:59 <klys> ~450
08:13:20 <doug16k> this is more of a torture test of the browser layout engine than a fill test though
08:13:35 <klys> it's cool because it's in the background behind my konsole
08:13:49 <doug16k> the "balls" are a bunch of divs with giant border radius
08:14:12 <klys> divides?
08:14:19 <klys> oh
08:14:23 <klys> the DOM element
08:14:43 <doug16k> ya, <div></div> kind
08:14:57 <klys> right-o
08:19:19 <klys> fri 03 mar 2017: 14:34 < doug16k_> bug in my png loader :D
08:19:25 <klys> dead link, though
08:20:59 <klys> meanwhile the earliest commit on is 13 jul 2018
08:22:27 <doug16k> no go back more than that
08:22:46 <doug16k> you don't even need virtio gpu - the fillrate test was against a vesa LFB
08:23:32 <doug16k> hey, the commit that adds background.jpg would be near the right place in time
08:26:01 <doug16k> I'm going to be ready to start hammering virtio-gpu soon, I'm doing the windowing/compositing stuff right now. I got virtio to the point where I can draw to it and it works and it magically resizes when the window size changes, and automatically rearranges the scanlines so only the newly exposed region is blank
08:26:16 <doug16k> or truncating and moving scanlines appropriately when reducing window size
08:26:52 <klys>
08:28:15 <doug16k> more like here or soon after
08:28:57 <doug16k> I tried to maximize transfer bandwidth from the backbuffer to the framebuffer with write combining
08:29:45 <klys> yeah I was just looking at that, now trying to find the tree
08:30:19 <klys> "browse files" seems to work
08:31:11 <klys> "clone or download" is just for the main branch, however.
08:31:30 <doug16k> do you have a git clone of it somewhere?
08:31:30 <klys> oh
08:31:34 <klys> got it
08:32:02 <klys> the download zip" thing works for the commit
08:32:52 <doug16k> it will print performance information to the debug output at the end of the test
08:35:30 <doug16k> the code for the balls thing in case you care about js at all ->
08:35:48 <doug16k> that's my site btw, it's like jsfiddle with no spyware or ads and it doesn't inject crap into your result
08:36:25 <Prf_Jakob> doug16k: write combining? On real gpu hardware or virtio?
08:36:47 <doug16k> Prf_Jakob, to real vesa LFB
08:36:53 <Prf_Jakob> Ah ok
08:37:00 <doug16k> but ya inside qemu I do it there too as if it is real
08:37:39 <doug16k> now it will be virtio and I'll just be updating system memory backing
08:38:03 <doug16k> so no need for fancy WC
08:41:02 <doug16k> ran into an interesting issue there and with keyboard and mouse. I might use VBE to set up the initial GPU state in bootloader, then later, discover the virtio-gpu, then have to hijack the screen and stop using vesa lfb and start using virtio. similar issue hijacking mouse and keyboard in my usb class drivers. I don't really do formally properly
08:41:50 <doug16k> when I wrote i8042 driver, the last thing I expected was unloading it :)
08:43:19 <doug16k> not a big deal, just in there with all the other todos
08:43:35 <nyc`> I need to be able to put in more time at the computer for the binutils ELF target bughunt. I'm going to go out to try to take care of some medical issues today so I'll be able to do that. It's an odd way to spend my birthday, but I'm far from enthused about being another year older anyway.
08:44:08 <klys> doug16k: cc1: error: code model kernel does not support PIC mode
08:45:31 <nyc`> Aren't people using PIE in-kernel for security reasons anymore?
08:48:06 <mischief> openbsd does something like that
08:48:12 <klys> so I just add -fno-PIC
08:49:14 <doug16k> nyc`, I'm using PIE in my kernel
08:49:34 <doug16k> now, not way back then
08:50:00 <nyc`> I think the kernel address space layout randomization is somehow tied to Spectre/Meltdown.
08:50:11 <doug16k> bootloader can throw it wherever and it just works (tm)
08:50:27 <klys> and now it's: ld: example.o: relocation R_X86_64_32S against `.data' can not be used when making a PIE object; recompile with -fPIC
08:50:57 <doug16k> klys, man, I don't know what to tell you. it worked with whatever I was building with then. nothing proper by that point
08:51:11 <klys> yeah
08:51:30 <doug16k> the mysteries of code rot
08:52:10 <doug16k> klys, I'll have a perf test soon for my windowing changes. should work then
08:53:11 <nyc`> I mostly remember hearing about abusing link loading relocation processing at boot time for micro-optimizing boot-time -initialized constants.
08:54:26 <doug16k> you can probably pull that off using ifunc relocations without too much trouble
08:55:25 <doug16k> plt still spectre vulnerable though, no retpoline in sight and using indirect calls. idk what to do there other than kill perf altogether by using __attribute__((__noplt__)) and have it stall on a retpoline for what should be a direct call
08:56:03 <nyc`> I'm not big on micro-optimizing things like that or inline functions or inline assembly.
08:57:12 <klys> doug16k, is this 32-bit or 64-bit code?
08:57:20 <doug16k> 64
08:57:23 <klys> kk
08:58:48 <nyc`> doug16k: Well, I'm foggy on how the PLT is a Spectre vulnerability.
08:59:17 <doug16k> nyc`, all indirect calls and indirect branches are the beginning of a spectre vulnerability
08:59:47 <doug16k> need a way to train that branch to go to a gadget and somehow infer what the gadget did
09:00:10 <doug16k> and you're done, exploit complete
09:00:16 <doug16k> no indirect branch no vulnerability
09:01:08 <nyc`> I think using the indirect branch as an exploit needs a little bit more.
09:01:13 <doug16k> retpoline fixes it by making it mispredict every time into a safe code path and stall until it realizes it should have went to the correct place
09:03:56 <nyc`> I think it takes more than knowing where you're going to branch to in order to exploit a system.
09:08:26 <nyc`> It isn't even necessarily giving away mildly privileged information that I can tell, never mind granting access to privileged functions or data.
09:09:57 <doug16k> the branch prediction mechanism uses essentially a hash table to store the prediction information, and that table has collisions. user code can control the predicted next instruction for any branch in the kernel
09:10:15 <nyc`> I may be failing to understand.
09:11:06 <doug16k> you can use that to make the cpu mispredict into an attacker controlled location in the kernel with potentially some control over the registers, and have it perturb the cache in a way that you can use later to see if the access was fast or slow
09:11:23 <doug16k> not any branch sorry. any indirect branch
09:11:58 <doug16k> if the access was fast your gadget touched there, otherwise, didn't
09:12:35 <nyc`> That's getting closer to trouble.
09:12:39 <doug16k> because a miss will be 1000+ cycle stall and a hit (that hit because the gadget touched it) won't
09:14:05 <doug16k> you can trick the kernel into perturbing one cache line or another or one of a series of cache lines based on the value of something you shouldn't be able to access
09:14:20 <doug16k> then elsewhere infer those secret values from which lines are quick and aren't a cache miss
09:14:47 <nyc`> I'm still a bit short of connecting it to privilege escalation or password leaks, but still.
09:15:19 <doug16k> I don't claim to be the least bit black hat.
09:15:42 <doug16k> I understand the principle behind it though
09:16:27 <nyc`> I'm not a security person either. I have rarely had direct contact with userspace.
09:18:19 <doug16k> my blackest hat days were back when windows 95 shared your entire root drive read write with the world if you didn't set up the internet properly, and thousands of people's entire drives and printers were right there
09:18:58 <doug16k> I didn't touch anything, I found them by accident
09:19:25 <nyc`> I didn't have personal ownership of computers until 2000.
09:21:10 <nyc`> They were things I went to labs to log into on Wyse50's or Visual Graphics X-19's.
09:24:15 <doug16k> I almost made a program that was going to connect to every one of them and print out a message that their entire hard drive and printer is shared with the internet and they need a firewall or someone to fix their setup. I didn't because that alone would be an illegal use of their computer, so I abandoned my little altruistic endeavour
09:25:08 <doug16k> print out on their printer that is
09:25:58 <doug16k> I'm sure many less noble souls put nasty programs and did writes to programs left and right
09:26:38 <ryoshu> what is intel trm?
09:27:30 <ryoshu> I got a reference to trm 4.1.2
09:28:20 <doug16k> short for technical reference manual?
09:28:37 <doug16k> doubt though, the manual title is "Intel® 64 and IA-32 Architectures Software Developer’s Manual"
09:28:37 <nyc`> I remember the encroachment of Microsoft affairs on the university. It was resented etc. because it rendered the systems useless in many respects, e.g. compilers and ways to remote access to other systems online were unavailable on the platforms.
09:28:38 <ryoshu> =SDM?
09:28:40 <doug16k> SDM would be that
09:29:59 <doug16k> ryoshu, paging related?
09:30:19 <ryoshu> yes
09:30:31 <doug16k> ya that's volume 3 of SDM
09:30:35 <doug16k> 4.1.2 Paging-Mode Enabling
09:31:32 <ryoshu> looking! thanks
09:32:08 <klys> doug16k: ever get this one: ld: final link failed: file truncated
09:32:33 <doug16k> not that I remember
09:32:37 <doug16k> what if you make again
09:32:56 <doug16k> it might have concurrency bugs in that old build
09:32:59 <nyc`> I'm getting emulations not being recognized.
09:33:31 <ryoshu> doug16k: there is switch PG=1 PAE=0 LME=0 -> PG=1 PAE=1 LME=0
09:33:45 <klys>
09:34:12 <doug16k> ryoshu, ya, so it just tlb flushes like toggling CR4.PGE and yay just works
09:34:14 <doug16k> very strange
09:34:31 <nyc`> The mips64-elf and mips64el-elf targets are only picking up a single 32-bit emulation.
09:35:24 <doug16k> I guess they figured the catastrophe would come soon enough if you accidentaly did it in a kernel :)
09:36:00 <doug16k> no need to early #GP it, the world ends if you didn't mean it
09:36:09 <klys> doug16k, if dot moved backwards, does that mean you've subracted from the program counter?
09:36:20 <doug16k> klys, it means something didnt fit
09:36:25 <klys> ah
09:36:29 <doug16k> that was my crazy 16 bit bootloader that barely fit
09:36:41 <doug16k> now it can be giant 32 bit program
09:36:46 <nyc`> ld/configure.tgt seems to need some kind of support code and errors happen even if it's modified to accept more emulations on the command line.
09:37:22 <doug16k> klys, needs smallest code in bootloader probably
09:37:29 <doug16k> 1st guess anyway
09:37:33 <klys> kk
09:37:42 <doug16k> and bootloader must be optimized
09:37:48 <ryoshu> doug16k: "Software can make transitions between 32-bit paging and PAE paging by changing the value of CR4.PAE with MOV to CR4."
09:38:44 <doug16k> pretty awesome but I'd rather have it trip #GP than run in crazy mode for a short time if it were done erroneously when you are changing something that is on
09:39:32 <doug16k> they probably figured they already are wired up to flush on other CR4 bit changes so why not just flush for that one and have neato change on the fly
09:40:20 <doug16k> CR4.PGE changes flush the TLB
09:40:28 <ryoshu> so the question is whether rdmsr() is legal with !PAE && PG
09:40:29 <doug16k> already
09:40:41 <doug16k> sure it is, why wouldn't it be?
09:40:42 <nyc`> I need to be able to be rested so I can sit down and trace the steps binutils goes through to find where I have to add something for the ${arch}-elf "triplets."
09:41:01 <ryoshu> doug16k: for some reason HAXM rejects it with #GP
09:41:02 <doug16k> I read MSRs without PAE in my bootloader
09:41:26 <doug16k> I think!
09:42:07 <ryoshu> I mean, reading EFER with rdmsr
09:42:11 <nyc`> Is PAE using more bits for physical addresses now that 64-bit is there as a fallback?
09:42:41 <doug16k> of course you can, how else do you set LME?
09:43:02 <doug16k> ah you need PAE on first actually there
09:43:15 <ryoshu> PG=1 PAE= and setting LME results in #GP
09:43:20 <doug16k> you're saying rdmsr is banned unless CR4.LMA == 1???
09:43:32 <doug16k> oh PAE
09:43:45 <doug16k> you can read MSRs before x86_64
09:43:51 <doug16k> they existed long before amd64
09:44:05 <ryoshu> HAXM rejects reading EFER with PAE=0 PG=1
09:44:11 <ryoshu> looking for rationale for this
09:44:21 <doug16k> the rationale is you always have PAE on there
09:44:34 <doug16k> read the instructions for transition to IA32e mode in SDM
09:45:09 <doug16k> steps: 1) turn on PAE, 2) read EFER, 3) set LME, 4) set PG=1, 5) now you are in long mode, yay!
09:45:42 <ryoshu> this is i386 mode kernel
09:45:42 <doug16k> if you attempt to set LME without PAE you get #GP
09:45:51 <ryoshu> right
09:45:57 <doug16k> so they kinda jump the gun and stop you from even reading EFER
09:46:46 <doug16k> 3.5 above) write EFER with wrmsr (of course)
09:47:27 <doug16k> it's one of the many cases where virtualization does it totally wrong and it always works in the common use case
09:47:45 <doug16k> my kernel would boot, because I have PAE on before I read EFER
09:48:03 <doug16k> probably because intel and/or amd said to do it that way
09:48:16 <doug16k> I don't do rebel crap, I do it as compatible as possible
09:48:26 <ryoshu> is this I see
09:48:40 <ryoshu> looking for startup notes, I saw them today somewhere in SDM
09:50:27 <doug16k> ryoshu, you probably want to see SDM volume 3 9.8.5 Initializing IA-32e Mode
09:50:37 <doug16k> IA-32e is intel's way of saying long mode
09:50:58 <klys> doug16k,
09:51:04 <ryoshu> thank!
09:51:05 <ryoshu> s
09:52:05 <nyc`> I'll add an x86 port to TsaiOS or whatever the kernel I'm working on will end up named after I get ARM, MIPS, OpenRISC,RISC-V, and SPARC (more properly, their 64-bit variants) going.
09:54:58 <nyc`> (Tsai Lun traditionally has/had the invention of paper attributed to him, though it actually seems to be a couple hundred years older.)
09:56:22 <nyc`> Maybe IBM POWER too.
09:59:56 <doug16k> what a blundering design to allow PAE to commit going on and off while paging is on with no GP. if you accidentally did that you've lost all hope of regaining any control over the CPU and you could be doomed to be wedged in a fluke infinite loop due to whatever nonsense the wrong-pae interpretation says
10:00:18 <doug16k> the wrong pae interpretation of the page tables that is
10:01:03 <ryoshu> I think it's typical for booting
10:01:05 <doug16k> nothing can make CR3 change in long mode. IST won't help
10:01:21 <ryoshu> to close eyes and try to make step forward
10:01:21 <doug16k> change when a catastrophe happens I mean. no handler can be done. no task gates
10:02:11 <doug16k> hmm I guess you're safe there, you can't possibly turn off PAE in long mode. but in protected mode, my gripe stands
10:02:29 <doug16k> hmm, there you can task gate. gripe withdrawn :D
10:03:13 <doug16k> ah but those need to use the page tables. still screwed if you don't place them where the page tables make enough sense in wrong pae
10:04:47 <doug16k> maybe a truly bulletproof protected mode kernel would have to be able to withstand PAE being turned off and have enough mapping in misinterpreted tables to be able to run a task gate into a handler
10:05:23 <ryoshu> too much information, reading SDM
10:05:36 <doug16k> pondering too much. will shut up
10:05:38 <doug16k> :D
10:34:43 <nyc`> x86 actually has a big gap between page sizes that's worth showing how well my algorithms handle. I think ARM and POWER actually operate in a similar fashion, apart from the base page size having a few different configuration options on ARM. I could be misunderstanding ARM, though.
10:36:42 <nyc`> POWER actually has a couple of base page sizes, too.
10:37:05 <ryoshu> IA32_EFER.LMA and LME looks redundant
10:39:16 <immibis> doug16k: your kernel doesn't need to be that bulletproof if it refrains from shooting itself, though
10:42:42 <nyc`> I think the first larger page size is 16MB on POWER. ARM I think has its first larger page sizes vary with granule but they're a smaller multiple than IBM POWER has.
10:58:12 <doug16k> ryoshu, LME is only consulted when PG is transitioned from 0 to 1
10:58:20 <nyc`> So IBM POWER is really the architecture to demonstrate handling realistic page size gaps. One could artificially make gaps in MIPS' page size spectrum, but it's better not to have to fake it.
10:58:29 <doug16k> when PG is transitioned from 0 to 1 when LME is 1, then LMA transitions to 1 and you enter long mode
10:58:59 <doug16k> when you turn off PG when LMA is 1, then LMA transitions from 1 to 0 too
10:59:08 <doug16k> LME = long mode enabled, LMA = long mode active
11:00:21 <doug16k> so, PG=0, LME=1, LMA=0 is going to transition to PG=1, LME=1, LMA=1 when you set PG to 1
11:01:07 <doug16k> theres a state machine that restricts what order you can turn PAE, LMA, and PG on and off
11:01:23 <doug16k> er I mean LME. you never change LMA
11:01:41 <doug16k> it's implied when PG transitions
11:02:50 <nyc`> That's pretty ugly. I guess following directions is the only way.
11:02:59 <mrvn> But unlike the specs suggest you can transition directly from 16bit to long mode.
11:03:28 <doug16k> you can yes but I wouldn't deviate that far from documented procedures no matter how well it works
11:04:12 <mrvn> It's perfectly valid. It's just not what you normaly do. Normally you have a bootloader that switches to protected mode and then the kernel switches to long mode.
11:04:36 <doug16k> your definition of validity doesn't include the procedure documented in the intel SDM?
11:04:53 <mrvn> doug16k: valid as in an allowed state change
11:04:54 <doug16k> if it is nothing like what they describe I don't see how that meets the definition of valid
11:05:17 <nyc`> Does grub transition into long mode or just 32-bit?
11:05:23 <doug16k> I agree though, the manual seems excessively cautious, but there must be a reason
11:05:24 <mrvn> nyc`: 32bit only
11:06:08 <mrvn> nyc`: UEFI can do long mode
11:06:16 <doug16k> mrvn, that SDM section I pointed out, it tells you that you need to be in protected mode before step 1 IIRC. why would it say that?
11:06:27 <mrvn> doug16k: no idea.
11:06:30 <doug16k> section describing entering IA-32e mode
11:06:47 <doug16k> that's why I'm afraid to go from real mode to long mode
11:07:49 <mrvn> I'm never in real mode so that worry doesn't apply
11:08:09 <nyc`> 32-bit is cleaner than 16, I guess. I'd prefer to punt all the legacy modes to the bootloader anyway.
11:09:01 <mrvn> nyc`: 16bit mode is horrible. Try loading a 2MB kernel image in 16it mode.
11:09:45 <nyc`> mrvn: Overlays are a lot of drudge work.
11:19:34 <doug16k> that's what I do. my bootloader will boot BIOS or UEFI from hd/cd/pxe and sets up paging and everything per the program headers and relocations and enters the kernel with everything mapped, doing physical allocation using the memory map or UEFI, and scatter load of the kernel sections into physical pages
11:19:58 <doug16k> not one instruction of my kernel needs to be identity mapped
11:20:20 <doug16k> bootloader handles tramoline of APs into kernel too
11:20:35 <doug16k> it had to do BSP, why not leverage codepath for APs
11:30:20 <nyc`> I'm surprised, but glad the bootloader handles AP bringup.
11:32:53 <nyc`> Sadly, not every architecture has as good of bootloader support.
11:36:27 <nyc`> I guess the fault with some is the incompleteness of their open source firmware clones and lack of kernel loading support in their simulators.
11:40:07 <nyc`> It's probably easier to do kernel loading in their simulators if I'm going to write it myself.
11:51:49 <nyc`> AIX made you pin everything in memory before accessing it, but I don't think it evicted the kernel code itself.
11:54:34 <nyc`> DYNIX/ptx had a thing where it tried to service system calls out of per-process kernel address space and you had to switch to the full kernel address space before accessing global data.
11:58:31 <nyc`> I don't know the rationale for AIX swapping kernel memory. DYNIX/ptx was trying to function on 64GB x86-32.
11:59:01 <mrvn> nyc`: That's his bootloader. Not the x86 bootloader. No such thing.
11:59:54 <mrvn> nyc`: they probably didn't want to swap address spaces and kill all the caches for something that doesn't need high memory.
12:00:03 <nyc`> mrvn: Oh, I've never looked into grub on SMP.
12:00:06 <mrvn> nyc`: somwthing between vdso and full kernel
12:00:21 <mrvn> nyc`: grub doesn't do smp nor any other booloader for x86.
12:00:40 <mrvn> Does UEFI have callbacks to bring up more CPUs?
12:01:14 <mrvn> Note to self: Never trust advertising.
12:01:22 <nyc`> mrvn: What's this? DYNIX/ptx managed its address space very differently from Linux.
12:01:52 <mrvn> My Intel M.2 pci SSD suposedly manages 1800 MB/s. dd says it actually manages 2.0GB/s.
12:02:03 <mrvn> nyc`: what's what?
12:02:15 <nyc`> mrvn- I'm not in a position to look up EFI specs.
12:03:26 <nyc`> mrvn: There wasn't a direct equivalent of Linux-like high memory on DYNIX/ptx.
12:03:40 <mrvn> nyc`: high memory = 64bit memory
12:04:02 <mrvn> or 48 bit or whatever you had
12:04:32 <mrvn> nyc`: no point switching address modes or paging in other physical banks when you don't need to.
12:05:30 <nyc`> 36 by dint of Intel limits. They probably could've scaled up to 1TB on 32-bit.
12:08:13 <nyc`> If 32-bit mode on x86-64 has more PAE bits someone in DYNIX/ptx legacy support might get bored and try bringing it up with more than 64GB.
12:08:57 <mrvn> nyc`: 32bit mode on x86_64 has the full address space. It uses the 64bit page tables.
12:09:24 <mrvn> You need to run a 64bit kernel for that. Not just add a few PAE bits.
12:12:37 <nyc`> mrvn: Maybe I'm not describing it clearly. I should probably go look at the architecture manuals myself. It's a bit of an academic question because the TLB overhead of windowing as 32-bit would need is a big performance hit.
12:13:24 <bcos_> For 32-bit kernel using PAE running on a "64-bit capable" CPU; physical addresses are the same "up to 52 bit" as they are in long mode
12:14:42 <bcos_> (where "up to 52 bit" depends on the CPU's physical address max. reported by CPUID, which can be as small as 32 bits, even for long mode)
12:14:52 <nyc`> bcos_: Sounds like DYNIX/ptx could be tried with more than 64GB RAM on a 32-bit kernel, then.
12:16:24 <mrvn> bcos_: didn't PAE just add 4 bit to the address space, i.e. 64GB total?
12:16:50 <bcos_> mrvn: Originally, yes; but when long mode got added they extended it
12:16:56 <nyc`> I only ever did Fibre Channel boot on DYNIX/ptx, so I didn't see much of how it did VM.
12:18:07 <nyc`> The 32-bit PAE PTE's were 64 bits wide and had room for more physical address bits than the 36 specified.
12:20:10 <nyc`> The lower 12 were all that the PTE's reserved for purposes other than the page frame number.
12:23:17 <nyc`> Linux uses to ignore the upper 32 bits of PAE PTE's for swap PTE's, but I added the relatively trivial code to use the rest of the bits at some point.
12:25:10 <nyc`> It enabled using more and bigger swap files and partitions for e.g. non-overcommit.
12:25:51 <mrvn> nyc`: I really see no point in that. Any hardware you get nowadays either is 32bit or has long mode. Getting a PAE only cpu would be totally stupid.
12:26:08 <mrvn> and 64bit kernel is faster.
12:26:50 <nyc`> (Where you have no intention of ever actually touching swap, but have to have lots of it there to guarantee that allocations won't fail and so have to account reservations against it.)
12:27:39 <mrvn> And a 64bit kernel will do that and more.
12:30:01 <nyc`> mrvn: From my POV making kernels work on different systems is interesting. Physical greater than virtual raises issues that are interesting to solve.
12:31:36 <mrvn> I tried to avoid problems that don't have to exist.
12:32:19 <nyc`> mrvn: Why write an OS? They already exist.
12:33:04 <mrvn> too many problems in those
12:34:53 <nyc`> Computers aren't really that necessary for day-to-day affairs. Problems with computers don't have to exist for people not doing science or engineering and a lot of times not even then.
12:37:24 <nyc`> I'm keeping myself busy for a few years if nothing else.
12:41:56 <nyc`> I remember back in the day computers and lasers were all the rage in science fiction, like Weird Science. The guys summoning Kelly LeBrock instead of me was a problem that didn't have to exist.
12:42:38 <mrvn> who would want you over Kelly LeBrock?
12:43:10 <nyc`> mrvn: No one, that was the problem that didn't have to exist.
12:46:17 <jmp9> thanks for advice
12:46:29 <jmp9> i've read initialization in sata specification
12:46:32 <jmp9> but one problem
12:46:50 <mrvn> only one? Lucky you
12:47:12 <jmp9> where i can find how to send commands to sata and wait for them
12:47:13 <jmp9> in spec
12:47:33 <mrvn> that would depend on the controler
12:47:48 <jmp9> AHCI
12:49:38 <jmp9> because i noticed problem on my laptop
12:49:44 <jmp9> that i doesn't wait for command completion
12:53:16 <jmp9> it's looks like that it doesn't run command
12:53:23 <jmp9> and doesn't set busy & drq flags
12:57:57 <nyc`> (Evil robotic clones of Superman and US-Soviet nuclear annihilation were other problems that according to the movies didn't have to exist, but needed to be solved because of computers.)
01:12:12 <nyc`> I actually think 32-bit is probably going to last as long as computers will, but it's more debatable whether PAE-like affairs will persist in general purpose operating systems or will need to.
01:16:13 <nyc`> Persisting in special purpose systems is probably a safe bet. It wouldn't take much in the way of an operating system to use a bunch of RAM for caches or buffers in an IO device.
01:19:43 <nyc`> My personal inclination is to regard the variety of situations and target systems the kernel works and works well on as a figure of merit or quality metric, so I'm in another world as far as design philosophy goes.
01:19:44 <mrvn> nyc`: why should they? Special purpose system are cheaper when they use common purpose cpus.
01:20:22 <nyc`> mrvn: Power draw is the most likely concern.
01:20:29 <mrvn> PAE comes from a time when the 32bit cpu was cheaper than the 64bit or didn't even exist.
01:22:25 <mrvn> nyc`: power consumption is more affected by how fast you get back to sleep / how infrequent you wake up. You want a real good sleep mode that doesn't draw any power.
01:23:04 <mrvn> Which is something modern code is saddly bad at. They constantly busy loop somewhere.
01:24:28 <nyc`> x86 PAE was probably made for a small number of vendors like Sequent and Unisys. PC's and their operating systems were largely incapable of physically installing it and had operating systems that couldn't use it well when they could.
01:24:51 <mrvn> Case in point: yesterday I installed Debian stable with Desktop / xfce (big mistake selecting that in the installer :). So after install and reboot you get a graphical login from sdd-greeter. That eat up 10% CPU time constantly waiting for you to log in. WTF?
01:26:56 <nyc`> So it's not like x86 PAE was ever in broad use.
01:27:09 <Galaxor> I'm trying to use int 15 e820 to get a memory map. It's returning an error (CF set, AX=8620). How do I figure out why?
01:28:57 <nyc`> If you want to argue from end-user userbase, there's not only no reason to touch anything but x86, there's no reason to develop operating systems or even use anything but Windows.
01:30:57 <nyc`> The same petard hoists every argument from popularity and practicality and such.
01:32:28 <Galaxor> If I'm reading RBIL correctly, error code AH=86 means "function not supported". But this is qemu-system-x86_64, I have a hard time believing that e820 is not supported.
01:33:01 <mrvn> nyc`: another point against it. It's too rare. ARM has a PAE mode too which need to die too.
01:33:32 <mrvn> nyc`: any savings you get from PAE you loose in the extra complexity of phys>virt.
01:34:24 <mrvn> Galaxor: getting the memory map and actually getting a correct memory map is hard. why not use the one grub already fixed for you?
01:35:17 <mrvn> *lunch* wave
01:36:18 <nyc`> mrvn: I covered rarity. Operating systems are complex, and as far as problems we don't need, witness non-x86, non-Microsoft, and even computers as a whole.
01:36:28 <Galaxor> mrvn: The whole point of this is to learn how to do ridiculous nonsense like this. I guess I could read grub's code to see what they do. But that would teach me how to do it the right way. I was hoping to learn how to do some troubleshooting.
01:37:54 <nyc`> Galaxor: I think there are EFI/ACPI memory tables.
01:38:56 <nyc`> Galaxor: MP tables might have some memory affairs, too.
01:40:17 <Galaxor> nyc`: Hm. Like, maybe qemu doesn't support e820 and I should find a different method to get memory maps, involving acpi? (I'm not using efi at this time).
01:41:10 <nyc`> Galaxor: ACPI has memory tables for sure. I think there might be MP tables, too.
01:41:33 <Galaxor> nyc`: I don't even know what MP is.
01:42:50 <nyc`> There are other system description tables from an older specification called that.
01:46:15 <Galaxor> Ohh, maybe I should watch grub run. I'd like to see if qemu really doesn't support e820, or if I'm just doing it wrong.
01:52:41 <mrvn> Galaxor: grub certainly works in qemu. So there must be a way to get the memory map.
01:56:56 <Galaxor> mrvn: I guarantee I'm just doing something wrong. The map is placed at ES:DI. Maybe I picked a bad value of ES:DI or something.
02:01:03 <mrvn> "This old beast has a new heart beating inside her. It has 512 GW of ram, an 80something harddrive, ..."
02:03:51 <rakesh4545> can you explain descriptor tables to me?
02:08:24 <Galaxor> Hmm, I found one thing I was doing wrong: cx needs to contain 20. I converted that to hex wrong in my head, and I was setting it to 0x12 (18) instead of 0x14 (20). It's still erroring, but now ax=0x8600 instead of ax=0x8620, so it's a different error.
02:10:52 <nyc`> I think I'm going to punt using ${CPU}-elf target "triplets" for binutils and gcc to some eventual future because it's nicer to have more immediate forward progress. I'll call it technical debt and get around to it whenever. -ffreestanding -nostdlib -nostdinc on architectures where toolchains, simulators, and bootloaders all work out of the box for kernel dev will have to do for the moment.
02:13:43 <mrvn> nyc`: you fail to build a cross toolchain?
02:14:12 <nyc`> ppc64 is probably the worst loss on that front for my purposes.
02:14:48 <mrvn> I fail to see how one tripplet is any different from other common tripplets.
02:15:13 <nyc`> mrvn: They built, but default binary emulation options are screwed up.
02:16:00 <mrvn> binary emulation options?
02:16:19 <jmp9> i found the problem
02:16:23 <jmp9> when i send command to AHCI
02:16:30 <jmp9> on qemu PxTFD sets BSY and DRQ
02:16:30 <jmp9> but
02:16:33 <jmp9> on laptop
02:16:35 <jmp9> it doesn't
02:18:50 <nyc`> Yes, like elf64btsmip.
02:24:00 <nyc`> Basically the multiple emulation and esp. 64-bit emulation output drivers are tied to the target OS part of the triplets.
02:29:55 <nyc`> I'll figure out what hit me later. I should be able to do things with the stock toolchains, simulators, and bootloaders for enough arches to get started even if a fresh toolchain would be cleaner.
02:30:31 <jmp9> okay
02:30:34 <jmp9> qemu is redarted
02:30:38 <jmp9> retarded
02:30:41 <jmp9> i set GHC.AE to zero
02:30:44 <jmp9> and it also works
02:32:34 <aalm> you shouldn't really dev hw-support against qemu
02:32:51 <jmp9> i don't
02:32:56 <jmp9> i testing it on laptop
02:33:03 <aalm> good
02:33:13 <jmp9> but i don't know why it doesn't execute commands
02:33:26 <rakesh4545> is there a community for compiler devs?
02:33:30 <jmp9> FR and CR in PxCMD is 1
02:33:37 <jmp9> so it's should run command
02:34:16 <nyc`> comp.compilers I think used to be before USENET died.
02:36:31 <jmp9> i'm trying to run identify command
02:36:35 <jmp9> and i get zeroed buffer
02:37:44 <X-Scale> nyc`: still has some activity ->!forum/comp.compilers
02:40:54 <rakesh4545> no problemo.
02:41:33 <rakesh4545> I will find a teacher anyway.
02:46:04 <zhiayang> there's #compilers and #proglangdesign on here
02:48:10 <rakesh4545> tanks :) I will now bug em up!!
03:46:22 <c32> hello
04:41:40 <nyc`> c32: Hello.
04:44:14 <c32> nyc`: hi
04:44:57 <c32> i'm really inspired to make a bootloader like tccboot but better
04:45:27 <c32> but i don't know much about making bootloader, i hope to learn as i go
04:55:06 <nyc`> My next thought for the name is YèOS.
04:57:38 <aalm> Y?
04:58:13 <aalm> i might be missing some char there due font or something.
05:02:18 <nyc`> Ye with ` as an accent.
05:03:20 <c32> for my bootloader?
05:05:43 <aalm> nope, i think
05:05:47 <ashkitten> yeetos
05:06:01 <ashkitten> also hi
05:06:08 <aalm> also lo
05:06:20 <c32> hi
05:06:27 <Daouki> hello :)
05:07:08 <Daouki> i've got a pretty basic problem and i thought that maybe you'll be able to help me out: my interrupt handlers wont get called and i have no idea why
05:07:23 <Daouki> heres the idt code:
05:07:39 <Daouki> would say anything more than that but im completely lost
05:09:22 <aalm> oh, but now you get to learn the power of polling xD
05:09:44 <aalm> "don't interrupt me, i'm busy"
05:10:51 <eryjus> Daouki is paging enabled?
05:11:23 <eryjus> Pack your structure
05:11:32 <nyc`> ashkitten: YeOS (ignoring accents)
05:12:08 <nyc`> ashkitten: YèOS with the accent.
05:12:17 <ashkitten> i know, i'm joking
05:13:19 * ashkitten wonders where "yeet" came from
05:13:48 <nyc`> I'm guessing it'll sound like "Yes."
05:14:35 <Daouki> im using identity paging for the first 1 gig of memory
05:14:39 <Daouki> pack which structure?
05:15:08 <Daouki> idtr is packed, size of rest is just simply check with static asserts
05:24:12 <nyc`> Time where I can sit down and concentrate is scarcer nowadays than ever before in my life. I seem to need to budget my programming time in ways I never had to before. I've been out all day today doing medical garbage and won't be back to where I can sit down and program for probably hours.
05:29:35 <nyc`> I'm not going to be able to do as many ports as I'd like.
05:37:18 <nyc`> It is not in my power to NIH the world, and almost certainly never was.
07:13:17 <nyc> Daouki: Pack the IDTEntry struct.
08:09:56 <renopt> grave warnings from the compiler on this day
08:10:18 <renopt> it just couldn't go on, it says
08:10:20 <renopt> poor thing
08:43:06 <doug16k> Dauoki, still there? I don't see anything that is handling some interrupts pushing an error code and some not
08:44:09 <mrvn> nyc: why should one pack the IDTEntry? It's perfectly aligned.
08:44:22 <geist> renopt: it just gave up
08:44:30 <geist> sat down and said "no more"
08:44:47 <geist> quietly watched the sunset on its process
08:45:57 <mrvn> geist: I think I will sit next to you. I've spend the last 6 hours implementing a maze algorithm and now I see that it can't work. It leaves parts of the maze unreachable.
08:46:51 <geist> that eminds me, one of my bucket list tasks is to finally decode and understand the algorihm that the classic basic maze drawing program used
08:47:28 <geist> the particular one i'm thinking of was a CP/M basic maze thing and the code is completely incomprehensible, just layers of ifs and gotos and opaque variables
08:47:34 <geist> but it makes the best mazes
08:47:44 <Ameisen> hrmm... what's the difference between zlib and zlib-gnu debug data compression
08:47:48 <Ameisen> can't find any concrete details
08:49:50 <doug16k> this maze generator video is aimed more at beginner programmers but it's not bad ->
08:51:52 <mrvn> geist: The simplest simply outputs random() < MAT_INT / 2 ? '/' : '\'
08:55:03 <geist> yah i know which one you're talking about
08:55:16 <geist> but the maze program i knew of would guarantee exactly one path, etc
08:55:21 <geist> it was a pretty good algorithm whatever it was
08:55:36 <mrvn> Try
08:56:20 <geist> it may have been
08:56:28 <geist> that's the original one from creative computing
08:56:52 <mrvn> The hard part is that the maze must be endless and generated as you explore. It should also turn out the same no matter what order you explore in.
08:57:14 <geist> yep
08:57:26 <geist> ensuring a single solution is hard
08:57:27 <mrvn> and the chunks it generates at a time shouldn't be obvious. So nothing that leaves patterns
08:57:54 <mrvn> urgs, I see what you mean by ifs and gotos.
08:58:24 <mrvn> Maybe try turning that into a state machine.
08:58:39 <geist> it's clear that it builds the entire maze as a 2d array, then from about 1010 on it's just outputting it
08:58:56 <geist> i remember on the old kaypro II it would take a good 5 minutes or so to generate like a 100x100 maze
08:59:03 <geist> used to print them out on an old epson MX100 printer we had
08:59:44 <mrvn> that basic seems to only have conditional goto. So if then else is if + 2 gotos
09:00:43 <geist> well, no. there are some straight gotos in there
09:00:51 <mrvn> Have you tried turning all line numbers into labels and some defines so it compiles as C code?
09:00:57 <geist> shouldn't be too hard. i should first verify that this is the right maze program
09:01:10 <mrvn> ON X GOTO 790,820,860
09:01:13 <mrvn> jump table
09:01:14 <geist> it's from creative computing though, that was a famous collection of basic programs printed in some books
09:01:38 <geist> in the mid 70s. i think it was largely a collection of basic stuff that had been floating around the mainframe and minicomputer world
09:02:39 <geist> was the book. i have an early version of it too
09:03:41 <geist> ah yes, the one i have is from 1978 and is the 'microcomputer edition'
09:03:55 <mrvn> I'm going back to letting corrals grow. Do you know that algorithm?
09:04:03 <geist> presumably ported to microsoft basic or whatnot, which was likely to be less powerful than versions found on some mainframes
09:05:26 <mrvn> You pick a random spot on the border of your maze and if that cell can be added to your maze without creating a loop then you add it. Repeat.
09:06:37 <geist> hmm, may be. i'm always fascinated by little clever algorithms like that
09:06:49 <geist> i'd think you start off with a fully closed set of cells
09:06:57 <geist> maybe you knock a wall out and see if that makes a loop?
09:06:59 <geist> and keep trying
09:07:11 <mrvn> geist: That's the way it usually goes.
09:07:14 <geist> yah
09:07:26 <geist> i could see where it's completley N^2
09:07:33 <mrvn> The difference is just in what order you knock them down.
09:07:57 <geist> wonder if it matters where you start? can you start in the middle?
09:08:09 <geist> and would that tend to vary the shape of the maze?
09:08:20 <mrvn> For the corral growth you get a perfect tree with a cross in the middle. You better start in the middle.
09:08:40 <mrvn> it more or less grows in a circle outwards.
09:08:52 <geist> i suppose you could then add some heuristics to try to fiddle with being twisty or more straight
09:09:10 <mrvn> If you start on a side-ish then you get shorter path on that side and longer on the other.
09:09:12 <geist> by biasing towards which wall(s) you knock out based on what walls are alreayd open
09:09:32 <geist> my guess is you pick a different seed cell randomly
09:09:42 <geist> because otherwise wouldn't that mean the seed cell always has to be on the critical path?
09:09:54 <geist> if you always picked the middle wouldn't tht mean the critical path always has to go through it
09:10:43 <mrvn> I'm doing multiple passes at the moment. First I pick some spots for caves. Then I create path between the caves using random walks. Gives nice twisty ways and makes it all conected. And last I want something to grow confusing stuff around the paths that have no loops.
09:11:00 * geist nods
09:11:34 <mrvn> geist: the seed cell is always the center and the critical part. Remove it and you get 2-3 more or less equal parts
09:11:41 <mrvn> 2-4
09:12:08 <mrvn> (with the corral). But it's a tree. any cell that isn't a dead end is a critical path.
09:27:37 <GwenNelson> i'm now implementing a vt100 console TTY for my kernel rewrite
09:27:52 <GwenNelson> anyone know of any existing libs easy to use in kernel mode?
09:27:57 <GwenNelson> was thinking of porting libvterm
09:30:25 <geist> well, depends a lot on what you're porting
09:30:41 <geist> as in yes, there are some libs that can work in kernels, but it depends on a huge number of factors
09:31:04 <geist> liek wha the programming environment is in your kernel, the shape of the kernel's libc, if the lib needs any other libs, etc
09:31:27 <GwenNelson> i'm aiming for a boring old traditional *nix clone, assume i can bring in any standard libc functions needed
09:31:46 <GwenNelson> libvterm seems like it'll be easy to port, just looking if there's alternatives around
09:31:56 <geist> yes, but kernel programming environment is completely undefined
09:32:10 <geist> as in that's entirely up t you, 'unix' doesn't really mean anything as to what kernel looks like on the inside
09:32:23 <GwenNelson> it does imply certain things
09:32:28 <geist> it's not in any way assumed that kernel programming and user space prrogramming are even remotely similar
09:32:35 <geist> oh? like what?
09:33:34 <GwenNelson> implies a certain model of computation, implies stuff like stdin/stdout, forking processes, certain syscalls, etc etc
09:33:49 <geist> yes, but again none of that necessarrrily applies to kernel programming at all
09:34:18 <geist> linux for example has no concept of stdin/stdout in the kernel, you can't syscall from the kernel, and there's no fork from within the kernel
09:34:34 <GwenNelson> of course
09:34:45 <geist> kernel programming frequently uses a completely different threading model and is usually a small subset of what yo get in user space
09:35:00 <GwenNelson> but i'm saying i want to implement a kernel-mode terminal that interprets stuff like a standard *nix terminal
09:35:04 <geist> so based on that it's entirely up to the kernel design as to whether or not a given user space lib is portable to the kernel
09:35:30 <geist> thats great, but like i said, it's entirely up to a bunch of specific details that yo have to decide about your kernel programming model to see if any given lib is easily portable to the kernel or not
09:35:57 <geist> so you really need to nail those down first
09:36:23 <geist> and/or pick an extremely simple lib. stuff like libz and whatnot usually are pretty portable because they usually dont rely on much from libc
09:37:05 <geist> something that generically took a stream of bytes, interpreted the vt100 bits, maintained that state, and then made a series of abstract function calls based on various things (put x,y, clear screen, etc) would be pretty portable to any environment
09:37:35 <GwenNelson> that's exactly what i'm after
09:37:47 <geist> yah, that would be handy
09:38:12 <GwenNelson> so, know of any libs besides libvterm that do that?
09:39:11 <geist> mrvn: neat! i loaded up that maze.bas into my altair clone and ran it against microsoft basic
09:39:15 <geist>
09:39:29 <geist> that's absolutely the one i remember. it takes about 2-3 minutes to do a 20x20 maze like that
09:39:51 <geist> looks better without so much line spacing there
09:44:18 <geist> i personally do not, but that would be a handy thing to use
09:44:39 <geist> i have written pieces of vt100 parsing a number of times, and it'd be nice to not have to do that again
09:50:17 <mrvn> geist: it might be Ellers maze algorithm:
09:51:21 <geist> ah yeah that may be true. that sounds a lot like something that would be needed to generate with basic
09:51:33 <geist> also it doesn't print it one row at a time, but it always immediately prints the top row
09:51:40 <geist> so it's already decided on an entry point
09:52:34 <mrvn> geist: if everything is connected every entry point will work. So it probably just picks one randomly and ignores it.
09:53:09 <geist> good point
09:54:17 <mrvn> hehe, I just had a thought. I'm doing the maze for a game called factorio. The goal is to build and launch a rocket. I'm thinking of placing the rocket silo you need to build the rocket right next to the starting point across a wall. You can see it right there but then you have to find the way to the rocket silo.
09:54:38 <geist> oh that's mean
09:54:45 <mrvn> :)
09:55:19 <mrvn> You still need a ton of stuff to build rockets parts to feed into the silo so it isn't that bad. But yeah. mean. :)=
09:58:12 <mrvn> Poll: Is a stack a LIFO or a FILO?
09:58:26 <CompanionCube> LIFO?
09:58:46 <mrvn> last in first out or first in last out?
09:58:56 <CompanionCube> i mean
09:58:58 <CompanionCube> i vote for LIFO
10:01:17 <doug16k> LIFO
10:01:36 <doug16k> if only to rhyme with FIFO
10:02:34 <doug16k> I'd say stack for LIFO though
10:03:00 <eryjus> I vote LIFO
10:04:03 <eryjus> IMHO, FILO does not describe that happens between the last item added and the last item removed.
10:04:15 <eryjus> s/that/what/
10:13:52 <bluezinc> mrvn: LIFO.
10:36:11 <doug16k> I'd use stack for LIFO and queue for FIFO and wouldn't generally use any "IFO" unless it were the term for a piece of hardware like a UART
10:51:18 <nyc> I occasionally hear it used to describe VM replacement policies etc.
11:12:46 <nyc>
11:56:04 <nyc> Okay, I'm reverted to the stock Ubuntu cross compilers. Next up, ARM64 / Aarch64 (sparc64 is on hold until the situation with -kernel and/or the bootloader is resolved). But it'll probably be tomorrow because I'm wiped out tonight.