Search logs: #osdev2 - 6 June 2022

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present

http://bespin.org/~qz/search/?view=1&c=osdev2&y=22&m=6&d=6

Monday, 6 June 2022

01:32:00 <gog> intereting
01:36:00 <geist> yah interesting read. now i really know what tile based gpus actually means
01:37:00 <gog> it seems like buffer handling is a weird thing to delegate to firmware
01:46:00 <mrvn> sounds more like it's the driver dealing with it.
01:47:00 <mrvn> Like "snprintf" the firmware tells you when the buffer is insufficient. Then instead of rendering the frame again with a larger buffer the driver says: Hey, nobody will notice a boken frame. Lets ignore it and fix it for the next frame."
01:49:00 <mrvn> rendering a frame twice would probably mean the frame gets skipped altogether as you exceed the rendering time. Which would be a much bigger error in the average casse with just a bunch of triangles missing. It's only the step from 0 to 10000000 triangles that makes it obvious.
01:49:00 <Mutabah> No quite from my reading?
01:49:00 <Mutabah> It overflows, then restarts rendering and merges the results together
01:50:00 <Mutabah> So, modulo shader bugs, you get a complete frame (just degraded performance for a bit until the driver/firmware expands buffers)
01:50:00 <mrvn> Mutabah: thats the part where it renders the image in chunks.
01:51:00 <mrvn> I'm talking just about the expanding part
01:51:00 <Mutabah> Sure.
01:51:00 <mrvn> Wasn't the problem that the "broken" driver didn't do that render+merge loop?
01:52:00 <Mutabah> I think it just didn't upload the right shader to prepare for the subsequent render attempts
01:53:00 <Mutabah> So the tiles that had to render twice were cleared for the subsequent attempts instead of being initialised with the results of the last pass
05:00:00 <zid> My faulting instruction causing a page fault is
05:00:00 <zid> imul r12, r12, 0x77747af5
05:00:00 <zid> *wat*
05:00:00 <zid> CR2 is.. 0
05:02:00 <Mutabah> Something's strange there.
05:02:00 <Mutabah> Sure it's a PF and not another fault?
05:02:00 <geist> yep. absolutely
05:02:00 <Mutabah> Or an interrupt
05:02:00 <geist> or is that the right instruction?
05:03:00 <Mutabah> To qemu's debug trace!
05:03:00 <zid> That's what I used
05:04:00 <zid> It's an instruction trace, cpu goes from 205fd2 to fff...2032
05:04:00 <mrvn> or another fault that causes a pf?
05:04:00 <zid> imul to 'push r11' in my pf handler
05:04:00 <Mutabah> Qemu's interrupt trace?
05:04:00 <Mutabah> Or are you looking at another one?
05:04:00 <zid> log cpu
05:04:00 <mrvn> Is this 32bit?
05:05:00 <zid> qemu set to -singlestep, gdb attached with break int_pf then c
05:05:00 <Mutabah> `-d int` is a nice for tracing errors
05:05:00 <zid> it doesn't print both, unfortunately
05:05:00 <zid> but the traces I was getting without -singlestep (so -d int worked) was showing RIP=0
05:06:00 <mrvn> That matches you seeing "CR2 is.. 0"
05:07:00 <zid> things that also might explain it.. that 205xxx page having permission issues
05:07:00 <mrvn> What fault caused RIP=0?
05:07:00 <zid> but the instruction before 205fd2 is.. 205fc8
05:07:00 <zid> which doesn't fault
05:07:00 <mrvn> and the one after?
05:09:00 <zid> I also tested for silent register corruption via IRQs, loaded every reg with a constant and span it in a while(1); nothing happens, all regs stay fine
05:09:00 <geist> the heavy hammer is -d exec,cpu,int i think
05:09:00 <geist> it should dump the state of the registers at interrupt itme
05:10:00 <mrvn> What is the instruction after imul and what did the trace show before RIP=0?
05:10:00 <geist> actually might not need exec and cpu in that case
05:10:00 <zid> -d int dumps the regs anyway
05:10:00 <geist> i dunno what log cpu does precisely
05:10:00 <zid> shows the register contents every time
05:10:00 <zid> stepi -> qemu spits out every single register
05:15:00 <zid> I'm starting to think qemu is wrong, which is silly
05:16:00 <geist> could be you'renot running the instruction you think you are
05:17:00 <geist> usually sets off warning signs when you PF on instructions that dont seem to touch memory
05:17:00 <zid> the imul is the result of x /1i in qemu
05:17:00 <zid> it aligns with objdump -d
05:17:00 <geist> what about the instructions around it?
05:17:00 <geist> would they have trapped with that address?
05:17:00 <zid> mov rax, imul, cdq
05:17:00 <zid> mov rax, r12, so not from memory
05:18:00 <geist> yah so that's really really odd
05:20:00 <zid> Just did another session, single stepped through that area fine
05:20:00 <zid> so it isn't the only time this codepath is hit, this is frame 0, it died on frame 352 before
05:22:00 <Mutabah> Run with just `-d int` and use an external disassembler
05:22:00 <zid> like objdump -d?
05:22:00 <Mutabah> That'll tell you exactly what instruction caused the interrupt, and what vector, and if it's chaining
05:22:00 <Mutabah> Yep
05:22:00 <zid> If I run it without -singlestep it faults at rip=0
05:22:00 <zid> most of the time at least
05:23:00 <geist> theory: you have some sort of memory corruption of your text segment, and -singlestep changes the granularity at which it decodes and runs instructions
05:23:00 <zid> I've lost a window somewhere that's pinging the VM..
05:23:00 <zid> I can see the irq spam in the console
05:24:00 <zid> okay crashed
05:24:00 <zid> v=0e, e=0014, cpl=3, ip=23:0, pc=0, cr2:0
05:24:00 <mrvn> is your .text mapped read-only?
05:25:00 <geist> are you using a bus mastering ethernet device?
05:25:00 <zid> 200000-20a000 a000 ur-
05:25:00 <geist> possible you're dmaing over text?
05:25:00 <zid> ooh DMA
05:25:00 <zid> I had not infact considered DMA
05:26:00 <zid> DMA won't trip W^X or anything right?
05:26:00 <mrvn> only with iommu
05:26:00 <mrvn> without it's plain phyiscal memory
05:26:00 <geist> yah, and how exactly qemu emulates dma trashing text i dunno
05:26:00 <zid> Would need an iommu for that
05:27:00 <geist> it may or may not cause it to instantly decode instructions differently, or may (probably) wait until the next trace point
05:27:00 <geist> so singlestep vs non would have different behavior there
05:27:00 <mrvn> the dynamic translations probably act similar to an icache
05:27:00 <geist> so if you wrote 0s to it, it'd for example start decoding it as the canonical x86 0 instruction
05:27:00 <geist> whichi s something dumb like `mov eax, 0(eax)` or some shit like that
05:28:00 <zid> same crash without the e1000 connected
05:28:00 <mrvn> zid: isn't there an interrupt before "v=0e, e=0014, cpl=3, ip=23:0, pc=0, cr2:0"?
05:28:00 <geist> what about without even initializing the e1000?
05:28:00 <zid> It's not connected to the machine
05:28:00 <geist> got it
05:28:00 <zid> It'd struggle to do DMA from the source repository only
05:29:00 <zid> instead of while it's emulating a device
05:29:00 <zid> It switched it to some 8086:10d3 at least, which I don't have an if() for, only 100E
05:30:00 <geist> usually i'd thing a bug would be for example a double allocate of the pages behind the text seghment when setting up the TX/RX queue n the driver
05:30:00 <FatAlbert> the books are catalog fro mmots recommended eto least ?
05:30:00 <FatAlbert> from most
05:30:00 <zid> Yea that sounds good
05:30:00 <zid> I mean, bad for me, but it's a very good idea
05:31:00 <zid> that I reused a physical address somehow, combined with qemu not showing the corrupt memory when in singlestep
05:31:00 <geist> i've personally made that mistake before. forgetting to account for text segment pages when setting up the pmm
05:31:00 <geist> and then working fine for a whiel until something double allocates the kernel pages
05:31:00 <zid> This code is old and was working fine before, is the problem
05:31:00 <geist> yah but those kinda bugs can hang around for a long time
05:32:00 <zid> I'm not sure I even remember how any of this memory code works anymore :P
05:32:00 <zid> but I do notice I have some memory marked user that totally shouldn't be
05:33:00 <mrvn> I had a bug once that only triggered when using the last page of memory because that was double used.
05:33:00 <zid> info tlb is sadly useless for me, I touch basically every 4k page at boot and it's so long it's taller than my "Infinite scrollback" option on my terminal >_<
05:33:00 <mrvn> zid: I always whish for that to display contiguos ranges compact.
05:34:00 <zid> oh duh, the machine boots fine with 32M of ram, let's just do that
05:37:00 <zid> user text is mapped to 0x454000 physical, and appears precisely twice, once there, and once again in kernel space at.. fffff...454000 which is an identity offset
05:44:00 <zid> weird my X flag got knocked off on the user binary at some point
05:48:00 <Andrew> Bochs :)
05:50:00 <zid> oh, I think info tlb is just inverted because it's showing the *nx* bit
05:50:00 <zid> so that was a lovely goose which was wild
05:53:00 <geist> yah lots of times it's a goose chase, but not bad to do every once in a while
05:54:00 <geist> causes you to reevaulate or refresh your memory on stuff you hadn't thought about in a while
05:54:00 <zid> I mean, it would be nice, if I had fixed it
05:54:00 <zid> but I am literally less close to a solution the more I look at it because I've ruled more things out
06:55:00 <mrvn> “When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth.” ~ Arthur Conan Doyle, The Case-Book of Sherlock Holmes
06:55:00 <mrvn> cosmic rays?
08:55:00 <Andrew> "The system is a thing; "programs" are patches to the thing
09:01:00 <FatAlbert> where's GeDaMo ?
09:02:00 <FatAlbert> seems to me like you guys need a bit of GeDaMo
09:06:00 <FatAlbert> :-)
09:06:00 <FatAlbert> you guys can proceed :D
09:07:00 <FatAlbert> don't let me interrupt
09:07:00 <FatAlbert> ok im leaving ..
10:47:00 <stephe> i have a dumb question about setting up the page tables when entering long mode described here https://wiki.osdev.org/Setting_Up_Long_Mode : it says that each page table entry in long mode is 8 bytes long, but the asm seems to be doing mov dword, wont that just store in the first 4 bytes?
10:47:00 <bslsk05> wiki.osdev.org: Setting Up Long Mode - OSDev Wiki
11:07:00 <pwng> Hello - for the QEMU virtio-blk device, I am getting one request done, and then I catch the virtio interrupt and, presumably, properly acknowledge the interrupt via both VirtIO INTERRUPT_ACK MMIO reg and the interrupt controller (RISC-V PLIC), I then can get other requests done, but I stop receiving the interrupt after the first request
11:09:00 <pwng> I can't spot the problem, any ideas what the problem might be? I don't think the virtio-blk initialization or sending the request is causing a problem as the requests are done exactly as I want them, but the interrupts are suppressed for some reason
11:09:00 <pwng> https://gist.github.com/i3abghany/0b2d50c3e927be0639ae7986c821b646 that's my code for virtio-blk, and I think the most important part is `virtio_blk_isr`
11:09:00 <bslsk05> gist.github.com: virtio_blk.c · GitHub
11:15:00 <pwng> There's a part of the standard that mentions "used buffer notifications suppression"; I can suppress them by setting the `flags` field in the "used" area of a virtqueue to 1, and I don't set it anywhere, and on time of writing to QUEUE_NOTIFY, it's set to 0.
11:29:00 <pwng> Ahh yes -- turns out I unintentionally negotiated the VIRTIO_F_EVENT_IDX feature...
11:31:00 <pwng> The feature bit was only mentioned in the virtq implementation appendix of the standard, and not in the Feature Bits section :D
12:43:00 <mrvn> stephe: the upper 4 bytes are always 0
12:45:00 <mrvn> stephe: "First we will clear the tables"
12:56:00 <stephe> mrvn: aha, i was thinking it should be mov qword ... but of course thats not allowed until youre running in long mode?
12:58:00 <mrvn> yes. And rather than alternate writing 4 byte low and 4 byte high (0) they clear the tables and then just write the low bits.
12:58:00 <mrvn> notice how the address is incremented by 8
13:00:00 <stephe> makes sense now, thanks
13:58:00 <mrvn> it's always fun seeing people trying so hard to parallelize their O(n^3) matrix multiply using a 512 core cluster when they could get the same speedup on a single core using Strassen's algorithm or a >1000 times speedup by using Alman-Williams and then go on and parallelize that.
14:02:00 <GeDaMo> "However, the constant coefficient hidden by the Big O notation is so large that these algorithms are only worthwhile for matrices that are too large to handle on present-day computers." ? https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Sub-cubic_algorithms
14:02:00 <bslsk05> en.wikipedia.org: Matrix multiplication algorithm - Wikipedia
14:04:00 <mrvn> That might be true for Alman-Williams but certainly not for Strassens.
14:04:00 <GeDaMo> It was the Alman-Williams I hadn't heard of before :P
14:05:00 <mrvn> It's from 2020, so not that surprising.
14:06:00 <mrvn> It's not really any improvement above the 1990 Coppersmith, Winograd algorithm if you look at the graph on wikipedia: https://en.wikipedia.org/wiki/Computational_complexity_of_matrix_multiplication#/media/File:MatrixMultComplexity_svg.svg
14:06:00 <bslsk05> en.wikipedia.org: Computational complexity of matrix multiplication - Wikipedia
14:09:00 <mrvn> Reminds me on the best algorithm for sorting numbers in parallel: if (N < 1000000) { sort any which way, that's a trivial problem } else { split into sqrt(N) parts and sort them in parallel }
14:09:00 <mrvn> So with 32bit you do at most one split, with 64bit at most 2 splits.
14:51:00 <_73> I had a discussion here a few days ago on the idea of kernel threads executing a syscall being given a higher priority than user space threads. I was told that this would be a problem because simply a call to getchar() would DOS the system. However, is this still the case if the kernel uses a turnstile data structure to manage blocked threads? In the getchar() case the syscalling thread would be inserted into the turnstile as waiting
14:51:00 <_73> for input from STDIN, and its high priority would be irrelevant because it would have to just sleep until STDIN got input. Is this correct or incorrect?
14:52:00 <mrvn> obviously it assumes it doesn't block
14:52:00 <_73> I don't know what you mean.
14:53:00 <mrvn> Instead of getchar you can call read on a non-blocking socket. Or any other syscall. The point was that the program could just waste time in syscalls and since it gets priority that would block any real work getting done.
14:54:00 <_73> So in the getchar() it would be ok for the syscalling thread to have a higher priority?
14:54:00 <mrvn> Anything that unconditionally gives priority will be used as a DoS eventually.
14:55:00 <mrvn> _73: no. You could just make 2 processes connected by pipe that getchar() / putchar()
14:55:00 <_73> Oh wow I see the problem there.
14:56:00 <_73> Ok I understand now
14:56:00 <mrvn> In older unixes programs that recieved input fropm stdin where given priority. So users waiting for their programs to finish would hit "space" every few seconds so their program would run faster.
15:13:00 <kingoffrance> that...makes me think of at least win9x or so, does active window get priority
15:13:00 <kingoffrance> "at least" later there was some setting favor "background" or "active" things IIRC
15:14:00 <sbalmos> gave rise to the server OSs. You could still change it in the Computer Advanced Properties IIRC. It was a radio button switch to favor interactive desktop apps vs server processes
15:17:00 <kingoffrance> hmm, i guess one could simulate it too, with renice and ionice or something -- for X, screen, whatever. :D
15:28:00 <mrvn> You have to consider 2 fundamental states for processes: idle or burning CPU. It everyhing is idle then whenever some process wakes up you just run it. If everything bruns CPU you want to give every process the same fraction of time to run to be fair. (subject to manual priorities like not having sound/video stutter).
15:28:00 <mrvn> The only interesting case is if you have a mix.
15:30:00 <mrvn> In many systems you have a metric called "load" being the number of processes not blocked waiting for something, the number of awake processes. My aim in the scheduler is to minimize that load by running the process likely to go to sleep the quickest.
15:33:00 <mrvn> My simplistic heuristic for that is: Does the process being woken up have time left in it's time slice or slept longer than 1 loop of the round-robin scheduler?
15:38:00 <kingoffrance> "running the process likely to go to sleep the quickest." makes sense, for permanent sleep (end) too :)
15:42:00 <mrvn> Hard to predict anything for processes or threads that run less than a timeslice.
15:42:00 <kingoffrance> agree, just had similar thoughts if a specialized case (real time) you knew
15:43:00 <mrvn> With `fork()` there is the assumption that likely the child will `exec()` and free all the COW stuff. So it's run first generally.
15:43:00 <mrvn> If a process keeps starting threads that quickly quit you can maybe record that in the processe. Keep track of avg. thread lifetime.
15:44:00 <kingoffrance> yeah, as cool as it sounds, there is surely a sweet spot of "when is this overhead worth it?"
15:45:00 <mrvn> All stuff you should measure first to see what the actual usage pattern is on your OS before you think of schemes to optimize something that might not be there in the end.
15:45:00 <kingoffrance> ^^^
15:47:00 <mrvn> Example: Linux has to huge and complex `O(1)` fair scheduler that decides what to run next. I have a round-robin scheduler that just runs "task->next". When I switch to e.g. the keyboard driver the process goes back to sleep before linux has even made up it's mind what process to run next.
15:47:00 <mrvn> sometimes less is more.
17:40:00 <geist> yah i find it hard to optimize that stuff until you have actual real load
17:54:00 <zid> I have a load for you, mov rax, [rsi]
17:54:00 <zid> Keep it, I have a bunch
18:30:00 <mrvn> zid: but that's a virtual load