Search logs: #osdev - 6 May 2019

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present

http://bespin.org/~qz/search/?view=1&c=osdev&y=19&m=5&d=6

Monday, 6 May 2019

01:39:23 <geist> yah iirc the 386 has like 16 TLB entries?
01:39:35 <geist> so it already had a decent amount out the gate when they added it
05:07:40 <doug16k> .pushsection / .popsection nice
05:41:43 <geist> doug16k: yeah you can do some interesting crimes with that
06:26:29 <ybyourmom> we've got ourselves a badass here
06:48:39 <doug16k> geist, I decided against #UD hack thing and will just emit fixups to a special section with that :)
06:49:14 <doug16k> linker will put together an array of pointers to instructions that may need fixup
06:49:23 <doug16k> then sanely iterate up front
06:56:36 <doug16k> I made a little macro that pushes section to a special input section name, generates a unique label with \@ then .quad emits a pointer and pops section
06:57:05 <doug16k> placing it before an instruction emits a fixup
06:57:12 <doug16k> on the line before
06:57:57 <doug16k> I don't need parameters, I can look at the instruction and infer all
06:59:12 <geist> yah, that's where i've mostly seen it done. great for building arrays of pointers to things
07:02:41 <doug16k> eventually I want to add the capability to context switch performance counters and debug registers
07:07:28 <doug16k> I only save fsgsbase when coming from user mode, and only set fsgsbase when returning to user mode, but I swap gs so I have kernel gs actually, and fs of the last user mode thread that ran on this cpu
07:08:11 <doug16k> I realized that I'd save and restore performance counters and debug registers right there too
07:08:27 <doug16k> hypothetically
07:09:58 <doug16k> (only when coming from / going into user code)
07:10:44 <olsner> I have some code that generates all the IRQ entry points during init now... used to have a list of entries of a known size (to avoid the separate array of pointers), but that broke when I added more entry points since the jump offset to the common IRQ handler didn't fit in the same instruction anymore
07:12:31 <doug16k> a short jump can only get over 7 aligned entry points
07:12:58 <doug16k> where "aligned" for x86 code fetch is 16 byte aligned
07:14:19 <doug16k> I did a thing where I put each entry point in its own section then sort it into position, then my code assumes they are 16 bytes apart. they end up that way because .balign 16
07:15:00 <doug16k> 256 16 byte entry points is interestingly one 4KB page long
07:15:26 <olsner> how much of a difference does that alignment make?
07:15:45 <doug16k> quite a bit. depends
07:16:32 <doug16k> the way x86 fetch works is, it fetches a 16 (or 32 on latest) byte block at a 16 byte aligned boundary. that is the only way it fetches code, aligned at 16 byte boundaries
07:17:35 <doug16k> the consequence of that is, for example, if your label is 4 or 5 bytes away from the end of the 16 byte window, it will fetch just those 4 or 5 bytes, then the next cycle it will get a full fetch window
07:17:44 <olsner> so I guess the main thing is to not need two fetches for the IRQ entry before it jumps away
07:17:56 <doug16k> on the last couple of generations it is a 32 byte window so the worse case fetch block is 17 bytes
07:18:24 <doug16k> on older (haswell or so) it will do a 1 byte fetch if RIP least significant hex digit is 0xF
07:19:02 <doug16k> that is why compilers actually align everything by default. it matters enough
07:19:21 <doug16k> only "smallest" optimization turns off alignment
07:20:33 <doug16k> yes, you will guarantee a single cide fetch cycle if you 16-byte aligned every ISR entry point
07:20:43 <doug16k> code fetch*
07:21:48 <doug16k> miracle will occur in fetch stages of pipeline and it will see the jmp and fetch there immediately on the next fetch cycle
08:54:50 <mquy90> I am using two level tables for paging. I wonder how it works for allocating on the fly because at this time, everything is virtual address
08:55:33 <mquy90> I use identity mapping to work around but I am not sure it is a right way or not?
08:55:47 <mquy90> identity mapping on the first 4MB
09:06:58 <Mutabah> You want to keep a way of accessing at least one page table (i.e. have the table itself mapped as a data page)
09:07:19 <Mutabah> Then you can use that to map new pages into the address space, and manipulate them (e.g. manipulate otehr page tables)
09:07:58 <Mutabah> Allocating the pages themselves is easy, that's just managing which pages are in use (metadata) - the bit you're probably asking about is "how do you add a new table once paging is enabled"
09:17:44 <mquy90> I am a bit confused, for example. I have page directory and one page table -> paging is enabled -> I want to add a new table which is allocated at xxx, but xxx does not exist in mapping
10:30:12 <doug16k> mquy90, you discovered the chicken and egg problem
10:31:02 <doug16k> easy solution is to reserve a 4MB region of address space and set it to the same physical address as CR3 says
10:31:16 <doug16k> by setting that slot of page directory to that
10:32:07 <doug16k> if you point a slot of the page directory at the page directory, it makes the page tables themselves appear in that 4MB region
10:32:31 <doug16k> as an array of 1048576 32-bit entries, one per 4KB region
10:33:01 <doug16k> they won't all "be there", not unless you created a page table page and set that entry of page directory to point to it
10:34:13 <doug16k> harder solution is to set aside some number of page table entries (while you are still identity mapped) and have them already mapped too (so you can change them to point to different addresses)
10:34:55 <mquy90> I think it works well for kernel, but is it the same when creating user process?
10:34:59 <doug16k> when you need to bootstrap yourself out of "how do I map the page tables so I can map the page tables" you can get out of it by pointing one of those entries at some arbitrary physical paage and that page appears there
10:36:02 <doug16k> if you go with the easy solution, you switch to their page directory, then do your thing. all user page directories map the kernel the same
10:36:35 <doug16k> their page tables will appear in the page table mapping region
10:37:05 <doug16k> that trick is only good on x86 though. it's an easy way to pass on it if you want it to just work asap
10:37:53 <doug16k> it makes paging insanely easy though once you really get what it does
10:38:26 <doug16k> it makes the page tables a sparse array in that 4MB region
10:38:42 <doug16k> you can O(1) instantly locate where a PTE is for a given virtual address
10:39:10 <doug16k> nearly "one instruction" I should say
10:40:16 <doug16k> page_tables[addr >> 12]. done. page dir entry? no problem, page_dir[addr >> 22]
10:41:11 <mquy90> if we share the same 4MB region for both kernel and user processes, is there any problem with overlapping?
10:43:26 <doug16k> how much of the top of space is your kernel? 0xc0000000-0xFFFFFFFF ?
10:43:39 <mquy90> yeah
10:43:58 <doug16k> then then last 256 entries of every user page directory are pointing to the same page tables
10:44:21 <doug16k> when you switch to another address space, none of them even changed, even if it went and looked
10:45:06 <doug16k> no problem with overlapping. at one moment it is one mapping, another moment it is another, no moment of overlap
10:45:51 <doug16k> that is one of the reasons why it is multi-level like that. you can share a whole 4MB region by sharing the entire page table
10:46:25 <doug16k> put that same page table phys addr in multiple user process page directories, and they all see the same thing within that 4MB
10:46:41 <doug16k> even different 4MB regions (different slots of page dir)
10:47:05 <doug16k> you are doing that for the top 1/4 of the slots across all user page dirs, for the kernel
10:48:06 <mquy90> let me reread what you said, not fully understand :D. Besides that approach, is there any ways?
10:48:07 <doug16k> (4MB because one page table maps 1024 4KB pages, 4096KB = 4MB)
10:48:51 <doug16k> would you like all virtual address space to just appear as an array of 1048576 array slots but not take the whole 4MB?
10:49:00 <doug16k> one array slot per 4KB
10:50:53 <doug16k> the other way is to set up a page table in your early setup code which maps some region of the page tables themselves
10:51:08 <mquy90> I understand the benefits of using two level tables, I mean that instead of 4MB identity region
10:51:28 <doug16k> you can modify those to point anywhere, then use that to edit the page tables arbitrarily
10:52:06 <doug16k> that way, to edit page tables, first you edit page tables so those page tables appear somewhere, then you do your actual edit
10:53:42 <doug16k> easy mode is: pte = (page_dir[addr >> 22] & PTE_PRESENT) ? &page_tables[addr >> 12] : null
10:57:54 <doug16k> mquy90, you are trying to properly implement paging and protection right? not a flat identity map or anything right?
10:58:30 <doug16k> if I understood you correctly earlier you realized you had no way of adding more page tables because you didn't have enough page tables mapped
10:58:46 <doug16k> enough of the page tables*
10:59:40 <doug16k> the problem is not being able to access the page tables. one approach is to use an x86 specific trick that makes all the page tables magically appear in a 4MB region. another more portable solution is to give yourself some way of accessing arbitrary physical addresses
10:59:51 <mquy90> normal paging, which each virtual addresses can map to any physical addresses, I am not sure what is its name?
11:00:09 <doug16k> normal is good enough :)
11:00:36 <mquy90> and two level, directory and table :D
11:00:47 <doug16k> yes, that's dictated by the hardware though
11:01:13 <doug16k> do you want to do the nicer more portable way, or less nice far easier x86 only way
11:02:13 <mquy90> maybe, easier way first and also know that there is a better way (I am not sure we have)
11:02:31 <doug16k> nicer way will require you to break the chicken and egg problem by mapping in some range of page table entries somewhere when you initialize paging
11:03:09 <doug16k> using that region of page table entries you can edit them to point anywhere. the region they represent will point to whatever physical address you put in the page table entry
11:03:57 <doug16k> easy way is very easy.
11:05:37 <doug16k> pick a slot to put your tables. 1023 is not bad. now page_tables = (uint32_t*)0xFFC00000
11:05:43 <doug16k> (top 4MB)
11:06:19 <doug16k> you put the page directory at 1023 so the top of the top of that is the page directory: page_dir = (uint32_t*)0xFFFFF000
11:06:51 <doug16k> now to map a page, you first see if there is a page table there yet:
11:07:05 <doug16k> if (page_dir[addr>>22] & PTE_PRESENT)
11:07:30 <doug16k> if there isn't, then you need to allocate a page, and stick it in page_dir[addr>>22] and make it present and writable
11:07:39 <doug16k> if there is, do nothing
11:08:14 <doug16k> now you know it is safe to do next step, the pte for the virtual address will be at page_tables[addr>>12]
11:08:54 <doug16k> if you want to make that address present, allocate a page, set page_tables[addr>>12] to that phys page addr plus make it present / writable
11:10:51 <doug16k> again, you check page_dir[addr >> 22] to either A) see if any page table exists for that 4MB region), and/or, B) create a new page table and set page_dir[addr >> 22]
11:11:32 <doug16k> once page_dir[addr>>22] has its present bit set, then it is safe to access page_tables[addr >> 12], which is the PTE that maps virtual address "addr"
11:18:34 <mquy90> when allocate a page, that page has to be in that 4MB region, right? otherwise it doesn't exist when accessing
11:33:12 <retpoline> .theo
11:33:12 <glenda> You are way out of line with the software development community.
11:37:21 <lkurusa> .ken
11:37:21 <glenda> I just hate to be pushed around by some @#$%^& machine.
11:38:41 <doug16k> mquy90, each slot of the page directory represents a 4MB region. it points to a page table
11:38:55 <doug16k> page table is 1024 slots of 4KB pages
11:39:30 <doug16k> sorry, page table is 1024 page table entries, each of which maps 4KB
11:40:24 <doug16k> if you have a function that makes some 4KB aligned region map some 4KB aligned physical address, you first make sure that virtual address even has a page table, by checking the appropriate slot of the page directory
11:40:40 <mquy90> yup, I understand that but when allocating page table, where should it be located?
11:41:09 <doug16k> where?
11:41:37 <doug16k> you don't care where. you allocate it from your physical allocator
11:41:50 <doug16k> physical allocator tells you some physical address of a free page
11:42:02 <doug16k> you put that physical address in a page directory entry and mark it present and writable
11:42:12 <doug16k> now that 4MB region has a page table
11:42:32 <doug16k> now you can place 4KB pages in that page table, and those ranges of memory will go to that physical memory
11:42:49 <doug16k> sorry, place page table entries that map 4KB pages in that page table
11:43:22 <doug16k> "where" for the page directory entry for a given address is page_dir[addr >> 22]
11:43:54 <doug16k> "where" for the page table entry (after you have verified/assigned page_dir entry) is page_tables[addr >> 12]
11:44:26 <doug16k> when you set page_dir[x>>22] then a page is there at page_tables[x>>12]
11:46:52 <doug16k> conversely, if page_dir[x>>22] has bit 0 equal to zero, then accessing page_tables[x>>12] is a page fault because there are no page tables at that address
11:49:03 <doug16k> note that >> 22 is "divide by 4MB", and, >> 12 is "divide by 4KB"
11:49:08 <mquy90> for example, when creating a page table to assign it to page directory entry, that page table's address is got from physical allocator
11:49:27 <doug16k> you assign it to page_dir[addr>>22]
11:49:38 <doug16k> addr is whatever address where you wish pages to appear
11:50:30 <doug16k> then, once you know you have a page there in page_dir[addr>>22], then you can go look at page_tables[addr>>12] and see if it has a page and assign one there
11:50:53 <doug16k> if not, get another physical page, set present, writable, whatever, writ it to page_tables[addr>>12]
11:51:45 <doug16k> if you set the last slot of the page dir to point at the page dir, then page_tables is 0xFFC00000 and page_dir is 0xFFFFF000
11:51:45 <mquy90> but the problem is that accessing page_tables is page failed
11:52:09 <doug16k> it won't if you read the part where I said to first check page_dir[addr>>22]
11:52:30 <doug16k> if bit 0 of that is not 1, then it will page fault if you access page_tables[addr>>12]
11:52:40 <doug16k> if it is 1, then it won't fault - page table is there
11:54:36 <mquy90> seem like I miss some points :D, will reread it :+1:
11:55:26 <doug16k> the main trick is using the page directory itself as a page table
11:55:49 <doug16k> it makes that 4MB region magically be a 1:1 mapping of the whole 4GB, one 32 bit entry per 4KB
11:56:56 <doug16k> and, the page directory will appear within there, linearly proportional to its place in the page directory. if you recurse the last slot then the last 4KB is the page directory (because that is where it lands when you interpret the page directory as a page table, last)
12:00:07 <doug16k> picture what would happen if the page directory were also a page table. in that case, each 4KB region is a page table. each 32 bit entry is a page table entry. in order. from 0 to 0xFFFFF000
12:00:55 <doug16k> and since the page directory is in there in that page table, it appears as a 4KB region in there, according to where it is
12:01:00 <doug16k> if last, then it is last in the 4MB region
12:07:49 <doug16k> kernel patching thing is working. I'm wondering whether it is better to do call disp32, then, 5-byte-nop, to replace 2 instructions with a single call, or would rep * 5, call disp32 be faster -> https://coliru.stacked-crooked.com/a/fbc7e81ddad82518
12:09:12 <doug16k> since it would be one instruction, and decode past call is pointless anyway
12:09:35 <doug16k> when it returns it would breeze right over that long nop eh?
12:17:25 <doug16k> found a very recent PCI class codes pdf the other day -> https://pcisig.com/sites/default/files/files/PCI_Code-ID_r_1_11__v24_Jan_2019.pdf
12:40:46 <pterp> I decided to show my paging code and see if anyone can find the bug. The bad function is paging_new_address_space, and the bad line is line 129. https://pastebin.com/fF1809Up
12:45:41 <doug16k> line 123 should be: asm volatile("movl %%cr3,%[dest]" : [dest] "=r" (cr3));
12:46:09 <pterp> BIt better, but old lineworks fine.
12:47:13 <doug16k> line 118 is forcing it to use ebx? so it has to push and pop around it - ebx is callee saved
12:47:41 <pterp> that is actually straight from the inline assembly examples in the wiki
12:47:57 <doug16k> change b to r and it's perfect
12:48:07 <pterp> done.
12:48:57 <doug16k> why is line 129 "bad"
12:49:04 <doug16k> page faults there?
12:51:16 <pterp> no, just doesnt modify the structure map properly (structure map holds active page directory and page tables, but can be changed to an inactive one, which I do here). I can show the 768th entry in the passed in directory in physical memory and it doesn't change.
12:51:31 <pterp> * done by qemu's xp command
12:54:19 <doug16k> i*1024 eh? why? (line 128)
12:54:32 <doug16k> 0th entry?
12:54:48 <doug16k> not i*1024 + 1023
12:54:50 <doug16k> ?
12:55:20 <doug16k> not following NUM_KERN_DIRS
12:55:27 <pterp> First time round it's the addresss of the first 1024 entries (the first kernel page table), second time rough adddrss of he second 1024 and so on.
12:55:29 <doug16k> why is there not an infinite number of dirs
12:55:43 <doug16k> limit*
12:56:17 <doug16k> kern_page_tables represents an array of page directories right?
12:56:19 <pterp> That really means the number of page tables for the kernel binary (It takes arough 8mb right now)
12:56:36 <pterp> Currently set to 2
12:56:55 <doug16k> 1024 page directories each
12:57:02 <doug16k> er page directory entries each
12:57:06 <pterp> Yes.
12:57:15 <pterp> 2048 pages reserved for the kernel binary.
12:57:33 <doug16k> then i*2024 represents the page directory entry that represents the first 4MB.
12:57:40 <doug16k> 1024
12:58:00 <pterp> When i is 0 yes.
12:58:07 <doug16k> no every time
12:58:15 <doug16k> it's the first 4MB entry in each page directory
12:59:01 <doug16k> 0x00000000-0x003FFFFF entry
12:59:42 <doug16k> why is it reading slot 0 of each page dir then storing that in +768 of another
01:00:05 <pterp> When i is 0 we have the adress of the first 1024 entries in kernel_page_tables, which is mapped to the 768th page directory entry. When i is 1 we have the address of the next 1024 entries in kernel_page_tables, which is mapped to the 769th page directory entry.
01:00:30 <doug16k> oh. it's a whole bunch of recursive mappings then
01:00:36 <pterp> Kernel page tables is essentialy page tablesstored back to back.
01:00:50 <pterp> * page tables stored
01:00:54 <doug16k> potentially a bunch
01:01:02 <pterp> yes. right now 2.
01:01:06 <doug16k> per NUM_KERN_DIRS
01:01:25 <doug16k> ok I see now
01:01:41 <pterp> Thayt code maps the first page table in there at the 768th entry, then mapth the next at the 769th and on and on.
01:01:58 <doug16k> yes. one 4MB window per dir
01:02:06 <pterp> right.
01:02:39 <doug16k> why all the invalidating?
01:03:10 <doug16k> there's guaranteed nothing to invalidate when it was not present 5 nanoseconds ago
01:03:21 <pterp> either way, it properly assigns when run manually in the debugger(p smap[i+768]=(entry_virt-0xC0000000)|0x7), but not when run without the debugger.
01:03:36 <doug16k> and never has been present since its universe began
01:03:43 <pterp> Also,that invalidation is done becausec it WAS present before.
01:04:00 <doug16k> it was and one instruction is invalidating 4MB?
01:04:24 <doug16k> flush the entire TLB before you use the changed dir
01:04:41 <pterp> just the first page in the structure map (a page diectory, usually the active one)
01:05:27 <doug16k> what is 137 trying to do?
01:06:02 <doug16k> you want to forget mappings at virtual address &smap ?
01:06:20 <doug16k> next load from smap[0] will rewalk tlb. why>?
01:06:55 <pterp> line 137? previouys line resets the first mappint to the current page directory, line 137 neds to flush otherwise the TLB is out of sync with the structuresfor that page.
01:07:49 <doug16k> where did the virt-to-phys for smap change?
01:08:13 <pterp> smap_page_tables[0]=cr3|0x3;
01:08:29 <doug16k> zero really?
01:08:29 <pterp> Line 136.
01:08:39 <doug16k> so null pointer access is page tables then
01:08:47 <pterp> No.
01:09:27 <pterp> The first page in smap_page_tables sets the mapping for the page at 0xFF800000.
01:09:32 <doug16k> smap is a pointer
01:09:39 <pterp> Right.
01:09:40 <doug16k> what good is invalidating your data section?
01:09:49 <doug16k> line 12?
01:09:55 <doug16k> it says &smap right?
01:10:06 <doug16k> remove the & is probably what you mean?
01:10:57 <doug16k> unless you wanted to "invlpg" a 4 byte pointer somewhere in .data
01:11:19 <doug16k> you want to invlpg the value in smap
01:11:28 <doug16k> right?
01:12:16 <doug16k> the address of smap is a pointer to a pointer
01:14:03 <pterp> It works!
01:14:14 <doug16k> nice
01:14:54 <doug16k> so much code and all that worked except one tiny character
01:15:04 <doug16k> this is why programming is fun
01:23:21 <pterp> And i now have my ELF loader mapping pages into a new address space and starting init as a separate task!
01:54:02 <doug16k> nice
01:56:48 <pterp> How can i test if i'm in user mode?
01:57:09 <doug16k> cs bit 0 and 1 will be 3
01:57:21 <doug16k> or ss
01:57:27 <pterp> ah You can read from cs. Didn't know that.
01:57:39 <doug16k> no I mean the register value
01:57:46 <doug16k> you can copy it to a general register
01:58:30 <doug16k> "mov %%cs,%[cs]" : [cs] "=r")
01:59:22 <doug16k> oops, asm volatile("mov %%cs,%[cs]" : [cs] "=r" (cs));
02:01:51 <doug16k> bit 0 and 1 of selectors holds the RPL (requested privilege level)
02:02:01 <doug16k> on cs and ss that will be your privilege level
02:38:50 <pterp> I've just discoved a bit of a problem. When I yield it goes like this: kernel->yield call->interupt->switches context to init task->init task yields->interrupt->switches context to kernel task->tasking_yield returns->iret->kernel yield->interrupt->switches context to init task->tasking_yield returns->page fault on addess 0x23. Where is that page fault coming from?
02:39:27 <zid> misaligned stack and you retted to the segement selector? idk
02:41:00 <bcos> Um
02:41:32 <bcos> pterp: kernel->yield call->switches context to init task (without any interrupt because that's a massive mistake)?
02:43:15 <eryjus_> pterp, bcos: is yield implemented as a syscall?
02:43:22 <pterp> Yes.
02:43:38 <bcos> eryjus_: In my case, no, yield isn't implemented ;-)
02:43:48 <zid> yeild is old hat
02:44:16 <bcos> (for "highest priority thread runs" yeild is a no-op)
02:44:20 <pterp> Doesn't is have to be a syscall to change address spaces?
02:45:04 <bcos> pterp: In your case it would have to be a syscall; but..
02:46:10 <pterp> Now i think i have kernel->tasking_yield call->switches context to init task->init task yields->interrupt->switches context to kernel task->tasking_yield returns->kernel yield->interrupt->switches context to init task->tasking_yield returns->page fault on addess 0x23
02:46:11 <bcos> There's a "common beginner's mistake" where task switches are conflated with interrupts, which often results in extremely slow and inefficient code, sometimes including people doing a "HLT" to wait for timer to do the task switch
02:46:36 <pterp> How would i make it not a syscall?
02:47:05 <bcos> ?
02:48:07 <zid> remove the code?
02:48:15 <zid> if nobody call do it, it's not a syscall, ez?
02:48:15 <pterp> that's what you're saying right? Move all the code into userspace?
02:48:28 <zid> what code
02:48:37 <pterp> The yield code.
02:48:41 <bcos> pterp: Is it more like "yield syscall to kernel (which is implemented as a software interrupt) -> switches context to init task"?
02:48:51 <pterp> Yes.
02:48:58 <pterp> That's what it is.
02:49:33 <bcos> Ah, OK. Your earlier version made is sound like you were in the kernel already, then called "yield()", then did an interrupt
02:49:51 <pterp> the kernel yield was an interupt but i removed that.
02:50:00 <pterp> Still getting the page fault
02:50:05 <zid> what could the kernel possibly yield to?
02:50:18 <pterp> The init task.
02:50:25 <zid> That's not a yield, that's a run
02:50:31 <pterp> ?
02:50:42 <pterp> It creates thaetask, then yield contril tothat task.
02:50:45 <zid> It should just be an 'iret', not a 'yield'
02:51:01 <zid> the kernel is 'running the process', not yielding to it
02:51:19 * bcos assumes "kernel stack per thread; init task is thread in user-space"
02:51:28 <pterp> Huh? You create the task then yield to it.
02:51:46 <zid> Nobody else would use the word 'yield' there it doesn't really make sense to me
02:52:10 <pterp> Also, there's ony one global kernel stack, and each task has a sepaerate user stack.
02:52:34 <bcos> "yield" is like "sleep(0)" which is like "tell the scheduler to switch to some other task (any other task) for a minimum of zero seconds"
02:52:39 <zid> The kernel doesn't stop running so that init can run, the kernel is just running a task. You can even picture the execution environment to be the kernel and the process, (think of the page tables), and the process is just doing api calls into protected pages
02:52:46 <zid> and the process is just pretending to be the kernel for a bit during syscalls
02:54:33 <pterp> No, the kernel creates a separate address space for init., then initializes a task with that address pace and the elf entry point as the starting point, then yield to it. Just like a normal user process creating a task. and switching to it.
02:54:40 <zid> It can't be separate
02:54:57 <zid> else you'll never be able to get back to the kernel
02:55:02 <pterp> Init has a new address space, which does incude the kernel of course.
02:55:03 <zid> interrupts and syscalls would see empty pages
02:55:33 <zid> right, so you can consider the execution environment to just be "process + kernel", and every process has its own kernel
02:55:41 <pterp> Then it yeild to the task, siwithing to init's address space and setting eip to the entrypoint.
02:55:59 <bcos> pterp: For "kernel stack per CPU" (your "one global kernel stack" for single-CPU) the kernel can't do any task switches - mostly you save user-thread state when you switch from user-task to kernel and then when you return from kernel to (potentialy different) user-task you load the (potentially different) user-task state
02:56:12 <zid> It doesn't matter if any one task is inside the kernel or user pages at any moment, it's still that process running
02:56:19 <zid> doesn't matter if it's in main() or sys_write
02:56:30 <zid> that's just a trivial matter of a couple of permission bits being set or not
02:56:45 <zid> the kernel isn't yielding to the process, it's retting to it
02:57:25 <zid> If you really wanted the concept of the kernel yielding, I'd say the yield happened when you reloaded cr3, not when you iret
02:57:36 <zid> because that's the point you switch execution environment
02:57:48 <bcos> pterp has left this server
02:57:51 <zid> heh
02:57:56 <zid> he does that, I have no idea why
02:58:27 <bcos> Usually when that happens it's unrelaibly internet connection
02:59:02 * bcos waits for "pterp has joined"
02:59:35 <zid> I have a good diagrma in my mind here but I'd never be able to draw it :P
02:59:52 <zid> multiple address spaces stacked in a horizontal row, with an upper/lower half to each for kernel and user pages
03:00:22 <zid> with a yield arrow going from lower half of one, through the thin barrier into kernel half, through the 'shared page' where cr3 swap happens
03:00:57 <bcos> "kernel stack per CPU" is very different to "kernel stack per task"
03:01:16 <zid> stacks aren't that relevent here
03:01:40 <zid> He just has a weird worldview and I'm trying to give him mine :p
03:02:04 <bcos> "kernel stack per CPU" is not like "multiple address spaces stacked in a horizontal row, with an upper/lower half to each for kernel" and is a lot more like "multiple address spaces stacked in a horizontal row, with kernel in one of them"
03:02:38 <zid> I don't see how a stack changes that, each 'column' is a cr3 change not an ss:rsp change
03:02:39 <bcos> ..not literally, but logically
03:03:39 <bcos> It doesn't matter. The problem is that kernel can't do a task switch of any kind (because it can't leave unexpected trash on the kernel stack)
03:04:06 <bcos> ..and that's why it ends up like the kernel is a task by itself
03:04:08 <zid> wow, I fucked up typing yield so hard just now
03:04:19 <zid> I managed to go for yiedl
03:09:51 * bcos would also suggest that "one kernel stack per CPU" isn't a good idea for beginners (or monolithic kernels)
04:31:26 <bauen1> amazingly, terminating firefox with 1440 tabs will saturate the entire disk io on this "crappy" ssd due to swapin
04:31:46 <bauen1> this was supposed to free up some ram goddamit
04:34:17 <bcos> bauen1: You created 1440 tabs to free up RAM?
04:37:20 <bauen1> lol no
04:37:26 <bauen1> ext4 performance is just terrible
04:37:33 <bauen1> i need to look into switching filesystems maybe
04:37:40 <bauen1> bcos: 1440 is my default profile
04:38:31 <bcos> ?
04:38:57 <bcos> bauen1: https://www.mayoclinic.org/diseases-conditions/hoarding-disorder/symptoms-causes/syc-20356056
04:40:13 <bauen1> i just closed firefox, how am i supposed to view that
04:41:08 <eryjus_> bcos, now why would you do that?? bauen1 would have 1441 open tabs....
04:42:42 * bcos hates tabs
04:43:17 <bcos> - can't see what anything is when you "alt+tab" (unlike separate browser windows)
04:43:44 <bcos> ..and the resoruce consumption (compared to bookmarks) is insane
04:45:22 <aalm> i often start hoping for a panic(); that never comes when i'm past 150 open tabs, tabs just become hard to navigate :(
04:46:04 <bauen1> ^ that's the only real issue (and i'm trying to bookmark stuff, it's just not working well enough)
04:47:00 <aalm> chromium is a trooper, when it comes to recovering a session with +100 open tabs :]
04:47:18 <aalm> because there's the one or two tabs you didn't bookmark .. :D
04:48:01 <aalm> i should buy a mouse w/o button on the wheel
04:48:02 <bauen1> i have yet to see someone using chrome with >1.5 open tabs
04:49:16 <aalm> well, i didn't mean open==visible
04:50:12 <zid> most tabs I ever have is like.. 8
04:50:24 <zid> if a project of 4 tabs gets interrupted by 4 more
04:50:26 <aalm> i wish... :S
04:52:36 <aalm> you must be laptop users or something
04:54:14 <bauen1> i'm on a laptop ...
04:54:28 <aalm> i don't reboot the browser every week even
04:54:35 <bauen1> yeah
04:54:46 <bauen1> i only restart once the swapping gets unbearable
04:55:06 <zid> I am on an 8 thread desktop with 24GB of ram
04:55:09 <zid> still don't need more than 8 tabs
04:55:21 <aalm> i've learned to avoid browsing during daily crons.. :D
04:55:34 <zid> would you have 200 pieces of paper on your real desk that you're actively looking at? I doubt it
04:55:53 <eryjus_> the problem with 1440 open tabs is that 1338 of them are going to have ads and automatic refreshes built in where it will consume all system resources... i think the high water mark for me is about 30 tabs and then my OCD kicks in...
04:56:18 <aalm> eryjus_, you browse wrong sites then :/
04:56:39 <aalm> i hear when site does something like that
04:57:48 <aalm> got only 5 sites i visit w/disabled js
04:58:12 <bauen1> eryjus_: that's what ublock origin is for
04:58:23 <aalm> i don't use browser addons at all either
04:58:33 <aalm> they consume unnecessary cycles
04:58:44 <bauen1> true, but ad blocking makes everything so much better
04:59:24 <aalm> or just avoiding the places where you need such 'betterness' :D
04:59:46 <bauen1> sadly not possible in the modern world :/
04:59:52 <zid> like they can't update their pages to add adverts
05:00:01 <aalm> use the phone for those, or something
05:00:16 <bauen1> no ...
05:00:20 <zid> I wouldn't use a website on a phone if I had ANY other choice
05:00:26 <bauen1> you can't block ads on mobile
05:00:30 <bauen1> ^^
05:00:32 <zid> you can it's just more annoying
05:00:49 <bauen1> it's not just annoying ._.
05:01:01 <zid> is there even a phone nobody has rooted yet
05:01:19 <aalm> doesn't all crap sites usually have their own apps ?
05:04:46 <aalm> anyway, if i was into blocking ads, i think i'd go for blocking it on the resolver(not even running on the browsing client)
05:04:57 <zid> hence rooting your phone, and editing /etc/hosts
05:05:04 <zid> which is a popular option
05:05:10 <zid> just blackhole adsense and doubleclick etc
05:05:16 <bauen1> editing /etc/hosts doesn't really work too well anymore
05:05:21 <zid> it does.. okay
05:05:23 <aalm> oh, right
05:05:23 <zid> better than nothing
05:06:55 <aalm> there used to be somekind of a proxy possibly increasing privacy, is there no solution like that for ad blocking?
05:07:04 <ZombieChicken> /etc/hosts works to handle the domains ads are served from, but it doesn't deal with everything. You still need an adblocker to deal with other unwanted ad-related crap
05:07:12 <zid> >better than nothing
05:07:19 <bauen1> youtube ads can't be blocked with hosts
05:07:40 <ZombieChicken> aalm: Proxies are of limited value these days with everything going over HTTPS. The proxy straight up can't determine what to block
05:07:55 <zid> mitm yourself :P
05:07:58 <ZombieChicken> and I htink you are thinking of Prixovy
05:08:12 <zid> add your proxy as a root ca
05:08:21 <zid> let it mitm every ssl connection
05:08:49 <notrackmo> muh super secret primez
05:10:07 <ZombieChicken> zid: That's certainly an idea
05:10:08 <bauen1> ^
05:10:12 <bauen1> *^^
05:10:51 <ZombieChicken> notrackmo: You mean the primes any major intelligency agency in the world likely has plastered all over everything?
05:13:28 <bauen1> bcos: i wouldn't call it a hoarding disorder, since i don't really care enough about my tabs, i just can't be bothered to sort them between bookmark and close
05:13:59 <bauen1> so really i'm just too lazy to close tabs
05:14:46 <notrackmo> does your browser at least shut down scripts running on tabs that are not currently visible to the user?\
05:16:01 <bauen1> maybe
05:16:03 <bauen1> not entirely sure lol
05:16:10 <zid> They run at 10% speed in chrome
05:16:19 <bauen1> they just don't get loaded after a restart of firefox
05:16:22 <zid> setinterval has a higher minimum etc
05:16:50 <notrackmo> 10% is kind of arbitrary, why not 0.01% ?
05:17:05 <zid> because that's essentially 0
05:17:22 <notrackmo> which is closer to ideal than 10% wasted resources
05:17:30 <zid> except who says it's wasted
05:17:33 <notrackmo> me
05:17:36 <notrackmo> and any sane person
05:17:38 <zid> then close the tabs
05:18:16 <notrackmo> who says it needs 10%
05:18:55 <notrackmo> gotta get that ad revenue tho
05:20:30 <zid> Well, you enact that policy and deal with the angry people whose support chat disconnected because js stopped in unfocused tabs
05:20:43 <zid> ads don't even generally use js beyond loading
05:21:08 <ZombieChicken> yeah. I imagine the reason just comes down to "because 10% is actually still doing something" so shit won't always break
05:21:24 <notrackmo> you need 10% to keep a connection alive?
05:21:24 <zid> I can run js at 10x speed, I can't run it at 100000x speed
05:21:42 <notrackmo> maybe it's a winsocks problem
05:21:46 <notrackmo> who knows
05:21:47 <notrackmo> lol
05:21:51 <zid> I just don't think you understand enough to comment
05:21:59 <notrackmo> ooooh
05:22:01 <notrackmo> you got me there
05:22:03 <ZombieChicken> they probably assume (and I'd assume rightfully so) that if a tab is still open, the user wants it to keep doing it's thing
05:22:16 <zid> hence <zid> then close the tab
05:22:35 <ZombieChicken> and 10% is the magical value where things seem to Just Work without bogging down the system
05:23:03 <notrackmo> is this a scientific fact?
05:23:18 <ZombieChicken> is anything dealing with browsers that rigorous?
05:23:38 <zid> That you're asking about "scientific facts" betrays your ignorance
05:23:50 <notrackmo> who says i'm ignorant other than you?
05:24:02 <ZombieChicken> this is tech we're talking about here; science means little. Marketing means a lot
05:24:11 <zid> google
05:24:34 <ZombieChicken> zid: What does Bing say about him?
05:24:39 <zid> Not sure what edge does
05:24:47 <zid> firefox also throttles, so mozilla thinks he's stupid too
05:25:07 <zid> he was given supporting evidence, still wants to argue, then comes back with non-sequitors about science and who else thinks he's dumb
05:25:18 <notrackmo> supporting evidence ?
05:25:28 <notrackmo> do you know how sockets work? they don't need 10% cpu to stay connected
05:25:48 <zid> I do, apparently you don't.
05:25:54 <notrackmo> ooooo man
05:25:59 <notrackmo> you're too good at this debating thing
05:26:09 <notrackmo> i must retire in light of such a supreme intellect
05:26:14 <zid> You'd actually have to have debate points for this to be a debate
05:26:22 <ZombieChicken> notrackmo: You'll find there is little 'scientific fact' in computing in the modern world outside of acedemics and R&D. There is a crazy amount of hand waving, cargo culting, and 'magical values' out there that work Well Enough to avoid having things explode (constantly) in people's faces
05:26:57 <notrackmo> we can measure observe and share results within a certain margin of error still though, right?
05:27:33 <ZombieChicken> 10% was likely a value set by a webdev on their 60th hour of work becuase it 'felt right' on their dev machine and since no bugs have been filed against it, that is the way of things now
05:27:48 <zid> ZombieChicken: That and javascript is fucking slow.
05:28:09 <ZombieChicken> I don't do any sort of web programming, so I can't really comment there
05:28:18 <zid> If I gave you 0.01% cpu time to run javascript in, even simple json pings will bog it down, much less swapping the content of the web page around after every one
05:28:22 <zid> like for an animated clock, say
05:28:45 <zid> the json responses would get lagged, and you'd ping out
05:28:52 <zid> but I don't know how sockets work so that can't be true
05:29:26 <notrackmo> the fact that you're pinging over JSON instead of using TCP to keep connection alive speaks volumes for your technical skills
05:29:30 <ZombieChicken> all I know about JS is that apparently it's weakly typed, was a shitty (re)implementation of Scheme when it was first (almost literally) thrown together as Netscape, and that I've yet to find anyone who isn't a web'dev' that takes JS seriously
05:29:41 <zid> Yes, it does, in that they're correct
05:29:45 <zid> good luck using bsd sockets from javascript
05:29:52 <zid> which is apparently what you believe happens
05:30:02 <zid> javscript does web requests, not sockets
05:30:04 <ZombieChicken> notrackmo: Probably using JS to handle creating those pings...
05:30:10 <notrackmo> websockets doesn'teven use JSON
05:30:14 <notrackmo> it uses http
05:30:38 <ZombieChicken> notrackmo: You found this channel by googling 'who do write OS', didn't you?
05:30:45 <ZombieChicken> s/who/how
05:30:53 <zid> nah, I think you got it right the first time
07:43:21 <geist> huh. reading more about the cray-1 architecture. it actually had a crapton of vector registers
07:43:39 <geist> v0-v7, each of which has 64 64-bit elements
07:44:18 <geist> so it seems the trick is you can do math like v0 = v1 + v2 (repeated for up to 64 elements), where there's a VL register that specifies the repeat count for subsequent instructions
07:51:53 <bcos> So you'd set VL to 1 million, then do one "add" instruction to pound through 512 MB of data?
07:52:26 <zid> If you could afford the billion dollars of ram first, ofc? :P
07:52:49 <geist> well, no up to 64
07:53:01 <geist> but i guess the point is the vector regs are really 4096 bits wide
07:53:08 <zid> bit like the avx ops with masks then
07:53:09 <geist> 64 elements of 64bits
07:53:25 <zid> where you can provide 11111000 to only do the operation on the top 5 dwords or whatever
07:53:49 <geist> well, it's not a mask. it's a count. it repeats the operations. the ALU itself is not doing it in parallel
07:53:58 <zid> oh it's actually a loop
07:54:23 <geist> yep. there's some whackyniess that i dont understand yet. you can have different arguments to the vector ops count differently
07:54:51 <geist> like have inputA sit at a single digit, but inputB counts up
07:54:58 <bcos> Oh ("SIMD but not in parallel") - was thinking it was more like GPU or AVX512
07:55:02 <geist> or have inputA count to 4 and B count up to 64, where A will wrap around
07:55:31 <geist> nah, still way ahead of anything else at the time (1977)
07:55:46 <zid> yea it's still pretty cool
07:55:54 <zid> rep fadd :P
07:56:01 <geist> 80Mhz 2 FPU ops per cycle
07:56:13 <geist> the fact that it was 80mhz in 1977 is already pretty impressive.
07:56:23 <geist> 110 kW to run this thing too...
07:57:28 <geist> i was reading about it the other day, but the HW context switch mechanism is pretty nifty. basically like a simpler, more generic TSS call gate
07:57:57 <geist> as in all irqs and exceptions and monitor <-> user transitions are via one of these
07:58:34 <geist> it's basic model is that it 'exchanges' the current active registers with one in memory, and stores in a hidden register the last one it exchanged with
07:59:23 <geist> so when a process is started the kernel exchanges with the process descriptor, and then once the process is running it has a single user opcode that basicaly lets you return to kernel via exchanging back. it inverts the general flow that you get with modern machines
07:59:30 <zid> didn't it use like, super carcinogent liquid coolant
07:59:32 <geist> ie, 'monitor' calls user space, user space returns
07:59:59 <zid> nic
08:00:00 <zid> how did I get a t
08:00:24 <geist> a timer?
08:00:57 <geist> interrupts are a forced exchange of whatever context is active with a dedicated one
08:01:21 <geist> key is instead of pushing stuff on a stack somewhere the hardware ust does a context switch. i think it said it takes like 39 cycles
08:02:40 <geist> of course a machine like this it's probably a waste of money to run generic OSes on the main processor. i think it may be more common to run some sort of batch processing system on the main processor, and then offload most of the reading stuff from disk, user interaction, etc to secondary processors
08:03:26 <geist> it has some sort of thing like 16 generic io channels which can be connected to different other computer systems. the manual even defines that in your crate when you receive one of these there's a Data General minicomputer that comes with it
08:03:33 <geist> which is the thing you actually bootstrap the system with
08:04:18 <geist> i think there's some mention that channel 0 has some special control lines, perhaps a bit kind of like JTAG, in that the secondary processor can halt the main one and force its state
08:07:29 * CompanionCube assumes https://devblogs.microsoft.com/commandline/announcing-wsl-2/ has already been posted here
08:08:20 <geist> looks like a later MP version in 1982 actually got a unix port called UNICOS
08:08:27 <geist> which became the standard after a while
08:08:46 <geist> CompanionCube: oh neat
08:08:55 <CompanionCube> the itneresting bits i'm curious about: "WSL 2 is a new version of the architecture that powers the Windows Subsystem for Linux to run ELF64 Linux binaries on Windows." "WSL 2 uses an entirely new architecture that uses a real Linux kernel."
08:08:56 <heat> CompanionCube: WSL 2: Electric boogaloo?
08:10:04 <geist> aah interesting. so looks like they moved away from attempting to emulate linux syscalls in kernel space
08:10:04 * CompanionCube will be interested to see what kind of paravirtualization(?) is involved here
08:10:15 <geist> they are running a heavily paravirualized linux kernel
08:10:32 <zid> Not that they really emulated many, as there were no files or anything
08:10:35 <geist> in some way that's less interesting to me, but i guess it's probably more useful
08:10:43 <bauen1> which is interesting, wasn't WSL 1 a syscall compatibility layer thing ?
08:10:50 <geist> zid: that's not true at all, fs stuff worked fine
08:10:55 <geist> it just had limitations
08:11:05 <geist> bauen1: it was
08:11:15 <heat> it was much more than that
08:11:22 <heat> fun video: https://www.youtube.com/watch?v=36Ykla27FIo
08:11:31 <heat> althought outdated now (thanks wsl2...)
08:11:37 <geist> yah in some sense i'm sad that they're taking this route, though it's likely to be a better solution
08:12:01 <geist> 'fuck it, linux is too hard to emulate, lets put a linux in there'
08:12:01 <zid> They should licence qemu and just integrate it into the main windows ui
08:12:16 <zid> get some nice backporting going on
08:12:28 <heat> geist: I mean, it's much easier
08:12:29 <geist> which is true. linux's syscall api is huuuuge, even more so if you consider /sys and /proc to be part of the system ABI
08:12:32 <CompanionCube> geist: there's still freebsd and smartos doing the thing though :p
08:12:51 <bauen1> windows is slowly turning into a linux distro
08:13:01 <bauen1> \/s
08:13:13 <geist> it's this tipsy turvy mixed up world we live in
08:13:19 <heat> but honestly, if wine folks would do this it would be amazing
08:13:24 <zid> and a lot of linux syscalls are weird compared to the windows alternatives, so wrapping them might not even really be feasible
08:13:42 <geist> oh totally. i was actually amazed at WSL1's ability to work as well as it did
08:13:43 <zid> if they want to expand compatibility it's sort of required
08:13:50 <geist> i have built stuff with it, and it was pretty okay
08:13:56 <heat> zid: :shrug:, they did keep a unix subsystem for a bunch of years
08:14:07 <geist> this is true
08:14:27 <geist> if i ever get a chance to meet the WSL1 folks i'd definitely have a beer with em
08:15:11 <heat> yeah, totally
08:15:13 <geist> presumably they'll still punch the FS calls through, so that'll probably still be the achilles heel
08:15:35 <geist> since fs performance in NT is just not tuned to the sheer amount of fs ops/sec that linux does
08:15:53 <heat> NT looks like a much better engineered thing than linux honestly
08:16:16 <heat> geist: I mean, they could get an ext4/whateverfs disk image and dynamically resize it
08:16:16 <geist> right. i may be that over the years it has picked up lots of cruft and hacks, but on paper it's a nice design
08:16:31 <heat> and have NT I/O has a separate, slower thing
08:16:34 <geist> heat: yah perhaps they'll do a hybrid where you can choose where it put your linux stuffs
08:16:51 <geist> and probably still let you get to the native FS via /mnt/c or something
08:17:42 <geist> my experience with arbitrary thigns like building a lot of source with a lot of fork/execs is that in WSL1 it's about twice as slow as linux on the same machine
08:17:53 <geist> and predictably, pure cpu bound stuff is just as fast
08:18:14 <heat> only twice as slow?
08:18:17 <geist> so presumably in that case almost all the extra time is sys% and is fs and fork/exec behavior
08:18:35 <geist> yah about twice. but then obviously that's workload dependent. but really lots of stats and fork/exec is about as bad as you can get
08:18:45 <geist> i have no idea what the relative network performance is though
08:18:55 <heat> I still have flashbacks of me trying to get into osdev in 2015 using windows and mingw/cygwin and waiting HOURS for binutils to cross-compile
08:18:57 <geist> i'd assume it's not too terrible, since NTs net stack isn't awful
08:19:15 <geist> yes but cygwin is faaaar more of a hack. you should dig into what it does to emulate fork/exec
08:19:21 <geist> it's astonishing that it works at all
08:19:27 <zid> https://randomascii.wordpress.com/2019/04/21/on2-in-createprocess/
08:19:43 * CompanionCube looks at his compile times for binutils...2 minutes
08:19:48 <CompanionCube> can't imagine it taking hours.
08:19:50 <geist> it sits on top of win32 so it's much more of a making lemonade out of lemons
08:20:20 <CompanionCube> even on the old PC it only took 15-20m :p
08:20:49 <zid> "The easiest explanation for why there was so much contention for the lock around CreateProcess would be that CreateProcess was running slowly, and indeed it was taking about 320 ms to execute. "
08:20:51 <zid> notbad
08:21:22 <heat> CompanionCube: oh believe me, that was the worst
08:21:25 <geist> ah interesting, looks like some 'CFG bitmap'
08:21:42 <geist> some new security thing. that's exactly what i sort f expect in this new world as layers of security mitigations get crammed in
08:21:45 <heat> fortunately it didn't put me off osdev'ing
08:22:50 <geist> looks like in this case someone just blatted out some nested N^2 'initialize A by repeatedly searching in B'
08:23:01 <geist> probably assuming that N would never realistically be big
08:23:08 <geist> or more likely just did it and no one cared
08:23:37 <zid> https://randomascii.wordpress.com/2017/07/27/what-is-windows-doing-while-hogging-that-lock/ another cool article of his
08:24:13 <geist> yep, that's a really good one
08:24:24 <zid> don't be upset if createprocess is slow, because when closeprocess is slow your mouse dies, which is one of my favourite parts about windows
08:24:32 <zid> when it lags the UI completely dies so you can't tell if it's frozen
08:24:35 <geist> i even know exactly what hardware he was using: a HP Z840. it was standard high performance workstation at google back then
08:24:52 <zid> My PC is basically a HP Z soemthing
08:25:31 <geist> anyway, this is too close to work and i'm on vacation!
08:25:35 * geist lalas and wanders off
08:25:36 <zid> The cpu I use is an OEM part for HP only used in a certain workstation, Z440 I think
08:27:54 <chrisf> geist: CFG might not have got the level of care it needs
08:28:02 <zid> https://randomascii.wordpress.com/2017/07/09/24-core-cpu-and-i-cant-move-my-mouse/ oh I linked the wrong page earlier, for those who give a shit
08:28:12 <geist> chrisf: yah that's my thought
08:28:26 <clever> zid: lots of great stories there
08:29:18 <geist> though lots of that stuff can strike anywhere. it's a cautionary tale of how monolithic locks can end up scaling really badly in a hurry on many core machines
08:29:40 <geist> you can easily hit points where the contention jumps up exponentially and you will not go to space that day
08:30:28 <heat> >looks at bkl
08:30:51 <geist> that's an obvious one, but so obvious it's not that interesting
08:31:07 <geist> what's more interesting are these sort of hidden, deep layered things that you dont think about until it's suddenly a problem
08:31:09 <zid> yea that's a very intentional "this code is not thread safe, sorry" case
08:31:21 <zid> not "Oh this complicated interaction is hard to resolve"
08:31:34 <_mjg> to be fair a lot of these cases can be easily weeded out
08:31:48 <geist> and it's exactly why you should
08:31:49 <_mjg> to the point where your average user(tm) does not run int othem
08:32:09 <chrisf> presumably the windows team does dogfood on beefy machines
08:32:14 <_mjg> until next core count bump
08:32:29 <geist> but you can easily have a lot of distributed responsibility that can plague a team like this
08:32:33 <geist> ie, 'someone else is looking at it'
08:32:50 <geist> s/this/that (freudian slip)
08:32:58 <_mjg> there was this article how nobody fixes this stuff in ms because it only gets them in trouble
08:33:19 <geist> yah. that is the curse of large companies. you almost need a tyrant in charge of the team that pushes folks to a higher standard
08:33:28 <geist> but then that's not really kosher anymore
08:33:29 <zid> metrics
08:33:40 <zid> Who wants to fix a hard bug that takes weeks that nobody'll notice :P
08:33:42 <geist> but even gathering the metrics is a large undertaking
08:34:39 <geist> but, at least big companies have the ability to write the check for the hardware if they can just get themselves organized
08:34:54 <geist> whereas hobby and open source OS stuff is always plagued by not having the hardware to test
08:35:08 <_mjg> there is also the argument that there are bugs already reported and should be taken care of first
08:35:20 <geist> yah
08:35:32 <_mjg> i got slapped with it when fixing "random" bugs i ran into
08:35:49 <geist> i think the only reasonable way to do that is to statically partition some amount of the team so that you're always working a bit on the next thing and scalability, and all that
08:36:20 <geist> otherwise you'll always be chasing the top of the bug list, which will almost always be juicy, user-facing things
08:36:49 <_mjg> there are many simple tricks you can use to smp-ify code well enough for real-world use
08:36:50 <geist> but then that comes bak around to what i said first: if you statically partition the team too much 'someone else is working on that'
08:36:58 <_mjg> yo ucan educate teams to keep them in mind
08:37:00 <heat> yeah but I can't think of a juicy kernel bug honestly
08:37:15 <geist> _mjg: i think that's the biggest win really. educate educate educate
08:37:30 <geist> make sure folks are thinking about it
08:37:39 <_mjg> now that's the problem
08:37:46 <_mjg> your typical folks really don't care
08:37:57 <_mjg> i mean, for real, no fucks given
08:38:00 <geist> nor do they have the hardware to test even if they did
08:38:31 <heat> nothing better than doing a really shit job that will affect millions of people worldwide
08:38:34 <geist> i deeply care, but it's basically in my bones. i want the low level thing to run as good as it can. it feels wrong otherwise
08:38:42 <geist> i think i'm a true low level programer
08:38:44 <_mjg> yea
08:38:52 <heat> like seriously, make every algorithm O(n^2)
08:38:59 <heat> hell, O(n!) if you can
08:39:10 <geist> which is part of my mid-career crisis right now, frankly. i want to be a low level programmer, but it's harder to do it nowadays
08:39:27 <_mjg> i don't think they do O(n^2) that often. code copy-pasted from SO often avoids it as it is the one thing people know about
08:39:36 <geist> folks i think appreciate the perspective, but will then 10 minutes later go write terrible code in a high level language
08:40:11 <geist> and there's an undercurrent call of 'just learn languague X or framework Y and you'll be so much happier!'
08:40:23 <geist> languages and tools dont solve problems. people do
08:40:39 <_mjg> i'm annoyed by people "learning" languages as well
08:40:59 <geist> OTOH good for them, of course. i dont begrudge what they're doing
08:41:00 <_mjg> they learn enough syntax and standard library to express what they need
08:41:12 <zid> The only language I know is C
08:41:18 <_mjg> zid: that's bad
08:41:22 <zid> I could probably code in 20 others
08:41:24 <zid> I only *know* C
08:41:49 <geist> it's the language you dream in, or the language that other stuff gets translated to in your head :)
08:41:54 <zid> hah
08:42:33 <geist> i actually see thinking in C in my head if i ever sit own and try to write code in 8 bit assembly or whatnot
08:42:50 <geist> where code back then in asm was usually much more free form. calling convention (if there is one) is dynamic
08:42:52 <zid> C is the best possible IR :P
08:42:56 <geist> you can branch to the middle of routines, etc
08:43:12 <geist> whereas coming from C you stilsl want to write a bunch of asm routines with a standard calling convention, etc
08:43:16 <zid> yea all my DOS assembly used "Whatever calling convention is most convenient"
08:43:29 <zid> because of the limited regs, mainly, though
08:43:31 <geist> and of coruse it sort of matters, since instructions aren't free
08:43:44 <zid> there wasn't really a way to formalise 'esi tends to contain the player struct' or whatever, it had to hop around
08:44:31 <geist> yah. thinking in terms of 'high level' languages like C tend to put a pattern on your stuff
08:45:00 <zid> I absolutely construct function bodies in C first in my mind though
08:45:10 <zid> And honestly, that's how they tend to work in practice too
08:45:18 <zid> 'declare some locals (adjust sp), do some work, return'
08:45:32 <zid> I will also comment assembly with C
08:46:10 <geist> and of course nowdaays with modern machines intel and amd and arm have met you in the middle
08:46:18 <geist> by tuning their microarches to work that way
08:46:20 <zid> yup!
08:46:27 <heat> huh, weird
08:46:32 * heat doesn't work like that
08:46:32 <geist> call return caches, 'free' stack pointer manipulation
08:46:51 <zid> It also goes in reverse, I never read assembly for RE, I convert it to C
08:47:01 <_mjg> fuck man i'm so annoyed when people claim func calls on x86 afre basically free
08:47:03 <zid> hoist all the operations in C operations, and then do transforms on the C as IR
08:47:07 <zid> They're basically free
08:47:08 <_mjg> because optimized
08:47:25 <zid> except on a couple of amd chips where there's an off by one bug
08:47:43 <zid> the branch predictor doesn't get the right return address inserted for the ret, it gets one frame off :P
08:48:00 <geist> _mjg: not all code needs to be equally blindingly fast
08:48:02 <zid> so it fetches the wrong instructions to decode then tosses them out
08:48:10 <heat> I mean technically if you just consider modern CPUs everything is basically free
08:48:36 <heat> except floating point ops, those aren't really free
08:48:36 <geist> there's just different levels of free. it's a big sea of free
08:48:37 <zid> On my cpu, every time you call, it remembers the return address, and then inserts the instructions for that return location into the instruction stream
08:48:59 <zid> so to the cpu it doesn't look like "call f; add rax, 5; f: blah; ret" It looks like call f; blah; add, rax 5
08:49:01 <zid> if that makes sense
08:49:13 <zid> it knows when it sees the ret where to fetch from already
08:49:18 <zid> 16 levels deep afaik
08:49:22 <geist> yah, that's how a lot of pipelined machines deal with branches
08:49:29 <heat> zid, what if you never ret?
08:49:36 <geist> they fold out the branch itself if they can compute it in the branch predictor
08:49:37 <_mjg> geist: no, but most of the time the code "not blindingly fast" is retardedly slow
08:49:46 <geist> _mjg: there is looots of grey area
08:49:49 <zid> heat: Then the buffer stays full? Either you branch another 16+ times deep and it falls out of the buffer, or it never matters
08:50:07 <_mjg> need to call a func? call a func. you can, but there is no need? don't.
08:50:09 <geist> anyway, i got some work to do around the house this afternoon though
08:50:22 <geist> so i gotta bail. pick it up some other time
08:50:40 <_mjg> well perf flamewars always end the same: everyone thinks they got the right appraoch
08:50:41 <heat> _mjg: I mean calling is better than inlining most of the time
08:51:37 <_mjg> i was after cases where executions of to be called code are spurious
08:51:55 <_mjg> calling vs inlining is another flamewar
08:54:24 <stisl> can anybody make a short codereview for me? https://pastebin.com/EN61f5GQ
08:54:37 <zid> sure
08:54:52 <zid> why forward declare update_pml4 isntead of just moving it above mmap
08:55:05 <zid> newline on line 5 please
08:55:26 <zid> missing space on 19
08:55:44 <zid> This looks eerily similar to my map function
08:55:50 <zid> but I have a better cast
08:56:12 * bcos wonders why all the "size_t"
08:56:33 <zid> your loop.. are you sure you're not going to try write past the end of a PD with that?
08:56:36 <zid> or PT, whatever
08:56:42 <bcos> Eww - and never treat physical addresses as if they're pointers - they need to be a something like "unint64_t"
08:58:05 <zid> https://github.com/zid/boros/blob/master/boot/main.c#L95 Double casts baby :P
08:58:55 <stisl> zid, thanks
08:59:31 <bcos> zid: "u32 paddr"? WTF is wrong with you people?
08:59:46 <zid> hmm?
09:00:02 <zid> bc
09:00:03 <stisl> exactly the same I also don't understand
09:00:03 <graphitemaster> geist, I know you're on vacation, but this is for you https://cdn.discordapp.com/attachments/256212041390489601/538105781204025344/cursed.webm
09:00:10 <graphitemaster> geist, you will enjoy this piece of history
09:00:36 <pterp> I'm back. Couple things i forgot to mention are that when i say "kernel yields", that's a sparate task with only kernel mappings. Also, interrupts have a stack separate from the kernel task's stack. (esp0 in TSS is not the same as the main kernel stack)
09:00:36 <bcos> zid: Physical addresses have been 36-bit or larger since for 2+ decades, and you're squishing them into 32 bit integers?
09:00:52 <zid> bcos: My kernel is not loaded to >4GB
09:01:05 <zid> if it is, bochs is very naughty
09:02:28 <bcos> If your "map_page()" is only ever intended to be used by (a subset of) kernel; then it needs to make that clear (e.g. "map_page_borked_junk_for_silly_special_case_and_not_usable_for_almost_everything()")
09:03:00 <zid> it's not used anyone but this code, it should actually be static
09:03:50 <stisl> my problem is that it looks like that I have not the newest data in the virtualaddress
09:04:14 <stisl> or that I can translate it to a physical address back
09:04:38 <stisl> +not
09:05:02 <stisl> for my framebuffer it works, but I don't know why
09:05:11 <stisl> for the kernel which should be loaded it works not
09:08:19 <stisl> maybe the problem is somewhere else - I don't know
09:11:23 <bcos> stisl: If "pages = r / 0x1000" is > 1 you might need two or more new PML4s, PML3s or PML2s
09:11:31 <zid> which I also mentioned
09:11:42 <bcos> D'oh, OK
09:11:51 <zid> your description is maybe more understandable though
09:15:04 <stisl> I will check this bcos
09:15:07 <bcos> I'd also worry about anything not being aligned on a page boundary - e.g. if someone does "mmap(123, 456, 789, flags)" it'd cause all kinds of fun
09:15:41 <geist> oh this is fun. for lulz i got a little 20W solar panel, a charge controller and an inverter
09:15:55 <geist> charging the laptop on the patio with the SUN
09:16:03 <stisl> hmm, that could be an issue
09:16:49 * bcos would be tempted to "if( (physical | virtual | length) & 0xFFF != 0) { return ERR_dont_do_that; }"
09:17:18 <stisl> thanks I will make some assertions I think
09:19:14 <pterp> When you pass a structure dietly into an assembly function, how are the elements pushed on the stack?
09:19:26 <pterp> *directly into
09:19:45 <bcos> pterp: How big is your structure (and which calling convention)?
09:20:44 <pterp> GCC x86 convention, 64 byte structure
09:21:07 <stisl> does the compiler generate a lot of copy code when the structure is for example bigger than 64 bit?
09:21:23 <zid> oh x86 it generally generates a hidden pointer param and fakes it
09:21:59 <pterp> Essentially passes &my_struct?
09:22:06 * bcos wonders if the compiler is smart enough to use pointer if the passed struct isn't modified - would assume not
09:22:52 <geist> it's less of that and what precisely the ABI says
09:22:59 <zid> depends if it's static or not geist :P
09:23:03 <geist> the compiler cannot (unless it's private to itself) violate those rules
09:23:15 <geist> so you should consult the ABI
09:23:28 <pterp> Where would the x86 ABi be?
09:23:32 <bcos> (e.g. if you do "int foo(mystruct bar) { bar.baz = 123; return 0; }" then it should only modify the local copy and not the caller's original so..)
09:23:33 <zid> I often pass internal structs around by value, knowing the compiler will cheat its ass off for me ;)
09:24:20 <geist> the rules tell you precisely what i can do. it can only relax them when it does its own escape analysis and knows nothing external can call it
09:24:47 <geist> bcos: in this case i am fairly certain that x86-64 abi says the caller made a copy that it passed a pointer to
09:24:51 <geist> but you'd have to look at the ABI for that
09:25:10 <pterp> I'm using 32-bit,not 64-bit.
09:25:11 <zid> amd64 has an entirely different convention where it'll split it amongst a bunch of regs if it wants to
09:25:32 <zid> and not even in an obvious way, because if you have like {int a; float b; int c;} it'll do like rsi = a, rdi = c, xmmo0 = b
09:25:34 <bcos> Myself; I just never pass structures
09:25:53 <bcos> (always "pointer to structure" instead - I don't want any copying going on)
09:26:20 <geist> zid: not sure about that, i am fairly certain the x86-64 abi says it'll only pass up to 2 full 64bit words worth. though could be it only does it for all integer or all float
09:26:33 <zid> bcos: struct pair{} type things I often do by value
09:26:54 <geist> where it really gets handy is returning structs. you can at least with x86-64 and arm64 return up to two words of struct
09:27:00 <zid> Just so I can return 'two things' is the entire point of the struct in the first place
09:27:03 <geist> so its actually pretty 'free' to return a { uint64; uint64; }
09:27:10 <geist> since there are two full return registers
09:27:16 <zid> yea rdx:rax gets it I assume?
09:27:17 <pterp> Hmm. All the sys V i386 supplement says is "structure an union arguments are pushed onto the stack in the same manner as integral arguments". ? What order qrae the values psuhed? Last value on top? First value on top?
09:27:19 <geist> handy for error + value style return things
09:27:28 <pterp> *are the values pushed
09:27:39 <geist> pterp: last first, so that it ends up in memory order on the stack
09:27:56 <geist> so in some sense x86-32 constructs the struct on the stack
09:29:09 <geist> and in general yes, last args first so that it ends up being that lower memory on the stack has the first args
09:29:32 <geist> varargs uses this to great effect, it ends up pushing essentially an array of values on the stack and the callee can just increment a pointer and read them off
11:31:21 <drakonis_> https://devblogs.microsoft.com/commandline/announcing-wsl-2/
11:31:22 <drakonis_> lol
11:31:39 <drakonis_> presented without comment
11:34:36 <_mjg> drakonis_: https://github.com/Microsoft/WSL/issues/873#issuecomment-425272829
11:36:43 <drakonis_> i'm aware
11:37:32 <drakonis_> it is funny
11:37:59 <drakonis_> we're inching closer to complete linux domination
11:38:03 <geist> what, people arguing on the internet?
11:38:35 <drakonis_> microsoft ships a linux kernel on wsl, no syscall translation
11:38:43 <drakonis_> its funny to behold
11:39:07 <_mjg> they are going to spawn a vm to run linux stuff
11:39:07 <drakonis_> their fix for the perf problems
11:39:26 <_mjg> wait few years and it will be the other way around: windows vms for compatibility
11:39:43 <drakonis_> reactos lives
11:39:45 <_mjg> curious if wine works under this
11:39:46 <geist> oh i see, someone said something funny and/or insightful during an argument on the internet
11:40:15 <zid> no being funny!
11:40:41 <geist> funny bad!
11:42:39 <_mjg> i remember old flamewars, someone was adamant that linux is using windows code as it takes too much effort to write stuff from scratch
11:42:59 <drakonis_> lol
11:43:47 <zid> because windows didn't write it either
11:43:52 <zid> it was handed down as gospel from the heavens
11:44:36 <_mjg> cmpq $0x50, %rdx /* 80 */
11:44:39 <_mjg> fuck this comment
11:44:45 <zid> that's a good comment
11:44:51 <zid> except for that it isn't
11:44:56 <bcos> _mjg: I heard Microsoft used Windows code because it was too much effort to write Windows from scratch..
11:45:03 <zid> It's good enough
11:45:32 <_mjg> bcos: i hear they just kept modifying it
11:45:34 <zid> bcos: I heard microsoft used linux code they stole from windows because it was too much effort to write Windows from scratch
11:45:59 <bcos> I did hear that Microsoft slapped an entire Linux kernel into WSL...
11:46:17 <geist> nawww. that be crazy
11:46:17 <_mjg> more like they reworked wsl to just spawn a vm
11:46:29 <geist> you be on crack yo
11:46:36 <_mjg> acid
11:46:42 <_mjg> and what's wrong wit hthat
11:46:46 <geist> in fact it's rather warm outside. i do not think hell hath frozen
11:47:13 <zid> I'd love UI integrated qemu
11:47:18 <_mjg> DEVELOPERS DEVELOPERS DEVELOPERS DEVELOPERS
11:47:23 <drakonis_> soon
11:47:23 <zid> I suppose you could do it trivially with X
11:47:45 <geist> oh it was such a simpler time to just hate on M$
11:47:46 <zid> Run X on the desktop, have the VM client render over ssh from the VM
11:47:55 <geist> we channeled all our hate towards one entity
11:47:59 <zid> maybe I should do that for my OS :P
11:48:12 <_mjg> oracle is not worth hating on
11:48:17 <geist> yah exactly
11:48:34 <geist> plenty of douchebag CEOs to hate on
11:48:57 <_mjg> ouch why are you calling out my CEO
11:49:53 <geist> well, sadly i think it generally is a positive attribute to be at least a bit of a douchebag in the current CEO market
11:50:30 <_mjg> i'm confident bing a douchebag is necssary to hold a lead in the climbing the ladder war
11:50:33 <_mjg> for a time
11:50:42 <drakonis_> https://imgur.com/YKtH9vO
11:50:48 <drakonis_> its real
11:51:07 <zid> that penguin's feet haunt me
11:51:31 <_mjg> perhaps rms tribute