Search logs: #osdev2 - 6 November 2021

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present

http://bespin.org/~qz/search/?view=1&c=osdev2&y=21&m=11&d=6

Saturday, 6 November 2021

09:53:00 <GeDaMo> https://www.cnet.com/news/the-untold-story-behind-apples-13000-operating-system/
09:53:00 <bslsk05> www.cnet.com: The untold story behind Apple's $13,000 operating system - CNET
14:02:00 * wikan says hi to everybody
14:03:00 <ThinkT510> greetings
14:06:00 <wikan> how people write their own oses? Where they have knowledle from usualy?
14:06:00 <wikan> i feel I need a strong theory first
14:07:00 <ThinkT510> the topic contains a wiki link
14:08:00 <ThinkT510> any particular language you interested in developing in?
14:08:00 <wikan> well I finished my IT career so I decided to start with C and finish with my oen language
14:09:00 <wikan> start with C because it is already ready to use ;)
14:09:00 <ThinkT510> you made your own language?
14:09:00 <wikan> great to learn os dev
14:09:00 <wikan> i have some os ideas and if it is possible to implement I will need my own language
14:10:00 <ThinkT510> this one is rather interesting: http://www.projectoberon.com/
14:10:00 <bslsk05> www.projectoberon.com: Home
14:11:00 <wikan> thank you, it looks interesting
14:13:00 <wikan> well, i decided to write my own libfuse driver first, because i want my own filesystem
14:13:00 <ThinkT510> I also like it when a language tries to improve itself based on lessons it learns trying to make an OS. here is one example in V: https://github.com/vlang/vinix
14:13:00 <bslsk05> vlang/vinix - Vinix is an effort to write a modern, fast, and useful operating system in the V programming language (60 forks/915 stargazers/GPL-2.0)
14:14:00 <wikan> thanks, i like your links ;)
14:14:00 <wikan> never heard about v
14:17:00 <ThinkT510> This one is rust specific but the author goes into a lot of generic background information: https://os.phil-opp.com/
14:17:00 <bslsk05> os.phil-opp.com: Writing an OS in Rust
14:18:00 <wikan> the best thing for me is as much information as it is mossible about everything what i have to master to be able to bring my own idea to life
14:19:00 <wikan> i must collect puzzles first
14:19:00 <wikan> because I have no idea not even how to start but the most important for me is to not copy blindly without understanding detailts
14:20:00 <wikan> this is why I start with fuse for linux.
14:20:00 <wikan> will inderstand and got a little practice with vfs concept
14:21:00 <wikan> and of course it will help me to copy files to my own partitions of my own os in the future :)
14:22:00 <wikan> or will not if I didn't understand fuse correctly
14:22:00 <ThinkT510> implementing a filesystem is one piece of the puzzle
14:23:00 <wikan> yea, but not sure if fuse gives me enough of flexibility
14:23:00 <wikan> like custom fs attributes
14:27:00 <ThinkT510> your design considerations will certainly have a knock on effect on the rest of the system. monolithic kernel vs microkernel, realtime, specific vs generic, embedded etc.
14:29:00 <wikan> well and it is extremely hard for me because I want to micro or nano kernel
14:29:00 <wikan> i think it is much harder to write
14:29:00 <ThinkT510> I am a fan of the microkernel design
14:30:00 <ThinkT510> this might be a good inspiration for you: https://github.com/managarm/managarm
14:30:00 <bslsk05> managarm/managarm - Pragmatic microkernel-based OS with fully asynchronous I/O (40 forks/737 stargazers/MIT)
14:32:00 <wikan> nice
15:55:00 <mjg> https://i.imgur.com/GpfsD4m.mp4
15:55:00 <bslsk05> 'Imgur' by [idk] (--:--:--)
19:59:00 <vin> When would read be better than mmap?
20:01:00 <gog> when the descriptor can't be mapped for some reason
20:01:00 <gog> like a socket
20:01:00 <GeDaMo> Uh ... when the 'm' key doesn't work on your keyboard? :P
20:02:00 <vin> lol
20:02:00 <vin> when would read *perform* better than mmap though?
20:02:00 <gog> probably never
20:03:00 <junon> I've read that read is faster than MMAP in the case of sequential reads of block size
20:03:00 <junon> I don't know if that's true or not
20:03:00 <junon> Unless you mean a single call, in which case, it wouldn't.
20:03:00 <junon> Probably.
20:04:00 <zid> read is pretty well optimized for sequential
20:04:00 <zid> mmap's nice if you care about cache and stuff
20:05:00 <vin> what is "well optimized" zid? what more can read do than prefetch next block in sequnetial access? I am sure linux prefetches blocks on sequential mmap accesses as well.
20:05:00 <zid> You're going to eat a lot of demand faults with mmap, compared to read, but once it's faulted in it should be nice and fast
20:05:00 <zid> vin: 0 copy shenanigans, no page faults, etc
20:06:00 <mjg> it really dependso n the access pattern
20:06:00 <vin> So reading a small file sequntially would be faster with read? Just pay the sys call overhead but avoid page faults?
20:06:00 <mjg> one key point with mmap is that it it costs to set it up
20:06:00 <mjg> and most notably if you have multiple threads you will be suffering from it to some extent
20:06:00 <zid> mmap's going to churn your TLB, cause a fuck lot of page faults, and a bunch of time setting up the page tables etc
20:06:00 <zid> it's basiucally read + brk combined
20:06:00 <zid> read is just read
20:07:00 <klange> mmap is good if you don't know what you want ahead of time
20:07:00 <zid> mmap's perfect for "My gap has a 1GB file it keeps in memory and uses random bits of it all the time"
20:07:00 <zid> gap? game.
20:07:00 <junon> I think a better question is, read+seek vs mmap
20:07:00 <junon> Also mmap is inherently thread safe, is it not? whereas a read on an FD is not.
20:07:00 <vin> junon: then wouldn't mmap be always better than read+seek
20:08:00 <klange> probably something you could do a bunch of benchmarking for on different kernels/platforms/disks...
20:08:00 <junon> I suppose I've never considered the thread safety of an mmapped region but I can't imagine it having issues.
20:08:00 <junon> Yes
20:08:00 <junon> vin: again it probably depends
20:08:00 <junon> Also read on io_uring (on linux) from the FS is going to be faster than read syscalls.
20:09:00 <junon> as io_uring was designed specifically for FS performance at facebook then was expanded out, IIUC
20:13:00 <vin> Wait both read and mmap needs to copy from kerenl buffer to userspace buffer correct? Even with io_uring.
20:13:00 <vin> So a kernel bypass read should be better than io_uring? like spdk I guess
20:15:00 <klange> No, there doesn't need to be any buffer copying in a read if the conditions are right. Nicely page-and-sector-aligned reads can easily be zero-copy...
20:29:00 <graphitemaster> Self plug: I updated my incbin header hack, it now does more, like text and works on Harvard architectures
20:29:00 <graphitemaster> https://github.com/graphitemaster/incbin
20:29:00 <bslsk05> graphitemaster/incbin - Include binary files in C/C++ (50 forks/578 stargazers/Unlicense)
20:30:00 <graphitemaster> I'm sure osdev people have uses for it XD
20:53:00 <sortie> https://sortix.org/man/man1/carray.1.html is my solution
20:53:00 <bslsk05> sortix.org: carray(1)
21:10:00 <graphitemaster> sortie, Yeah but that requires running a separate external tool, which is sadge
21:10:00 <zid> bin2o not being part of binutils is sadge
21:18:00 <graphitemaster> sortie, your documentation on your site still mentions freenode! GASP
21:18:00 <sortie> graphitemaster, guess I just didn't regenerate that html file yet
21:42:00 <heat> re: read/mmap, mmap doesn't copy data, nor allocates memory, unless it needs to, which is only when COWing
21:43:00 <heat> with mmap you never actually allocate any pages beyond those in the inode's page cache unless there's actually a need for duplication, and because of that, you don't need to copy pages too
21:44:00 <heat> that's why it can be faster than read, especially since the kernel can do readahead pretty nicely when it sees you accessing the mmap region
21:45:00 <heat> oh and if the kernel starts to need to reclaim memory, in the read case it will write your file's data to swap, since it's anonymous memory; with mmap, it only writes back if there are dirty pages and there's no swap involved
22:02:00 <geist> it's all about the setup overhead of mmap vs the gains you get from 'zerocopy'
22:02:00 <geist> if it's a one time read it's probably not faster to mmap a thing and then read/write it
22:03:00 <geist> but if you are going to continually access it then it probably ends up being a win
22:04:00 <geist> also this is all assuming the OS is about as smart as it can be with this stuff
22:04:00 <geist> a map of a huge file that you then sequentially read is possibly pretty fast if the OS is going to observe whast your'e doing and start pre-fetching pages as you read through it and try to pave in front of your cursor
22:04:00 <geist> and thus making the overhead of the demand faults minimized
22:05:00 <geist> OTOH if you're just reading a file in then it' the same number of copies to go through a read() call, since the kernel is most likely going to just map the same backing pages on a mmap() then what it'd internally memcpy out to user space
22:05:00 <clever> i can see how read-ahead with mmap would perform faster then read-ahead with read(), even ignoring the copy costs
22:06:00 <geist> *potentially*, but that's only if the kernel really sees whats going on and gets ahead of the cursor, which it probably cant
22:06:00 <clever> the kernel could read-ahead with a second core, and fill the paging tables in ahead, so your thread never does a single context switch
22:06:00 <clever> while read() has to both context switch AND also copy
22:06:00 <geist> i doubt it can keep up. the page table overhead is immense vs the speed a cpu can read through stuff
22:06:00 <geist> this is of course assuming the file is already completely brought in (or it's some shared memory object that's already populated)
22:07:00 <clever> yeah, it would rely on you reading at less the 100% bus capacity
22:07:00 <clever> like doing compute as you read
22:07:00 <geist> right
22:07:00 <geist> but i think in the case where you're mapping in something you're hitting randomly and continually, makes a lot of sense
22:07:00 <geist> cache files, font files, databases, etc
22:08:00 <clever> i use rtorrent a lot, and its mmap based
22:08:00 <clever> and torrents are a very random load
22:08:00 <geist> oh yeah totally
22:08:00 <clever> but until your closing things, you dont need to sync much
22:08:00 <clever> so the kernel can just flush the dirty data whenever it wants, and you dont really block on IO
22:10:00 <geist> but yeah everyone here has repeated the sae thing with different variants: it depends and the setup is hard
22:10:00 <geist> this is a thing we see a lot in zircon actually, since the VM lets you do pretty mcuh whatever you want by tossing a bunch of objects and tools at your feet
22:11:00 <geist> and so folks have been building all sorts of things out of it, some of which may or may not be fast or efficient
22:11:00 <heat> geist, I don't believe linux can do zero-copy read()
22:11:00 <geist> heat: i dont either, since by definition it's a copy
22:11:00 <clever> geist: does the TLB store negative results, like unmapped memory?
22:11:00 <heat> it can barely even do zero copy networking (send)
22:11:00 <geist> i consider the 'kernel moves data from internal buffer to user space' to be a copy
22:12:00 <geist> clever: negative. that's a strong design requirement in pretty much all arches i've seen: no negative TLBs
22:12:00 <clever> ah
22:12:00 <clever> so you can just fill the page table in as you read from disk, and not have to flush anything
22:13:00 <geist> there is even a fairly complex dance of precisely the state of a TLB when a cpu takes a page fault of various types. ie, did it just store that it couldn't do a TLB operation? did it flush the entry if it was a permission fault? etc
22:14:00 <geist> heat: if you dont count the actual user copyin/copyout then read() should be zero copy since the kernel can basically move directly from the backing page cache
22:14:00 <heat> ah
22:15:00 <heat> well you have a copy, I wasn't counting that as zero copy :)
22:15:00 <geist> but, mmap can be zero copy in the sense that you can directly manipulate the page cache. though at that point you h ave to consider whether or not a memcpy inside your application is a copy or not
22:15:00 <geist> if you mmap a file and then directly utilize the mapping without making a private copy of the data, say for an image file, then that's definitely less copies than a read()
22:15:00 <clever> i was investigating memory usage of wine many many years ago, and noticed that a chunk of the .exe file was anon memory, not mmap'd
22:16:00 <clever> as i dug more, i found that the LOAD commands in the .exe, wanted a chunk of the file to be mis-aligned in ram
22:16:00 <geist> clever: probably like in ELF where you either COW the .data segment, or you map in an anon chunk for bss
22:16:00 <geist> ah
22:16:00 <geist> yaeh if it's misaligned the loader either has to give up or make an anon mapping and copy
22:16:00 <heat> oh no, misalignment D:
22:16:00 <clever> and you cant share the read-cache with the mmap, if they have different alignments
22:16:00 <clever> exactly
22:18:00 <heat> i've always thought about a (userspace) filesystem driver that just mmap'd the whole disk and read everything like regular memory
22:18:00 <heat> seems like a fun experiment
22:18:00 <clever> heat: definitely needs 64bit there!
22:18:00 <heat> yupp
22:19:00 <clever> and with 48bit virtual size, thats 256tb of address space
22:19:00 <clever> so as long as LVM isnt fusing devices into a larger unit, you shouldnt have any issues mapping the entire disk
22:19:00 <geist> heat: there's kinda something like that we're working with on fuchsia
22:19:00 <clever> but you may have issues mapping several disks in an array like zfs
22:19:00 <geist> sicne FS drivers are user space, and the FS drivers also act as a user pager
22:19:00 <geist> for stuff like metadata you can create a VMO that represents the entire disk
22:20:00 <geist> and then demand fault them in and use the user pager interface to let the kernel do writebacks and whatnot
22:20:00 <heat> the issue is that you get lots of page duplication between the disk's cache and the file's cache
22:20:00 <geist> also i think that's a simialr interface to talk to block devices: you can have a VMO that represents the entire block device and then shuffle pages around with it
22:20:00 <geist> that *is* the disk cache
22:21:00 <heat> unless you're lucky enough that the partition is page aligned and the blocks are page-sized
22:21:00 <geist> oh sure. yeah if you didn't then that's bad
22:21:00 <geist> we at least have the advantage that we can make sure that is the case
22:21:00 <clever> heat: fdisk tries to make partitions 1mb aligned, to future-proof things like that
22:22:00 <heat> i've done a fun experiment with linux: create a file, write to it, read from the disk at the block's location, flush, read again from /dev/sda and again, with O_DIRECT
22:22:00 <zid> 1MB? go for 2MB because of 2MB pages smh
22:22:00 <zid> actually, let's go for 1GB
22:22:00 <clever> heat: a major problem when you dont do that, is if you have say 4kb sectors with 512byte emulation, and the partition wasnt 4k aligned, so every 4k write involves an 8kread, modify, 8kwrite!
22:22:00 <heat> data stops being coherent between reads to sda, the file, and sda with O_DIRECT
22:22:00 <geist> worse: classic DOS MBR would start the first partition in somethingl ike sector 63
22:22:00 <geist> for legacy dumb reasons
22:23:00 <clever> i think fat has a structure similar to the MBR in sector 0 of itself?
22:23:00 <geist> or 62, or whatever it was one sector before where you think the sector should ber
22:23:00 <clever> from the old pre-partition days
22:23:00 <gog> so it didn't span cylinders i think
22:23:00 <clever> and starting one too early, would put the bulk of the data where you would have expected it
22:24:00 <geist> oh possibly
22:24:00 <gog> because CHS addressing oddities
22:24:00 <geist> yay
22:24:00 <heat> chs aged very well
22:25:00 <gog> yes
22:25:00 <gog> much like meee
22:25:00 * gog crumbles into dust
22:25:00 <heat> tbh de_dust2 aged pretty well
22:26:00 <gog> i don't play video games
22:26:00 <heat> gog hates fun :(
22:26:00 <gog> that's right
22:27:00 <Arsen> ironic
22:28:00 <Arsen> :^) http://gog.com/
22:28:00 <bslsk05> gog.com: GOG.com
22:28:00 <clever> that ~1mb gap between sector 0 and partition 1, is also what grub and many other bootloaders abuse
22:28:00 <geist> side note random recent game plug: metroid dread. finished it a few days ago. good fun
22:28:00 <clever> a decent chunk of the bootloader gets shoved into that "unused" space
22:30:00 <clever> but with GPT using that region, and banning such hacks, legacy on GPT now uses a partition of type "bios boot partition" to hold the raw executable code
22:30:00 <clever> and the protective MBR in sector 0 still has the same 1st-stage as before
22:31:00 <gog> i was playing simutrans but i hate it because they adamantly refuse to make the interface less clunky
22:31:00 <heat> if we're plugging games I might as well do my classic "dark souls is the best game ever" plug
22:31:00 <gog> openttd isn't as fun and also it hard locks my machine for some reason
22:36:00 <heat> open source games are not very fun because most open source developers are bad with UX
22:49:00 <gog> openttd has a slightly better ui. slightly
22:50:00 <gog> but its passenger and goods transport isn't very realistic
22:50:00 <gog> too easy
23:53:00 <Maka_Albarn> yo. has anyone heard anything about what's going on with osdev.org? I know it's been down for a few days now, but does anyone know why?
23:54:00 <Maka_Albarn> ...more or less down.
23:57:00 <zid> nobody paid me $200 to threaten the guy hosting it not to stop
23:59:00 <vdamewood> zid: What if we paid the $200 to the hosting guy?