Search logs:

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present


http://bespin.org/~qz/search/?view=1&c=osdev2&y=21&m=10&d=3

Sunday, 3 October 2021

02:19:00 <c2a1> quick question, are far jumps only performed in kernel mode? where are they most common?
02:21:00 <Mutabah> In modern systems, they're just something done during early bring-up
02:22:00 <Mutabah> Far jumps are part of the x86's segmentation "feature", which was very commonly used in 16-bit "real mode", but nearly never used in 32-bit and larger modes
02:26:00 <zid> Yea you must first provide context like "Are we in 1970"
02:34:00 <vdamewood> For real!
02:34:00 <c2a1> trying to wrap my head around how context switching works
02:36:00 <heat> in x86 you essentially capture each thread's context in the stack and then basically pop + iret it all
02:36:00 <heat> when switching to a thread
02:37:00 <heat> iret does a sort of far return to whatever segment you have on the stack
02:37:00 <c2a1> what is a context typically composed of
02:37:00 <c2a1> just registers and memory locations?
02:38:00 <heat> basic registers, fpu state, segment registers (for user vs kernel mode), cr3 (for paging), stack
02:38:00 <Mutabah> register states
02:38:00 <zid> anything you don't want to have overwritten or leaked to a differnet processs
02:38:00 <c2a1> like, the entire stack is copied every task switch?
02:39:00 <zid> like, you'd be mad if your eax and top of stack suddenly changed every time your process got a timeslice
02:39:00 <kazinsal> the stack pointer is just saved and restored
02:39:00 <zid> they each use their own stack, usually
02:39:00 <zid> so just yea, rsp gets saved/restored
02:39:00 <zid> rsp is the trickest part of a context switch
02:39:00 <c2a1> ah
02:40:00 <c2a1> so the kernel reads the stack stored in user memory if i'm correct
02:40:00 <zid> Like, you enter an IRQ or such, and you can't touch any of the registers OR the stack without destroying them, good luck :P
02:40:00 <zid> kernel never reads user stack to my knowledge
02:40:00 <zid> unless you do a syscall and pass it a pointer to a struct allocated there I guess
02:41:00 <Mutabah> A simple task switch method is the following:
02:41:00 <Mutabah> Push all callee-save registers to the stack
02:41:00 <Mutabah> Save ESP
02:41:00 <Mutabah> Load a new CR3 (paging root register) and ESP
02:42:00 <Mutabah> Pop callee-save registers
02:42:00 <Mutabah> return (which returns to the saved state)
02:42:00 <zid> (Then reality appears and writing to arbitrary stack pointers like that is a massive vuln and also may fail and cause an exception). Osdev is hard.
02:43:00 <Mutabah> (The above assumes that you have a kernel stack for each thread, which means that user state was saved by the entry to kernel-space)
02:44:00 <zid> broke his face
13:06:00 <nur> question about aarch64 mmio functions, the examples I've seen still use 32 bits to refer to the address space
13:06:00 <nur> why is this
13:07:00 <nur> looking at RPI bare bones in the wiki
13:08:00 <klange> There is no aarch64 raspberry pi bare bones.
13:08:00 <nur> https://wiki.osdev.org/Raspberry_Pi_Bare_Bones
13:08:00 <Oli> The RPI boot process begins with ARM 32 bit mode; you can change the configuration in the boot device's first partition FAT32 filesystem
13:08:00 <nur> in the "Writing a kernel in C" example there is a comment for raspi4
13:09:00 <Oli> to switch to 64 bit mode*
13:09:00 <nur> ah
13:10:00 <klange> As I said, there is no aarch64 raspberry pi bare bones on the wiki, that is still aarch32 code, and mmio addresses on a raspberry pi are "mmio addresses on a raspberry pi", not anything to do with aarch64 even if the particular raspberry pi is one of the newfangled aarch64 ones.
13:12:00 <klange> And that in itself is also an answer for you: The RPi 4 still supports 32-bit code, and much like any other backwards-compatible platform that supports 32-bit code, mmio is often mapped within the 32-bit range so that 32-bit code can still reach it
13:13:00 <nur> so the board can't access the mmio peripherals in 64 bit mode?
13:13:00 <klange> ?
13:13:00 <nur> you seemed to imply "mmio addresses on a raspberry pi" can only be done in 32 bit mode
13:13:00 <klange> Uh, no, I did not.
13:14:00 <nur> sorry I read it wrong then
13:14:00 <klange> Quite wrong.
13:14:00 <nur> so what you're saying is, the example code is 32 bit, and the newer aarch64 cpu also supports this mode
13:15:00 <klange> Yes.
13:15:00 <nur> and it boots into 32 bit mode by default
13:15:00 <klange> That's irrelevant.
13:15:00 <Oli> RPI boots into 32 bit mode by default
13:16:00 <nur> unless we change its config otherwise
13:16:00 <Oli> Exactly!
13:17:00 <nur> in which case, would we be using an mmio function that accepts a 64 bit pointer
13:17:00 <klange> It's MMIO. It's just an address.
13:18:00 <nur> so... it doesn't matter?
13:18:00 <nur> because the MMIO addresses are all in the 32 bit range anyway?
13:20:00 <klange> Sorry if I'm being curt, it's late here and I just spent the last hour dealing with the fallout from a bot getting access to my SMTP server and sending a million spam messages from a user's account before the mail log filled the entire disk.
13:21:00 <nur> that's all right
13:21:00 <klange> > the examples I've seen still use 32 bits to refer to the address space
13:21:00 <klange> > why is this
13:21:00 <klange> RPi 4 supports 32-bit code for backwards compatibility, so they need to map hardware to addresses that 32-bit code can reach.
13:21:00 <nur> right
13:23:00 <Oli> I think that the thread on the web page that the next hyperlink leads at, may be relevant for: https://stackoverflow.com/questions/60220759/how-linux-arm64-switch-between-aarch32-and-aarch64
13:25:00 <clever> klange: the pi4 has both a low-peripherals and a high-peripherals mode
13:26:00 <clever> klange: so it can put the MMIO either at the top of the 32bit addr space, or way up in 64bit space
13:26:00 <klange> please stop pinging me
13:26:00 <klange> just, like, in general
13:26:00 <clever> ok
13:27:00 <nur> thanks everyone
13:28:00 <nur> clever, you can ping me anytime! :D
13:30:00 <clever> nur: for the entire vc4 series (pi0 to pi3), there is a dedicated MMU, that maps the "arm physical" space to the bus addr space, 64 pages of 16mb each, covering exactly 1gig of the arm physical space
13:30:00 <clever> nur: anything beyond that 1gig is a bus fault
13:30:00 <clever> for the pi4, i think that mmu still exists, but i need to research the hw more
13:30:00 <clever> the closed firmware allows moving the MMIO as i said above
13:30:00 <nur> ah
13:31:00 <nur> and I suppose it's different on different ARM boards
13:31:00 <clever> yeah
13:32:00 <nur> how does it work for say, if you launch an aarch64 VM with just virt IO
13:33:00 <clever> you would use an api like KVM to map guest physical to host physical, and setup the MMIO to just trap into the host
13:33:00 <clever> then any attempt to access MMIO triggers a fault into the host, where you can emulate it
13:33:00 <clever> https://david942j.blogspot.com/2018/10/note-learning-kvm-implement-your-own.html
13:34:00 <clever> nur: this is a guide on how to use the raw /dev/kvm api, to create your own vm
13:34:00 <nur> holy crap that's amazing
13:34:00 <nur> thanks!
13:34:00 <clever> KVM_SET_USER_MEMORY_REGION is used to map a range of the physical space in the guest, to a range of virtual space in the process managing it
13:34:00 <nur> is that your blog
13:35:00 <clever> nope, i just stumbled upon it one day
13:35:00 <j`ey_> nur: i skimmed it a few years ago, it looks good
13:36:00 <clever> nur: the blog example is for x86, but the only real difference is the fields in `struct kvm_regs` and what ISA you copy into the guest ram and point PC at
13:39:00 <nur> if my host is x86 though
13:39:00 <clever> kvm can only run a vm that is compatible with the host cpu
13:40:00 <clever> if you want to run arm on x86, then you need more of an emulator, like qemu's TCG
13:40:00 <klange> The far more interesting thing imo is how to actually set up the virtualization, not Linux's API around it...
14:05:00 <nur> well, if I used QEMU's full emulation mode there is no question of trapping to the host
14:06:00 <clever> qemu would be setting up the traps, and handling them
14:10:00 <nur> how do I find out what the addresses I need to write to are for virt IO. Some kind of PCI discovery? Is there a devicetree to be read somewhere? Is there... UEFI?
14:11:00 <j`ey_> depends!
14:11:00 <clever> nur: depends on what the vm is emulating
14:11:00 <clever> arm stuff typically uses device-tree
14:11:00 <clever> x86 typically uses acpi via bios
14:11:00 <clever> uefi can be enabled on both
14:11:00 <nur> aha
14:11:00 <nur> "can"?
14:11:00 <j`ey_> arm64 can use uefi+acpi too
14:11:00 <clever> just pass it a bios blob with -bios
14:12:00 <clever> the qemu console can also let you cheat, and just print the entire mmio tree
14:12:00 <nur> so as a kernel writer, do we need to anticipate _everything_?
14:12:00 <j`ey_> yep
14:12:00 <j`ey_> if you want it to work everywhere
14:12:00 <clever> or you can cheat, peek at the qemu console, and write a driver that will only work in that exact vm
14:13:00 <nur> but of course we want to dynamically configure these things
14:13:00 <clever> a lot of rpi baremetal is that form of cheating
14:13:00 <clever> just read the docs, write to the mmio, and ignore the device-tree
14:13:00 <nur> aha
14:13:00 <clever> then your code will break every time a new model comes out
18:16:00 <corecode> hi
18:16:00 <gog> hi
18:16:00 <Oli> Hello, corecode and gog!
18:20:00 <dzwdz> what do y'all think about plan9-style exit messages?
18:20:00 <dzwdz> basically when a process exits it can pass a string to the parent
18:20:00 <corecode> ah like an inverse argument?
18:21:00 <corecode> simplifies a lot of non-streaming IPC
18:21:00 <dzwdz> i don't think i get what you mean?
18:21:00 <dzwdz> like on unixes when a process exits it returns a byte
18:21:00 <corecode> you invoke a process with environment and arguments
18:21:00 <dzwdz> on plan9 it returns a string
18:22:00 <corecode> what i'm saying is returning an (array of) strings is sort of the inverse
18:22:00 <corecode> it's cute
18:22:00 <gog> returning an arbitrary object could be useful
18:22:00 <dzwdz> yeah, i really liked that idea first time i saw it
18:22:00 <corecode> of course you need to store that somewhere until the process is reaped
18:23:00 <dzwdz> but now i'm having second thoughts
18:23:00 <clever> corecode: a fragment of the argv is stored in a a fixed-size array in the linux task structure
18:23:00 <gog> yeah that object and its lifetime would have to be managed by the process handling portion of the code
18:23:00 <dzwdz> that's very easy to implement
18:24:00 <dzwdz> i feel like returning a string instead of a number complicates stuff for not much in return
18:24:00 <dzwdz> you don't really gain anything by making processes return a string
18:24:00 <dzwdz> if there was an error, why not just print it to stderr
18:25:00 <clever> corecode: that is the difference between the truncated `Name:` in `/proc/PID/status` and the full args in some other entry
18:25:00 <clever> the truncated name is in a fixed-size field, that cant be swapped out
18:25:00 <GeDaMo> Return a string in which language?
18:25:00 <clever> while the full argv, is peeking into the actual userland stack of the proc, and can hit swap
18:25:00 <corecode> clever: yea
18:25:00 <dzwdz> i think it's either always english or localised
18:26:00 <Oli> If it was an error string it hands to a parent process, I am pondering about an implementation of supporting a multi-language system approach.
18:26:00 <corecode> clever: that's why returning stuff means either very small storage or having to hold on to a memory object that belongs to a dead process
18:27:00 <corecode> dzwdz: i think that many processes could then return their result (sort of like a function) instead of printing on stdout (procedure with side effects)
18:27:00 <heat> you can already return data like that using pipes
18:27:00 <corecode> no
18:28:00 <corecode> that's not returning
18:28:00 <heat> yes it is
18:28:00 <heat> write(...); return 0;
18:28:00 <corecode> so why are there process arguments
18:28:00 <heat> dunno
18:28:00 <dzwdz> it's just another way to pass stuff to a program
18:28:00 <heat> pipes also do that
18:28:00 <corecode> you could pass arguments in via a pipe
18:28:00 <heat> exactly
18:28:00 <heat> and that's what you do in the standard unix model
18:28:00 <gog> everything is a pipe
18:28:00 <corecode> because it is impractical, that's why
18:29:00 <heat> read from stdin, write to stdout
18:29:00 <Oli> Thank you for hinting about a process handing structured data to a parent upon it's exit; so sounds practical for me.
18:29:00 <dzwdz> how about a parent passing structured data to the child?
18:29:00 <corecode> yea
18:30:00 <corecode> you mean a sequence of 0-terminated strings?
18:30:00 <corecode> :)
18:30:00 <dzwdz> or a struct or something
18:30:00 <heat> stdin
18:30:00 <heat> you can write anything to a pipe
18:30:00 <corecode> i like the argument passing
18:30:00 <corecode> and returning of data
18:31:00 <heat> argument passing in the argv[] way is there because it's relatively common and useful
18:31:00 <heat> returning strings isn't common nor useful
18:31:00 <corecode> [citation needed]
18:31:00 <heat> where would you use it?
18:32:00 <heat> and how is it not replaced by stdout?
18:32:00 <corecode> i would find it more useful to get data returned from a process than setting up pipes, capturing stdout, parsing it
18:32:00 <corecode> because that's what a lot of spawning processes does
18:32:00 <heat> no need to parse stdout
18:32:00 <dzwdz> i actually think that i'm going to remove support for returning strings from my kernel
18:32:00 <corecode> how do you use the output then?
18:33:00 <corecode> heat: i think you're too negative
18:33:00 <heat> you can pass binary through stdout
18:33:00 <corecode> not engaging with the merits
18:33:00 <dzwdz> you need to parse the data no matter how you get it
18:34:00 <corecode> oh
18:34:00 <corecode> maybe the parent could pass a buffer (page) during spawn()
18:35:00 <heat> how does that solve the issue?
18:35:00 <corecode> what issue?
18:35:00 <heat> parsing the data
18:35:00 <corecode> i'm not talking about parsing
18:35:00 <heat> also what's spawn()?
18:36:00 <corecode> syscall that creates a process with arguments etc
18:36:00 <Oli> If it isn't in a known structure for, a key=value syntax is a way to go for
18:36:00 <corecode> Oli: i'd go for opaque blob with a convention
18:36:00 <corecode> kernel doesn't have to care
18:37:00 <corecode> dzwdz: so why did you decide to remove returning of strings?
18:38:00 <dzwdz> it doesn't add anything useful
18:38:00 <corecode> i think it does, but okay
18:38:00 <dzwdz> and if i remove it i could simplify some of the code
18:38:00 <corecode> fair
18:38:00 <Oli> I would go for so, too: I feel better by it's data size compactness in contrast to.
18:38:00 <corecode> it's a simple way for IPC
18:38:00 <dzwdz> for heavily limited ipc
18:38:00 <corecode> yea
18:39:00 <dzwdz> you'd need proper ipc anyways
18:39:00 <dzwdz> so why overcomplicate stuff with another shittier one
18:39:00 <heat> piping works so well because you can chain everything together very easily and stdout is always open (even if its not a pipe)
18:39:00 <corecode> there are advantages of having a simple solution that covers 80% of uses
18:39:00 <heat> assuming reading returned data would be something akin to waitpid(), it's just not a great fit in any model
18:40:00 <dzwdz> heat: it does
18:41:00 <corecode> i don't get the negativity
18:41:00 <corecode> not everything must be unix
18:41:00 <dzwdz> ^
18:41:00 <dzwdz> unix is overrated
18:41:00 <dzwdz> but that's not a discussion that i want to have rn
18:42:00 <heat> you can talk about plan9 without talking about unix i'd say
18:42:00 <corecode> i mean, by all means, have processes expose RPCs
18:42:00 <dzwdz> are any of y'all making oses that aren't unixy?
18:42:00 <corecode> everything is a server, and you have to do an RPC
18:42:00 <corecode> i do mostly embedded, so no unix there
18:43:00 <corecode> hm that makes it sound like android tho
18:43:00 <heat> no embedded is definitely not android
18:43:00 <heat> unless you work on the kernel that is
18:43:00 <corecode> no i mean everything is a server and you have to do RPC
18:43:00 <heat> no that sounds like a microkernel
18:44:00 <corecode> that's how android processes interact
18:44:00 <corecode> service and intents they are called, i think
18:44:00 <heat> would would they interact othrewise?
18:44:00 <corecode> ipc, pipes, parsing stdout
18:44:00 <heat> you need RPC to do remote calls
18:44:00 <corecode> but they don't
18:44:00 <heat> RPC is IPC but fancy
18:45:00 <corecode> and formalized
18:45:00 <corecode> permission controlled
18:45:00 <corecode> type safe, i suppose
18:45:00 <heat> IPC is $your_favourite_ipc_primitive (shared memory, pipes, sockets, probably something else)
18:46:00 <clever> heat: what would you classify shared memory with a hw FIFO (with irq) as?
18:46:00 <heat> weird doorbell thing
18:47:00 <clever> the FIFO is typically used to send over the physical addr of a packet
18:47:00 <clever> so the shared memory can exist literally anywhere in the addr space
18:47:00 <corecode> half microkernel
18:48:00 <corecode> clever: are you working with such a system?
18:48:00 <clever> corecode: except, its a channel between 2 kernels, running on different cpu cores, each kernel in control of different things
18:48:00 <clever> yes
18:48:00 <corecode> what system is that?
18:48:00 <clever> the rpi
18:48:00 <corecode> ah
18:48:00 <clever> https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface
18:49:00 <heat> ah yes, mailbox, that's the proper term
18:49:00 <clever> you put a structure into memory, that is a series of length-prefixed objects, each with a tag on it
18:49:00 <heat> the i915 also has mailboxes
18:49:00 <clever> and then you put the physical address of that structure into the mailbox
18:49:00 <clever> the mailbox itself, is simply a fifo, with an irq to wake the far end
18:49:00 <corecode> and i guess based on use you either share ownership or transfer it?
18:50:00 <clever> corecode: yeah, you flush the packet to ram, then post its addr to the mailbox, transfering ownership temporarily
18:50:00 <corecode> i think modern x86 does the same for inter processor communication
18:50:00 <clever> the firmware on the far end will read the packet, do something, modify the packet to hold a result, and then return the same addr, on a 2nd fifo
18:50:00 <corecode> yea, message queue with shared data
18:51:00 <clever> from what ive seen of arm inter-core stuff, its purely mutex primitives and a way to raise a special irq (ipi) on the far core
18:51:00 <corecode> plenty of opportunity to get race conditions or memory model issues
18:51:00 <clever> but there is no fifo, you have to build that yourself, using the mutex primitives
18:52:00 <clever> the rpi avoids most of the race conditions, by not having an array of messages, and using the mailbox to clearly define who writes when
18:52:00 <clever> so you just need to deal with the non-coherent caches
18:54:00 <clever> xhci has doorbells instead of mailbox, all messages are in a ring, with a special bool saying a given msg is valid
18:54:00 <clever> the doorbell is purely a mechanism to make the hw poll the array again
18:54:00 <corecode> well i came in here to talk about bloom filters
18:55:00 <clever> and to save writes, when the ring wraps around, it inverts the expected value if the "is valid" field
18:55:00 <clever> so all stale data, becomes invalid
18:55:00 <corecode> so you have another flag that you xor with to get validity?
18:55:00 <clever> yeah
18:55:00 <corecode> ok
18:56:00 <corecode> i guess that's an extra MSB on the pointer register
18:56:00 <clever> so the xhci buffer will basically look like AAAAABBBBB
18:56:00 <clever> and the A is the last valid msg
18:56:00 <clever> and it just keeps checking that first B, to see when it turns valid
18:57:00 <clever> and its not entirely a ring-buffer, the final msg in the list, is just a "goto X, B is now valid"
18:57:00 <clever> so then the array looks like AAAAAA, and the read pointer is at the front
18:57:00 <corecode> ok
18:57:00 <clever> and only when it becomes BAAAA, does it consider that valid, and move on
18:58:00 <clever> bloom filters, is that the thing where you have say a 9x9 matrix of multiplication factors? and for each pixel of input, you multiply that pixel by every cell in the matrix, and then add it to the output, centered on the input coord?
19:03:00 <c2a1> Hey why does the multiboot header on osdev lack a tags field? is it an older specification?
19:04:00 <c2a1> https://wiki.osdev.org/Bare_Bones_with_NASM
19:04:00 <c2a1> what i'm talking about
19:04:00 <corecode> clever: no, that's some convolution filter
19:04:00 <heat> c2a1, that's multiboot1
19:04:00 <heat> multiboot2 is the one that has tags
19:04:00 <c2a1> ok thanks
19:04:00 <corecode> clever: a bloom filter is a probabilistic data structure
19:04:00 <sortie> Yeah bare bones uses a Multiboot 1 header
19:05:00 <c2a1> do newer versions of grub support mb1
19:05:00 <heat> yes
19:05:00 <clever> corecode: ah
19:05:00 <c2a1> and does mb2 allow raw binaries?
19:05:00 <heat> yes? I think, not sure
19:06:00 * c2a1 reads
19:06:00 <heat> not sure why you'd use binary blobs with multiboot
19:06:00 <clever> c2a1: i think grub just looks for the multiboot magic# at an aligned addr, within the first X bytes of the binary
19:06:00 <clever> it doesnt care what other structures exist in the file
19:06:00 <heat> yes it does, it needs to load it
19:06:00 <sortie> c2a1, if at all feasible, I always recommend using ELF as a container format. It's easy to load and avoids sooo many pit falls with flat binaries
19:08:00 <heat> you can also use PE if you want to do something like linux and have a UEFI stub
19:08:00 <c2a1> side question, have any of you tried using any of the bsd boot loaders with your os?
19:08:00 <c2a1> trying to avoid the gpl as much as possible
19:08:00 <heat> using grub doesn't bind you to the GPL
19:09:00 <heat> same with gcc
19:09:00 <c2a1> sortie, what are some pitfalls out of curiosity?
19:10:00 <heat> c2a1, flat binaries need to be contiguous in memory or have huge blobs of 0s in between
19:10:00 <c2a1> heat, even then it's somewhat relevant as they seem to be very cross platform
19:10:00 <c2a1> openbsd and netbsd's to be specific. not sure if they use the same header formats abd whatnot though.
19:11:00 <sortie> c2a1, the files don't say where they are supposed to be loaded. That create some ambiguity. They are difficult to inspect with debugging tools. Regions of zeros needs to be stored in them, so they can't be sparse. It's easy to screw up their loading address when making them.
19:11:00 <heat> they also don't have symbol information, relocation information, no dynamic linking(although that's not kernel relevant most of the time)
19:11:00 <clever> heat: i believe objcopy will deal with making them contiguous and filling in any holes between DT_LOAD's
19:11:00 <clever> but yeah, elf does make things so much simpler
19:12:00 <heat> if objcopy changed my load addresses I would be very angry
19:12:00 <sortie> c2a1, it turns out that ELF is actually super simple for loading a kernel. You read a struct header at the start. From that, you locate the segment headers. They are a list of memory locations with source offsets in the file and destination location in memory, and a trailing region to memset to 0.
19:12:00 <sortie> So basically you just iterate the program headers and do the memcpy/memset operations it tells you to do
19:13:00 <sortie> Dynamically loaded ELF is a lot more complex, but a statically linked ELF kernel is literally that simple.
19:13:00 <clever> https://github.com/littlekernel/lk/blob/master/lib/elf/elf.c can deal with loading static elf, without any relocations
19:14:00 <sortie> c2a1, my point here is that ELF might actually be simpler than flat binaries + issues getting them working correctly + drawbacks of being unable to inspect them with debugging tools.
19:14:00 <heat> that reminds me
19:14:00 <heat> i have huge issues returning errors from the elf loader to the exec caller
19:14:00 <clever> a mem_alloc_hook() can be used to force some relocation (but not patching) upon what elf.c does
19:16:00 <heat> note that you can always flatten your elf files into binary if you need to
19:18:00 <c2a1> sortie: thanks
20:09:00 <geist> clever: oh hah, i'm not sure i'd use that ELF code as an example
20:10:00 <geist> it was kinda a weekend hack and written in a wonky way, kinda as an experiment (the whole callback thing)
20:14:00 <clever> well, it works.... lol
20:15:00 <clever> i feel like if you leave me alone too long, i might wind up adding a userland to my bootloader, with that elf code as the loader....
20:16:00 <clever> all the feature creep!
20:16:00 <clever> do i need a userland in my bootloader? lol
20:17:00 <heat> woah I found a new errno
20:17:00 <heat> ELIBBAD
20:22:00 <Oli> My ideal bootloader has a beepy tune playing out of the PC speaker and a sinouidal-moving text scroller; userland in bootloader sounds like UEFI at some level to me: It may serve good!
20:23:00 <clever> Oli: i already have opengl working....
20:30:00 <Oli> On the bootloader?
20:34:00 <clever> Oli: yeah
20:34:00 <clever> Oli: https://www.youtube.com/watch?v=BQyyVtmmVg8
20:35:00 <clever> also, the opengl is running outside of linux, so it can keep spinning, even after linux begins to boot
20:38:00 <Oli> Congratulations on achieving implementing OpenGL in a bootloader, clever! I am feeling use potential from.
20:38:00 <clever> its kinda not even in the bootloader, its in the firmware that runs BEFORE the bootloader!
20:39:00 <heat> cursed
20:39:00 <clever> the arm core is not needed at all
20:39:00 <heat> isn't firmware supposed to be small
20:40:00 <clever> heat: this is small
20:40:00 <heat> opengl drivers are small?
20:41:00 <clever> heat: when you implement them from scratch, yes
20:41:00 <clever> text data bss dec hex filename
20:41:00 <clever> 5234 96 174 5504 1580 ./build-vc4-stage2/platform/bcm28xx/v3d.mod.o
20:41:00 <clever> 5kb of .text
20:41:00 <clever> thats the complete 3d demo
20:42:00 <clever> not counting shared logic, like libc and the 2d framework
20:43:00 <kingoffrance> "what do y'all think about plan9-style exit messages?" i have lots of ideas for strings...but have not gotten to that point yet. just fyi i am not opposed.
20:44:00 <kingoffrance> i will not say good or bad, just an experiment
20:44:00 <kingoffrance> and i wont know for a while how it turns out
20:44:00 <kingoffrance> gimme some months/years :)
20:44:00 <kingoffrance> to become "operational" lol
20:44:00 <kingoffrance> i am not opposed, anyway...
20:44:00 <Oli> Reading the repository linked on the description of the video you have shared with me here, I am feeling desire to express to you my gratitude for being involved in the creation of a libre firmware for RPI systems!
20:46:00 <clever> Oli: some of the major tasks that remain but are relatively simple: hdmi out, proper arm SMP, pi3-aarch64 support, page flipping for linux, config files
20:46:00 <clever> each of those should be simple, the answers are known, it just has to be converted into source code
20:49:00 <clever> the more unknown things, are h264/mpeg2 accel, isp, camera, dsi
20:49:00 <Oli> *I gaze at you, smile, and inhale deeply*
20:51:00 <clever> SMP for example, the current problem is that LK claims all 4 arm cores for itself, with a fully working SMP scheduler
20:52:00 <clever> https://github.com/littlekernel/lk/blob/master/arch/arm/arm/arch.c#L305
20:52:00 <clever> Oli: arch_chain_load() is then used to execute linux on core 0
20:52:00 <clever> but the problem, is that no code exists, to hand off the other 3 cores
20:52:00 <clever> so the other 3, are still technically paused in LK's idle function, waiting for the LK scheduler to wake them
20:53:00 <clever> even after linux overwrites the LK kernel.....
20:53:00 <clever> ive formulated several plans of attack for that, but havent tested any yet
21:07:00 <c2a1> could one solve the problem of context switching overhead by isolating the kernel to one core and userspaxe to the others?
21:09:00 <clever> c2a1: its less about the switching overhead, and more about getting the core out of the scheduler, and passing it to another os
21:10:00 <clever> and yeah, one solution is to just hook the entry-point, and never let the LK scheduler get the core to begin with
21:10:00 <clever> pre-park it
21:10:00 <clever> but then i cant do anything fancy like multi-core gunzip
21:15:00 <clever> c2a1: also, aarch64 dropped support for linux to uncompress itself, so the bootloader is now responsible for the gunzip
21:16:00 <clever> or just dont compress
21:17:00 <corecode> hm, how do i map values uniformly to a smaller range? i guess i'll need some divisions
21:23:00 <junon> Perhaps the wrong channel but I'm going through a game right now and was instructed to write gates to negate a two-complement signed 8-bit byte. I just did a bitwise NOT and then added 1, but is there a way to do that without a full adder?
21:25:00 <corecode> probably more ##electronics
21:26:00 <corecode> no, i don't think you can do it without a full adder
21:26:00 <corecode> well, no
21:27:00 <corecode> maybe half adder?
21:27:00 <junon> The half adder just omits the carry gate right?
21:27:00 <corecode> well i'm thinking you only add carry
21:27:00 <corecode> you still need a carry chain
21:29:00 <junon> well you have to add carry chain all the bits up anyway, right?
21:49:00 <corecode> yes
21:49:00 <corecode> but all other data you just add 0
22:39:00 <geist> clever: actually over in the lkuser repo i do have some code that uses lib/elf to load user space
22:41:00 <clever> ah, hadnt noticed that repo
22:41:00 <clever> guessing i can just add it as an overlay?
22:44:00 <clever> ah, found the core: https://github.com/littlekernel/lkuser/blob/master/sys/lib/lkuser/user.c
22:45:00 <clever> all looks fairly simple
23:14:00 <sortie> Oh wow
23:15:00 <sortie> I booted my Sortix 0.9 operating system released December 30, 2014. This was the first technically self-hosting release, although it wasn't installable and old and buggy and lacks lots of ports and got older ports.
23:16:00 <sortie> Anyways I put the source code of Sortix 1.0, released March 2016, onto it via the ext2 filesystem support. With a few hacks, 0.9 was actually able to compile 1.0. I never tested that before and it was never intended to work.
23:17:00 <clever> sortie: nice!
23:17:00 <sortie> I just booted the Sortix 1.0 built under Sortix 0.9 and it actually works
23:17:00 <clever> was going to ask if it could build the new one, but then you went and answered that
23:18:00 <sortie> I previously confirmed that Sortix 1.0 is able to build the upcoming Sortix 1.1dev (2021 source code on the master branch), and actually get far bootstrapping the modern ports
23:19:00 <sortie> Note this 0.9 -> 1.0 jump stays with 0.9 ports, rather than 1.0 ports. 1.1dev is able to natively build all the ports, a state I think I can get to using only 1.0 ports, but bootstrapping that via 0.9 ports is going to be *difficult*.
23:19:00 <clever> ports?
23:20:00 <klange> Ported third-party software.
23:20:00 <clever> ah
23:20:00 <sortie> If I can find a solution for that though, then I can establish a 0.9 -> 1.0 (kinda, with 0.9 ports) -> 1.1dev (kinda, with 0.9 ports) -> ??? -> 1.1dev (with 1.1dev ports)
23:20:00 <sortie> That would vindicate my 2014 claim that I was self-hosting
23:21:00 <j`ey_> lol
23:21:00 <klange> I don't think I'm really any closer to building gcc natively than I was five years ago...
23:21:00 <clever> until i fix my cross-compiler setup to work in arm->vc4 mode, i cant possibly self-host
23:23:00 <klange> I haven't attempted a native build of much more than individual apps in a while, there's at least two Python scripts that need to be ported to Kuroko first. My aim at the moment should be to get my build down to _just_ gcc and binutils...
23:24:00 <klange> Which really means I need a DEFLATE compressor if I want to do it correctly...
23:30:00 <c2a1> are gzip and compress the same thing
23:30:00 <c2a1> looks like they use the same encoding
23:31:00 <klange> gzip is a wrapper around a DEFLATE payload
23:31:00 <clever> i was mildly surprised to see zfs gzip-9 compression using zlib functions
23:31:00 <clever> and yeah, deflate was a function within zlib
23:31:00 <clever> always fun when you call something by 3 different names!
23:31:00 <klange> It's not three different names.
23:32:00 <clever> lib/zlib_deflate/deflate.c:static block_state deflate_slow (deflate_state *s, int flush);
23:32:00 <clever> ah, its zlib_deflate, not just zlib
23:32:00 <klange> zlib is an implementation - the defacto one. gzip is a wrapper with a bit more information about the payload. DEFLATE is the name for the actually compression format inside of it.
23:32:00 <clever> ah
23:33:00 <klange> Other stuff also uses DEFLATE but not gzip, and of course there are other implementations especialy of the decompression side of things.
23:33:00 <clever> i feel like zfs doesnt need the gzip extras, since the fs layer gives that, but they are probably using the gzip name, because what you just said isnt as well known
23:33:00 <klange> I have my own decompressor, I use it for PNGs and compressed tarballs, and there's even a version of it in my kernel that unpacks comrpessed ramdisks on boot.
23:34:00 <clever> yeah, linux has a special build of it, that runs with the mmu off, and uses a dumber heap
23:35:00 <clever> that special one, is for the zImage, to unpack itself
23:35:00 <clever> seperate from the initrd one
23:35:00 <sortie> Hmm. make from Sortix 0.9 doesn't quite run on Sortix 1.0. There was an ABI change in how to read directories. I might need to patch the 1.0 kernel with some 0.9 compat.
23:35:00 <clever> the zImage one, also has to have relocation patching done with asm, because gcc cant generate 100% PIC objects
23:35:00 <sortie> At least 0.9 quake runs on 1.0
23:38:00 <clever> sortie: one idea, is if you build the kernel first, then reboot to run an 0.9 userland on a 1.0 kernel
23:38:00 <clever> then any tools built during the 1.0 userland build, can be ran immediately
23:38:00 <clever> ive thought of the same thing, when upgrading from 32bit linux to 64bit
23:40:00 <sortie> clever, basically I was able to cross-compile a whole pristine 1.0 system cleanly from 0.9, so I could build a kernel & userland with a matching ABI
23:40:00 <clever> ah, if you can do it in more of a cross fasion, you dont need the 1.0 kernel
23:40:00 <sortie> The trouble here is that the 1.0 kernel is actually not able to run 0.9 binaries perfectly (nor the other way around) because struct dirent changed
23:41:00 <clever> i was thinking more in a native fashion
23:41:00 <clever> yeah, thats an issue
23:41:00 <sortie> I might be able to build towards 1.1dev on a 0.9 kernel but honestly that kernel is damn old and lacks features
23:42:00 <sortie> A full 1.1dev bootstrap, I imagine, is best done with a 1.1dev kernel running a 1.1dev userland (bootstrapped from a 1.0 userland), with a patch to support 0.9 binaries
23:42:00 <sortie> Basically one big temporary hybrid
23:43:00 <sortie> Then hopefully 0.9 has enough ports -- I doubt it -- to actually bootstrap the core ports