Search logs:

channel logs for 2004 - 2010 are archived at http://tunes.org/~nef/logs/old/ ·· can't be searched

#osdev2 = #osdev @ Libera from 23may2021 to present

#osdev @ OPN/FreeNode from 3apr2001 to 23may2021

all other channels are on OPN/FreeNode from 2004 to present


http://bespin.org/~qz/search/?view=1&c=osdev&y=19&m=8&d=10

Saturday, 10 August 2019

06:28:57 <klys> crickets and stars here, and a mountain...
06:31:38 <klys> today I found eisa treating one slot as usable for isa and one slot not thus usable.
06:32:11 <klys> on reflection, it may have just been my monitor, which would turn off after ten seconds.
06:32:59 <klys> new in the mail this week will be...another 15" crt display.
06:39:04 <zid`> I read a bit about iso9660, it's odd
06:39:15 <klys> odd in what way, zid?
06:39:16 <zid`> the fields are both little endian and big endian alternating
06:39:35 <zid`> and then it hangs two copies of every sub struct off the main one, in big and little endian
06:39:50 <zid`> so you could have a CD that contains totally different things depending on if the reader is big or little endian :D
06:40:02 <klys> do the fields have the same data recorded both ways?
06:40:11 <zid`> I mean.. I imagine they're supposed to ;)
06:40:17 <klys> right, right
06:41:07 <zid`> I couldn't get the psx bios to read files when I tried it, so I'm considering writing an iso9660 driver
06:41:16 <zid`> and just doing it manually
06:42:11 <klys> are you starting with grub2's iso driver?
06:43:08 <zid`> no I was just starting with a hex editor
06:43:15 <zid`> making sure it isn't insane and that I understand it
06:43:20 <zid`> it's pretty simple it turns out
07:11:40 <graphitemaster> I've come up with an informal proof on how to design a relatively simple file system that is lock-free
07:13:35 <aalm> ?
07:23:18 <zid`> is it to use a single core, that's my informal proof
07:29:54 <graphitemaster> no
07:31:03 <graphitemaster> not only is there an informal proof for lock-free, I even have an informal proof for some atomic guarantees even modern file systems can't provide
07:31:19 <graphitemaster> this would be hell to implement
07:32:39 <graphitemaster> it would also be mostly wait-free (no CAS loops)
09:55:24 <aalm> "this would be hell to implement" - sounds promising =]
13:21:56 <j`ey> I guess even this is some progress: 0b000101 Translation fault, level 1
13:54:36 <grimondo> Any qemu/kvm experts here? I'm trying to check cpu feature flags in my os, which is running in qemu with host or max cpu. I would expect this to pass through all cpu flags but it doesn't. Amongst others, it's not reporting 'est' (enhanced speedstep), although it's available on the host. Any ideas?
13:54:48 <grimondo> (I'm running qemu with kvm)
13:58:46 <heat> I'm no qemu/kvm expert but I believe there's certain flags that aren't passed because they're simply not implemented
13:59:31 <heat> in this case, power management
14:11:08 <grimondo> Hmm maybe it's as simple as that
14:14:26 <j`ey> damn, been trying to get the MMU started for about 4 hours now :(
14:28:13 <xenos1984> j`ey: Do you have your code in a repo?
14:29:02 <j`ey> xenos1984: not yet
14:29:09 <j`ey> (this is for aarch64 btw)
14:30:21 <xenos1984> j`ey: Hm... Otherwise I would have offered to have a look. I was reading channel conversations recently, so I figured out it's aarch64 from the context :D
14:30:39 <j`ey> I mean, if you know about aarch64, I could paste the code
14:31:01 <xenos1984> That could help indeed.
14:31:02 <j`ey> I guess Im guessing my l1 (l2?) table entries wrong
14:33:59 <xenos1984> Just for reference, this is the code that creates my page tables, if you want to compare stuff like flags: https://github.com/xenos1984/NOS/blob/master/kernel/vendor/raspi/Entry.a64.S#L82
14:34:22 <j`ey> thanks
14:34:24 <j`ey> I'll take a look
14:34:34 <j`ey> I dont really understand the t0sz stuff yet, and most of the registers
14:34:34 <xenos1984> Translation fault indeed sounds like trouble with page tables...
14:35:13 <xenos1984> It has a steep learning curve. I wouldn't say that I master it either :D
14:38:14 <j`ey> https://paste.rs/Ns5
14:38:18 <j`ey> not really the best format.. but
14:39:46 <j`ey> I was trying to make 2 tables, with one entry each
14:44:09 <xenos1984> I'm not really fluent in Rust... But indeed, that looks strange.
14:44:28 <xenos1984> Your two entries are the long binary numbers at the end of the paste?
14:44:41 <j`ey> yeah
14:45:04 <j`ey> theyre the same
14:46:13 <xenos1984> So the fault occurs when you try to access that memory? Or immediately after enabling MMU?
14:46:30 <j`ey> when trying to execute the next instruction I think
14:46:43 <j`ey> which is at 0x400019a4
14:47:09 <xenos1984> I see that in your identity mapping you set only the table flags...
14:47:56 <j`ey> those binary numbers end with 01.. isnt that for a block?
14:49:33 <xenos1984> Yes, for the two at the bottom. But I guess you don't even get to the point where they would become relevant if it faults already after enabling MMU. So I rather think the faulting part wants to read your identity mapped next instruction.
14:50:41 <j`ey> I should rename that identity_page_table, really it's just level1_page_table
14:51:04 <j`ey> with a single entry to my level2 table
14:51:35 <j`ey> hm, I should try mangling that entry on purpose, to see if I get a different fault
14:52:00 <xenos1984> I think you need to set a few more flags, such as access flag (bit 10).
14:52:31 <j`ey> yeah, if I remove my l1 entry, I get "0 Translation fault, level 0"
14:53:02 <j`ey> I had bit 10 set.. but then I removed it, will add it back
14:53:13 <xenos1984> Sounds promising at least...
14:53:52 <j`ey> so yes, maybe I need more flags..
14:55:02 <xenos1984> Also you might want to set flags on sharability at some point (but that should be less important for now).
14:56:30 <j`ey> access permission 00 seems to be read/write
14:56:50 <xenos1984> Yes.
14:58:05 <xenos1984> My flags are 0x703 btw - accessed, shared, page / page table.
14:59:22 <j`ey> that's for table entries, right?
14:59:32 <xenos1984> Yes.
15:00:09 <j`ey> https://armv8-ref.codingbelief.com/en/chapter_d4/d43_1_vmsav8-64_translation_table_descriptor_formats.html suggest the low bits are ignored
15:01:53 <xenos1984> Hm... But where do you fill the L2 table for the identity mapping?
15:02:15 <j`ey> thats the bit with the long binary numbers?
15:02:19 <j`ey> just 2 entries
15:03:11 <xenos1984> But that maps 0x0000000040000000, right? Where is your kernel loaded? It it at that address?
15:03:37 <j`ey> yes
15:05:59 <xenos1984> Ah, I see. Yes, they are ignored if there is another level following, so they are ignore in L1, not in L2.
15:06:18 <xenos1984> So you need to set bit 10 (accessed) in L2 - the long numbers.
15:06:38 <j`ey> ok, I have done that again
15:08:55 <j`ey> geist: around?
15:10:45 <j`ey> that should just map a region from 0x4000... to +1GB right?
15:11:49 <xenos1984> That actually depends on the TCR settings...
15:12:08 <j`ey> yeah, I didnt quite understand those yet
15:13:18 <xenos1984> It is quite tricky. Took me a while to get the calculation right (mine is different from yours, I use less bits because my target platform has only 1-4GB).
15:14:02 <j`ey> I dont really care what I use, I just want to get it working now
15:19:45 <j`ey> maybe my MAIR is wrong
15:20:56 <xenos1984> Huh, you set it to 0.
15:21:21 <j`ey> I think it should be 0xff maybe
15:22:21 <xenos1984> That sounds better.
15:23:41 <j`ey> no change though
15:25:03 <xenos1984> I still suspect it's the L2 table...
15:25:19 <j`ey> it is
15:25:26 <xenos1984> Wait, you set the first entry of L1 and first of L2 table?
15:25:40 <j`ey> first of L1 and first two in L2
15:25:53 <xenos1984> Ah, I see.
15:26:09 <j`ey> (well that's what i was trying..)
15:29:13 <xenos1984> base << 12
15:29:24 <xenos1984> You shouldn't shift base by 12.
15:29:30 <j`ey> why not?
15:29:59 <xenos1984> It already has the lower 12 bits to 0. It's the aligned address of the page table.
15:30:21 <j`ey> hmm
15:30:44 <xenos1984> If your L2 table is at 0x4000 (for example), you want your L1 entry to be 0x4003. No need to shift.
15:30:54 <j`ey> ok, different error now
15:31:17 <xenos1984> I hope a better one :D
15:32:08 <j`ey> "0b010000 Synchronous External abort, not on translation table walk"
15:34:18 <j`ey> feels like backwards progress heh
15:35:27 <aalm> .theo unsafe{}
15:35:28 <glenda> Don't like it? Then walk away.
15:37:40 <xenos1984> Interesting... Are you testing on real hardware or on an emulator?
15:37:47 <j`ey> qemu
15:38:04 <xenos1984> Ah... Which machine type?
15:39:01 <j`ey> virt
15:39:08 <j`ey> -M virt -cpu cortex-a53
15:39:20 <xenos1984> Hm... Never used that one.
15:40:50 <xenos1984> Normally I would expect that external abort points to a hardware problem.
16:13:33 <j`ey> I really cannot find a good explanation of t0sz :(
16:15:31 <xenos1984> ARMv8 manual says that region size is 2^(64 - T0SZ) bytes.
16:15:57 <j`ey> so I could just set it to 0?
16:16:00 <xenos1984> T0SZ = 16 as you have means 2^48 bytes.
16:16:13 <xenos1984> There are certain min and max values.
16:16:35 <xenos1984> ...which depend on the particular implementation / CPU.
16:24:47 <j`ey> if I have it to 16, I get the "0b010000 Synchronous External abort, not on translation table walk", if I set it to 25 I get 0b000101
16:24:50 <j`ey> Translation fault, level 1
16:28:33 <xenos1984> Yes, 25 changes things a lot. You get one fewer level of translation tables as well, so I think then your L1 table entry should be what your L2 is now.
16:29:06 <j`ey> 0x400019a8 has 17 leading zeros
16:29:12 <j`ey> (if 48bits)
16:29:57 <j`ey> I havent yet found a good explanation of how the number of translation tables changes with t0sz, just tha they change
16:32:07 <j`ey> "You can configure the level of translation table that is used for the first lookup. The full translation
16:32:10 <j`ey> process can require three or four levels of tables. You need not implement all levels. The first level
16:32:13 <j`ey> of lookup is, in effect, determined by the granule size and TCR_ELn.TxSZ fields."
16:32:26 <j`ey> if only it actually explained this :/
16:32:35 <xenos1984> Hm... I have a table D4-12 in section D4.2.6 of the ARMv8 manual...
16:33:31 <j`ey> D4 seems to be about pauth for me
16:33:50 <xenos1984> 4k granule size, T0SZ between 16 and 24 is 4 levels
16:34:05 <xenos1984> 25 to 33 is 3 levels
16:34:08 <j`ey> ok: https://armv8-ref.codingbelief.com/en/chapter_d4/table_d4_12.png
16:34:25 <j`ey> but I thought you could 'terminate' them early, right?
16:35:24 <xenos1984> You mean by using blocks? Such as 1GB?
16:35:35 <j`ey> yes
16:35:40 <xenos1984> Yes.
16:35:40 <j`ey> like what Im trying
16:36:08 <xenos1984> But with T0SZ = 25 you already start the lookup one level higher.
16:36:33 <j`ey> ok, so Ill keep the 16
16:36:33 <xenos1984> Basically it means you have only one single L2 table.
16:36:57 <xenos1984> You can, if you want to keep the table layout.
16:37:13 <j`ey> I just want to get it working anyway possible at this point
16:38:12 <j`ey> I dont really understand what it means to 'start' at a different level. you have to set ttbr0 to that level, it just ignores some of the higher bits, right?
16:38:38 <xenos1984> You can try to put 25 and give the address of your L2 table to the table base register.
16:38:51 <xenos1984> Yes, exactly.
16:39:40 <j`ey> same error
16:39:51 <xenos1984> external abort?
16:41:04 <j`ey> yup Synchronous External abort, not on translation table walk
16:41:27 <j`ey> maybe I should have just downloaded qemu's source and added prints
16:42:22 <xenos1984> Hm... Could be something about caching or sharing... What happens if you set TCR and MAIR to the values I use in my code?
16:43:03 <j`ey> tried that before, can try again
16:43:20 <xenos1984> MAIR = 0x4404ff, TCR = 0xb5193519
16:45:04 <j`ey> nope, no luck
16:45:53 <xenos1984> Another thing I saw in my code... Apparently one needs to set bit 0 of TTBR0 to 1.
16:46:03 <xenos1984> So load table address + 1 instead.
16:46:19 <xenos1984> No idea why, seems not to be documented well...
16:46:24 <j`ey> ok....
16:46:33 <j`ey> dont think ie seen that in other code, will look
16:47:09 <j`ey> nope
16:48:20 <j`ey> (nope as-in, it didnt work)
16:48:49 <xenos1984> That bit is called CnP, apparently used by newer implementations.
16:48:53 <xenos1984> Hm...
16:49:48 <xenos1984> Are you sure the physical address is correct?
16:50:07 <j`ey> Im not sure of anything at this point :)
16:50:33 <j`ey> but yes, my linkerscript: . = 0x40000000;
16:51:11 <xenos1984> I meant in the page table... I'm tired of counting bits there :D
16:51:33 <xenos1984> What if you set the entry to 0x0000000040000401?
16:51:34 <j`ey> i have no idea, ive re-done it many times
16:51:45 <xenos1984> instead of binary
16:52:17 <j`ey> hmm data abort
16:53:12 <xenos1984> Interesting...
16:53:18 <j`ey> different place now
16:53:24 <j`ey> maybe that worked, maybe my binary was wrong
16:53:28 <xenos1984> Earlier or later?
16:53:39 <xenos1984> Yes, I counted, that binary is wrong.
16:53:45 <xenos1984> Too many 0's.
16:53:58 <j`ey> ffs
16:54:03 <j`ey> I went over it so many times
16:54:36 <j`ey> it had 64 digits.. im sure
16:55:37 <xenos1984> But the 1 in the wrong place?
16:55:48 <j`ey> hm
16:55:48 <xenos1984> Must be bit 29.
16:55:53 <j`ey> hm
16:56:11 <xenos1984> sorry, 30
16:56:51 <j`ey> ugh
16:56:59 <j`ey> I tried several different versions
16:56:59 <j`ey> hm
16:58:20 <xenos1984> So the external abort means you pointed the physical address somewhere outside of physical RAM.
16:59:10 <j`ey> ffs, I knew I should have just written a function for generating that value
17:00:08 <xenos1984> Probably :D But the MMU should work now, so I suppose that data abort has a different reason...
17:00:40 <j`ey> well I didnt map uart
17:00:46 <j`ey> that's probably it
17:01:03 <xenos1984> Sounds about right.
17:02:34 <j`ey> ugh, thanks a lot
17:02:44 <j`ey> I mean, thanks a lot, ugh, that was so stupid
17:03:20 <xenos1984> you're welcome :D well, that happens
17:03:43 <j`ey> could have been avoided if I just wrote a simple helper function :)
17:04:44 <xenos1984> Or the number in hex :D
17:05:03 <j`ey> heh yeah. I was trying to follow a diagram, and it felt easier to write it in binary :)
17:05:16 <xenos1984> I see ;)
17:06:00 <xenos1984> But the upshot - aarch64 exceptions tell you pretty well what's going wrong and where. Translation fault at which level, no physical memory at the place you want to access...
17:06:53 <xenos1984> That helps pretty well in debugging and finding the issue.
17:08:29 <j`ey> think I'll not bother with getting uart working now :)
18:22:53 <Nizumzen> does anyone know if you can order printed copies of the intel developer manuals?
18:23:23 <zid`> Sometimes you can sometimes you can't, afaik
18:25:34 <zid`> http://www.lulu.com/spotlight/IntelSDM
18:25:40 <zid`> they're charging atm it seems
18:26:56 <Nizumzen> hmm might have to order them
18:27:05 <Nizumzen> not that worried about the cost
18:27:17 <zid`> I have free ones :D
18:27:23 <Nizumzen> nice :)
20:21:07 <mrvn> anyone storing kernel messages in memory to be recoverable after a crash?
20:30:16 <graphitemaster> Nizumzen, I tried to get Staples (Office Depot) to print them and bind the whole combined volume one, they couldn't
20:30:38 <Nizumzen> ah that is a shame
20:30:51 <bcos> mrvn: Tried that many years ago and found that firmware wipes memory during reboot/reset on most systems
20:33:04 <Nizumzen> graphitemaster: I might try again anyway - I'm not sure if it'll be any different in England but you never know
20:34:06 <bcos> mrvn: Alterantives include UEFI vars and non-volatile RAM; and some kind of "pre-prepared kexec()" (to avoid firmware being used during restart)
20:35:19 <mrvn> UEFI var? Are you insane? That bricks your efi
20:35:56 <bcos> It possibly shouldn't brick UEFI..
20:36:40 <mrvn> bcos: there are some EFI that won't boot anymore if the EFI vars ever fill the space.
20:41:54 * bcos doesn't really like having a dependency on firmware type for anything after boot anyway; but there's not many good/easy alternatives
20:42:56 <bcos> (e.g. the "kexec()" idea sounds nice until you try to bring everything back to a sane state after a kernel panic)
21:10:43 <mrvn> bcos: an alternative to rescue crash logs are QR codes
21:13:51 <j`ey> :O
21:18:19 <graphitemaster> morse code with PC speaker