00:46:01 Monster database of master branch on crawl.develz.org updated to: 0.35-a0-47-g35ae254118 04:39:01 Experimental (bcrawl) branch on underhound.eu updated to: 0.23-a0-5261-gd9800d219b 05:15:28 achylon (L14 ReHW) ASSERT(!invalid_monster(mon_act)) in 'state.cc' at line 443 failed. (Orc:1) 05:51:07 <04d​racoomega> https://archive.nemelex.cards/morgue/xombeh/crash-xombeh-20260208-101536.txt This is a baffling crash. (And it's repeatable, since this person hit it multiple times.) I can't tell entirely what is going on, but somehow _dgn_find_nearest_square has produced a vector of size 33554432 by doing a breadth-first search of a dungeon floor. Which, even if you assume that every square is reachable and iterated, has only 5600 spaces in 05:51:07 total. None of this code has changed in a long time, and this result would seem to be impossible no matter what, when, or why it was searching, surely? 07:19:09 <11O​dds> The BFS does seem very weird... each iteration has the correct points for BFS, but repeated many times 07:20:45 <11O​dds> I think on the i^th iteration, points contains all points at distance i, repeating them times equal to the number of paths of length i to them 07:22:38 <11O​dds> (We need to mark np as visited in the inner loop to avoid this) 07:34:46 <04d​racoomega> Wait, really? So are you saying this implementation of this has been super broken for ages and it just happens to rarely be visible? 07:35:00 fwiw, Claude figured it out pretty quickly... 07:35:16 <11O​dds> Or only to break on sufficiently large searches? 07:35:27 <04d​racoomega> I mean, that sounds pretty broken to me 07:35:29 > The bug: visited is set when a cell is processed (dequeued), but checked when enqueueing neighbors. This means if two cells A and B in the same BFS layer are both adjacent to cell C, then when processing A, C isn't visited yet so it gets enqueued. When processing B in the same layer, C still isn't visited (it won't be until the next layer), so it gets enqueued a second time. Those duplicates then each generate their own duplicates, causing exponential growth. 07:35:35 <11O​dds> Why this is happening on what appears to be a small level is confusing, but I haven't worked out the stack trace 07:35:38 <04d​racoomega> (To the point that I am surprised it wouldn't have come up years ago) 07:35:39 <11O​dds> It's functionally correct 07:35:50 <11O​dds> But yeah, I am also surprised 07:35:53 > In normal levels this bug is mostly harmless because the BFS finds an acceptable square quickly before the duplicates compound too badly. But in the crash case, _dgn_shift_item at line 1105 called with keep_visible=true first tries _item_visible_square (must be safe AND in player LOS AND not the player's position) with a traversability constraint. 07:35:59 > If the player can't see many valid squares (e.g., narrow corridors, walls everywhere), the BFS has to expand many layers before finding a match — or never finds one — and the exponential duplicate growth blows up to 33M+ entries before allocation fails. 07:36:25 <11O​dds> Hmmmm I think Claude is way over its skies in that last bit 07:36:32 > If you look at the coordinate data, you can see tons of duplicates (e.g., {x=27, y=12}, {x=59, y=10}, {x=58, y=12} appear over and over). This strongly suggests the visited check is failing to prevent re-visiting — either the BFS is checking visited after enqueueing rather than before, or the adjacency iterator (ai in the trace) is generating coordinates that map outside the visited grid bounds and wrapping around, bypassing the dedup. 07:37:50 <11O​dds> OOC, what did you give the bot? The stacktrace and the codebase? 07:38:06 Only the stacktrace first, then just a copy of the latest terrain.cc 07:38:24 https://claude.ai/share/775eb154-cdf8-4209-9c39-0b865cfe3a99 07:38:36 It even suggests the fix. 07:38:59 fwiw, this is with latest Opus 4.6 model 07:40:03 If this crash is easily repeatable with this seed, then the fix should be easy to test, too. 07:40:37 <11O​dds> Like DO, I'm still quite curious why the crash is new/super rare 07:41:34 Is it? Do you/we have all crashes stashed somewhere where we can see those that have `_dgn_find_nearest_square` in the stacktrace? 07:42:04 <11O​dds> Well yeah they get posted here 07:42:15 <04d​racoomega> I mean, there's no automated way to search all the crashlogs for that, but I've been around a while and never seen it happen before 07:42:25 <04d​racoomega> At least that I remember 07:42:35 It's probably rare because it only happens on specific degenerate cases - in the general case, it works just fine because the search finds a candidate before it hits a loop? 07:43:03 <11O​dds> In which case the question is what the specific degenerate case is 07:47:37 <11O​dds> Looks like this can only happen when _item_safe_square returns false for all squares near the item? Which I'm struggling to see why it would 07:47:43 <04d​racoomega> Honestly, another confusing part of this is the apply_daction_to_transit up the stack, which may be at least part of the issue. 07:47:54 <04d​racoomega> It feels like there's undefined behavior here 07:48:35 <11O​dds> Do you know approximately what's happening here? Like, what event is being resolved? 07:49:13 <04d​racoomega> In that it is killing an old bound soul still on the transit list (possibly because it fell down a shaft to a level you haven't entered yet). But I'm pretty sure that means its position is arbitrary - it's not even on this floor, for instance 07:49:35 <11O​dds> It looks like we've ended up trying to shift an item from an illegal position, when all nearby positions are also illegal 07:49:35 <04d​racoomega> And then is trying to drop its things while it is 'inside a wall' 07:49:42 <11O​dds> Ahhhh right 07:49:53 Looking at the screenshot, where is the player? It says "Position: (40, 33)" but ... ? 07:50:09 <11O​dds> Sounds like the screenshot is a red herring 🙂 07:50:15 <04d​racoomega> This seems like a super unsafe thing to be doing 07:50:16 * Dossy nods 07:50:33 Is this also a red herring: Invalid monster index -1581886235 currently acting: 07:51:21 <11O​dds> So yeah, the reason this hid is that we only use this very broken BFS to look for a place to put an item, and that tends not to be too hard. But now we are putting an item somewhere really stupid, like in many layers of rock. 07:51:22 <04d​racoomega> No, actually. A bit unfortunate-looking, but not actually indicative of an issue 07:51:28 So, corpse in illegal position trying to drop its loot? 07:51:46 <04d​racoomega> (It's because it's currently killing a monster not in env.monsters()) 07:51:55 Ah. 07:51:59 <04d​racoomega> You could get similar things when querying that during level excursions 07:52:13 <11O​dds> And being in a tiny zig maybe helps us have a super illegal position? 07:52:25 <11O​dds> Though you said it might not even be searching on this level. 07:52:39 <04d​racoomega> No, I think it is searching on the current level, but the monster isn't on the level 07:53:04 <04d​racoomega> It's basically in a list "To be placed when the player enters X floor" but currently exists in limbo outside of anwhere 07:53:17 <11O​dds> Ah cool, so makes sense that this comes up in a tiny zig level 07:54:03 <04d​racoomega> This is apparently why we do this to transiting monsters (and an example of how they can end up this way)> 07:54:06 <04d​racoomega> %git 3eaf7e5 07:54:07 <04C​erebot> DracoOmega * 0.12-a0-2624-g3eaf7e59e0: Apply dactions to transiting monsters, refactor related code (13 years ago, 4 files, 126+ 56-) https://github.com/crawl/crawl/commit/3eaf7e59e043 07:54:09 <04d​racoomega> Thanks for the documentation, past me 07:54:22 :) 07:55:03 <04d​racoomega> Changing a transiting monster's own state this way is safe. And killing things that won't drop stuff is probably often safe (though I'd really want to check some things...) 07:55:15 <04d​racoomega> But trying to do anything with the monster's position is extremely undefined 07:56:24 <04d​racoomega> It's worth noting that this 'try to push items out of walls if they drop inside them' code is itself only 4 months old. Added in response to rockfish draugr 07:56:47 <04d​racoomega> So previously it probably just left the item in some wall somewhere out of sight and nobody noticed 07:57:09 <11O​dds> (Just noting, to reinforce my sense of superiority over the machines, that while Claude got the DFS bug right this surrounding explanation is mostly wrong - it's nothing to do with seeing valid squares) 07:57:16 <04d​racoomega> Actually, wait. A transiting monster isn't even holding its own items properly. 07:57:58 <04d​racoomega> (Because monster inventory is part of the floor it's on. So transiting monsters stash their items differently and whatever a monster's inventory appears to be is probably also wrong if you treat it normally at this point.) 07:58:26 <04d​racoomega> So it seems likely that whatever it's trying to drop isn't even a thing that it has 07:58:45 <04d​racoomega> Multiple layers of undefined behavior here 07:59:18 <11O​dds> Good thing we had a BFS that can't cope with a fully level 🙂 07:59:25 <04d​racoomega> Heh >.> 07:59:51 <04d​racoomega> (Making monsters own their own items is part of a bunch of refactorings I'd like to do in future somehow, but the whole thing is a doozy >.>) 08:00:15 <04d​racoomega> (Because I need to do that in order to refactor all the needless duplication with monster_info away if we're going to make remembered copies of monsters) 08:00:17 <11O​dds> How is monster inventory tracked? 08:00:38 <04d​racoomega> A monster's inventory is an array of indicies into the floor item list 08:01:32 <04d​racoomega> This is convenient for memory management in some ways (and is very, very old), but has a fairly obvious host of problems when looking on differrent levels 08:02:13 <04d​racoomega> For instance: 08:02:15 <04d​racoomega> %git b00f08b 08:02:16 <04C​erebot> DracoOmega * 0.34-a0-738-gb00f08be49: Remove undefined equipment behavior when transiting monsters (Xolotl) (5 months ago, 1 file, 5+ 1-) https://github.com/crawl/crawl/commit/b00f08be4958 08:04:04 :o 08:04:15 <04d​racoomega> So much like the bug mentioned there, in this current crash it is probably trying to 'drop' some other item on the floor that happens to be in the same array position as whatever it was holding on the other floor 08:04:28 <04d​racoomega> And trying to do so at some arbitrary incorrect position 08:06:02 Just so I know, is this project generally anti-Claude/AI tooling? 08:06:29 Not regarding AI slop contributions, but for troubleshooting purposes? 08:06:39 <04d​racoomega> Fixing this one seems a lot more tricky, though. That commit above dealt with a monster that was in the process of being placed on the floor, so I could slightly reorder the process of placing it. But we do need to wipe out the old bound soul even when it is not in a state where it should be placed anywhere, and its items shouldn't be lost either. 08:07:27 If you move the items before the monster, does that help? 08:07:32 <04d​racoomega> But we can't even use a level excursion to kill it, since currently it isn't anywhere 08:09:34 <04d​racoomega> (In fact, monsters can be shafted to floors the player has never been on yet) 08:09:47 <04d​racoomega> Can you make an excursion to a level that hasn't generated yet, I wonder? 08:11:24 <11O​dds> Not that I've heard, I imagine we're in the usual place of wanting any quality contributions but being pretty wary of AI slop? If I sounded a little sharp in response to your Claude diagnosis, it was probably because I didn't feel it was adding on top of the observation I'd already made, except more probably-wrong detail (and because of that AI slop wariness). 08:12:16 Okay, I wanted to share this observation from Claude, because it sounds plausible - and, I didn't even let Claude know this was a zig level, and it came up with this: 08:12:18 > The stair marker at (34, 35) is also interesting — if this is Descent mode, descent_crumble_stairs() could be what triggered the terrain change that caused dgn_check_terrain_items → _dgn_shift_item in the first place. A stair outside the room gets destroyed, but if that propagates to items somehow, or if there's a separate terrain event inside the vault, that would kick off the chain. 08:12:22 <11O​dds> (I've actually also asked Claude to debug a problem or two, and haven't found it useful for anything substantial in this codebase, but I've not tried the latest version and wouldn't claim to be an expert) 08:13:12 <04d​racoomega> I mean, as best I've been able to tell, while AI code-related tools look very impressive (and frankly have higher accuracy than I'd have expected in some ways), they're still incorrect sufficiently often (while being extremely confident nonetheless) that it seems bad policy to me to lean on them at all. 08:14:00 That's fair. That's why I treat them like any other reasonably competent but fallible pair programmer I might work with. 08:14:02 <04d​racoomega> (I mean, I am also incorrect plenty of times, but either way I still need to actually understand the problem myself - and in some cases, AI is just one more thing to have to verify) 08:14:05 <11O​dds> In my professional coding life, they've got to the point where them writing code and me reviewing it is probably better than me without them (the domain could hardly be more different to DCSS code though) 08:15:07 <04d​racoomega> I've seen studies, actually, that while people often think they increase productivity, that objectives measures of the same population show the opposite. (Though I imagine this might vary by sub-discipline somewhat) 08:15:57 <11O​dds> Yeah, there was a study like that... the domain was kinda simiilar to "you working on crawl code" 08:18:04 There are absolutely tasks that, at least in my experience and specifically with Claude, are absolutely super-human in terms of capability and speed. I was able to give Claude some legacy JavaScript that was minified and obfuscated, where the original source had been lost to time, and it was able to do a very high quality job of deobfuscating and unminifying it. Before these tools, when I'd do that kind of task by hand, with that specific code, probably would 08:18:14 <04d​racoomega> Hmm... I wonder actually if a quick-and-simple solution here is to give them a 1 aut ENCH_SLOWLY_DYING and not kill them at all. This should kill them instantly after they get placed normally, ensuring general correctness of death processing. 08:18:14 Claude literally spat its version out in maybe 3 minutes? 08:18:45 damn halloy irc client :-/ 08:18:54 What I tried to say: 08:18:56 There are absolutely tasks that, at least in my experience and specifically with Claude, are absolutely super-human in terms of capability and speed. I was able to give Claude some legacy JavaScript that was minified and obfuscated, where the original source had been lost to time, and it was able to do a very high quality job of deobfuscating and unminifying it. 08:19:14 Before these tools, when I'd do that kind of task by hand, with that specific code, probably would have taken me 3-4 days. Claude literally spat its version out in maybe 3 minutes? 08:19:20 <11O​dds> (This fact makes me quite sad, because I like my craft. So even if claude becomes the right way to work on crawl, I'll continue doing it by hand for the joy of it 🙂 ) 08:19:42 Absolutely. Just because we have power tools doesn't mean you can't enjoy handcrafting. 08:20:01 <04d​racoomega> I've not yet seen evidence of it replacing us 08:20:41 <04d​racoomega> (Even for technical work like this, but even moreso for anything design-adjacent) 08:20:51 That's the scary thing. There won't be evidence, until after it's happened. Or, what I mean is: when it is about to happen, it's going to happen _very_ fast. 08:21:07 And that is terrifying. 08:21:11 <11O​dds> Oh yeah for design I don't think it's remotely close 08:23:44 Having been doing this for 37 years now, I'll say this: if I can let AI work on the code that I must in order to pay the bills, and that lets me spend more time working on the code I want to because I enjoy it, that's a win for me 08:25:31 Of course, the reality is the code that will still require humans will probably be the code that people are willing to pay humans to work on, and will consume all of my time, leaving me to use AI tools to autonomously work on the code that I wanted to work on. :-/ 08:40:58 <04d​racoomega> Well, I've at least verified the incorrect behavior (as a prelude to verifying the fix). I forgot that allies can't get shafted anymore, but banishment does work here. I got my bound soul to get banished, bound another, and it moved a different item on the floor to where the monster was banished from 10:20:39 03DracoOmega02 07* 0.35-a0-48-ga6cf1f4fed: Give freezing cloud a warning prompt again 10(3 hours ago, 5 files, 25+ 36-) 13https://github.com/crawl/crawl/commit/a6cf1f4fed3f 10:20:39 03DracoOmega02 07* 0.35-a0-49-g1b2965c119: Remove an assert that is reachable through a valid event sequence 10(3 hours ago, 1 file, 0+ 1-) 13https://github.com/crawl/crawl/commit/1b2965c1192a 10:20:40 03DracoOmega02 07* 0.35-a0-50-gca4ab9d3b7: Fix undefined behavior when killing a transiting bound soul via daction 10(57 minutes ago, 2 files, 10+ 3-) 13https://github.com/crawl/crawl/commit/ca4ab9d3b7e1 10:20:40 03DracoOmega02 07* 0.35-a0-51-g0642ae1248: Fix a mesmerism-related crash (Lici) 10(17 minutes ago, 1 file, 3+ 1-) 13https://github.com/crawl/crawl/commit/0642ae124892 10:20:40 03DracoOmega02 07* 0.35-a0-52-gbbfe210076: Prompt before transforming in ways that would lower max HP a lot (various) 10(2 minutes ago, 2 files, 13+ 1-) 13https://github.com/crawl/crawl/commit/bbfe2100761b 10:20:43 03DracoOmega02 07[stone_soup-0.34] * 0.34.0-1-g28811ce238: Give freezing cloud a warning prompt again 10(3 hours ago, 5 files, 25+ 36-) 13https://github.com/crawl/crawl/commit/28811ce238d8 10:20:43 03DracoOmega02 07[stone_soup-0.34] * 0.34.0-2-g4544f204d9: Remove an assert that is reachable through a valid event sequence 10(3 hours ago, 1 file, 0+ 1-) 13https://github.com/crawl/crawl/commit/4544f204d9ba 10:20:43 03DracoOmega02 07[stone_soup-0.34] * 0.34.0-3-gd3414bd8d5: Fix undefined behavior when killing a transiting bound soul via daction 10(57 minutes ago, 2 files, 10+ 3-) 13https://github.com/crawl/crawl/commit/d3414bd8d5bb 10:20:43 03DracoOmega02 07[stone_soup-0.34] * 0.34.0-4-g4b2728444d: Fix a mesmerism-related crash (Lici) 10(17 minutes ago, 1 file, 3+ 1-) 13https://github.com/crawl/crawl/commit/4b2728444de3 10:20:43 03DracoOmega02 07[stone_soup-0.34] * 0.34.0-5-g3e7bbf7a23: Prompt before transforming in ways that would lower max HP a lot (various) 10(2 minutes ago, 2 files, 13+ 1-) 13https://github.com/crawl/crawl/commit/3e7bbf7a2388 10:39:00 04Build failed for 08stone_soup-0.34 @ 3e7bbf7a 06https://github.com/crawl/crawl/actions/runs/21802173402 10:45:52 04Build failed for 08master @ bbfe2100 06https://github.com/crawl/crawl/actions/runs/21802173990 10:46:55 <04d​racoomega> Not sure what that error is about 16:44:32 Unstable branch on underhound.eu updated to: 0.35-a0-52-gbbfe210076 (34) 17:00:21 Hey, can someone crank the handle and get the CAO streaks page to update? It would be nice to see Sergey at the top. 17:04:10 <08o​____0> Pinkbeast The CAO score pages have stopped updating during tournaments since I started playing. https://dcss-stats.com/streaks works of course! 17:17:42 iirc cao's too overloaded to update themm during t, and has been for several years 23:35:47 Unstable branch on crawl.develz.org updated to: 0.35-a0-52-gbbfe210076 (34) 23:59:17 Windows builds of master branch on crawl.develz.org updated to: 0.35-a0-52-gbbfe210076