00:08:54 <04C​gettys> Yeah, the question is, where's the rest of the latency 00:09:05 <04C​gettys> yes I need more realistic stress testing to repro it 00:09:21 <04C​gettys> of course, Python async is not exactly easy to profile 00:10:20 <04C​gettys> And tornado is explicitly single threaded from my understanding; all interactions with it to e.g. send messages have to come from main thread (except the background scheduling bit, but that won't help with most of the time being spent inside tornado calls) 00:10:36 <04C​gettys> Long story short, current architecture has some hard scaling limits 00:11:33 <04C​gettys> question is just, how much can we stretch it, or should we rewrite some significant chunks 00:18:39 <04C​gettys> To be clear, I'm not saying tornado may not use more threads under hood 00:19:00 <04C​gettys> just that enqueueing messages itself is going to become enough of a bottleneck 00:19:45 <06m​umra> Its very easy to understand how busy servers start showing more severe lag problems anyway. Everything is in a queue waiting for everything else. If i press a key, the server is waiting til it's sent any currently queued messages out to clients of all running games (possibly including your own game which it could still be churning through messages for) before it can even hand the keypress to the game process. Then the response from the 00:19:45 game is in a queue waiting for anything else that got queued since then 00:20:07 <04C​gettys> Right 00:20:42 <04C​gettys> And ballpark (I had the more precise number, but didn't necessarily save it), you maybe can process sending 10k websocket messages thru tornado per second 00:20:45 <04C​gettys> maybe 00:20:53 <04C​gettys> if there was no other processing anywhere else 00:20:55 <06m​umra> I'm pretty sure there is only ever one thread. Tornado is just a python library, there's no magic there 00:21:05 <04C​gettys> It looked like we might be able to squeeze out 2x or 3x maybe 00:21:09 <04C​gettys> vs current 00:21:22 <04C​gettys> if the code I was staring at was really the bottleneck, which who knows 00:23:22 <04C​gettys> and even if there are more threads, not more truely concurrently executing threads 00:23:27 <04C​gettys> it's pure Python as you said 00:23:39 <04C​gettys> https://github.com/tornadoweb/tornado 00:23:42 <04C​gettys> So GIL for sure 00:31:12 <04C​gettys> But also the question is, how much scalability do we need 00:31:18 <04C​gettys> many of the servers have like, 8 threads 00:31:38 <04C​gettys> if most of time spent in the games, may not matter 00:41:21 <06m​umra> They're just not threads. Coroutines are a single thread pretending to be concurrent by yielding to the event loop (it's exactly how javascript async/await works) 00:41:44 <04C​gettys> Yeah, I know, coroutines exist, green threads exist, heck, fibers exist 00:42:18 <06m​umra> (Javascript can do proper multithreading using webworkers, which is actually used in the emscripten build, but it's not common to use this in servers as if you do non-blocking IO properly you can still be massively scalable without it. And horizontal scaling is easier) 00:42:28 <04C​gettys> If you haven't been introduced to fibers, let me recount the horrors of fibers 😄 00:42:37 <04C​gettys> Right, I know you can do a tremendous amount with a single thread 00:42:52 <04C​gettys> but as soon as you have enough CPU bound work, need to use more than one core at end of day 00:43:02 <04C​gettys> whether that's 1 core per process or not, is a design choice 00:43:13 <06m​umra> The server should not be having to do so much CPU work that it's a problem. It's just acting a message router 00:43:42 <06m​umra> The game processes can genuinely run on other cores so they should be carrying all the CPU burden 00:43:49 <04C​gettys> Sure, but "just" is putting a bit of a finer point than the current architecture 00:44:18 <04C​gettys> it's putting together individual messages (which is not the bottleneck at lesat in my trivial tests) into an array of them, compressing, sending over a socket, etc 00:45:18 <04C​gettys> And yeah hopefully the game is more expensive and therefore it scales fine 00:45:32 <04C​gettys> but that's not necessarily true, especially with some old versions of Python being used for it 😄 00:47:14 <04C​gettys> What I'm trying to say is it's not a terrible choice 00:47:18 <04C​gettys> and I've seen far worse 00:47:37 <04C​gettys> but it's doing more byte manipulation and the like that I'd want to see 00:50:44 <06m​umra> See it's quite likely batching the messages is completely unneccessary since websockets are streaming everything 00:51:11 <06m​umra> And certainly allocating string memory is going to be expensive and also creates more garbage collection 00:51:15 <04C​gettys> Dunno, sending them in bigger packets could make sense 00:51:19 <04C​gettys> better compression 00:51:43 <04C​gettys> it needs to flush the stream every message anyway in other words 00:51:48 <04C​gettys> or every batch 00:51:53 <04C​gettys> batching if the data is ready to send is perfectly reasonable 00:52:08 <04C​gettys> though maybe it should chunk it if it's too large 00:52:19 <04C​gettys> e.g. take up to 10 messages per queue and round robin 00:52:24 <04C​gettys> instead of "take everything" 00:52:52 <04C​gettys> But I still don't think we've found the bottleneck well enough, idk 00:53:00 <06m​umra> https://stackoverflow.com/a/10929855 00:53:47 <04C​gettys> Sure, it's not the websocket layer I was talking about 00:53:51 <04C​gettys> I'm talking about e.g. zstd 00:53:51 <06m​umra> I'm trying to say that I think that all of this is probably premature optimisation (maybe it was necessary 12 years ago when this stuff was written, when full browser support for TCP websockets was not readily available) 00:54:07 <04C​gettys> Fair, that I could believe 00:54:27 <04C​gettys> for example it looks like tornado can now do the compression for us if we ask it to 00:54:35 <04C​gettys> yes using exactly the same code we're doing, I think 00:55:28 Monster database of master branch on crawl.develz.org updated to: 0.33-a0-1120-gbd99899e7f 00:57:02 <04C​gettys> I suspect there is a lot of premature optimization here, but also my gut is the architecture isn't quite right 00:57:53 <04C​gettys> Also some accidental pessimissization 00:58:04 <04C​gettys> getting the current stack trace is really quite expensive 00:58:54 <06m​umra> Crawl is not a 60FPS real-time network game, the amount of messages and the size of them are not actually huge. Websockets should just handle it well enough. Even if tornado was doing the compression, it is still probably more CPU overhead than actually helps. The important thing is to start sending the message as quickly as possible. 1-2kb is not a scary amount of data on modern network speeds (and i think that would be a normal packet 00:58:54 size without any compression) 00:59:25 <04C​gettys> I'm not saying you're necessarily wrong 00:59:51 <04C​gettys> but it didn't look like it got as much faster as I'd have expected when I turned it off 01:00:10 <04C​gettys> Should we try turnign it off on one of the servers or all of them? sure, go for it, it's a 1 line change to h ack it off 01:02:19 <06m​umra> I'd be really interested to see the effect anyway (but you're very right that by itself it might not make much difference) 01:02:39 <04C​gettys> I definitely absolutely think that we should turn down the compression level if we do keep it on 01:02:58 <04C​gettys> 1 or 2 or 3 is a much better setting for us than the default (which on modern versions is 6, but may have been different in the past) 01:03:10 <04C​gettys> if it is the bottleneck, you might even see close to 2x better that way 01:04:10 <04C​gettys> But you're also right, the amount of bandwidth is pretty miniscule these days 01:04:26 <04C​gettys> even if you assume it's 10000x 10kb messages per second 01:04:32 <04C​gettys> which there's no way we're hitting 01:05:25 <04C​gettys> That'd be 100 Megabytes per second 01:05:36 <04C​gettys> or 800 megabits 01:05:39 <04C​gettys> so if it's that it's actually kinda sizable 01:05:43 <04C​gettys> but it's not 01:05:57 <04C​gettys> Probably anyway 😄 01:06:02 <04C​gettys> I'd be shocked if it were 1000x 1kb typical 01:06:40 <04C​gettys> which would be 1 MBps / 8 Megabits per second 01:06:48 <04C​gettys> But really, better to get real numbers off of servers 01:25:20 <06m​umra> Yeah servers should easily handle that bandwidth outgoing 04:33:28 Experimental (bcrawl) branch on underhound.eu updated to: 0.23-a0-5261-gd9800d219b 05:07:11 Unstable branch on crawl.akrasiac.org updated to: 0.33-a0-1120-gbd99899 (34) 10:19:48 <02M​onkooky> Been perusing issues, compiled a list of probably closable: https://github.com/crawl/crawl/issues/4283 https://github.com/crawl/crawl/issues/4250 https://github.com/crawl/crawl/issues/4230 https://github.com/crawl/crawl/issues/3761 (monsters = dirty cheaters) https://github.com/crawl/crawl/issues/4092 (as above) https://github.com/crawl/crawl/issues/3956 (issue cleared by deleting saves/des) https://github.com/crawl/crawl/issues/3953 10:19:48 (believe this is a server feature, not actually crawlcode) https://github.com/crawl/crawl/issues/3692 (might be fixed already) https://github.com/crawl/crawl/issues/3373 10:19:52 <02M​onkooky> oh god embeds 10:59:43 <06m​umra> @Monkooky #4230 was a very easy close 😂 11:12:13 <06m​umra> Re #3761 .. is maurice suppose to be able to wield a giant club..? 11:14:10 <06m​umra> Same for #4092, doesn't sound right for an orc either ... Nice apostle tho 11:24:59 <02M​onkooky> I don't think there's weapon type restrictions on what monsters can grab off the floor 11:25:28 <02M​onkooky> the apostle getting it's a little weirder but I feel like still falls under 'monsters are dirty cheaters' 11:34:15 <06m​umra> Sure but they do still follow their own rules, and this seems to be breaking an otherwise consistent convention that only giant monsters can wield giant clubs 11:36:00 <04d​racoomega> Other monsters presumably picked it up off the floor, but the apostle case is somewhat weirder, since they shouldn't be able to do this and I certainly didn't hand out giant clubs to them. Could some other type of weapon get upgraded into skullcrusher when randartified?? 11:36:26 <04d​racoomega> (Contrariwise: maybe that orc used to be an oni) 11:36:48 <04d​racoomega> (Okay, okay, I know they're also size: medium at the moment >.>) 11:37:21 <04d​racoomega> But it is 100% not on the list of stuff they're supposed to be given 11:48:15 <06m​umra> I closed a couple more tickets that seemed obvious, but a lot of these still look like actual bugs to me or reasonable requests 12:07:24 Hello all! I was looking to try to make a contribution to dcss; are there any lower priority issues on your radar that would be good as a first task (and is this still the best place to be interacting around crawl dev?) 12:17:47 <12e​bering> This is still the best place 12:19:02 <12e​bering> I’m somewhat retired so I don’t have a good sense of things, but our github issues are a good place to start. If you see one you can ask here and someone with a seasoned view of the codebase will gladly tell you “seems easy/seems nightmare” 12:23:29 Thanks! I've been poking around the issues and using those as an excuse to explore the code base. For instance I saw https://github.com/crawl/crawl/issues/3835 and spent a moment looking at mud related code; seemed achievable technically to make mud impact invisibility, but also I wasn't sure how decisions about a gameplay changes like that would be made. 12:24:57 <06m​umra> Honestly I was just looking at this one Monkooky linked to above and it seems like a reasonable thing to have and fairly entry-level: https://github.com/crawl/crawl/issues/3953 Basically you can't view Sprint high scores from the main menu and yet they are tracked and are displayed after a sprint game. The complication here is there's a separate highscore table for each sprint. So you need a key to switch to sprint scores, and then pop 12:24:57 up the menu to choose which sprint, and finally display the score table (As I write this i realise it's possibly not quite so entry level as i first thought, but the relevant UIs at least all exist already, they just need to be accessed in a new way) 12:27:23 @mumra -- nice! I can give it a whirl (unless someone else was about to jump on it) -- and will force me to learn about the UX. 12:29:14 <06m​umra> Maybe scores should be accessible within the Sprint map selection instead rather than buried in the main high scores screen (it'd be really nice if it summarised your best score next to each map so you can see at a glance which ones you've completed, but for the relatively small number of players that would strongly care about this feature that's a lot of extra effort at that point) 12:32:37 <06m​umra> The ticket has sat there for the best part of a year with no comments, I doubt you would be treading on anyone's toes 🙂 UI/UX is an area of the game that tends to get the least attention unfortunately, so it's a great area to get stuck into if you want to contribute 13:36:00 <04C​gettys> @mumra -I did some more thinking about the emscripten thing. I still think it's worth doing, but I also suspect a emscripten frontend won't make webtiles feel like local, at least not so long as the game logic runs server side 13:36:27 <04C​gettys> In other words: even if the browser side was instant, you still have a round-trip before you can see what your keystroke did. 13:38:45 <06m​umra> Sure, and that's not the total aim 13:39:16 <06m​umra> Although, in my profiling, there is a tangible 30ms improvement in the rendering, which is huge alone in terms of responsiveness 13:39:35 <06m​umra> 100ms down to 70ms feels massive to a user 13:39:47 <06m​umra> Assuming that the server roundtrip stays constant 13:41:11 <04C​gettys> Fair, if the latency numbers are there, then it will 13:41:40 <04C​gettys> I was just noting, "hey, it might or might not be massive depending on server latency in question" 13:42:00 <04C​gettys> If you're in say Arizona where one of the serversis, or quite close, absolutely it'll make a big deal to see 30ms go away 13:43:59 <06m​umra> I am theorising that there can be other benefits in terms of sending a different model across: we have to transmit map knowledge and monster info, instead of transmitting the final state of what tiles are displayed. So a lot of the tile logic will happen on the client instead. Whether this can translate to any latency improvement is certainly a question mark but there's potential there to be more efficient. 13:44:21 <04C​gettys> Yeah, there's potential in theory to run some logic on both sides 13:44:28 <04C​gettys> though it gets tricky cause we odn't want to expose seed to client 13:44:50 <04C​gettys> But yes you can probably with some work allow moving one step while waiting for messages at a minimum 13:44:50 <06m​umra> But all this is ignoring the HUGE potential benefit for crawl development in general: UI development can focus on the single C++ codebase rather than having this awkward situation where there's a whole lot of stuff replicated in a fairly ancient js codebase that employs none of the modern js development standards and is frankly awful to work with 13:45:00 <04C​gettys> Yeah, I agree there are still major benefits 13:45:09 <04C​gettys> Just trying to temper expectations, that's all 13:51:14 <04C​gettys> I'm gonna do some more perf digging 13:51:23 <04C​gettys> I may not have looked high up enough in the call stack 13:51:33 <06m​umra> That's ok 🙂 I'm not under any illusions. But 40ms is high - I work with servers all day seeing latency of <5ms to Europe 13:51:59 <06m​umra> So something can be optimised there I have no doubt 13:52:13 <04C​gettys> Sure, if the server is close enough you can do far better 13:52:21 <04C​gettys> it's a question of physics at end of day 13:52:40 <04C​gettys> I mean literally, speed of light delay 13:52:55 <06m​umra> No i mean the underhound server is in Europe, I'm sure a lower latency is possible 13:53:19 <04C​gettys> Sure, right, not saying it isn't 13:53:39 <04C​gettys> I'm just saying that some users may still have say, 50-70ms 13:53:45 <04C​gettys> even if server on same continent 13:54:11 <04C​gettys> See e.g. https://learn.microsoft.com/en-us/azure/networking/azure-network-latency?tabs=Americas%2CWestUS (there are other charts but this is the I knew where to find 😄 ) 13:55:16 <04C​gettys> CAO is ~30ms typical for me 13:55:25 <04C​gettys> yes it'd made a huge difference if latency was about 30ms 13:55:31 <04C​gettys> (including rendering I mean) 13:55:49 <04C​gettys> I'm fully in agreement the emscripten frontend idea is great 13:56:45 <04C​gettys> I'm just being clear that it won't magically get rid of that latency, and that local or fully emscripten would be lower latency, and therefore that we should expect performance to be somewhat less good in that model 13:56:58 <04C​gettys> but better than the current, agreed 13:57:37 <06m​umra> I didn't ever claim that anything would get rid of the physical time it takes a message to travel to the server and back again 🙂 13:57:47 <04C​gettys> I never said you did 13:58:12 <04C​gettys> I just realized nobody had said it and that we probably should be talking about it 😄 13:58:22 <04C​gettys> You already went and got the number it was making me think we should go get 13:58:35 <04C​gettys> (i.e. how much time was rendering vs how much time was network latency) 13:58:49 <04C​gettys> Agreed even if it were no faster, one client would still be a huge win 13:59:10 <06m​umra> But we know there is a performance issue inside the server, we just don't know exactly how significant it is (but it absolutely is observed to get worse with more connections and when a server has been up for longer) 13:59:20 <04C​gettys> Hmmm 13:59:31 <04C​gettys> re: when a server has been up longer, that's interesting for sure 13:59:44 <04C​gettys> in the crawl code itself, I think the UX has some memory leaks 13:59:51 <04C​gettys> but that'd be a matter of crawl process lifetime 14:00:11 <04C​gettys> But that does somewhat correlate with server uptime cause can't have had a crawl process running longer 14:01:47 <06m​umra> I think it was @asciiphilia doing some digging into this previously 14:02:01 <06m​umra> And it appeared to be related to the tornado process rather than game processes 14:02:52 <06m​umra> But yeah it does not surprise me if crawl itself is leaky 14:05:12 <04C​gettys> Ah, ok, I have found an interesting tho 14:05:18 <04C​gettys> This is very interesting 14:05:24 <04C​gettys> # TODO: if multiple spectators join at the same time, it's probably # possible for this heuristic to fail and send a full map to everyone 14:05:47 <04C​gettys> It does not appear to be limited to just if multiple spectators join 14:05:50 <04C​gettys> it's a straight-up race 14:08:25 <06m​umra> Yeah I actually posted that comment in the earlier discussion i believe 14:08:56 <04C​gettys> Yeah, you did 14:09:04 <04C​gettys> I can't entirely consistently reproduce it 14:09:10 <04C​gettys> but that's not a shocker 😄 14:09:30 <06m​umra> So it's actually sending a full map to everyone sometimes? 14:09:36 <04C​gettys> Sometimes 14:09:57 <04C​gettys> which multiplies any problems with significant CPU work etc 14:10:44 <04C​gettys> Ah, ok, maybe not, per-se 14:10:56 <04C​gettys> YOu also would "lose" the race if you were starting the game 14:11:10 <04C​gettys> because in that case, you have to receive a full map 14:12:51 <04C​gettys> Ok, so this is another interesting point 14:13:00 <04C​gettys> all the receivers are in a set() 14:13:06 <04C​gettys> _receivers in process_handler 14:13:25 <04C​gettys> and... python sets don't preserve insertion order 14:13:37 <04C​gettys> Meaning, we don't prioritize sending messages to players over spectators 14:13:38 <04C​gettys> That seems bad 14:16:03 <06m​umra> Yeah the player should be the first receiver certainly (but we shouldn't have to have receivers wait in a queue, the message should be prepared once and then broadcast to all receivers concurrently of course ... player should still be the first but it shouldn't matter that much if concurrency is working properly) 14:16:14 <04C​gettys> Sure, sure 14:16:36 <04C​gettys> But if you assume any overhead at all for actually sending 14:16:39 <04C​gettys> even if it's .1ms or .01ms 14:16:55 <04C​gettys> You want the player to go first 14:18:08 <06m​umra> Sure it stands to reason, there is only one network interface, the player's packets should be first. But currently we have a whole load of extra CPU delay between receivers 14:18:27 <04C​gettys> Sure, the quesiton is just, how much is "extra" 14:18:31 <04C​gettys> vs how much can we not get rid of 14:20:30 <06m​umra> Literally the extra should be as near to negligible as makes no difference. Just whatever overhead the websocket library has to send a message which should be extremely minimal. All the waiting time in our code should be removable, because every client should be getting the same data normally 14:20:51 <04C​gettys> Well that's what i'm telling you, a fair bit of itme is spent just inside tornado 14:20:56 <06m​umra> The only exception is a new player joining and they get a complete map, but that's a totally separate update 14:21:35 <06m​umra> Hmm. wonder what it's doing 14:22:14 <06m​umra> (Probably copying memory around. This is why i'd write it with nodejs streams instead since i know how to completely eliminate copying) 14:22:41 <04C​gettys> Copying memory around at this network bandwidth is probably not bottleneck if done well 14:22:45 <04C​gettys> the issue is the "if done well" 14:23:14 <06m​umra> Yep and we're dealing with fairly antique lib versions of most parts of the stack 🙂 14:24:02 <04C​gettys> Yuuup 14:24:46 <04C​gettys> keeping in mind adding a pritn statement is enough to skew the results... as that appears to be ~30000ns alone... meaning you can't really just scattershot this well 😄 14:25:09 <04C​gettys> but e.g. write_message takes ~40k ns, 100k ns, that sort of range 14:33:35 <04C​gettys> But then again, CrawWebSocket Init alone takes 57000ns 14:46:20 <06m​umra> If i ping underhound.eu, my average round trip is 23ms 14:47:11 <06m​umra> If i'm playing webtiles on the same server, watching the message times - it's regularly in excess of 100ms, just for normal walking around, not even full map updates 14:47:26 <04C​gettys> So we should be able to make it 4x faster-ish 14:48:11 <06m​umra> It's quite busy right now, there are like 23 active games (hardly any spectators tho) 14:48:36 <06m​umra> This is noticably worse than when i was investigating yesterday 14:48:55 <04C​gettys> I'm wondering if we're actually wastign the time in crawl, I'm gonna take a look 14:49:43 <06m​umra> I can't believe that crawl is taking spending 100ms thinking about a move, when it doesn't even have to render anything 14:50:02 I can, sadly 14:50:03 <04C​gettys> Was wondering for the spectator case to be clear 14:50:08 happens on local too 14:51:24 also, highly noticeable on local with clang + LTO: the last action in a sequence takes a ridiculously long time 14:51:45 <04C​gettys> Got a repro for that last bit? 14:51:51 <04C​gettys> e.g. what do you mean by a sequence? 14:52:23 actions by monsters after the player acts, before the next time the player can act 14:53:05 the actions themselves all seem to take around the same amount of time, but before it outputs the result of the last one (with the '_' in front of it) there's a long pause 14:53:24 <04C​gettys> interesting... 14:53:24 I haven't profiled to try to figure out what's going on there 14:53:32 <04C​gettys> I hadn't noticed 14:54:11 I didn't either when built with gcc, so conceivably it's a clang thing, but so it's pretty weird 14:54:14 <06m​umra> Local tiles is super responsive tho compared to webtiles, we can't be talking about the same kind of numbers 14:54:36 <04C​gettys> To be clear, what I was wondering about is how much of the time is beign spent in _send_everything 14:54:49 (actually the complete sequence seems to take the same amount of time, it just seems to be spread over the individual actions with gcc) 14:54:58 also this is console not any form of tiles 14:55:11 <04C​gettys> because that will interrupt other processing 14:56:27 <06m​umra> Yeah i don't see how _send_everything can be in any way slow, unless there is a significant wait in actually posting the final message over the socket 14:56:45 <06m​umra> If there is time spent in crawl, it's on the actual world update, monster pathfinding etc 14:57:05 hm, actually clang is faster overall, it's just that weird final delay that doesn't show with a gcc-built crawl (LTO or not) 14:57:28 <04C​gettys> Sure, but that wouldn't explain the lag spike on spectator join 14:58:06 <06m​umra> I'm not even looking at spectator join here. I'm looking at me playing alone and seeing 100ms+ spent somewhere 14:58:14 <04C​gettys> RE: weird final delay geekosaur, will see if I can repro, but if you can record a profile would be happy to stare at it 14:59:04 <04C​gettys> Ok, yeah, this is interesting 14:59:10 <04C​gettys> spectator_joined takes 3-4ms in crawl 14:59:32 <04C​gettys> and that's with a brand new game 14:59:47 <04C​gettys> well, not quite, but with empty inventory and no enmeies on screen 15:00:08 <06m​umra> The fact that the delay i'm seeing is so variable as well, leads me to the same conclusion that this is to do with main thread blocking with many players on the server. Nothing in the crawl process, it's the CPU spent in tornado / string copying / compression 15:00:43 <04C​gettys> I'm also suspicious that it's main thread blocking related, but I think we have more than 1 problem here 😄 15:01:14 <06m​umra> Oh definitely, well it's always the case that there is more than 1 problem in my experience 😂 15:02:08 <04C​gettys> I think the first order of business is we need a load testing setup 15:02:11 <06m​umra> I do also wonder what is happening when one crawl process is doing something heavy like levelgen, is that going to block other crawl processes that happen to be running on the same CPU? 15:03:01 <04C​gettys> It'llg et preempted periodically 15:03:09 <04C​gettys> depending on kernel scheduling choices 15:05:13 <04C​gettys> which is yet another good question 15:05:32 <04C​gettys> presumably we should be giving the crawl processes lower nice values than the server 15:11:34 <06m​umra> well hopefully the server is running on a dedicated CPU 15:12:15 <04C​gettys> You'd hope taht 15:13:17 <04C​gettys> But I don't think that's necessarily being done 15:13:32 <04C​gettys> Or if it is, may be on some servers but not others 15:13:46 <04C​gettys> don't see anything about e.g. taskset in the crawl repo 15:20:18 03Cgettys02 07https://github.com/crawl/crawl/pull/4396 * 0.33-a0-1123-gfb8aa86fa4: Timings 10(46 minutes ago, 2 files, 20+ 5-) 13https://github.com/crawl/crawl/commit/fb8aa86fa49c 15:20:18 03Cgettys02 07https://github.com/crawl/crawl/pull/4396 * 0.33-a0-1124-g4e81f8302c: More timing info 10(2 minutes ago, 1 file, 8+ 5-) 13https://github.com/crawl/crawl/commit/4e81f8302c54 15:20:18 03Cgettys02 07https://github.com/crawl/crawl/pull/4396 * 0.33-a0-1125-g1e704dc1b5: Nice the crawl processes? 10(48 seconds ago, 1 file, 3+ 0-) 13https://github.com/crawl/crawl/commit/1e704dc1b5ae 15:46:50 New branch created: pull/4398 (1 commit) 13https://github.com/crawl/crawl/pull/4398 15:46:50 03Cgettys02 07https://github.com/crawl/crawl/pull/4398 * 0.33-a0-1120-g5f6cd9e263: Webtiles Perf Improvements, part 1 10(79 seconds ago, 2 files, 40+ 34-) 13https://github.com/crawl/crawl/commit/5f6cd9e26368 15:46:58 <04C​gettys> I split out the pieces I felt confident enoguh in 15:48:48 <04C​gettys> Will likely need admins to take a look before and after and see if it improves things / test it 16:39:45 Unstable branch on underhound.eu updated to: 0.33-a0-1120-gbd99899e7f (34) 17:00:07 New branch created: pull/4399 (1 commit) 13https://github.com/crawl/crawl/pull/4399 17:00:07 03Cgettys02 07https://github.com/crawl/crawl/pull/4399 * 0.33-a0-1120-g4aa8bb76f6: bugfix: AUTOFIGHT while immotile should not wait a turn (CrawlOdds) 10(3 minutes ago, 1 file, 13+ 3-) 13https://github.com/crawl/crawl/commit/4aa8bb76f62b 17:11:51 03Cgettys02 07https://github.com/crawl/crawl/pull/4399 * 0.33-a0-1120-gca452c13f2: bugfix: AUTOFIGHT while immotile should not wait a turn (CrawlOdds) 10(15 minutes ago, 1 file, 13+ 3-) 13https://github.com/crawl/crawl/commit/ca452c13f249 17:12:55 03Cgettys02 07https://github.com/crawl/crawl/pull/4399 * 0.33-a0-1120-g87188b8adb: bugfix: AUTOFIGHT while immotile should not wait a turn (CrawlOdds) 10(16 minutes ago, 1 file, 5+ 3-) 13https://github.com/crawl/crawl/commit/87188b8adb81 17:22:44 <04C​gettys> @gammafunk - not urgent, but can I pick your brain about the lag spikes etc discussion above sometime? I think https://github.com/crawl/crawl/pull/4398 will make things better, and most of the changes should be quite safe, but I can't tell whether the webserver is struggling to get time or not, and I'm not sure whether nicing the crawl processes will make things better or worse as a result (the ideal might be to make the webserver 17:22:45 less nice in addition to the above, for example - but without that, crawl processes may be lower priority than other random processes running on the server, which would be net-negative in isolation) 17:32:40 <09g​ammafunk> Well, if you want me to comment on a technical level about scheduling, I'm afraid I can't. I'm not a systems guy. I do agree with the general idea that one needs to actually profile and test things in some kind of reasonably controlled way before rolling out changes. In this case, we'd potentially be pushing out tweaks to all servers without having tested it on any server (either a test server or a real server) ahead of time. That's 17:32:40 not ideal, certainly when you're not willing to claim confidence in the improvements based on having a lot of experience. But I don't have easy answers as to how to test things well in a controlled way, because again not a systems programmer. 17:32:54 <09g​ammafunk> The server environment in terms of other processes isn't terribly fixed across servers 17:33:56 <09g​ammafunk> some servers like CAO have other things running on them and obviously can be very divergent in terms of OS from server to server. CDI and CBR2 are pretty recent distributions (not sure if CBR2 is also ubuntu) but a server like CUE might not be so current 17:35:21 <09g​ammafunk> it might be possible to put together a test setup with dgamelaunch and a bunch of concurrent instances of qw 17:35:58 <09g​ammafunk> like I have a special script I use to run multiple qw in parallel (and track their stats in a local copy of sequell) 17:36:41 <09g​ammafunk> and if you use the internal qw delay, you can get some approximation of a human player in terms of average number of in-game actions 17:37:42 <09g​ammafunk> obv qw is going to lean on clua way more than a human player would, and you're not going to have human-like pauses in between actions. And qw does get stuck sometimes still and occassionally crashes even (although rare, and prefer to recieve reports about that). 17:38:06 <09g​ammafunk> but yeah it's probably good to have some kind of test setup where you can at least check before and after 17:55:33 <04C​gettys> I am a systems programmer, but even I don't feel confident in speaking to scheduler details - especially not in the large across a large number of servers running potentially different scheduling algorithms, vastly different software versions, vastly different hardware, etc 😄 17:56:22 <04C​gettys> I can add more options to the config files and then ask/beg/plead for admins to try it and see, I guess 😄 18:02:39 <04C​gettys> Basically, the problem is, I know how I'd attack this as a systems programming in a non-hobby setting 18:03:09 <04C​gettys> Continuous collection of CPU/memory/network stats at the machine level 18:03:22 and when I had access to the servers (sysadmin/system programmer background) 18:04:04 <04C​gettys> Background / periodic profiling of CPU usage, possibly also high-cpu triggered profiling (though this can be a catch-22 - adds overhead, so now you're making the problem worse) 18:04:05 <04C​gettys> etc 18:04:19 <04C​gettys> But all of this requires access, and much of it requires infrastructure, too 😄 18:04:47 <09g​ammafunk> well I don't think there's any fundamental reason why you couldn't make a container with a dgamelaunch-config webtiles server and whatever scripts you needed to test concurrent crawl process (like if you wanted to go the qw route) 18:05:06 <09g​ammafunk> From that point, you could test on something like amazon ec2 or a DigitalOcean droplet 18:05:11 <04C​gettys> Oh, sure, that I can do and will get around to sometime soonish 😄 18:05:34 <09g​ammafunk> if you were concerned that testing on one of your personal machines wasn't realistic 18:05:34 <04C​gettys> Also my local machine is powerful enough that I don't need to rent HW to be comparable to what some of the servers are (or more) 18:06:11 <04C​gettys> Testing is never realistic enough. It's sometimes good enough 18:06:18 <04C​gettys> But no amount of testing will tell me for sure 18:06:44 <09g​ammafunk> Well, for reference, all CDI is is the $48 tier droplet 18:06:45 <09g​ammafunk> https://cdn.discordapp.com/attachments/747522859361894521/1353535470449197066/image.png?ex=67e201a3&is=67e0b023&hm=21bfeba815bbd9800773263fd77c76a28deee90a08b837b0c44056737a55938f& 18:06:45 <04C​gettys> That's the bit that I wanted to talk about, how to roll out these things once confident enough, and not cause you and other admin folks pain 18:07:04 <09g​ammafunk> if it gives improvement on such a 4 "vcpu" system then it's going to give significant improvement to at least a few servers 18:07:09 <04C​gettys> Ah, that is a useful point of realism, yeah 18:07:15 <09g​ammafunk> obviously vcpus are their own weird thing 18:07:42 <09g​ammafunk> that is probably very difficult to recreate with a local test machine to some extent, but again absolutely not my expertise 18:08:57 <09g​ammafunk> well, rolling it out is in fact trivial! we have the dgamelaunch scripts do their trunk update so that update all portions of the webtiles server that are not local configuration (so config.py and the server template directory) 18:09:10 <09g​ammafunk> so in a sense, rolling it out is just pushing to the repository 18:09:20 <09g​ammafunk> unless configuration changes are require (i.e. config.py) 18:10:06 <04C​gettys> Right, that's the catch-22 18:10:14 <04C​gettys> the normal way I'd de-risk this is to make it ocnfigurable 😄 18:10:18 <09g​ammafunk> although it's also fair to point out that a webtiles restart would be required 18:10:30 <04C​gettys> but then either someone has to turn it off if I make it on by default and things go worng 18:10:39 <04C​gettys> or someone has to turn it on if I make it off by default or it's useless 😄 18:10:52 <09g​ammafunk> I don't know what you mean by "go wrong" specifically. Local tests will tell you if the code functionas at all 18:10:55 <09g​ammafunk> and yeah those should be done first 18:11:01 <04C​gettys> Well I did that 😄 18:11:14 <04C​gettys> What I mean is, what if it makes it eeeeeeeven laggier 18:11:37 <04C​gettys> though I guess it will only take effect when someoen restarts the webserver 18:11:39 <09g​ammafunk> what if someone makes a change to the crawl game that makes things laggier? pretty much the same deal 18:12:20 <09g​ammafunk> and yeah, webtiles restart is actually required if changes are at the python level 18:12:26 <04C​gettys> Eh, end of the day it's basically just a virtual machine. Now, the exact configuration options, what version of what hypervisor, what underlying hardware, what host os, et cetera... yeah not public knowledge 18:12:45 <04C​gettys> But you can definitely get close 18:15:19 <04C​gettys> True... 18:16:11 <04C​gettys> Let me restate that: it's literally just a virtual machine 18:18:01 <04C​gettys> DigitialOcean at least based on some web sleuthing, is using KVM as the hypervisor 18:19:49 <04C​gettys> AWS moved to KVM too in like 2018, probably, mostly 18:20:29 <04C​gettys> Microsoft Azure uses Hyper-V (documented publicly here: https://learn.microsoft.com/en-us/azure/security/fundamentals/hypervisor) 18:20:49 <04C​gettys> Yes there is a lot of secret sauce and magic involved in the cloud. But generally speaking, it's VMs at the end of the day 😄 18:25:02 <04C​gettys> Any chance I can borrow that script sometime? 18:28:45 <09g​ammafunk> it's very simple, so more of an inspiration for your own script: https://github.com/crawl/qw/blob/master/util/batch-qw.sh is the master script and https://github.com/crawl/qw/blob/master/util/run-qw.sh is the launcher. You'd want a checkout of that repo and to run the little make script to assemble a final qw.lua and/or rc file. See the project readme for details. 18:29:10 <04C​gettys> Ah, I looked thru your repos, should have looked thru the crawl repos 😄 18:29:25 <09g​ammafunk> also it runs qw games in tmux, so I can look through all the running games and potentially interact with them, which maybe you don't want 18:29:39 <09g​ammafunk> but like I said, it might be useful as a starting point 18:29:42 <04C​gettys> Basically, my test setup should be pretty simple: 1) A docker container or VM with some set amount of memory and CPU (probably 8GB/4 logical processors to match the digital ocean spec, for example) 2) Outside the VM, a bunch of something, presumably qw instances, loading up the server until it screams a bit. And collecting latency and thruput stats 18:29:49 <04C​gettys> Very much so, thanks 🙂 18:30:31 <09g​ammafunk> yeah, you can really throttle your server by running qw with no delay (configured via your RC as a lua variable) 18:31:01 <09g​ammafunk> also you do need need the correct type of build so you can pass an arg to disable the lua throttle 18:31:10 <09g​ammafunk> and set max memory usage 18:31:25 <09g​ammafunk> (side note: I really need both a cpu and memory profiler I can use in clua for qw development) 18:31:38 <04C​gettys> I really need a profiler that understands everything 😄 18:31:50 <04C​gettys> e.g. what one really wants is one profiler to understand CPython, lua, C++ 18:32:05 <04C​gettys> yes a C/C++ profiler should get you some useful data 18:32:21 <04C​gettys> but for an interpretter where the interpreter is spending its time is not always all that interesting 18:32:47 <04C​gettys> you want to know what functions' interpretation is taking the time, and the stack traces of the interpreter don't help you much with that 😄 18:34:15 <09g​ammafunk> yeah, I want summaries of what variables are taking up how much memory in lua, since I frequently run into memory limits (note the qw launch setting max clua memory of 128MB), probably due to some leaks, at least one of which I know about already 18:34:40 <09g​ammafunk> but what lua is using up the most cpu in qw is something I really need as well 18:35:02 <09g​ammafunk> it's grown in complexity from elliptic's ~7k lines of lua to my ~18k lines of lua 18:35:11 <04C​gettys> https://github.com/pmusa/luamemprofiler for mem maybe? never used it, old, google search 18:35:26 <09g​ammafunk> ah, requires lua 5.2 18:35:32 <09g​ammafunk> which reminds me of another lua project... 18:35:57 <04C​gettys> https://stackoverflow.com/questions/15725744/easy-lua-profiling hm, fun 18:36:15 <04C​gettys> http://lua-users.org/wiki/ProfilingLuaCode also "fun" 18:37:20 <04C​gettys> This looks somewaht promising, callgrind/kcachegrind / valgrind is great stuff: https://jan.kneschke.de/projects/misc/profiling-lua-with-kcachegrind/ 18:38:33 <09g​ammafunk> yeah, when I have some time after finishing my next qw combo wins, I'm going to tackle some profiling issues for sure 18:39:10 <04C​gettys> https://github.com/LewisJEllis/awesome-lua?tab=readme-ov-file#debugging-and-profiling mostly the same list but some other mentions 18:42:04 <04C​gettys> BTW, I think I worked out why the autofight behavior was what it was 18:42:11 <04C​gettys> commit from Ellpitic 18:42:20 <04C​gettys> No mention of qw in the message 18:42:30 <04C​gettys> but I suspect it was a fix for qw getting stuck if -Move 18:45:59 <09g​ammafunk> I'm not sure if -Move even existed until recently 18:46:25 <04C​gettys> Was like a year ago the commit was added, I linked in in the PR, one sec 18:46:34 <04C​gettys> Also: any chance we can merge https://github.com/crawl/crawl/pull/3757/files? 18:46:53 <04C​gettys> https://github.com/crawl/crawl/commit/52037ea7a925a6ae6b1b67939267cd8f1f4837a4 18:47:29 <09g​ammafunk> advil has a docker for testing dgamelaunch setup already (in that repo) 18:47:32 <04C​gettys> Asking cause that way, I could build a script on top of this to make what I'm doign reproducible 18:47:36 <04C​gettys> Ah, that'd also work 😄 18:48:00 <09g​ammafunk> I'm not sure how necessary it is for you to use the dgamelaunch-config scripts for testing per se 18:48:12 <09g​ammafunk> but if you want a setup that's close to a live server, those all use dgamelaunch-config 18:48:28 <09g​ammafunk> although CNC uses a separate docker made by asciiphillia 18:48:33 <09g​ammafunk> you might prefer his 18:48:39 <09g​ammafunk> he has it in his own repo; it doesn't have a chroot 18:48:39 <04C​gettys> https://github.com/crawl/dgamelaunch-config/blob/d41e90831d96d0008eb41f04cc5824cb8ab6fa75/utils/testing-container/Dockerfile#L5 This one? 18:49:12 <09g​ammafunk> yes 18:49:56 <04C​gettys> So we have at least 3 floating around 😄 18:50:32 <09g​ammafunk> mumra's is more for a local webtiles dev scenario, advil's is to facilitate testing of changes to dgamelaunch-config webtiles 18:50:54 <04C​gettys> And asciiphillia's is used in production so to speak 😄 18:51:24 <04C​gettys> If I understand right 😄 18:51:56 <09g​ammafunk> Here is ascii's: https://github.com/refracta/dcss-server 18:52:13 <09g​ammafunk> it also has a bunch of forks so you probably want to tweak it to not install those 18:54:46 <04C​gettys> Boy oh boy, so much fun stuff 😄 18:55:03 <04C​gettys> Pity we have so many separate versions tho 18:55:17 <04C​gettys> I know, some of it is necessary customization, but presumably some of it is reusable imporvements 18:58:06 <04C​gettys> So GitHub's runners, are 4 Vcores, 16 GB 18:58:16 <04C​gettys> bit more ram than e.g. a digitalOcean droplet 18:58:28 <04C​gettys> Maybe I'll see if I can get this to run in CI 😄 19:07:15 <04C​gettys> I may have to try ascii's 19:07:33 <04C​gettys> the "official" one has been running for almost 10 minutes now 😄 19:07:45 <04C​gettys> copying and updating stable and trunk 19:17:13 New branch created: pull/4400 (1 commit) 13https://github.com/crawl/crawl/pull/4400 19:17:14 03Mike Hegarty02 07https://github.com/crawl/crawl/pull/4400 * 0.33-a0-1121-g544f73fc97: Fixed description of Zot trap to be more accurate 10(4 minutes ago, 1 file, 4+ 4-) 13https://github.com/crawl/crawl/commit/544f73fc970a 19:35:02 <09h​ellmonk> hmm, how do coglin weapon slots work these days 19:35:23 <09h​ellmonk> do I need to check for the possibility of no weapon in mainhand but weapon in offhand, or did the slot rework obsolete that 19:53:21 03hellmonk02 07[theyrebombs] * 0.33-a0-1091-gc4b0b63fa1: adjustments and fixes 10(4 minutes ago, 2 files, 24+ 9-) 13https://github.com/crawl/crawl/commit/c4b0b63fa11a 19:53:21 03hellmonk02 07[theyrebombs] * 0.33-a0-1092-gc8846ca573: add missing random 10(30 seconds ago, 1 file, 1+ 1-) 13https://github.com/crawl/crawl/commit/c8846ca57390 19:53:21 Branch pull/4388 updated to be equal with theyrebombs: 13https://github.com/crawl/crawl/pull/4388 19:53:29 <09h​ellmonk> would like some playtest feedback before jamming this in trunk if anyone has a few minutes 20:10:55 <04d​racoomega> you.offhand_weapon() currently returns the second weapon they have on. So it can't return a weapon if you.weapon() returns null 20:11:05 <09h​ellmonk> ok, cool 20:11:36 <04d​racoomega> (I can take a look a look at the PR tomorrow, if you'd like, but I'm shortly heading to sleep myself) 20:11:42 <09h​ellmonk> good night 20:18:54 03WizardIke02 {GitHub} 07* 0.33-a0-1121-ge3852c236f: Allow manifold assault with the Sword of Power (#4393) 10(47 seconds ago, 8 files, 47+ 71-) 13https://github.com/crawl/crawl/commit/e3852c236f0e 21:09:02 03hellmonk02 07* 0.33-a0-1122-gd0d60a0ba6: Update background descs (NotAJumbleOfNumbers) 10(7 minutes ago, 1 file, 15+ 14-) 13https://github.com/crawl/crawl/commit/d0d60a0ba61a 21:13:19 03hellmonk02 07* 0.33-a0-1123-g8bf82eaf03: update credits 10(52 seconds ago, 1 file, 1+ 0-) 13https://github.com/crawl/crawl/commit/8bf82eaf034f 21:20:45 03dolorous02 07* 0.33-a0-1124-gc4b2e2bd67: Fix spelling. 10(89 seconds ago, 1 file, 1+ 1-) 13https://github.com/crawl/crawl/commit/c4b2e2bd6762 21:22:53 03WizardIke02 07https://github.com/crawl/crawl/pull/4383 * 0.33-a0-1088-gf5de2ad6de: Call some tiles rendering code less often 10(6 days ago, 7 files, 309+ 123-) 13https://github.com/crawl/crawl/commit/f5de2ad6dee2 21:24:15 <09h​ellmonk> foiled by the Australians once again... 21:37:12 <09h​ellmonk> really hope this doesn't break anything 21:37:45 03WizardIke02 {GitHub} 07* 0.33-a0-1125-g9c36f3742a: Make fair_adjacent_iterator fair (#3914) 10(57 seconds ago, 6 files, 100+ 15-) 13https://github.com/crawl/crawl/commit/9c36f3742aaf 21:40:15 <04C​gettys> Ooh, oooh, pickme 😛 https://github.com/crawl/crawl/pull/4312 21:41:29 <09h​ellmonk> ok. 21:42:02 03Charlie Gettys02 {GitHub} 07* 0.33-a0-1126-g9300408954: ci: fix checkconventionalcommit.py (#4312) 10(26 seconds ago, 2 files, 6+ 4-) 13https://github.com/crawl/crawl/commit/9300408954f7 21:43:07 <04C​gettys> Will try to give this a shot in a few minutes 21:43:54 <04C​gettys> I don't know how informed my opinion will be, but nonetheless 22:48:42 <04C​gettys> Some interesting findings... refreshing the page may load ~17MB? looks like tilesets may not be cached nicely? 22:51:37 <04C​gettys> Might only be firefox that isn't caching nicely 22:52:05 <04C​gettys> W/O cache on edge, it's ~5.3MB transfered 23:11:18 <06m​umra> Cache seems fine for me (in chrome) 23:11:47 <06m​umra> Check you have not at some point checked "disable cache" in webtools (have made this mistake a number of times!) 23:21:51 <04C​gettys> I had 23:21:59 <04C​gettys> had checked I mean 23:22:12 <04C​gettys> Anyway, even so would not necessarily be the problem 23:22:20 <04C​gettys> I can't repro the problem so far 23:22:42 <04C​gettys> Chrome and Firefox both support simulating request latencies, but the same doesn't seem to be true for websocket traffic 23:22:49 <04C​gettys> which is annoying 23:23:00 <04C​gettys> and so far I cannot reproduce the issue on my test setup 23:23:10 <04C​gettys> not even if I load up the heck out of the other cores 23:23:42 <04C​gettys> So network latency and some blocking somewhere seem fairly likely to be involved 23:26:51 Unstable branch on cbro.berotato.org updated to: 0.33-a0-1126-g9300408954 (34) 23:30:15 <04C​gettys> @mumra - I'm beginning to think we may be compressing twice, at least on up to date servers... may be that the standardized websocket compression you spoke about is getting auto-negotiated between tornado and modern browsers 23:30:24 <04C​gettys> andif so we should definitely drop the compression 😄 23:35:31 Unstable branch on crawl.develz.org updated to: 0.33-a0-1126-g9300408954 (34) 23:38:22 <04C​gettys> I'm probably wrong... let me dig more 23:58:39 Windows builds of master branch on crawl.develz.org updated to: 0.33-a0-1126-g9300408954 23:59:12 <04C​gettys> Ok, not quite as bad as I thought, but bad 23:59:27 <04C​gettys> tornado can handle the compression exactly like we can but protocol level 23:59:34 <04C​gettys> which will save some time in JS on the other side