00:31:13 Fork (bcrawl) on crawl.kelbi.org updated to: 0.23-a0-4778-g1c48639af6 03:22:49 Experimental (bcrawl) branch on underhound.eu updated to: 0.23-a0-4778-g1c48639af6 05:06:42 Unstable branch on crawl.akrasiac.org updated to: 0.30-a0-50-gb2d611f (34) 05:46:58 <06a​dvil> lol, odds of me getting tourney code to run on an arm mac as-is, very very low 08:00:22 <06a​dvil> hm, not sure if my sql is entirely up to optimizing this code 08:00:42 <06a​dvil> (switching to this channel) 08:01:07 <06a​dvil> the bad query is essentially this (dropping some json stuff): SELECT g.player, g.score FROM games AS g LEFT OUTER JOIN games AS g2 ON g.player = g2.player AND g.score < g2.score WHERE g2.score IS NULL AND g.score > 0; 08:01:20 <06a​dvil> seems like it is probably comparing every game with every other game or something similarly silly 08:02:24 <06a​dvil> now my simpleminded understanding has produced the following much simpler, easy to read, and (crucially) faster query: select g.player, max(g.score) from games as g group by g.player; 08:03:55 <06a​dvil> or select g.player, max(g.score) from games as g where g.score > 0 group by g.player; with the positive constraint (not entirely sure what the NULL thing is doing in the original) 08:04:13 <06a​dvil> (presumably something to do with it being a left outer join) 08:04:47 NULL is also difficult as it can't be indexed 08:05:13 <06a​dvil> explain does claim that g2 is using an index, but it might be the player 08:05:34 <06a​dvil> I don't think score actually can be null? 08:05:41 right, I'd expect the join on player to use an index 08:06:13 <06a​dvil> I have the current full tourney data locally and those two queries produce the same result afaict 08:06:41 <06a​dvil> hm two rows different 08:21:56 <06a​dvil> @gammafunk do you know if the clan pages are slow also, or is it just player pages? (I haven't pulled clan data) 08:22:10 <06a​dvil> there's an even more elaborate query there 08:26:08 <06a​dvil> hm, the extras vs the simpler group by query are just that the left outer join for some reason lets in a couple of spare duplicates of one player 08:27:38 <06a​dvil> oh, it's just three games with the same score 08:27:51 <06a​dvil> which should be doable in the group by version 08:28:30 <06a​dvil> anyways, if anyone has a clue as to what is going on here or a reason why I shouldn't change the query, lmk 08:42:31 <06a​dvil> well, I can definitely find things on the internet advocating for the left outer join approach and saying it is "performance friendly" 08:43:13 <06a​dvil> but it just really isn't empirically in this case 08:43:30 <06a​dvil> maybe if there were some index there currently isn't 08:54:39 marcb3 (L14 MiFi) ASSERT(you.running.turns_passed < buggy_threshold) in 'delay.cc' at line 473 failed. (Excessive delay, 700 turns passed, delay type 9999299) (D:12) 08:54:43 <09g​ammafunk> yeah, there are some indexes defined in one of the two sql files 08:55:20 <06a​dvil> I think it might need an index for score, which is maybe a bad idea? 08:55:46 <06a​dvil> anyways, I've rewritten the query to something that seems to be identical but is much faster 08:56:49 <09g​ammafunk> excellent, you can certainly pull the changes yourself on cdo and restart yourself if you like (i.e. kill and restart taildb.py after the pull), or I can do that after you've pushed 08:56:56 <06a​dvil> do you know what the answer is re clan page speed? 08:57:21 <09g​ammafunk> I don't, I seem to recall they take at least a few seconds each too though 08:57:30 <09g​ammafunk> there are far fewer of them of course 08:57:49 <09g​ammafunk> I forget if those come before or after the player pages in the update cycle 08:58:20 <06a​dvil> yeah, not sure, I will have to figure out how to get clan data locally 08:58:22 <06a​dvil> when I push this I'm also going to push py3 fixes that are hopefully backwards compatible 08:58:59 <09g​ammafunk> for clan data, it would just be getting the rcfiles into a local dir, the cdo crontab has all the relevant wget statements 08:59:18 <09g​ammafunk> or I can maybe give an answer about whether slow after you push and we actually get through a player update cycle in a reasonable time! 08:59:40 <09g​ammafunk> !nchoice 08:59:48 <04C​erebot> MfEE: 0 wins || AmberJosephineLiu: CXC, L20 Impaler of Gozag || Prakerore: CKO, L19 Alchemist of Uskayaw || particleface: CKO, L18 Stormcaller of Zin || Armakuni: CXC, L13 Spear-Bearer of Elyvilon || dilly: CWZ, L12 Nimble of Okawaru || Stipulation: CKO, L12 Fencer of Okawaru || Subbak: CXC, L11 Transmogrifier of Sif Muna || Implojin: CKO, L11 Spry of Hepliaklqana || Rubinko: CAO, L10 Martial Art... 09:12:53 <06a​dvil> I just ended up pulling a tarbell of the rc files from cdo 09:13:21 <06a​dvil> definitely some slow clan stuff, not yet to individual clan pages though 09:16:16 <06a​dvil> lol ok, these are way worse than player pages, I'm seeing times in the minutes per clan page 09:30:00 <12e​bering> @advil iirc I used the left outer join because that gives you the whole row where the max is realized 09:30:03 <06a​dvil> mysql> select * from clan_highest_scores; [...] 101 rows in set (58.93 sec) 🙃 09:30:13 <12e​bering> which is needed to get the game info in the json... 09:30:23 <06a​dvil> yeah I have done it with an inner join on the query I pasted above 09:30:41 <06a​dvil> ELECT a.*, JSON_OBJECT('source_file', a.source_file, 'player', a.player, 'end_time', a.end_time, 'charabbrev', a.charabbrev) AS morgue_json FROM games AS a INNER JOIN ( SELECT g.player, MAX(g.score) AS score FROM games AS g WHERE g.score > 0 GROUP BY g.player) AS b ON a.player=b.player AND a.score=b.score; 09:30:53 <12e​bering> ah cool 09:31:38 <06a​dvil> I have definitely come to the conclusion that what you did is recommended by various sources, I was just not familiar with it, and I cannot figure out why it is not going well 09:31:53 <06a​dvil> there is an index on (player, score) which it claims to be using 09:32:53 <06a​dvil> but rather than try to spend more time figuring that out I might just convert the slowest queries to an inner join using group_by 09:33:43 <12e​bering> sorry I don't have more time to help out! in oh shit mode your re-write seems reasonable 09:43:28 <09g​ammafunk> advil not sure if your work is related to this, and it's definitely a secondary concern relative to getting player and clan pages generated in a timely manner, but the script is also fairly slow in processing lines of logfile/milestones input. It's not much of a problem in practice unless we want to rebuild the DB from scratch though. We'd probably only want to do that to fix the player case issue (which itself could probably be 09:43:28 solved other ways) or to recover from some major DB fiasco 09:44:30 <09g​ammafunk> The number of lines that come in between a fetch are small enough that processing time isn't prohibitive for the normal update cycle 09:45:19 <09g​ammafunk> one thing I do also want to do is to try to get the scripts to do a major page update whenever the current unwon nchoice is won, instead of just waiting until the next cycle 09:45:43 <09g​ammafunk> but that's another nice-to-have and I can probably figure that out when I have some more time 10:02:34 <06a​dvil> I'm sure logfile/milestone importing can be sped up but my experience based on cao scoring (this codebase is shared enough that it inherits some of the same problems) is that speedups past the current rate involve a lot of nontrivial refactoring in how data is loaded 10:40:50 <06a​dvil> "remembered their ancestor 🦀 as a hexer." 10:40:53 <06a​dvil> til you can do that 10:42:17 <09g​ammafunk> hah 10:42:49 <09g​ammafunk> as I recall, at one point the scripts broke with an exception in the milestone handling 10:43:50 <09g​ammafunk> the culprit turned out to be a new (at the time) pleasingfungus commit that allowed spaces in randomly generated names, hence a pan lord kill with a space in the name brought the tournament down a day or so after it started 10:43:54 <06a​dvil> I can't seem to find the milestone in sequell 10:44:23 <09g​ammafunk> interesting, I assume checking by e.g. GID? 10:45:02 <06a​dvil> I was trying other things since gid is a derived field 10:45:33 <09g​ammafunk> might help to manually determine it and look at all milestones for that game in sequell 10:45:36 !lm implojin t0.29 degl 5 10:45:37 5/56. [2022-08-26 21:06:39] Implojin the Nimble (L9 DEGl of Hepliaklqana) remembered their ancestor 🦀 as a hexer on turn 10156. (D:7) 10:45:44 <09g​ammafunk> nice 10:45:49 there it is, not sure why other things I tried didn't work 10:46:30 <06a​dvil> anyways, py3 as-is is not extremely happy about the crab 10:48:49 <06a​dvil> or maybe I need to change some mysql setting 10:57:19 <06a​dvil> hm, everything seems to be utf8mb4 already 11:00:31 <06a​dvil> now a bit surprised that the py2 version doesn't crash on implojin's crab, I don't see any reason to expect it to default to 4 byte unicode on that mysql version 11:06:05 <06a​dvil> ok, looks like it is 11:15:42 <06a​dvil> locally a full db reload takes about 3 minutes to read all lines, which is really not bad at all -- I guess this isn't what you were seeing on cdo though? 11:26:05 <09g​ammafunk> that's right, it take minutes to process 3k lines iirc 11:26:58 <06a​dvil> is cdo an ssd do you know? 11:27:12 <09g​ammafunk> not sure, but Napkin could give you more details 11:27:14 <09g​ammafunk> by the way advil 11:27:48 <09g​ammafunk> on cdo I've made a tournament_test database and a ~/tourney_dev checkout of the dcss_tourney repo 11:28:06 <09g​ammafunk> if you want to test things to see how cdo acts, you can use that db and that checkout 11:28:14 <09g​ammafunk> I did this when testing my bot-related changes 11:28:44 <09g​ammafunk> I have a local commit in that repo which has the script code use the tournament_test database (instead of the live tournament one) 11:29:08 <09g​ammafunk> because I couldn't install the mysql module locally for python 2 etc 11:29:15 <09g​ammafunk> so the only way I could really test my changes was on cdo 11:29:42 <09g​ammafunk> I also made it use a ~/website/tournament/0.29-test/ folder (which I've since deleted, but it can just be recreated) 11:29:52 <09g​ammafunk> for its output files, I mean 11:30:42 <09g​ammafunk> the local mysql client reads the ~/.my.cnf file to get the authorization if you want to use the client for anything 11:30:58 <06a​dvil> ok, sounds good, esp since I committed directly to the 0.29 branch 11:31:38 <09g​ammafunk> yep, that's what I've likewise been doing, I actually only merge to master after the tournament is finished 11:31:40 <06a​dvil> hm, well, one thing I do with cao whenever I want to do a full reload is do it locally, export the database with mysqldump, and import it on cao from the resulting file -- it is substantially faster than just running the import on cao 11:31:53 <06a​dvil> because of ssd vs hdd mainly 11:31:57 <09g​ammafunk> I see 11:32:05 <06a​dvil> it's more of pain ofc 11:32:10 <06a​dvil> but time is saved 11:32:19 <09g​ammafunk> certainly could do something like, maybe I could do it on cdi 11:32:46 <09g​ammafunk> my old dev laptop here is not quite as powerful as your desktop :gammafHeh: 11:33:32 <06a​dvil> yeah, it's really the ssd that makes the difference more than cpu 11:33:44 <06a​dvil> I first worked this out on something that is a much older computer 11:34:01 <09g​ammafunk> sounds reasonable to pursue as a plan for what to do if we are just forced to do a db rebuild 11:34:20 <09g​ammafunk> in that I can do it on an aws instance probably (which would probably just be cdi for convenience) 11:34:43 <09g​ammafunk> oh, good, and now I'm reminded of how I still haven't built the debs 11:34:46 <06a​dvil> apparently I can do the whole thing in ~6minutes so if I'm around probably I should just do it 😄 11:35:01 <09g​ammafunk> heh, sounds good to me, but didn't want to volunteer you for more work 11:35:17 <06a​dvil> one thought I had re that is if we can ride out the complaints it might be good to immediately release 0.29.1 after t with bugfixes, and just generate them then? 11:35:34 <09g​ammafunk> yeah that's probably reasonable 11:35:38 <06a​dvil> also it kind of looked to me like robertxgray had a ci thing that was making debs, not sure exactly what I was seeing 11:36:04 <09g​ammafunk> this way I can leave cdi for just running qw for now without needing to interrupt anything 11:36:37 <06a​dvil> https://github.com/robertxgray/crawl/releases/tag/0.29.0 11:37:53 <06a​dvil> not sure what's in those exactly, but if they work and we merge that PR soon...slacking on deb making really pays off 11:39:34 <09g​ammafunk> yeah looking, but don't actually see anything in his github workflow that creates the debs 11:40:15 <06a​dvil> indeed 11:40:18 <06a​dvil> quite mysterious 11:40:37 <06a​dvil> maybe it's only in his clone, he has still been iterating on this I believe 11:51:31 <06a​dvil> possible he just built them manually to make sure nothing broke; I asked in the discussion thread 11:56:44 <06a​dvil> tourney_dev looks like it has a bunch of changes that I might not mess with (would need to switch branches) 11:57:45 <09g​ammafunk> hrm 11:57:58 <09g​ammafunk> oh sorry 12:00:18 <06a​dvil> I didn't restart anything but I have updated the views 12:00:36 <06a​dvil> which should have a pretty immediate effect on that part of the taildb loop 12:01:05 <06a​dvil> wow, I see what you mean about the line processing 12:01:29 <06a​dvil> that can't be just disk speed 12:03:18 <09g​ammafunk> @advil I have reset tourney_dev to the 0.29 branch and updated the local commit on top to use tournament_test and the corresponding test directories/urls 12:03:36 <09g​ammafunk> may not be necessary to use now but in case you'd like to in the future 12:04:40 <06a​dvil> thanks 12:04:51 <09g​ammafunk> yeah, re the line processing, not sure if cdo mysqld is throttled in some way or what, if you haven't looked at it yet, it's running from ~/mysql with 12:04:57 <06a​dvil> I can still use it to test restarting 12:05:30 <09g​ammafunk> you may need to manually create that 0.29-test dir in ~/website/tournament 12:07:12 <06a​dvil> currently just waiting to see if the production instance ever decides to rebuild some player pages 12:07:38 <09g​ammafunk> it does that on like an 8 minute cycle 12:07:56 <06a​dvil> ah right 12:08:07 <09g​ammafunk> would recommend maybe killing/restarting taildb.py process, or maybe you're trying to see something more specific behaviour wise 12:08:25 <09g​ammafunk> and python loaddb.py if taildb.py not running does a single full update 12:08:33 <06a​dvil> just wanted to first see if the view changes made a difference 12:09:03 <09g​ammafunk> yeah, although is updating the view without first restarting taildb somewhat dangerous? I guess it only affects the page output 12:09:21 <09g​ammafunk> *views 12:09:37 <06a​dvil> should be safe, mysql can handle it and the contents of the views that changed shouldn't even be different 12:10:39 <09g​ammafunk> when I first added the cdi source, the first qw game in the logfile got duplicated in the tournament db 12:10:57 <09g​ammafunk> possible I did something silly like doing loaddb.py while taildb.py was running 12:11:52 <09g​ammafunk> that game is no longer visible from qw player page but probably still duplicated in the db 12:12:05 <09g​ammafunk> the very first game in the cdi logfile, a fewr 12:12:15 <09g​ammafunk> seems to have caused no ill effects though 12:13:15 <06a​dvil> aha here we go, it's going super fast now 12:13:52 <06a​dvil> 50-100ms per player page 12:15:47 <06a​dvil> that full update cycle (which seemed to update a lot of player pages) ran from 21:12:31,790 to 21:14:44,174, which seems fairly reasonable 12:21:40 <09g​ammafunk> yeah, sounds great 12:26:49 <06a​dvil> https://github.com/crawl/dcss_tourney/commit/e8eb9c6e6065123a52b904cad7cbde19b2beb9f7 was the fixing commit 12:33:36 <09g​ammafunk> thanks for that, makes the tournament update actually sane 13:02:20 <06a​dvil> haha oops, I think I'm loading the full db in this test instance 13:05:26 <06a​dvil> actually running this on py3 will need some help from napkin, needs a new mysqldb implementation that I don't think I can install; but it seems to be working fine on py2 15:41:09 <09g​ammafunk> @advil you're probably done messing with this stuff, but napkin did set up a py3 installation using something called anaconda a couple years ago when we mentioned to him that we were trying to migrate the scripts to py3 (then of course we never did it). It's at ~/conda and probably could be updated to install a more recent mysqldb module 15:41:31 <09g​ammafunk> apparently it's only python3.7 though 15:42:20 <09g​ammafunk> now that you've done the initial work, at the very least we can consider updating that and using it for the next tournament 17:45:15 <06a​dvil> ah, if there's a user anaconda install that should make things a lot simpler 22:15:18 -!- Discord|2 is now known as Discord| 23:47:24 Monster database of master branch on crawl.develz.org updated to: 0.30-a0-50-gb2d611f6b1