I got dragged in a contest. Now that hadn’t happened in a long time! 🙂
I ended up 136th (so upper Gold), much better than I expected considering the experience lag I’m starting to accumulate in multis.
How did that happen? The memory’s fuzzy. Let me try and recollect what I can from git, chat and web logs.
For starters, I usually don’t take (seriously) part in the contests. I did in the past, then hit a pretty long streak of not being able to attend the event for real-life reasons, and lost the rhythm and drifted away for some time before coming back to the community.
I consider myself a decent programmer, but as I may have mentioned before, the competitive aspect of it is really quite alien to me. Statement quality varies wildly from an event to the next, but I did notice there’s always a very pleasant atmosphere on the various chat channels. Old timers chime back in, newbies aplenty with their distinctive set of first questions most of us can remember asking when it was our own time. Much sharing. So social. Wow.
It started kind of how most last contests did for me: an opportunity to share a bit of real-world, live Haskell coding.1 So I streamed2 my unboxingdiscovery of the statement, and all I could achieve in an evening’s work.
The opening introduction by Errichto was a nice surprise. Not that it was particularly informative, but I’m a fan, so that’s that.
Thus opened Wood-2. The wood leagues, in the challenges that have them, feature a restricted set of the rules, aiming at gradual onboarding of those who aren’t too familiar with the platform. Their usefulness is still pretty debated, and many would like an option to skip them. But as far as educative streaming goes, they’re acceptable. My point being: I don’t really remember much of what the statement’s actual restrictions were. What I do remember is that the simplest solution I could come up with was the mathematically-perfect one, implemented as a maximum over nested loops.
let (rs,(a1,a2)) =
maximum [ (bYield (aBrew a1) + bYield (aBrew a2),(a1,a2))
| a1 <- actions, a2 <- actions, aId a1 /= aId a2
+ bDelta (aBrew a2)
, vPos (bDelta (aBrew a1) + pInv self) ]
let pick | bYield (aBrew a1) >= bYield (aBrew a2) = a1
| otherwise = a2
Mmm… I read that as picking two actions a1
and a2
among a set, likely provided in input. Ensuring they don’t share an id. Also checking “something” (vPos
) about the sum of their “delta” with “inv”. “Inv” likely being an inventory.
And then “pick”ing the maximal one along a measure called bYield ∘ aBrew
, which can probably be summarized as “the better one”.
That wasn’t too hard. What’s lurking in the next league?
Wood-1 got me stumped. Oh, it didn’t stump me in an “oh my god I don’t know how to handle this” kind of way. No, it was more of a “this is a wood league, we’re kind of expected to be able to get through this with a one-liner, yet I can’t think of anything simple that would fit the bill.”
So… My hands, as if on autopilot, typed the following:
bfs :: St -> [(St,[Action])]
= go Set.empty . pure . (,[]) where
bfs = []
go _ [] @St{..},as):q)
go cl ((st| stBrewed = [(st,as)]
| st `Set.member` cl = go cl q
| otherwise = (st,as) : go (Set.insert st cl) (q ++ q')
where q' = map (second (:as)) $
filter ((`Set.notMember` cl) . fst) $
children st
Really, I think it’s a first. A search algorithm to get out of the woods.
I learned later on it was possible to pass with a much more ridiculous “play a random action among those provided” algorithm. To date, I’m still not sure whether I should feel stupid or offended.
Anyway.
My fancy search algorithm passed the boss and reached Bronze.
Of course, it crashed3 in Bronze, since the inputs had changed yet again. I updated what I could understand I ought to from a cursory glance at the statement’s differences. And submitted.
Top tercile in the league! Not too shabby for an evening’s stream. And an algorithm I only wrote to get myself out of wood.
Real-life called, so I stopped the stream there. But not without commenting that I was worried the contest would decay into some variation of compute-heavy linear optimization. That would be close to the opposite of fun.
On Sunday I edited the video and played some git golf to reproduce a decent history.
Monday evening, about the time Silver league opened, I got back to it. The upper leagues were bound to be un-fun, but I hadn’t finished what I came to stream in the first place: an actual simple implementation of the full Bronze rules.
Except I wouldn’t stream anymore: both @egaetan and @R4N4R4M4 had joined the club, in French as well, and they’re a helluva lot more interesting than I.
What’s missing to claim a bronze-class bot? The learning curve, mostly. With the BFS approach, there’s actually not much to do: simply integrate the learning action to the state generation, and you’re golden: making use of them will be a natural part of the search. I also gleaned from chat eavesdropping that the inventory was bounded, so I pruned those newly invalid states.
I submitted the improvements and watched my bot rise to the upper league.
LOLNO. In my dreams. I submitted the “changes” and sank my rank to the lower (bronze) quartile. I know, right?
It was late, I’d had a long day, I called it quits.
Tuesday noon I fixed some of my timeouts by fudging my time checking routine. I wish I could, like most, use clock_gettime(CLOCK_MONOTONIC)
. But that’s not how we roll around here. Not only do we have an obsolete compiler that’s been out of support for some time, we’re also lacking on the libraries front. So for better or worse, I’m using getSystemTime
. It returns the useful part of its result, the nanoseconds, as a Word32
value.
While a Word32
is certainly enough to hold a nanoseconds value, there isn’t a lot of leeway up there. Combined with the fact it’s returned as an unsigned integer, we’ve got a latent bug nest right there. Nothing impossible, but cautious coding required.
In the evening I had the genius idea of updating the “known spells” part of my state record when I learned a spell. The subsequent ranking confirmed the hunch: it’s a better way of doing it than the previous haul-a-constant-set-of-four-starting-spells-around-the-entire-BFS one. A much better way. Ranked 66 in the league. I don’t remember the league size at that point, but it’s definitely “upper” Bronze.
Closer to midnight, I messed around with some compiler options, looking for quick and easy performance wins. I happened to submit the exact same code, with optimisations on and a heap size limited to 128 MiB.
Surprise, low Silver!
Wednesday noon—six hours to the Gold league opening—I upgraded my heuristic for ingredients from “sum” to “weighted sum” and migrated my quadratic queue to a more appropriate logarithmic Seq. Logarithmic isn’t exactly perfection for this use case, but this is a quick win. Win enough to reach 225th place. Something like halfway to Gold, if you’re into that sort of terminology.
In the evening, after the league opened and my local rank shifted upwards accordingly, I completed a few easy elementary upgrades to my codebase:
- I restricted the int width of everything to the appropriate limit.
- I renamed just about every identifier to some cohesive ensemble.
- I removed the crash when the best found state is the tree’s root.
- I moved input “castable” from a spell attribute to a player’s.
More chat eavesdropping told me the order bonuses were now a part of the input. I eliminated my own tracking code and made use of the input instead. This cured my crashes in the case I failed to unify a tracked potion with its counterpart in the inputs. Which would happen whenever the opponent would naughtily brew the same bonus-laden potion as I.
I also wrote a quick filter/code-generator around Deck.java
. It was intended to perform some static analysis of brews and spells, to try and gain some insight about their interaction. It did bring a few things up at the time, but nothing that survived in the end. Especially compared to the formulae you can find in others’ postmortems.4
I don’t have a recording of my rank at the end of this. I’m not even sure I submitted it at that time. So let’s keep with a conservative “still upper Silver, but not close to top” state of mind.
Thursday I didn’t have much time to afford to the contest. All I recall improving was:
- migrating spells to bitsets,
- fixing some more unification between state and inputs around order bonuses, and
- rewriting all I/O to use ByteStrings.
Considering the size of the inputs, that last one’s probably a negative net gain⁄time spent ratio.5 But I’m starting to run out of fun stuff to implement, and I must track that measure very closely too if I don’t want to get bored out of the game too soon.
Friday noon I completed my last battles monitor, to get easier and more convenient access to those battles that resulted in a timeout for my bot. Because I hate my bot timing out for unexplained causes. They cause the final ranking a lot of harm when I submit. I must be getting rusty, my bot times out a lot more than what I used to code on previous contests.
I switched my heuristic from 2tier ingredient weighting to the more commonplace potion cost formula. My ranking, that had meanwhile descended to the 400 range, started its run mid-Silver, and gradually fought its way to the top. The very top. As in, ranked 1st in Silver, just beneath the boss.
As I hadn’t yet implemented any of the instagold heuristics that were discussed on the chat, I felt pretty confident they’d be all I’d need, and concentrated on other changes. Changes with a vision, long-term changes.
The bot stayed around rank 1 all afternoon. It was still there when I got back to coding in the evening, around the time the Legend league opened. I implemented a few remaining low-hanging fruits:
- migrating the potions to bitsets as well
- paranoid assertions around anything that has to do with bonuses
- fixing that @*&#% timer at long last
So now the bot comfortably, reliably and stably uses all of its allowance minus a few milliseconds’ margin. And I’m still getting random unexplained timeouts. Oh well, we’ll see about those tomorrow.
Saturday I updated the players’ state to remember who brewed what and reduce data volumes a bit more. I continued to investigate the timeouts, and still didn’t find anything actionable. Distressing.
And I implemented the instagold brew decay formula at last.
Except it didn’t instagold me at all. Really, it barely made any difference to my ranking. My ranking that, by the way, wasn’t so top-silver anymore. I don’t have records for that, but I remember spending a very tense Saturday, attempting numerous variations on my heuristic and on my runtime settings to try and contain my two most pressing problems: not being top silver anymore, and the timeouts. Not conflating the two, but not totally separating them either: timeouts have the worst influence on the best rankings.
You can tell it was a bad day by glancing at my git log: there’s barely anything in there for Saturday. The heuristic change, “more and better GC stats”, and “endgame acknowledgement”.
That last one is exactly what it says on the tin: trawling through my oh-too-numerous defeats looking for obvious quirks to fix, I did notice I lost a few (which is to say, too many) games by a brew action of my own. Stupid, huh? So I integrated the opponent’s maximal T+1 “endgame” score in my heuristic so as to detect and prune self-defeating moves.
The git-invisible aspect of that day is a lot of head scratching, thinking through many Many heuristics and trying out a few of those in the arena. For submits were taking longer and longer.
I went to bed with a lame-ass silver ranking feeling pretty depressed about the whole thing.
Sunday morning, I gave myself a deadline. Stable-enough submits took about an hour, so I’d have time for three in the morning. I was to immediately fallback my heuristic to what had been Silver#1 state. Then I’d attempt three distinct-yet-promising enough heuristics variations I’d gravitated around the day before. I’d keep the best one and use it in a lower-level language to try and get more depth out of my BFS. Hopefully that would make it to Gold at long last. Also, who knows, maybe I’d intuit a valid assessment of what makes it actually good, so I could use it in a beam search like half of Legend is openly doing.
So, in order:6
- the most elaborate one I could come up with, backed with spreadsheet and a day’s battles’ worth of statistics, aiming to optimize score, mana, and “rebound” ability.
- just discounted rupees, strict inventory score and spell castability bonus
- discounted rupees and lax inventory score
As I watched the first run start,7 I noticed there was no need to wait until the third run was complete to actually start working on the code.
So I wrote 347 lines of C.
How do I remember the exact count?
Glad you asked! Writing them took me all morning. Hey, I’m rusty, ok! I started with the input handling, because a lot of the data structures emanate from there. By lunchtime, I had a working input parser, with well-typed8 support code for the various stems of ingredients, brews, spells, action parsing. And a port of my Haskell “state” type.
At such short notice, I can’t spare much debug time, if any. So it’s all extremely explicit C, steering clear from anything remotely dangerous. Struct9 typing, no pointers. Pass-by-value everywhere, there’s always time to improve it once it’s correct. That GADT in my I/O handling will have to be rewritten into something tractable to the C type system.
And then something weird happened.
I came back from lunch greeted with chat highlight notifications. Scrolling up, it was all congratulations from my fake school mates. But what for? It turns out that final heuristic, for reasons unbeknownst to us mere mortals, had made it to lower tercile Gold.
My reaction, as you can guess, was one of pure unadulterated joy:
I took a moment to regroup, right there. What was I supposed to do at this point?
Complete and debug the C, reaching feature parity with the Haskell, and move on from there? Uncertain payoff: there’s no saying I’d be able to complete it in time. Let alone debug it. C has top-notch debugging facilities. Haskell not so much. Then again, Haskell doesn’t write nearly as many bugs as C. Anything too funky would be very unfun and I’d regret wasting a fine day.
Or get back to Haskell, expanding from a base that provedly works? Past the previous days’ low-hanging fruits, there were other not-too-high avenues for improvement. Amortized constant-time queueing. More unboxing and unlifting. A yet smaller state representation. Searching for more than one turn of opponent moves. Beam search. A more elaborate learning scheme than { LEARN 0 } × N
.
The sunk cost fallacy got the better of me. I haven’t coded in C in a long time, so that does carry a form of intrinsic “fun”. And my Haskell list was in good part performance stuff, which wouldn’t need to come up on the C side until much later. If at all.
I didn’t have the entire afternoon to dedicate to it, but the conversion happened at a steady enough pace. Porting to a lower-level language is relatively mindless, if verbose. By 16:20 the engine—state management and transitions—was complete, and I turned to the hotline for the C minutiae.
JBM> What’s the C incantation for popcount?
anonymous> What’s that?
pb4> JBM: 64-bit, I assume?
egaetan> I use __builtin_popcount()
JBM> pb4: Right you are!
pb4> JBM: __builtin_popcountll()
JBM> egaetan: Seems to me you’ve got a latent bug :-)
NARRATOR> egaetan enjoyed a nice ranking bump shortly after.
By 17:24 I was writing the time management. Interestingly, where Haskell used Word32
, clock_gettime()
returns the nanoseconds as a C long
. Was I granted a free pass with regard to the unsigned integer wrapping bugs I had to work around earlier this week?
JBM> Are C longs on the platform 32 or 64 bits?
anonymous> 64
JBM> Thanks!
NARRATOR> They weren’t.
By 17:30 I was fixing the same off-by-3 bonus reconciliation mistakes I had in the beginning of the week.
Real life called and I was AFK until 21.
By 22 I was mostly able to run a game, but it kept timing out around turn 15. The monitor also reported many invalid action attempts. I mentally translated both as bugs in my engine, and went on debugging.
By 22:16 I realized both my C and Haskell had been reliant on the fact that all spells were distinct by their delta value. A fact I’d never bothered to verify.
JBM> Are all spells distinct in their delta value?
anonymous>10 Yes, I think they are.
JBM> Thanks!
NARRATOR> That much hasn’t been disproved yet.
By 23:24 I eliminated the source of the turn 15 crash: it happened when the best found state is the root of the tree. Sound familiar? It was orders of magnitude harder to pin down: where the Haskell crashed with an exploitable message “Prelude.last: empty list”, the C just sought (and failed to output) an invalid random action. Obviously a bad memory state, but caused by what part of the code?
Narrator: It’s caused by an uninitialized variable when you trace the actions from the best found state back to the root of the tree.
Hey! Get out of my postmortem, you, it’s painful enough as is!
He’s right, though. In a nutshell, my best move selection looked something like this:
action_t choice;for ( int i = best;
queue[i].prevAction.type != ROOT;
i = queue[i].prev ) choice = queue[i].prevAction;
Variable choice
wasn’t reflex-initialized by being a part of the for
initializer, since it needed to scope out of it. The second clause gave a fake sense of having handled the root case, and… there you have it! An uninitialized variable when the root happens to be the best.
By 23:53 I spotted this gem:11
typedef struct spell {
int8_t id;
/* and other fields */
} spell_t;
typedef int64_t spells_t;
void spDelete(const spell_t s, spells_t *ss)
{
*ss &= ~s.id; }
Spot the error? Hover for a hint.
My sleep is lacking, my C quality is degrading and it’s showing. At least solving that let my bot play complete matches, with no more invalid actions. Ironically, I only found non-null winrates against the community manager’s bot. Still pretty lame.
Some time later—memory’s fuzzy for obvious reasons—I replace all of my (1 << x)
with (1LL << x)
. Yet fewer invalid actions, but nothing decent yet.
At 1:48 I take a break from the debugging and come to the channel to request the magical pragmas to make C fast. Not that I really need them right now, but I will eventually, and there’s no saying I’ll find someone online at that time. Two contestants12 give me their line. I try both, and notify the one who gave me an invalid one he’d better settle for the other.
Around 2:15 I’m having a hard time keeping my eyes open. My bot still won’t win much: all too often it learns its starting batch, brews a potion… and then proceeds to learn the entire tome on an empty inventory instead of casting or brewing anything. Not too winning of an attitude. The sunk cost fairy is nagging me hard.
I get started on duplicate state avoidance. That one was for free in Haskell, but in C I’m on my own. A hash table is the obvious way to go, but at this stage in the night I have no chance of getting a bug-free one up and running. So I cop out and just use the one from glib
instead, high performance be damned.
Sometime around 2:45 I get it to work, and I get the expected ÷10 in rate of nodes expanded.
By 3 I’m so tired I’m starting to see pink elephants on parade. I’m still debugging my sub-par strategy making, and stumble upon this block I wrote a few hours earlier as a part of my node expansion:
for (int spell_index = 0; i < N_SPELLS; spell_index++) {
const spell_t spell = spellList[spell_index];
if (!spMember(spell, st.self.castable)) continue;
for ( int mult = 0;
10 : 1);
mult < (spell.repeatable ?
mult++ ) {
inventory_t inv = pay_muliplied(st.self.inv, mult,
spell.delta);if (!validInv(inv)) break;
/* create and enqueue new node */
} }
Found the issue?
At 3:06 I submitted my code again, now devoid of the nastiest off-by-one of my platform career. And it rose, rose! It rose to top Gold!!!
It still had the same cases of finishing the game to an opponent’s win, so I adjusted it the exact same way I did with the Haskell earlier in the week: by simulating one turn of the opponent’s move and maximizing endgame value. Way dirtier, though. We’re talking multiple times past my bedtime C, copy-pasted from the BFS expansion with mass search-replace of “self” to “opp”. Something that would have been clean from the start, you nagging pixie. Though I have to admit it could be done cleanly in C too, just… with time to spare.
Submit code… 3:47… bid channel g’night… I forgot where I left my bed… Just lemme sleep, mkay?
The week concludes on 719 lines of Haskell. 1 177 lines of C. 66 lines of Perl. 58 lines of Cabal.
I took Monday (kinda) off.
When I emerged, the invalid C pragmas guy had thanked me for my contribution to his ranking. And my own bot had stabilized at rank 136. Still top Gold. I can only assume better Gold than the submission without the opponent, as I haven’t found any bug in that part of the code yet. But I don’t think I’ll ever be able to prove it.
What a ride.
Do I have any take-aways? Lessons learned, that sort of shit? I’m not even sure.
The XMPP chat is definitely a more interesting place in times of contest. Helpful people, helpless people, clueless people, heuristic leeches, the good old boys and girls, but always something common to talk about.
C’s still a badlow-level language. Not much wrong with that per se, but it comes as no surprise we can’t expect the same level of support from the compiler as we get in a high-level language.
I was impressed at its code simplifications abilities, though. At the very beginning of my BFS port, I statically allocated a 1 Mi-node array, to avoid spending any more time on memory management. The word on the street at the time was most people in low-level languages reached 100k to 150k nodes. So my buffer ought to have been plenty, especially before I started optimizing for node size.
Yet my program was very happy to report it as exhausted in less than 50 ms. And I don’t think it was a bug of any kind: simply it was before I implemented any kind of action output. So any result from the search wouldn’t be observable, and GCC was able to correctly infer it was all the same as simply pretending it incremented my allocation count above the array size and reporting the overflow error. The kind of simplification I only expected from pure languages.
It probably helped that I’m able and used to expressing pure computations as such. FP is the future of low-level languages too, whether their practitioners realize it or not.
Amidst the lockdown more than ever, it appears I have to remain highly self-conscious around what makes the contest fun, to be able to remain and not get kicked out by boredom. Writing this PM, I realize how much the little steps were useful. Not so much to make the program more efficient at whatever it was doing, but simply to give me the will to keep investing a little more time in the endeavour.
I’ve been disappointed with the general13 level of both the contestants and the chat members. I don’t mean this in a particularly pejorative way. What I’m saying is, while I was detached from contests and just browsing the forum threads, it appeared to me as though the top achieved wonders. But now I’m 1st or 2nd percentile with a BFS and a basic heuristic, so I don’t exactly feel like an AI hero. My journey only appears incredible14 because I switched languages at the last minute.
I’ve now also, with an insider’s view, witnessed the amount of cargo cult that went in quite a few of the top’s AIs. Boy was that a depressing experience. I might have to publish a worst-of digest, if I can think of a proper way to anonymize things.
Disregarding the failed yet honest attempt at providing me the ABI word sizes, I have read truly horrifying things. I saw a high-ranking player claim he had scientifically tried BFS both with and without duplicate node avoidance, and went without because it performed 10× better on the “nodes evaluated” scale. What do you want to retort to that? They’re not kidding when they say the common folk don’t understand exponential growth.
I’m as ever-so-grateful as always to the staff and company for granting us a new contest with great wearable prizes. Really, this contest had stunning visuals, don’t you think? I’ve checked the source, and they represent most of the lines of code, surely that’s saying something. I wish I could bring myself to give a damn about them, but I’m such an unredeemable audio thinker.
As for the game, I guess it was ok. Nothing ridiculous like shoehorning rock-paper-scissors into a random classic or anything. The short, bounded matches probably helped a lot at giving reasonable rating time, at least in the beginning.
The I/O protocol was really sub-par. Frankly, it’s insulting they’d ask us to give proper identifiers to variables in the contribution rules, and then proceed to provide us with a potion’s bonus value in a variable named tomeIndex
. And I say that as someone who, for better or worse, knows and suffers the SDK stub generator inside-out.
Do I have to mention the timeouts again?
If I’m ever to play contests with a Legend state of mind, I’d better brush up on my Rust. Kudos to @emil, it’s now a better integrated language on the platform. I wish I could come up with such a scheme to get Haskell to a decent level of support. Or merely a compiler upgrade. I’m not holding my breath. But hey, I am aware Rust exists, and of its promises both on the type/memory safety front and on the performance one.
I’d like to congratulate @Sakisan, who remained pure to the end and ranked his final score with a Haskell bot.
I wasn’t lazy15 enough for that. But I do have the dilemma now: should I get back to my old faithful Haskell code base and improve that to Legend, or just claim my 500 XP in C and be done with it? Semi-related, should I distill my bronze bot to a Haskell starter kit for the game, one that actually produces something usable out of the inputs?
Mmm, this was longer than expected. Was it of any use? Only you can answer that!16
Many thanks to all of you who either supported me throughout the week, or just gave me something funny to write about!
This article hasn’t been proofread yet. Be the first to make fun of me by reporting one of my mistakes!
And there’s demand. Niche demand, but demand nonetheless. BTW, if you’re that user who came to ask me for my code the other day: I was on jabber.el, I don’t have your CG nickname. But you can “watch” my stream thread on the forum and you’ll be notified every time I think of updating it.↩︎
I tried it in French, for a change. Functional programming on that platform is as niche as it gets, and it’s better for my streamer’s morale if it actually feels more like streaming than recording a performance of an educational video. Having the #Fr regulars listen to me in the background, ready to pounce on my n0oBz mistakes, is a great support.↩︎
This, for once, is not a cue to poke fun at me. Having the code crash in the face of unexpected inputs is a wholesome part of ensuring they are actually dealt with properly.↩︎
So go read those instead. What are you waiting for?↩︎
Though you got the general idea in that formula, it’s probably very wrong too. You’re welcome to help me correct it.↩︎
In order and from memory, as there’s no time left for git commits when you’re still silver less than a day from the end. I’ll try and dig it out of the submit history, as I’ve been told it’s now available. Looks like we lost the end-of-contest email report on the way, though.↩︎
And probably reflex-restarted it thirty seconds in because six is too many losses for a first ten.↩︎
Let’s say as well-typed as can be in C.↩︎
Not a typo.↩︎
I’m sorry, I forgot who you were. Remind me and I’ll update.↩︎
Contracted for your convenience. They weren’t more than a page apart, but they weren’t all next to each other either.↩︎
Forgot names here too. Please get in touch.↩︎
This word means: not you, the others.↩︎
In my very own opinion, guaranteed 100% pure with no pieces of bias inside.↩︎
For the outsiders: SLPJ famously recounted: “laziness kept us pure.”↩︎
And please do. You know how to reach me.↩︎