Wednesday, March 14, 2018

How I Approached the Design Test

Mark Rosewater recently shared what the final Design Test was to help determine the finalists for GDS3, and then went over how he might've attacked the problem in Reading the Designs. I found that insightful and interesting; some of his methods were similar to mine, but some were very different. Today, I'm going to share how I approached the test, and suss out some conclusions from his approach and the differences. I'll also share a few methods of playtesting small numbers of Magic cards.

The images below are renders of designs from my design test, they are not Magic cards or previews.

The first thing I did upon learning that we'd be designing cards for each two-color pair was to go back and read up on all ten of them, as well as the two articles about ally and enemy color pairs. There are a handful of people not employed by Wizards that are as well-versed in the color pie as I am, and just a few more so, but leaning on that confidence would be foolish, and a refresher can only help. Here are those articles:









{BG}—LIFE AND DEATH (Because green's goal has since been clarified from Growth to Acceptance, some of these articles are out-dated.)




After reading each article, I brainstormed a bunch of designs to express each pair's combined philosophy, almost exclusively from a top-down perspective. This approach would give me the best chance of finding novel mechanical space appropriate to a pairing, and it guarantees (in theory) all these ideas are color-appropriate, which is obviously a huge component of the skills being tested.

Fiona represents the white goal of peace achieved
through the blue method of acquiring knowledge.
I asked myself questions like, how could I use red methods to achieve blue goals? If black and green value this, how far will they go to secure it? If white and red disagree about this, what does that look like on a card? As well as some more mechanical ones. How can I intimately combine bounce and life gain? Black only gets proper discard, but blue and red cause incidental discard; How can I bridge that gap? Etc.

I printed up the usable share of these ideas (any design with significant potential), cut them out, and made a grid with each color pair on one axis, and each card type on the other, helping me to visualize my options. IIRC, I had relatively few {RW} and {UB} designs, so I selected those first. From there, I started selecting the designs I liked the most (prioritizing elegance, novelty, and flavor in about that order). Next I selected what fit. And finally, I found the holes—I think there were one or two—and then designed a couple options each to fill them.

The next day, I playtested those 10 cards. The fastest way to playtest a dozen or fewer random cards is to separate them into two decks to minimize the number of colors in each deck (trying to keep synergies together, and putting more colors/splashes in the deck with more fixing [usually the base green one]) and then filling the deck out with the most generic staple cards you can find: Keeping a box of commons (and maybe uncommons) from the latest core set is great for this—I'm using Magic Origins.

If all the cards you're testing have some synergy (like being all the commons of one color in a set), you might just print multiple copies and make a deck out of just those cards. If that synergy spreads across, say, three colors, you might build both decks to test it, trying CD versus CE and then CC versus DE, to see how different combinations work.

My final grid needed a green-blue instant.
The card draw is there so you never get
confused how large your target will
be after this card leaves your hand.
You're looking for a number of things during these games. Are any of the cards confusing, or do they something in a way your testers didn't expect? Are they way too strong, way too weak, way too narrow, or way too swingy? (You won't be able to tell if they're a little bit any of these things without tests in the double-digits.) Is this effect happening too soon in the game because its mana cost is low? Is this one happening too late because its threshold is too high? Would this effect be better on a creature? On an instant? Should we make this repeatable creature ability into a one-time trigger? Is this card helping the strategy or theme it was built to support, or distracting from it? I could go on.

Don't be afraid to change cards in the middle of a game. When you discover something's obviously problematic, playtesting has served its purpose and you should act on that; don't finish a broken game out just because that's what you'd do in a tournament. Either start a new game or finish with the fixed game. Fix the game state too, if you need to. The goal is to see as many card interactions and game states as possible in as little time as possible.

Similarly, you don't always stop playing when someone wins. Rewind a turn or two and make different choices. Cycle out the card that won you the game. Because the late game takes longer to get to by its nature, and is terribly important in determining how players remember experiencing the game, dipping into alternate universes to see more of them is of immense value. This technique is particularly useful when the game is close, or when there are cards still in your hand you want to test; less so when it's a landslide.

Check Your Work
There's a twist to this story: I completely missed one of the requirements of the test until half-way through. Here they are again:
  1. All the cards will be two-color and each of the ten two-color combinations (listed below) need to be represented.
  2. Each of the following five card types (creature, enchantment, instant, planeswalker and sorcery) needs to be represented twice, and never on the same color.
  3. Each rarity (common, uncommon, rare, and mythic rare) must be represented on at least two cards.
  4. Submit your cards in order of quality of design from what you consider your best design (first) to what you consider your worst design (last).
Requirement 2 is actually two requirements, and one of them is shaped like a little clarifying phrase after a clarifying parenthetical. I had missed that each card type could only use each color once, and that if I made a CD sorcery, the other sorcery would have to be EF, EG or FG. I quickly checked to see if this revelation had invalidated my selection. Oh yeah. Ohhh yeah.

This threw off my plans to go to a game design retreat that weekend, but at least I had plenty of time left to re-work things.

Selection, Again
I looked at which of the ten cards I had settled on initially were my favorites and I looked at which cards worked together under the newly realized fifth requirement. I decided to keep these four:

And I ditched the other 6. Going back to my original pile of brainstormed ideas, I selected a few of the promising ones that fit. And then I had a couple holes, so I designed several options to fill them. I actually had enough promising cards at this point that I had two complete sets of 10 (one with a different blue enchantment than Bucket List and a different black planeswalker than Jori).

More Playtesting
I playtested both sets. Since I had 20 two-color cards, I made RWub deck and a GUBrw deck. That's super-awkward, so I stuck some cardbacks in my sleeves and called them five-color lands (since that's faster than hunting down a bunch of dual or tri-lands). The cards most affected by playtesting were the planeswalkers. It's harder to anticipate how 'walkers will play than all the other card types, because everyone has a lot less experience with them, they're inherently more complicated, and they just work different.

Several of my cards were under-costed. Oops.
Playtesting also helps you find the corner cases you'll often miss in theorycrafting. For example, Recall with Purpose originally ended with "Attach that aura to that creature" but I had never restricted the player from getting Bought the Farm. That had to be solved, either by specifying "search for an aura that could enchant that creature" or opening it up and letting you get any aura to stick anywhere. Opening it up lost a little bit of flavor, but this isn't a story card, so I opted to let players be clever. Remembering that auras entering the battlefield automatically attach to a legal target helped.  (I wasn't aware that putting both return actions in the same sentence meant you wouldn't be able to attach the aura to that creature, but that's a trivial templating solve.)

Make Deliberate Choices
In game design, as in improv, the only choice worse than the wrong choice is no choice. If two (or N) options seem similarly strong, don't fret over the choice, just make it. Commitment is really important in any space where you're expressing yourself. It matters what you say, but only if you say it with conviction. So I had two sets of 10 and I was really happy with both. I chose the ten you saw. I wish you could've seen my Tibalt or my Bioneer, or Day at the Market. Hopefully, one day you will.

I also changed a card at the last moment, which is dangerous, but I decided I'd rather have a wicked simple common than a meandering one.

I still quite like the elegance of pairing the very red "~ deals 2 damage to a creature and its controller" with the very black "~ deals 2 damage to a target and you gain 2 life" but it's clear in retrospect that my submission had enough straight-forward A+B combos, and I could've ventured more risk.

The MaRo Method
Before I go, I want to briefly examine how Mark Rosewater designed his sample {WU} card. I'm sure he doesn't literally list all the abilities primary in a card's colors, but instead shortcuts to a mental list of them, but the idea of carefully and deliberately culling those long lists for cardtype- and rarity-appropriateness, and culling again for interest (which is clearly both subjective to the author and the day of the week) is fascinating to me. Again, I don't imagine he literally does that when designing, but if that's what he shares when walking through his process slowly enough for us to follow, that's still a much more deliberate procedure than I take.

It's subtly reassuring that his example goes down a dead-end path. That's likely deliberate, to help demonstrate how frequent those are in this line of work, but given the designing-as-I-write-this-article vibe he gives, I'm inclined to think that really is the first place Mark's mind went to, and he saw value in keeping that in there for us to see. He's such a good teacher.

I'd never have guessed when I started that
my {UB} common would be a creature, but
I was pleased to find this it-looks-like-a-
creature-but-its-really-a-spell illusion idea.
How do I attack a blank slate design like this? I mentioned a bunch of top-down approaches above. Another great source of inspiration for me are potential solutions to little frustrations I have during actual games. "Isn't it disappointing when you bounce their best creature when it attacks and they just slap it back down in their second main phase?" Reflector Mage. Or looking at perennial problems (like blue-black's lack of a creature keyword) and trying to solve them by applying the color pie through different lenses. Sometimes it works to just stare at an existing card and ask "what's the smallest change I could make to this card to completely change how it plays?"

There are, of course, many many ways to approach card design and game design (and I promise you no one at Wizards uses only one).


  1. Fascinating breakdown. At some point I'm very interested in attempting the challenge myself. I'm surprised that you didn't include a step of researching existing cards. Especially once you had your designs, I would think some Scryfall searches would have helped with costing or just to make sure none of the designs were too similar to existing cards.

    1. Also, I think it's interesting that your original design for third degree had three explicit "modes". By moving it to the "always on" model it kinda loses that.

      I'm also intested in how you decided your rarity distribution. Did you start with the required 8 then fill in the last two slots? The rarity questions are always hard for me on the exam.

    2. Oh, I live on Gatherer and Scryfall. Apparently so much so I didn't even think to mention it.

      Yeah. The Third Degree I submitted is half the text, but the one I didn't sells its identity, both in terms of origin and the name, so much better. Elegance isn't always correct.

      I knew my 'walkers would be my mythics. From there, I just keep an eye out toward having at least 2 of the other rarities. That my extras fell between rare and uncommon was neither surprising nor planned.

  2. thanks for sharing. One thing I was interested in, and perhaps you shouldn't comment on it yet, was how there seemed to be less out there ideas in the top 8 compared to, say, gds1. Not to say there wasn't a ton of innovation, but it seemed like logical next steps were valued higher than "look at this band new concept", which I found interesting.

    Then again, perhaps I over value seeing those crazy designs. Maybe I'm alone in this thinking?

    Thanks for sharing!

    1. I imagine it was scarier than ever to submit 'out there' ideas for the GDS3 because you were given no room at all to explain your ideas. If you throw out an 'out there' idea and it's interpreted in a slightly wrong way it could be disastrous.

      It's interesting that there was no room for explanations on the submissions given how much MaRo loves talking about pitching ideas, but I guess it's just that many more words to read.

    2. I would also posit that there’s been a lot of changes since GDS1, New World Order being fully implemented not least among them. I think that Wizard’s/MaRo’s take on good design has changed over the intervening time period, as you’d expect it to.

    3. I will say that in general, the bar for out-there design rises with every release and with every GDS. It's much higher now than it was for GDS1.

      I will also say that I was so focused on making the best and most fun cards possible, and demonstrating my understanding of the color pie, card types, and rarities, that I failed to give a lot of attention to pushing boundaries in the design test.

    4. I think that one of the things that also happens over time is the more that Magic innovates and iterates, the more we all learn what does and does not actually work, and the more we realize when something "out there" doesn't actually cause the interesting game play we think it will.

  3. Oh, for reference, I designed over 100 cards for this test. Probably close to 150, not counting versions of the same concept.

    1. Wow! According to my google doc, my number was about 22.

    2. About 35-40 for me, but I often overwrite cards repeatedly as I go as part of my process (sometimes this backfires!) so I probably was closer to 60-70 if including slightly different designs.

  4. This is awesome! Thinking about the color pie is a very cool way to generate card ideas, and it looks like all the effort you put into it paid off. It's ironic that a lot of the results from this approach ended up looking like bottom-up mechanical combinations.

    By the way, Third Degree (the changed version) is my favorite design from your submission.

    1. It is ironic.

      Thanks. Glad to know someone likes Third Degree.