Contextual Evaluation of Limited Formats (CELF)

For many, the best way to test a limited format (and most fun) is to get together with friends and jam a bunch of drafts or sealed games. I’ll never discourage this if you have the opportunity but it does ask a lot of you: a lot of free time, seven other friends that also have the time/desire, an ability to coordinate a convenient time and place for all to attend, and access to product from the new set.

Not everyone has access to the time or resources that a Pro Tour testing team does, getting to play near infinite games of limited and discuss the format with the best players in the world. Many of us only get to play a draft or two a week so how can we learn about the format as quickly as possible? Even pro teams need product before they can start testing.

As a player, I understand the appeal of the brute force approach since playing games of Magic is a lot of fun. However I also really enjoy data analysis and I think there is a lot we can learn about a limited format before we even sit down with our first draft set in front us.

We get two weeks between a set being fully spoiled and the official release date. We shouldn’t waste that time waiting for product to hit the streets when the set itself is rich with information if we’re willing to spend some time analyzing the limited format on paper. When we’re done we’ll be much more prepared heading into that first draft or sealed event.

Current Methods

When a new set is spoiled how many set review articles do you see that make a list of every card in the set each assigned with a letter/number grade? The answer is most, if not all of them.

Constructed players will recognize that the metagame plays a large role in defining which individual cards or archetypes are playable in a format. In constructed it is common for a seemingly powerful mythic rare to see very little play because it just doesn’t line up well with the format.

The key here is context. A card’s value in a format is comprised of individual power level AND contextual value. When set reviews focus on evaluating individual cards they risk missing a huge portion of a card’s value by ignoring context. [Note: The relative importance of power level vs. context varies from format to format and that will be part of our evaluation.].

My guess is that set reviews are approached this way for simplicity. When a set is spoiled there is a lot of new information to digest. I understand the instinct to want to look at each new card and ask ‘How good is it?’. I believe this is the Level 0 of set reviews and as a community we can do better.

We need to start evaluating limited formats instead of limited cards. This may sound like a daunting task but in limited we have the advantage of a smaller card pool. In fact, I propose that many cards in the set can be ignored. A complete set review may actually take more effort because it reviews every single card which is a distraction from the big picture.

So how should we start thinking about new limited formats?

Respect Your CELF (Contextual Evaluation of Limited Formats)

At its core CELF is a top-down analysis of limited formats rather than the bottoms-up approach of reviewing individual cards. I’m sure that most set reviewers tweak card grades based on some understanding of what the set is about but this is usually an afterthought.

We are going to turn this on its head and spend most of our time discussing the set at a top level. Before we even consider the power level of a specific card we will attempt to establish an expectation of what the baseline limited metagame will look like. This will be a moving target but there are quite a few clues that can give us a lot of insight.

Once we have some idea which direction the format is pushing us in we will be better equipped to evaluate each individual card within that context.

I should also note that most of our analysis will focus on commons and uncommons only. Limited is generally defined by these rarities since these are the cards that we will see over and over again during a draft or in game play. There are some formats which are so bomb driven that rares and mythic rares can’t be ignored but we’ll only touch on this towards the end.

Limited Magic is complex and there are many factors to consider when approaching a new format. Some of these factors are dependent on one another so they can’t be analyzed in isolation. The following is something of a suggested order of operations for evaluating limited formats of new sets. Let’s get started.

1) Mechanics/Themes

We’re going to break the fourth wall right away here. Put yourself in the role of a game designer or developer at Wizards of the Coast. They spend a lot of time carefully curating a new set and testing the limited format. They’ve carefully chosen a theme and a handful of mechanics to bring a certain feel and balance to the game play of the format. We should spend a little time trying to figure out what those design choices are encouraging us to do with our decks and how that will effect the limited environment.

Look at each keyword in the set and ask what the keyword wants us to do. Is the mechanic very linear? Think of Heroic from Theros. The more heroes you had in your deck the more you wanted ways to target them. The more spells you had that target your heroes the more heroes you wanted. The strategy snowballs on itself until you realize you really don’t want to play any cards that aren’t directly contributing to your Heroic synergies.

Does the mechanic indicate something important about the pace of the format? Take the Renown mechanic from Magic Origins. Letting your opponent trigger Renown puts you pretty far behind in tempo. This might make having 2-drops to block with even more important than in a format without Renown. The same goes for something like Bloodthirst which gave a bonus for having damaged the opponent this turn.

Besides the keyword mechanics, are there are other themes at play in the set? In Battle for Zendikar, the Eldrazi and colorless spells were pushed themes. We should start by assuming that these primary themes are going to be important to the draft format. It is unlikely for the set designers/developers to give us a bunch of sweet Eldrazi only to produce a format that is too fast for us to live long enough to cast them.

Another common example is a set with a tribal theme where creature types matter. A creature that is usually a mediocre or 23rd card might become a high pick when it turns on a powerful strategy. Kalastria Healer ended up being a high pick in Battle for Zendikar because it was an important piece of a puzzle that tied together all the synergistic pieces of the BW Ally Lifegain deck.

Themes within a set are some of the hardest to evaluate because you have to be aware of red herrings. It’s impossible to know without playing any games if certain themes or sub-themes in a set will be viable strategies. What we can do is look at how many playable cards in the set support the theme and use this to inform our estimate of its viability. Is the theme supported as a standalone strategy or only as incidental value within a ‘normal’ deck? Do the cards have good enough stats to be playable on their own but also support the theme? Or are the support cards completely unplayable if the theme of the deck doesn’t come together? Is there a big enough payoff for the theme to be worth taking a risk on it? Are the payoffs in the common or uncommon slot (or maybe the deck really only works with rares)?

2) Synergy vs. Raw Power

A lot of our analysis of the set mechanics is used to inform this first important hypothesis about the format. Is the strength of a deck going to be defined by its synergy or by the power level of its individual cards? This is a sliding scale. Some formats are heavily biased towards the synergy end of the spectrum (Modern Master 2015, Battle for Zendikar). Other formats are very much about individual cards (Magic Origins, Khans of Tarkir). Some formats are a mix with archetypes on each side.

The answer to this question can drastically effect our card evaluation when we start reviewing each card in the set. If the mechanics are pushing us in a very linear direction, we may have to discount cards that look otherwise playable but don’t contribute to the strategy. Bombs are still bombs and will make the cut no matter what. But cards at the margin will definitely see their value bumped up or down based on the role they play.

3) Mana Fixing

Mana is a huge factor in shaping the direction of a format. If mana fixing is plentiful we can make greedier picks, speculate on colors and splash more bombs. When mana fixing is in short supply we have to be more disciplined with our picks and expect to play a more focused 2-color strategy.

Make a list of the good mana fixing. How much fixing is there? What rarity is the fixing (do we get Evolving Wilds at common or are we stuck with an uncommon cycle of dual lands)? Is the fixing equally available to all colors (dual lands, artifact mana like Azorius Signet, land search like Pilgrim’s Eye or Evolving Wilds) or is it focused in green only? Is each color combination equally viable or does the fixing suggest particular color pairs, wedges or shards?

Based on the amount and distribution of mana fixing, does this look like a 2-color format, a 3-color format, a 4 or 5-color format?

The mana gives us an idea of how far we can stretch our decks and still be able to cast our spells. Next we’ll start to look at which spells we actually want to cast.

4) Removal

The importance of removal in limited is generally well understood. It makes sense then that the first individual cards we should look at are the removal spells.

First we should make a list of all the playable common/uncommon removal spells in each color. This is going to give us our first insight into the strength of each color in the format. Some colors naturally get better removal than others (red and black are usually very good). But if we see that green has zero or one playable removal spell we start to gain some insight. Green may need to be paired with another color that has very good removal to make up for green’s lack of it. When drafting green, you may have to take the one playable removal spell much higher than normal because it’s so hard to find. Conversely, if white has lots of good removal then we might start to think that white is a very strong color capable of supporting multiple drafters. Maybe white is a good support color because any other color would be happy to splash some of white’s top tier removal.

Most removal printed these days is conditional; gone are the days of Doom Blade. Since the removal is conditional, we should look at the common (and uncommon) removal spells in the set to understand what the conditions are. We will be able to more accurately evaluate creatures when we are aware of their potential vulnerabilities.

Here are some typical conditions for removal:

Dealing X damage (ex. Lightning Strike)
Giving a creature -X/-X (ex. Throttle)
Restrictions based on Power or Toughness (ex. Smite the Monstrous)
Restrictions based on combat (attacking and/or blocking creatures) (ex. Divine Verdict)
Color identity (ex. Self-Inflicted Wound)
Converted mana cost (ex. Silkwrap)
Edict effects (ex. Devour Flesh)
Ability keywords (ex. Plummet)
Creature type (Eyeblight Massacre)

Most removal spells fall into the first four categories which means that the most important property when evaluating creature vulnerability is power and/or toughness. This gives rise to the concept of the ‘magic number’. This is the toughness that a creature needs to have to dodge the most common removal spells. If the format has Lightning Strike and Bile Blight then you really want your best creatures to have a toughness of 4 or greater. If the format has Shock and Dead Weight then maybe your magic number is only 3. The creatures of the format will also help define the magic number but it will be mostly driven by the removal spells. Based on the removal available, consider what the magic number might be for this format and keep it in mind when performing card evaluation later on.

Some other removal considerations:

Destroy vs. Exile (is it likely to matter in the context of this set?)
Instant vs. Sorcery speed effects
Bounce spells
Ping effects (if there are multiple common ways to incidentally deal 1 damage or give -1/-1 then 1 toughness creatures get a lot worse – ex. Magic 2015 had Crippling Blight, Festergloom, and Forge Devil at common)

Format Inventory

We are nearly ready to start evaluating individual cards, but before we do let’s review what we’ve accomplished. So far we have:

Studied the sets key mechanics and what they suggest strategically
Looked at the set for other non-keyword themes or sub-themes
Considered whether decks will be driven by synergy or raw card power
Decided how many colors the average deck will be able to play
Estimated the ‘magic number’ for creature power/toughness

5) Card Evaluation

Now we are well armed to start evaluating each card in the set. Instead of grading each card in a list we are going to focus on the top commons and uncommons in each color. You’ll have to look at each card to accomplish this, but we will dismiss any filler cards or unplayables (cards typically graded C and below). This is just a first pass at card evaluation – we will refine these initial lists later.

Keep a list of every common or uncommon that you would consider an incentive to play that color. These are cards that you might see picks 3-6 in a draft and consider them a signal that the color is open (cards typically graded B- and higher). Remember to apply all that we’ve discovered from our format inventory so far.

Do not place any restrictions on the number of cards you list from each color. Many set reviews list the Top 5 commons/uncommons in each color. In the past I have separately listed the Top 5 commons and Top 3 uncommons for each color. Using a quota makes it difficult to compare the relative strength of cards across the color pie. In some sets I have struggled to fill out the top 5 common slot of a weak color and was forced to include mediocre cards just to complete the list. Using card quality as our only filter we are able to produce lists that more accurately reflect the balance of the colors.

Once each list is complete, let’s review them. Which color is the deepest with high quality commons/uncommons? Are the colors relatively balanced? Which colors look like primary colors (lots of depth, good combination of creatures/removal/tricks) vs. support colors (powerful removal or efficient creatures but not both).

6) Speed of the Format

At this point we can start putting some of the pieces together to arrive at derived information – information that we wouldn’t have been able to observe directly but that we can infer from some of our previous conclusions.

One piece of derived information is the potential speed of the format. We have looked at the set mechanics and decided whether they are pushing us towards more aggressive or defensive strategies. We looked at the mana fixing. Three or four color decks usually require more time in the early game to set up and would suggest a slower format. We’ve also looked at the key pieces of removal. Are there a lot of cheap interactive spells or is the removal expensive? Cheap removal will keep aggressive strategies in check.

Finally we’ve made a list of the top commons and uncommons. What’s the converted mana cost of the creatures in our list? Perform the vanilla test (look at the power and toughness, ignoring rules text) on the creatures that have made our lists. Does it favor power (aggro) or toughness (defense)? Is this list filled with cheap aggressively costed creatures or larger late game threats?

I’ll adopt an interesting shortcut suggested by Luis Scott-Vargas. Based on the information available decide whether you think it will be more important to have 2 drops or more important to have 7 drops in your decks. A faster format is going to require you to play 2 drops to keep up on board, while a slower format is going to encourage you to have more late game power to go over the top of your opponent. Understanding, or at least having an idea, of which environment we can expect can give you a big edge at the start of a format.

Refine and Iterate

It’s possible to stop here with our first pass of card evaluation. If you have the time and really want to dive deeper into the format I recommend continuing on to the next phase of refining our evaluation by defining format archetypes. This is where I believe a lot of your edge can be gained.

7) Color Pairs / Archetypes

When we play limited we don’t play cards, we play decks. Every good deck should have a plan. We can learn a lot by trying to predict which archetypes will be common in the limited format and what their gameplans are.

In most formats our baseline will be to look at each color pair for a total of 10 different archetypes. In unusual formats like Return to Ravnica or Khans of Tarkir we may need to focus only on certain Guilds or Wedges/Shards. If our review of the mana fixing has identified the potential for a 5-color deck we may want to include that as an additional archetype.

For each archetype we want to identify what the most common build and game plan will look like. Of course not every RB deck is going to look the same but we can get an idea for what the typical build is trying to do.

What are the overlapping set mechanics in each archetype? Are the mechanics synergistic or do they suggest conflicting strategies?

Combine the lists of top commons and uncommons for each color in the archetype. Do these cards work well together to form a cohesive game plan? This may give us insight on which color pairs are more likely to be viable in the format and which ones require more specific combinations of cards.

Describe the strategy for each archetype. Note whether the mechanics and top commons/uncommons indicate that the archetype is internally aligned toward a common strategic goal or if the colors appear to have an identity crisis with conflicting strategies.

I have ideas on how to evaluate limited archetypes in more detail but that will have to wait for another article.

8) Card Reevaluation

Now that we have taken a closer look at each expected archetype we should reevaluate our top commons/uncommons list through this lens.

What is each color’s identity? Are the top cards in the color good in each archetype or do they fit in specific archetypes? Remember blue in Battle for Zendikar limited. Clutch of Currents was a great card for any blue deck while Mist Intruder and Cloud Manta fit into two very different blue strategies without a lot of overlap. Cards that are in demand from multiple archetypes increase in value and our evaluation of them should reflect that.

Did we rank a common/uncommon highly but see that it might not have a home now that we’ve laid out the archetypes? Similarly we should go back to the pool that we dismissed as ‘filler’ cards. Did we overlook anything that now appears like it fills an important role for the popular archetypes?

Our first pass at the top commons/uncommons list was based mostly on raw card power with some preliminary assumptions about the format. This second pass should vet those initial evaluations in the context of the expected archetypes.

Advanced Considerations

We’ve gone through a pretty thorough evaluation of the format without even touching the cards. Yet there is always more that can be done. I won’t explore these topics now, but will list a couple of ideas for further consideration.

Bombs. We’ve only focused on commons and uncommons. Some formats are more defined by rares than others (Prince formats vs. Pauper formats). Looking at the rares and deciding how many of them are ‘bombs’ (definition left to the reader) can give us an idea of what to expect.

Build Around Uncommons. This is the most fun part of a set for many players. We’ve focused on the more expected archetypes in our analysis. But there are often niche strategies that are discovered as the draft format unfolds. We could probably do some work up front to try to identify these earlier but that might also ruin part of the fun!

Conclusion

First, I’d like to give a shout out to the Limited Resources podcast. I’ve been working on this approach to set evaluation in my head for awhile. Limited Resources Episode 312 gave me the motivation to put ‘pen to paper’ so to speak. The topic was ‘How To Master Any Format’ and I recommend giving it a listen. The episode made me realize that what I was attempting to do was create a framework that would allow us as players to accelerate up the learning curve from new format to format mastery as quickly as possible.

None of the concepts I’ve discussed are by any means revolutionary. But I think we stand to benefit from reallocating how much time we spend examining each aspect of a new format. Most people instinctively spend lots of time on card evaluation, but that represents only a small percentage of what can be learned from studying a new set. By expanding our scope and evaluating a set from the top down we can get a much deeper understanding of the format.

If you are a player who enjoys the discovery process and just wants to have fun learning a new limited format I recommend ignoring this entire article (I probably should have given you that warning up front!). However, if you are a player who wants to master a format in the shortest amount of time I recommend trying this method out. It will be a lot of work and at some point I’m sure you’ll find yourself thinking, ‘I’d rather just be playing Magic’. Like with anything, preparation and doing your homework can give you a huge edge but it does take some work.

The next time you are presented with a brand new limited format, put it through a CELF-test!

And if you are in need of any CELF help, don’t hesitate to ask! If you have comments or questions, feel free to leave a note in the comments below. I would love to hear any and all feedback.

Follow me on Twitter @thenatewalker.
Follow my Twitch channel: www.twitch.tv/n_walker. I stream a lot of limited, some standard/modern.
Or reach me in the comments below!

New Compass Games

Navigating the World of Competitive Magic: the Gathering

Contextual Evaluation of Limited Formats (CELF)