Categories
Uncategorized

Pathfinder Monster Database: AC vs Touch AC

Ok, one more visit to this pathfinder monster database before I’m on to a new dataset. This time, I wanted to take a look at the relationship between Armor Class (i.e., how hard a monster is to hit) and Challenge Rating (i.e., how tough the monster is, according to the game developers). There should be a pretty strong linear relationship between AC and CR.

However, the thing I’m really interested in is the relationship between Touch AC and CR. Because pathfinder is needlessly complicated, monsters have a separate AC for just touching them (or getting hit with certain kinds of attacks like rays or bullets). It’s been my experience that touch AC is always low, and doesn’t seem to change at all as the game progresses. Pragmatically speaking, this means that at a certain point touch attacks basically always hit. Let’s see if that’s the case. I think some run-of-the mill scatterplots will do here, but might as well make them fancy. Three packages needed: (a) ggplot2 for the basic plotting; (b) ggExtra to add density distributions to the plot; (c) cowplot, to stitch the two plots together at the end and (d) ggthemes, in case I want some custom color palettes.

library(ggplot2)
library(ggExtra)
library(cowplot)
library(ggthemes)

Then, I can make the plots. Made the dots transparent with “alpha = 0.3” to deal with overplotting. Personally, I find it looks cleaner than geom_jitter(); it’s always low-key bothered me that the randomness added by geom_jitter() actually changes the positions of the raw data a little. Feels a little like lying with data to me, sometimes. With the ggmarginal command, I added in the density plots so we can see the overall distributions. Then at the end, I used cowplot to stitch them together side-by-side. Knowing this, I made sure both graphs had the same y-axis lengths to facilitate easier comparison, a key component of the small multiples approach.

p1 <- ggplot(mydata, aes(x = CR, y = AC)) +
  geom_point(size = 2, alpha = 0.3, color = "steelblue") +
  theme_classic() +
  geom_smooth(col = "royalblue", se = FALSE) +
  scale_y_continuous(breaks = seq(0, 70, 5)) +
  scale_x_continuous(breaks = seq(0, 40, 5)) 

p1m <- ggMarginal(p1, type = "density", fill = "steelblue")

p2 <- ggplot(mydata, aes(x = CR, y = AC_Touch)) +
  geom_point(size = 2, alpha = 0.3, color = "steelblue") +
  theme_classic() +
  geom_smooth(col = "royalblue", se = FALSE) +
  scale_y_continuous(limits = c(0, 70), breaks = seq(0, 70, 5)) +
  scale_x_continuous(breaks = seq(0, 40, 5)) +
  labs(y = "Touch AC")

p2m <- ggMarginal(p2, type = "density", fill = "steelblue")

totalplot2 <- cowplot::plot_grid(p1m, p2m, labels = "auto", ncol = 2)
  
cowplot::save_plot("scatter.pathfinder.png", 
totalplot2, 
base_aspect_ratio = 2)
Scatterplots of AC and CR

Ok, well there’s a clear (very strong) linear relationship between CR and AC (r = .91). That’s probably not surprising, given that the game designers probably wanted that to happen by design, to keep up with increases to attack bonuses that players get each level. However, touch AC is very weakly related to touch AC (r = .12). In fact, it’s really only mythic level monsters beyond CR 20 that see any increase in Touch AC at all! So for the vast majority of play, there’s no relationship between CR and Touch AC. I got to wondering then, what is the variance in touch AC actually attributed to? A good first guess would be a creature’s size:

ggplot(mydata, aes(x = CR, y = AC_Touch, col = Size)) +
  geom_point(size = 2, alpha = 0.3) +
  theme_classic() +
  geom_smooth(method = "lm", se = FALSE) +
  scale_y_continuous(limits = c(0, 70), breaks = seq(0, 70, 5)) +
  scale_x_continuous(breaks = seq(0, 40, 5)) +
  labs(y = "Touch AC") +
  scale_color_ptol()

Yup, looks like the majority of variation is due to that. So in summary, touch AC doesn’t really change as monster CR increases, and most of the variation is due to size. It’s a weird mechanic to add to the game, and easily abusable by additions like the gunslinger that always hit touch AC. Far as I can tell though, hasn’t been removed from the second version under playtest. Oh well, add it to the list of weird broken mechanics, I suppose.

Scatterplots of touch AC and CR, split by size

Syntax and data available on the blog’s OSF page.

Categories
Uncategorized

Pathfinder Monster Database Plots

I want to incorporate more R into my classes at Dalhousie. Problem is, I am a pretty bad R coder– I spent much of the past decade or so with SPSS and Mplus. But there’s lots of evidence that R is the future of science. I find that the best way to learn is project-based, so I’m going to start blogging on R code. I’m going to focus on topics that are inherently interesting to me, with a focus on data visualization. If I keep it fun, I’m more likely to stick with it.

So, to start I’m going to analyze data from the Pathfinder Monster Database, a comprehensive database of all 2812 monsters from Paizo’s tabletop roleplaying game, Pathfinder. I’ve played Pathfinder for years now and there are a lot of crunchy numbers in there. Probably why I like it so much! I’m going to look at the relationship between creature type two outcome variables (a) Armor Class (i.e., how hard the creature is to hit) and (b) Challenge rating (i.e., how tough the monster is overall). The goal is to see what creature type is “toughest” overall.

The data needed a little bit of cleaning (e.g., changing “Dragon” to “dragon” for some entries), but it was in good shape overall. I decided to try out ridge plots as the way to visualize the data, since I’ve never used them before. First thing to do is load the necessary libraries into R.


library(ggplot2)
library(ggridges)
library(dplyr)
library(ggExtra)

Next, since I want the two plots to be in order from highest to lowest values of AC/CR, I need to use the next bit of code which requires dplyr. This creates two new variables I can use to re-order the y-axis with later. I also created a color palette of 13 random colors, since there are 13 creature types and I didn’t like the default ggplot2 colors here.

<h1>Order variables by AC</h1>
avg <- mydata %>%
group_by(Type) %&gt;%
summarise(mean = mean(AC))

ACorder &lt;- avg$Type[order(avg$mean)]
<h1>Order variables by CR</h1>
avg2 <- mydata %>%
group_by(Type) %&gt;%
summarise(mean2 = mean(CR))

CRorder &lt;- avg2$Type[order(avg2$mean2)]
<h1>Create color palette</h1>
pal &lt;- rainbow(13)

Ok, now I can create the two plots using the geom_density_ridges() function. This needs the ggridges package to function, as base ggplot2 can’t do this.


ggplot(mydata, aes(x = CR, y = Type, fill = Type)) +
geom_density_ridges() +
theme_ridges() +
theme(legend.position = "none") +
scale_y_discrete(limits = CRorder) +
scale_x_continuous(limits = c(0,30), breaks = seq(0, 30, 5)) +
scale_fill_manual(values = pal) +
labs (y = "", x = "Challenge Rating")

ggplot(mydata, aes(x = AC, y = Type, fill = Type)) +
geom_density_ridges() +
theme_ridges() +
theme(legend.position = "none") +
scale_y_discrete(limits = CRorder) +
scale_x_continuous(limits = c(0,50), breaks = seq(0, 50, 5)) +
scale_fill_manual(values = pal) +
labs (y = "", x = "Armor Class")

So, the toughest monster types in Pathfinder are dragons, followed by outsiders. The weakest monster types are vermin and animals. The ranking of toughness by CR and AC are exactly the same, as it turns out. However, the distribution for oozes are way different than everything else: These creature types tend to be really easy to hit, but are still tough because of lots of other abilities and immunities. The positive skew in the distributions for CR are interesting, since it shows that there are generally a LOT more monsters under CR 10, which makes sense given that very few games get to such high levels.

I like ridge plots. They work a lot better than overlapping histograms when there are lots of groups and lots of cases. There was a bit of difficulty with numbers less than 1 for the CR plot (e.g., some CRs are 1/3). Without the “scale_x_continuous(limits = c(0,50)” function, the graph displayed values less than 0, which is outside the range of actual data. I believe that the graph is now bunching all the CRs that are less than 1 (~217 data points) as “0” on the graph above. Overall, a fun first attempt, and neat data to work with.

Datafile and syntax available on the blog’s OSF page.