Lollipop Plot: Meta-analysis of Gender Differences in Sexual Behaviour

When I went to the APS conference in San Francisco last year, I got to hear Janet Hyde talk about the gender similarities hypothesis. Broadly, she argues that most gender differences (i.e., men vs. women) in psychological variables tend to be small in size. She used meta-analysis — statistically summarizing the results of lots of published research — as a method of testing her hypothesis. I thought it was fascinating stuff and a great talk, so I wanted to incorporate some of her research into intro psychology. Since I’ve been sprucing up the intro psych section on sex and gender, I thought it would be interesting to include Petersen and Hyde’s (2010) meta-analysis on gender differences in sexual attitudes and behaviour. Broadly, it found that there are a few large sex differences (e.g., men watch more pornography and masturbate more than women), but consistent with her general argument, most differences were very small in size (i.e., smaller than d = .20) and men and women are more similar than different. However, the paper itself doesn’t present the results in a very PowerPoint friendly way, and certainly not intelligible to freshmen students:

Portion of Table 1 from Petersen & Hyde (2010)

So, with that in mind I decided to make a plot up to better visualize the data. The degree of uncertainty displayed by the 95% CI is important … but most in the class don’t actually have any statistics background. So I decided to teach them what “cohen’s d” was first, then create a graph that showed off those numbers. I also taught them that values less than .20 are essentially negligible, so I wanted to highlight that on the plot. I decided on a lollipop plot, which is a bit more appealing than a bar plot.


#Order the variables in Rank Order

mydata$name <- factor(mydata$name, levels = mydata$name[order(mydata$d)])

#Create the Plot

  ggplot(mydata, aes(x=name, y=d, label=d)) + 
    geom_point(stat='identity', fill="black", size=6)  +
    geom_segment(aes(y = 0, 
                     x = name, 
                     yend = d, 
                     xend = name), 
                 color = "black") +
    geom_text(color="white", size=2) +
    labs(title="Gender Differences in Sexual Behaviour", 
         subtitle = "Negative numbers = more common in women; positive numbers = more common in men",
         x = "", y = "cohen's d (effect size)") + 
    ylim(-1, 1) +
    theme_bw() +
    geom_hline(yintercept = 0, linetype = "solid", color = "grey") +
    geom_hline(yintercept = -.20, linetype = "dashed", color = "grey") +
    geom_hline(yintercept = .20, linetype = "dashed", color = "grey") +
Lollipop plot showing meta-analysis results

Overall, I think the plot turned out pretty clear and was a lot more appealing than just putting a table of numbers for a more dynamic presentation. The ordering from largest to smallest helps a lot to guide the eye, and the dotted grey lines show the range of “trivial” effect sizes. So there are some substantial gender differences, but the effect sizes are actually a LOT smaller than you’d expect! For context, a cohen’s d of ~.60 would look like this (using this site to generate this quickly):

So phrased another way, if you selected a random man and a random woman, there is ~67% chance that the man will watch more pornography than the woman (the “common language effect size“). That’s not insubstantial, but also probably a lot smaller than a lot of people would think! Many of the effect sizes are vanishingly small. For a Cohen’s d of 0.20, there’s only a 55% chance that a randomly selected man will be more likely to engage in the behavior than a randomly selected woman. For that matter, men and women are actually more similar than they are different on all of these variables, if you consider the percent overlap among the distributions. Which broadly, is what Dr. Hyde’s point has been all along.

Code and data available on my OSF page.


Open Sex Role Inventory

Did you know about the Open Source Psychometrics project? It has more than two dozen personality tests that are all free to use with a creative commons license and posts large, open access datasets for their validation? Wow. What’s even stranger is that this site has no university affiliation, so far as I can tell and I can’t find any info on the site’s administrator. They’ve collected data from ~300,000 people here, these datasets are massive and an incredible resource, but are also virtually untouched by the academy. It’s downright criminal.

Case in point, the Open Sex Roles Inventory (OSRI). I started looking up data on Bem’s Sex Role inventory to help enrich the gender section of my intro psychology class, when I came across this measure. The open dataset had ~20,000 data points in it and I thought it’d be fun to play around with. I sent an email to the site administrator asking about how to score it and validation info. No email response, but the website was updated four days later with a TON of info and modifications to the scale to make a 20-item, more well-validated short form from the original 44 items. Seriously, who is this wonderful person? These data could easily be published in any personality journal they wanted, but they’re just free for all to use with no glory or publishing prestige. Searching online, I found a single publication using the OSRI, and it was a Ph.D. dissertation that used the data to test out a new method of measurement invariance, for which this particular data was mostly incidental (finding that the 44-item version of the OSRI didn’t have measurement invariance across heterosexual vs. non-heterosexual groups.

Anyway, back to the point. I wanted to have a visualization for my intro psychology class to clarify the distinction between sex (i.e., biological/physiological) and gender (i.e., socio-cultural). Specifically, I wanted to talk about Sandra Bem’s theory published way back in 1974 that gender roles are spectrum with at least two dimensions (masculinity vs. femininity). I’m aware of challenges to this distinction (e.g., that sex/gender are inseparable, and that sex is also a spectrum), but broadly won’t get into that here. In any event, the dataset I have only has sex measured as “Male, Female, Other,” so a categorical model will need to do for now. My goal was to show that there is a sizable minority of people beyond cis-gendered men and women. The dataset is pretty large (~16,000 after missing data was removed), so I decided to use geom_hex() with the viridis color scheme. This crushes my computer, and takes a long time to run but in the end, after a lot of obsessing over details, I am happy with the result. Special thanks to this post on Stack Overflow that helped me solve an graphical glitch that stumped me for a long time.

ggplot(mydata2, aes(x = FemSF, y = MascSF)) +
  geom_hex(aes(colour = ..count..), bins = 22) + 
  theme_classic() +
  facet_grid(. ~ gender) +
  scale_fill_viridis() +
  geom_hline(yintercept = mean(mydata$MascSF)) +
  geom_vline(xintercept = mean(mydata$FemSF)) +
  labs(y = "Masculinity", x = "Femininity" ) +
  geom_text(x=4.25, y=4.25, label="Androgynous", col = "darkslategray2") + 
  geom_text(x=2.0, y=1.75, label="Undifferentiated", col = "darkslategray2") +
  geom_text(x=2.0, y=4.25, label="Masculine", col = "darkslategray2") +
  geom_text(x=4.25, y=1.75, label="Feminine", col = "darkslategray2")
Hexagonal Scatterplot for the OSRI

I like the way this graph came out. It shows that indeed, you can predict self-identified male or female status with the items on this questionnaire. But, it also shows that there’s a lot of variation, with folks that are androgynous (i.e., high in both) or undifferentiated (i.e., low in both) in for males and females. The “Other” gender category looks more similar to the female category than the male category, but a little more in the center overall. When I did some t-tests comparing men vs. women, femininity had a cohen’s d of 1.14 and masculinity had a cohen’s d of 0.82. Those are huge effect sizes! Much larger than the majority of sex differences in the literature.

Out of general interest, I decided to do a few more quick plots to break separate out the heterosexual vs. the non-heterosexual participants (I binned homosexual, bixsexual, asexual and other into one category for comparison in this analysis). Nothing fancy this time around, just some simple density plots:

ggplot(mydata2, aes(x = Fem, fill = gender)) +
  geom_density(alpha = .4) +
  facet_grid(. ~ orientation.d) +
  scale_fill_manual(values = pal) +
  labs(x = "Femininity", fill ="")

ggplot(mydata2, aes(x = MascSF, fill = gender)) +
  geom_density(alpha = .4) +
  facet_grid(. ~ orientation.d) +
  scale_fill_manual(values = pal) +
  labs(x ="Masculinity", fill ="")
Density plots for femininity and masculinity, split by gender and sexual orientation

One thing that is pretty striking about these plots is that the femininity and masculinity scales predict identifying as a heterosexual man or woman REALLY well, with massive effect sizes. Folks identifying as “other” fell somewhere in between, though a little closer to the female profile. However, for the non-heterosexual folks it’s a different story. The sex difference is still there for femininity, but less pronounced, mostly accounted for by men being more feminine. For the masculinity scale though, there’s basically no discrimination at all! Men, women and non-binary folks can’t really be distinguished based on their masculinity scores. I suppose that’s consistent with the measurement invariance issue the this Ph.D. thesis picked up on. Not sure what it means in a theoretical sense, but it’s definitely interesting.

I could stop here, but I wanted to also do a quick factor analysis on the short form data, just to see if the factor structure looked good. Nothing fancy or in-depth, just an exploratory factor analysis extracting two factors, with principal axis factoring, and an oblique rotation. I also fussed around until I could make a nice plot of the factor loadings, using the ggrepel package.

#Factor Analysis

mydata4 <- select(mydata2, 
       Q4, Q26, Q28, Q30, Q34, Q36, Q38, Q40, Q42, Q44,
Q1, Q3, Q5, Q7, Q9, Q17, Q29, Q35, Q39, Q43)

corMatsf <- cor(mydata4)

solutionsf <- fa(r = corMatsf, nfactors = 2, rotate = "oblimin", fm = "pa")

#Made a .csv file manually by copying into excel (can't figure out better solution)

print(solution$loadings, digits=2, cutoff=0)
loadings.sf <- read.csv("efa.loadingsSF2.csv")

#The plot

ggplot(loadings.sf, aes(x = PA1, y = PA2, label = item, col = FemMasc)) +
  geom_label_repel() +
  geom_point(size = 3) +
  scale_y_continuous(limits = c(-.80, .80), breaks = seq(-.80, .80, .10)) +
  scale_x_continuous(limits = c(-.80, .80), breaks = seq(-.80, .80, .10)) +
  geom_hline(yintercept = 0) +
  geom_vline(xintercept = 0) +
  scale_color_manual(values = c("hotpink", "royalblue")) +
  labs(x = "Factor 1 (Masculinity)", y = "Factor 2 (Femininity)", col = "") +

Plot of standardized factor loadings for OSRI

That looks pretty excellent, items are clearly distinguished by their factors and the factors are mostly uncorrelated. So the measurement seems solid. No wonder, given the amount of data that went into validating it! Overall, very cool dataset and measure. Might even use this myself some day in a published study…

If you want to take the test yourself, take a look here:

Data and syntax available on my OSF site. Data were used as part of a CC BY-NC-SA 4.0 license, with attribution the mysterious owner of