Computing chord statistics for Hooktheory Trends

Continuing the discussion from Minor Keys, Roman numerals:

@HertzDevil, thank you for pointing this out, and I feel like this is an important enough subject to deserve its own thread.

The question arises, what is the most useful way for us to store chord information for Hooktheory Trends?

As you have noted, Hooktheory currently stores data in the relative major mode. We chose this because we felt that the majority of progressions in popular music are either in the major mode, or ambiguous enough that analyzing them in the major mode is satisfactory. For songs that are properly in a different mode, the relationships between chords would be preserved (e.g., vi → IV of the ionian mode and i → ♭VI of the aeolian mode are counted as the same trend).

However, there are also disadvantages to this approach. Organizing borrowed chords is not ideal. We’ve done our best to group borrowed chords together enharmonically, but are still missing many cases. Furthermore, some people are interested in “major key” or “minor key” perspectives: for example, comparing the mixolydian progression I → ♭VII to the ionian progression I → ♭VII rather than the ionian progression V → IV.

This method also has the effect of improperly categorizing songs that are written in parallel modes by borrowing chords (although perhaps this is just a symptom of a different issue).

As a specific example, to find the relative frequency of the ♭VII, I assumed that an adequate representation would be a combination of: borrowed chords that are enharmonic to ♭VII, applied IV/IV chords, and IV chords coming from theoryTabs in the mixolydian mode. I didn’t count V chords in the aeolian mode (or any “minor i” modes) since @trevordeclercq was referring to a “major key” usage of ♭VII. However, I also didn’t include Lydian mode progressions that would be technically equivalent, because I simply forgot to consider this possibility. Ultimately this was a little cumbersome.

Off the top of my head, I see a few ways of proceeding:

  1. Keep chord data the same: Here the goal would be to fix theoryTabs that are improperly analyzed in parallel modes (by perhaps enabling a transpose to parallel mode feature), and do a better job of grouping enharmonic chords together. This is in effect an ionian mode-centric approach.

  2. Separate trends by modes: Here we could simply not compare chord progressions across modes. This would give better data on how chords are used in a specific mode, but the downside is that there would be far less data (over 90% of TheoryTabs are in the ionian mode).

  3. Organize chord data by parallel major mode: Here, progressions like I → ♭VII would be counted the same in both the ionian and mixolydian modes. This would also in effect separate modes in to two groups: major-I modes and minor-i modes. One clear advantage of this is having no difference between chords borrowed from a mode and the chords that are in that mode. I suppose this was in effect what I had to do to find statistics on ♭VII. Here, rather than storing the borrowed mode, we would instead store the label in the “popular” style notation, which is consistent across modes.

Just to clarify my position, I said that, given a major key, bVII is way more common than viio. And then I said that bVII is the most common chord in rock songs after V and IV. I did not say (or at least mean to imply) that bVII was the most common chord in rock songs after V and IV given a major key, although re-reading my post I can see how a reader may have inferred that. (The same holds for my point about melodic scale degrees.) I don’t want to mis-represent my own published research.

That all being said, I would be somewhat surprised if iii in a major key (not v in the relative minor) is more common than bVII in a major key. But it would not be the first time that statistics proved my own intuitions incorrect.

As far as answering your question about how to calculate trends, I think the trends should be calculated based on the tonality of the song (at least as a baseline). E.g., for songs in a major key, here are the trends; for songs in a minor key, here are the trends. You can then also search for relative cross-trends, i.e., maybe bVI - i and IV - vi are not the most common progressions in minor or major keys respectively, but this chord progression holds more statistical weight when tracked via the La-based minor method (i.e., your color scheme if you go with the relative color thing). In other words, here are the trending color patterns. And heck, why not also search for the most common parallel trends, too (although if 90% of the songs are major, it probably would not make a big difference). If you have the key encoded as major or minor and the Roman numerals in pop notation, I would think it would be pretty easy to see all of this data, none of which we can presume is the “right” way to approach trends.

P.S. As is summarized in the Wikipedia description of modes, I would suggest you avoid referring to it as the “major mode.” Ionian, Dorian, and Mixolydian are all major modes. And the Ionian mode is not the same thing as a major key. Sorry to be picky here and keep harping on the same thing, but keys and scalar entities (like modes) should not be conflated. Your third point starts to get at this, that modes are traditionally separated by the quality of the tonic triad.

@trevordeclercq, thank you for the clarification. I agree that this third method is beginning to sound like a pretty good option the more I think about it.