I’m Alexandre Passant, and I’m co-founder of Music & Data Geeks. We’re a music-tech company based in Dogpatch Labs Dublin, Ireland. In particular, we’re building seevl, a music meta-data API to help music services make sense of their data, provide recommendations to their users, and more. I’ve been working in data and Semantic Web technologies for about 10 years, first through a Ph.D., then as a Research Fellow in DERI, world’s largest “Web 3.0” R&D lab, and now through the start-up.
My goal is to make the Web more open and interconnected, bridging the gap between raw data (webpages) and knowledge (meta-data and structured connections), and then making sense of this data through recommendations, analytics, etc. Combining this with my passion for listening, playing and recoding music is what lead me to starting MDG. I regularly build hacks and run small data experiments , and I recently decided to go through the top-500 songs as ranked by the Rolling Stone magazine. My goal was to identify common patterns and differences between songs, and to see if/how some of them compare. Thus, I worked first on analysing the lyrics, figuring out that some patterns, such as love, regularly come through the songs, then analysing their tempo and loudness, in order to identify which songs were the most dynamic or monotonic. Surprisingly, a few chart hits, like Pretty Woman, were in that category! I have a few other experiments in my pipeline, and I regularly blog about them on my website, while we release new products and hacks on MDG.
Here’s the second post of my data analysis series on the Rolling Stone top 500 greatest songs of all time. While the first one focused on lyrics, this one is all about the acoustic properties of the data-set – especially their volume and tempo.
To do so, I used the EchoNest, which delivers a good understanding of each track at the section level (e.g. verse, chorus, etc.) but also at a deeper “segment” level, providing loudness details about very short intervals (up to less than a second). This is not perfect, due to some issues discussed below, but gives a few interesting insights.
Complete our SAP x Data Natives CDO Club survey now, and help us to help you
Black leather, knee-hole pants, can’t play no highschool dance
As my goal was to identify relevant tracks from the dataset, in addition to absolute values for the loudness and tempo of each track, I also looked at their standard deviation. If you’re not familiar with it, this helps to identify which songs / artists tend to be closer to their average tempo / loudness, versus the ones that are more dynamic.
Before going through individual songs from the top-500, let’s take an example with the top-10 Spotify tracks of a few artists to check their loudness:
|Artist||Average Loudness||Standard Deviation|
And the tempo:
|Artist||Average Loudness||Standard Deviation|
You can see that some bands really deserve their reputation. For instance, while the Pink Floyd have a high standard deviation both in volume and tempo (not surprising), Motörhead is not only the loudest (in average) of the list, but also the one with the smallest standard deviation, meaning most of their tracks evolve around that average loudness. In order words, they play everything loud. While the Ramones and just fast, everything fast. And when they’re together on stage, the result is not surprising
But you don’t really care for music, do you?
Coming back to the top 500, I ran the Echonest analysis on 474 tracks of the list. The 26 missing are due to various errors at different stages of the full pipeline.
On the one hand, I’ve used raw results from the song API to get the average values. I had to consolidate the data by aggregating multiple API results together. For a single song, multiple tracks are returned by the API (as expected), but there can be large inconsistencies between them. For instance, if you search for American Idiot, one track (ID=SOHDHEA1391229C0EF) is identified having a tempo of 93, the other one (SOCVQDB129F08211FC) of 186. Some can also have slighter variations (in volume for instance, between a live and the original version). To simplify things – and I agree it include a bias in the results – I averaged the first 3 results from the API.
On the other hand, I relied on NumPy to compute the standard deviation from the first API result, removing first the fade-in and fade-out of each track. Here, I’ve also skipped every segment of section where the API confidence was too low (< 0.4).
The average loudness for the dataset is -10.38 dB. Paul Lamere run an analysis of 1500 tracks a few years ago, with an average of -9.5 dB so we can see that this dataset is not too far from a “random” sample – check the conclusion of this post to understand why the Echonest’s loudness is less than 0.
Going through individual tracks, here are the loudest tracks from the list:
- The Twist by Chubby Checker (-2.97)
- Take Me Out by Franz Ferdinand (-3.61)
- She Loves You by The Beatles (-3.73)
- Time to Pretend by MGMT (-3.92)
- Call Me by Blondie (-3.99)
- Brown Sugar by The Rolling Stones (-4.28)
- Highway to Hell by AC/DC (-4.82)
- Who’ll Stop the Rain by Creedence Clearwater Revival (-4.92)
And the quietest ones:
- Hallelujah by Jeff Buckley (-18.16)
- Fast Car by Tracy Chapman (-18.40)
- Respect by Aretha Franklin (-18.62)
- Rollin’ Stone by Muddy Waters (-19.03)
- Desolation Row by Bob Dylan (-19.29)
- Let’s Stay Together by Al Green (-19.38)
- Let It Be by The Beatles (-20.50)
- Earth Angel by The Penguins (-34.94)
You can clearly see the dB difference between a loud (CCR) and quiet (Jeff Buckley) track on following plots.
Looking at the standard deviations, here are now the most dynamic, volume-wise, tracks.
- Sexual Healing by Marvin Gaye (16.11)
- Love and Happiness by Al Green (10.29)
- Heart of Gold by Neil Young (10.17)
- Roadrunner by The Modern Lovers (10.11)
- I’ve Been Loving You too Long (to Stop Now) by Otis Redding (9.98)
This last one is a beautiful example of a soul song with a dynamic volume range, and here’s a live version below.
- Highway to Hell by AC/DC (1.33)
- She Loves You by The Beatles (1.32)
- I’ll Feel a Whole Lot Better by The Byrds (1.31)
- I Wanna Be Sedated by Ramones (1.19)
- 1999 by Prince (1.02)
The Ramones strike again – but I’m not sure that Highway to Hell is actually so linear – even though the 2nd part definitely is!
Please could you stop the noise, I’m trying to get some rest
Going away from the loudness and focusing on the tempo, here are the fastest tracks (in average BpM) of the list (some seem a bit awkward here):
- Wild Thing by The Troggs (205.08)
- I Got a Woman by Ray Charles (174.76)
- Subterranean Homesick Blues by Bob Dylan (174.59)
- Moment of Surrender by U2 (173.99)
- Sympathy for the Devil by The Rolling Stones (172.32)
And the slowest ones, also including the Stones:
- Fake Plastic Trees by Radiohead (73.36)
- Let It Be by The Beatles (72.26)
- Wild Horses by The Rolling Stones (72.24)
- He Stopped Loving Her Today by George Jones (69.53)
- Remember (Walkin’ in the Sand) by The Shangri-La (66.68)
But I believe that once again, it’s interested to look at how dynamic the tracks can be, with the most dynamic ones (tempo-wise):
- Ain’t It a Shame by Fats Domino (34.27)
- I Never Loved a Man (the Way I Love You) by Aretha Franklin (20.82)
- Heart of Gold by Neil Young (20.23)
- Get Ur Freak On by Missy Elliott (19.17)
- Paranoid Android by Radiohead (17.92)
And the most static ones, i.e. the ones with less tempo variation:
- Bizarre Love Triangle by New Order (0.048)
- I Want You Back by The Jackson 5 (0.035)
- Stand By Me by Ben E. King (0.027)
- I Heard It Through the Grapevine by Marvin Gaye (0.0075)
- Walk This Way by Run-DMC (0.0035)
If you’ve ever looked at the Man vs Machine app, you might find fun that even though the less dynamic (or the more consistent, depending how you look at it) one is using samples (Run DMC), all other involved drummers. Don’t forget to thank the best backing band ever for the perfect tempo on Marvin Gaye’s track (and I couldn’t resist sharing their own cover of the song).
I’m waiting for that final moment you say the words that I can’t say
Last but not least, I’ve normalized and combined both the tempo deviation and the rhythm one to assign a [0:1] score to each track in order find the most and less dynamic tracks overall. Here’s the top-5 of the most dynamic ones:
- I Never Loved a Man (the Way I Love You) by Aretha Franklin (0.72)
- Heart of Gold by Neil Young (0.67)
- My Generation by The Who (0.65)
- Get Ur Freak On by Missy Elliott (0.60)
- Paranoid Android by Radiohead (0.42)
If you listen to My Generation, you can clearly hear the dynamic both in tempo and volume with the different bursts of the song. While the Radiohead one is more on the long-run, with clearly distinct phases as shown below for the volume part.
Finally, here are the less dynamic ones. Several ones on that list made it through the charts, showing that even though a song can be pretty flat in both volume and tempo, it can still be a hit – or at least an earworm:
- Brown Eyed Handsome Man by Chuck Berry (0.080)
- Oh, Pretty Woman by Roy Orbison (0.079)
- Maggie May by Rod Stewart (0.077)
- Stayin’ Alive by Bee Gees (0.076)
- Bizarre Love Triangle by New Order (0.073)
Alexandre Passant is the co-founder of Music and Data Geeks, a music-tech company based in Dublin. Music and Data Geek’s chief product is Seevl.fm, a music meta-data API to help music services make sense of their data, provide recommendations to their users, and more. He has 10 years’ experience in data and Semantic Web technologies.
His personal blog can be found here.
(Featured image credit: Didier DDD)