Genius is a great resource. At a high level, Genius has song lyrics and allows users to comment on what the artist meant. Starting as Rap Genius, where users annotated rap lyrics, the site rebranded as “Genius”, allowing all songs to be talked about. According to their website, “Genius is the world’s biggest collection of song lyrics and crowdsourced musical knowledge.” Recently even, they’ve moved to allowing annotations of pretty much anything posted online.
I’ve have used it a bunch recently while trying to figure out what the hell Frank Ocean was trying to say in his new album Blond. Users of the site explained tons of Frank’s references that went whoosh right over my head when I listened the first time and all the times after.
And recently, when I had some ideas for mini projects using song lyrics, I was pretty happy to find that Genius had a API for getting the data on their site. Whenever I’m trying to get data elsewhere, I’m much happier with an API, or at least being able to get it from JSON responses rather than parsing HTML. It’s just cleaner to look at, and with an API, I can expect good documentation that isn’t going to change with css updates.
Their API docs looked pretty good at first glance, with endpoints for artists, songs, albums, and annotations. One things I did notice was that they don’t have an artist entry point. A lot of what I want to do is artist based, meaning I need to know the artist id for everyone. And in order for me to get that, I have to search the artist, grab a song from the results, hit the song endpoint for that song’s information, and then grab the artist id from there. It’d be nice if you could specify what I’m searching for when I hit the search endpoint so I don’t have to go through that whole charade just to get the artist. But that’s a blog post for another time. Overall, they give out tons of information pretty easily.
But why, Genius, why don’t you have an endpoint for getting the raw lyrics of a song?! You have a songs endpoint on the API, and you give me a ton of information from there — the song title, album name, featured artists on the song, number of annotations, images associated with the song, album information, page views for that song, and a whole host of more data. But the one thing you don’t give me, and the one thing that people using the API probably want the most, is plain text lyrics!
Pre-Genius, I was stuck with these jankily laid out sites with super old looking css that would have the lyrics, but not necessarily correct, and definitely no annotations. Those sites are probably easily scrapeable considering their simplicity, but searching for the right song would be more difficult, and the lyrics might not be correct. Genius solved this all now for a web user, but dammit, I want the lyrics in the API!
Now you might be able to get the entire set of lyrics by using the annotations endpoint, which had information about all the annotations for a certain song or article, but that would require a song to have annotations for every word in the song. For someone like Chance the Rapper who like Frank Ocean (and most other hip hop artists uses tons of references in his lyrics, having complete annotations might not be an issue. But of Jake Owen, who’s new single “American Country Love Song” has probably the most self explanatory lyrics ever (sorry for throwing you under the bus here, Jake. Still a fan), there’s no need to annotate anything, and getting the lyrics in this manner wouldn’t work.
The lyrics are there on the internet however, and I can get at them by hitting the song endpoint, and using the web url that it returns. The rest of this article will show you how to do that using Python and it’s requests and BeautifulSoup libraries. But I don’t have to have to resort to HTML parsing, and I don’t think Genius wants users doing that either.
I’m left here wondering why they don’t want to give up the lyrics so easily, and I really don’t have much to go on. Genius’s goal seems to be wanting to annotate the internet. It has already moved on from their initial site of Rap Genius, into all music, and now into speech transcripts, as well as pretty much any other content on the web. Their value comes from those annotations themselves, not the information they’re annotating. They give away the annotations freely, but not the information (lyrics) in this case.
Enough speculation on why Genius doesn’t spit out the lyrics to a song when you get the other information. And as I’m writing this, I realize I easily could have overlooked something in their API and Genius might return the full lyrics, but I overlooked it. In that case, half of this article will be pointless and I’ll hold my head in shame from yelling at them like I did.
For purposes here, I’m going to show you how to get the song lyrics from Genius if you have the song title, and also talk through my process of getting there.
Note of clarification, just to make sure I’m not violating their terms of service, this post is for informational purposes only. Hopefully this can help programmers out there learn. Don’t do something bad with this knowledge. Code time!
First thing you’re going to need is an account set up with Genius. You can sign up from the upper right hand corner of the genius.com homepage. After that, navigate to the api docs where you’ll then see your Bearer token that you’ll need for all API requests.
I’m using the requests library here, and once you have the bearer token, here’s what all the API requests to Genius should look like if, for example, you’re searching for a song title.
import requests #TOKEN below should be the string that the API docs tells you #Clearly I'm not giving mine out here on the internet. That'd be dumb base_url = "http://api.genius.com" #Key line below here when, this is how to authorize your request when #using the API headers = {'Authorization': 'Bearer TOKEN'} search_url = base_url + "/search" song_title = "In the Midst of It All" params = {'q': song_title} response = requests.get(search_url, params=params, headers=headers)
The response, according to the Genius API, would be a list of songs that match that string passed in, with the first result being the Tom Misch song that I was going for. By changing around the url that is passed into the request method, you can access all the information that Genius supplies from the API (pretty much everything but the lyrics).