June 13, 2012 Tweet about this project using hashtag #musicdata  One comment

The Science of Music

An analysis published in Scientific Reports by Joan Serrà of the Artificial Intelligence Research Institute in Barcelona and his colleagues has found that music has indeed become both more homogeneous and louder over the decades.

Dr Serrà began with the basic premise that music, like language, can evolve over time, often pulled in different directions by opposing forces. Popular music especially has always prized a degree of conformity—witness the enduring popularity of cover songs and remixes—while at the same time being obsessed with the new. To untangle these factors, Dr Serrà’s team sifted through the Million Song Dataset, run jointly by Columbia University, in New York, and the Echo Nest, an American company, which contains beat-by-beat data on a million Western songs from a variety of popular genres. The researchers focussed on the primary musical qualities of pitch, timbre and loudness, which were available for nearly 0.5m songs released from 1955 to 2010.

They found that music today relies on the same chords as music from the 1950s. Nearly all melodies are composed of ten most popular chords. They follow a similar pattern to written texts, where the most common word occurs roughly twice as often as the second most common, three times as often as the third most common, and so on, a linguistic regularity known as Zipf’s law. What has changed is how the chords are spliced into melodies. In the 1950s many of the less common chords would chime close to one another in the melodic progression. More recently, they have tended to be separated by the more pedestrian chords, leading to a loss of some of the more unusual transitions. Timbre, lent by instrument types and recording techniques, similarly shows signs of narrowing, after peaking in the mid-60s, a phenomenon Dr Serrà attributes to experimentation with electric-guitar sounds by Jimi Hendrix and the like.

Read more via The Economist



Streaming Goes Global: Analysing Global Streaming Music With EMI Insight Data

[Report by Mark Mulligan from The Music Industry Blog] This July EMI’s Insight division launched an unprecedented initiative to share data from their 850,000 interview Global Consumer Insight data. This dataset covers 25 countries and over 7,400 artists, with twelve people being interviewed at any given moment, 24 hours a day, 7 days a week.

The data is being shared with the data science community in a range of initiatives including forthcoming Music Data Science Hackcamps.
As hard data continues to be something of a scarce commodity for the streaming music debate I decided to mine EMI’s dataset to create a snapshot of global streaming music adoption, and its influence on the broader music market. I have written up a report which you can download for free here. Additionally EMI have given me permission to post the data here so that you can play around the data yourselves. In fact I invite you to go and play around with the data and see if you can find any trends that I missed in my analysis.

Here are some of the key findings from the report (which of course, along with all of the opinions and interpretations are my own and are not, necessarily, EMI’s)

Streaming has a firm foothold. 32% of consumers across the globe are now using streaming services (see figure 1). However, adoption is far from uniform.
Nordics lead the way. Norway and Sweden (the home of Spotify) are respectively the 1st and 3rd most active streaming markets globally. Key to this trend is the relative sophistication of Internet users in these markets. 48% of Norwegians are now streaming music users, as are 43% of Swedes.
Streaming is a good fit for piracy riddled Spain. Spain is the 2nd most active market with 44% streaming penetration. But whereas consumer sophistication was key to Nordic adoption, in Spain piracy and the legacy of free were the most important drivers.
Free is a good fit for France too. The role of piracy and free have also been important in France. French authorities have pushed through the controversial Hadopi legislation but the carrot of Spotify and local streaming success Deezer has delivered immediate results. Translating streaming usage into purchases though is less successful: just 13%.
Purchase conversion rates are higher in lower penetration markets. The US, Canada, UK, Germany and Denmark have lower streaming penetration but these markets have much higher streaming-to-paid downloads conversion rates, averaging 23% of streaming users.
Streaming Drives Music Discovery and Consumption. Although it is still too early to draw definitive conclusions about exactly how much streaming impacts piracy and sales, the case for driving discovery and consumption is much clearer. 55% of global streaming music users state that they now discover new artists and new music as a result of streaming.
Usage is steady among existing users. Usage among existing streaming users is broadly steady with 19% using streaming more than 12 months previously and 20% more.
Download the complete report here.

Original post by Mark Mulligan – The Music Industry Blog


The State of Online Music Discovery (Read Write Web)

Choosing music that someone else would like is more complex than suggesting toaster ovens or even movies. The reasons we like a song are highly subjective and can hinge on very specific, sometimes subtle characteristics. Thus music recommendation is a hard problem whose solution would simplify and brighten the lives of a huge audience – and that’s a tidy definition of a worthwhile business venture. Some companies have tried to solve it by programming computers to identify songs a given listener will like. Others use human judgement to match new music to a listener’s preferences. In this article, we evaluate some of the key players.

Automated Recommendation:, Pandora and The Echo Nest

The dominant approach is to use the power of data and algorithms to understand the relationships between songs and listeners., Pandora and Echo Nest are the most established players in this field.

It has been awhile since grabbed headlines the way Spotify and newer upstarts do today, but the service still offers one of the best music discovery tools out there. It monitors an individual’s digital listening habits – from mobile devices to desktop players, even tracks streamed from the browser – and compares them to the demonstrated preferences of other listeners. The CBS-owned service provides an open API that developers have used to build all kinds of apps and mashups. It has remained relevant by weaving in as many other products and services as possible. Last year’sSpotify integration, which brought the user interface and functionality directly into Spotify’s desktop client, is a perfect example.

Read more >



‘Game-powered machine learning’ opens door to ‘Google for music’ (via Kurzweil Intelligent News)

Facebook game Herd It. The game acts as an incentive for online music fans to classify music, providing sets of examples that are used to train computers to automatically label more songs. (Credit: UC San Diego Jacobs School of Engineering)

Can a computer be taught to automatically label every song on the Internet using sets of examples provided by unpaid music fans? University of California, San Diego engineers have found that the answer is yes, and the results are as accurate as using paid music experts to provide the examples, saving considerable time and money.

Their solution, called “game-powered machine learning,” would enable music lovers to search every song on the Web well beyond popular hits, with a simple text search using key words like “funky” or “spooky electronica.”

Searching for specific multimedia content, including music, is a challenge because of the need to use text to search images, video and audio. The researchers, led by Gert Lanckriet, a professor of electrical engineering at the UC San Diego Jacobs School of Engineering, hope to create a text-based multimedia search engine that will make it far easier to access the explosion of multimedia content online.

That’s because humans working round the clock labeling songs with descriptive text could never keep up with the volume of content being uploaded to the Internet. For example, YouTube users upload 60 hours of video content per minute, according to the company.

In Lanckriet’s solution, computers study the examples of music that have been provided by the music fans and labeled in categories such as “romantic,” “jazz,” “saxophone,” or “happy.” The computer then analyzes waveforms of recorded songs in these categories looking for acoustic patterns common to each. It can then automatically label millions of songs by recognizing these patterns.

Read more>



Why Netflix Never Implemented The Algorithm That Won The Netflix $1 Million Challenge (via Techdirt)

You probably recall all the excitement that went around when a group finally won the big Netflix $1 million prize in 2009, improving Netflix’s recommendation algorithm by 10%. But what you might not know, is that Netflix never implemented that solution itself. Netflix recently put up a blog post discussing some of the details of its recommendation system, which (as an aside) explains why the winning entry never was used. First, they note that they did make use of an earlier bit of code that came out of the contest:

A year into the competition, the Korbell team won the first Progress Prize with an 8.43% improvement. They reported more than 2000 hours of work in order to come up with the final combination of 107 algorithms that gave them this prize. And, they gave us the source code. We looked at the two underlying algorithms with the best performance in the ensemble: Matrix Factorization (which the community generally called SVD, Singular Value Decomposition) and Restricted Boltzmann Machines (RBM). SVD by itself provided a 0.8914 RMSE (root mean squared error), while RBM alone provided a competitive but slightly worse 0.8990 RMSE. A linear blend of these two reduced the error to 0.88. To put these algorithms to use, we had to work to overcome some limitations, for instance that they were built to handle 100 million ratings, instead of the more than 5 billion that we have, and that they were not built to adapt as members added more ratings. But once we overcame those challenges, we put the two algorithms into production, where they are still used as part of our recommendation engine.

Neat. But the winning prize? Eh… just not worth it:
We evaluated some of the new methods offline but the additional accuracy gains that we measured did not seem to justify the engineering effort needed to bring them into a production environment.
It wasn’t just that the improvement was marginal, but that Netflix’s business had shifted and the way customers used its product, and the kinds of recommendations the company had done, had shifted too. Suddenly, the prize winning solution just wasn’t that useful — in part because many people were streaming videos rather than renting DVDs — and it turns out that the recommendation for streaming videos is different than for rental viewing a few days later

Read more>


One comment to News

Leave a reply