Predicting not only the effects of music promotion, but also a cult following.

April 4, 2018 MKTG/STAT 476/776 Prof. Peter Fader

Abstract

This paper analyzes the purchase processes of two obscure albums by fitting 28 timing models. By drawing insights from music history and the music industry, the author determines that a “Weibull 2- Segment Finite Mixture Model with covariates associated with music promotion (airplay, the holiday season, tours and EPs)” can be an actionable tool for music promoters at record labels to both assess the efficacy of different promotional tactics and gain insights about consumer behavior. Finally, one segment of the final model’s positive duration dependence appears to have been able to predict that would develop a cult following.

1 Contextualizing the Data: Alternative Rock and the Music Industry

With the breakthrough phenomenon of Nirvana’s Nevermind album in 1991, alternative rock entered the mainstream. Previously, major record labels such as Capitol, EMI and Warner Bros had deemed “alternative” rock unprofitable and left them to indie labels. However, in response to the changing landscape of popular music, the major record labels greedily signed talented alt-rock bands. Popular radio and MTV (Music Television) aggressively increased the genre’s airplay in their weekly rotations per audio and music video formats, respectively. Ironically, even though that and other mainstream album and tour promotional tactics were against the do-it-yourself ethos of alternative rock, bands such as Pearl Jam, Soundgarden, and nevertheless achieved notoriety.

Yet, many other alternative rock bands have since faded into obscurity. Two such bands, both signed by , are Dink and Sparklehorse. Dink released one self-titled album, Dink (1994) with noticeable promotion upon debut, before Capital decided that the band no longer had commercial viability and dropped it. Sparklehorse’s debut album, Vivadixiesubmarinetransmissionplot (1995), was well-received by critics and eventually gained some attention from college radio stations. Ultimately, Sparklehorse developed a cult following that was remarkable enough for the band to be the subject of a documentary, The Sad & Beautiful World of Sparklehorse, in 2016.

This history is reflected in the sales and airplay data of the first 27 weeks of the bands’ respective debut albums. It is evident that Capitol believed in Dink but not Sparklehorse. Dink was met with immediate radio attention, likely due to Capitol’s promotional efforts, but rapidly lost the attention of radio stations after approximately three months. Conversely, Sparklehorse’s album did not receive any airplay until its nineteenth week, after which its airplay drastically increased for nearly two months before plateauing. Notably, even at its maximal weekly airplay in its first 27 weeks, Sparklehorse’s album did not receive as much airplay in one week than Dink did in its first week. This is unsurprising; college radio simply does not have the same reach as mainstream radio.

These observations suggest that using probability models to understand the underlying purchase processes of their albums may both confirm intuitions about the music industry and yield fascinating new insights into the fundamental drivers of album promotion.

2 A Baseline for Comparison: Basic Probability Models

The album purchase data matches the criteria of timing models. The number of purchases per week has an integer value. Furthermore, the assumption that each consumer only purchases one album simplifies modeling. Fitting basic exponential models provides a baseline for comparison with future models:

Dink

Model λ Total LL BIC MdAPE Exponential 0.0295 -255510.63 511032.77 24.90%

Sparklehorse

Model λ Total LL BIC MdAPE Exponential 0.00156 -31036.88 62085.27 50.60%

Dink Sparklehorse Exponential Exponential 4000 1000

2000 500

0 0 1 4 7 10 13 16 19 22 25 1 4 7 10 13 16 19 22 25

Actual Expected Actual Expected

2 3 Understanding Heterogeneity: Gamma Mixture and Zero-Inflation

No justification was provided for the assumption that the population size (N) is 100,000. It is unclear whether N includes all music buyers, popular music listeners, casual alt-rock fans, or only hardcore alt-rock diehards. Given that Capitol is a major record label in this era of music history in which “alternative” is also mainstream, this assumption is particularly frustrating and questionable for understanding what significance the population’s heterogeneity actually holds.

Fitting Exponential-Gamma models, both with and without “non-buyers,” helps to elucidate the nature of the data’s heterogeneity without breaking the N = 100,000 assumption:

Dink

Model p(buyer) λ Total LL BIC MdAPE Exponential (Zero-Inflated) 0.999 0.0296 -255514.10 511051.22 24.911%

Model p(buyer) r α Total LL BIC MdAPE Exponential-Gamma -- 23814.46 8751199 -255692.00 511407.12 28.40% Exponential-Gamma (ZI) 0.999 19051229 64486053 -255514.10 511062.73 24.91%

Sparklehorse

Model p(buyer) λ Total LL BIC MdAPE Exponential (Zero-Inflated) 0.999 0.00157 -31036.90 62096.83 50.60%

Model p(buyer) r α Total LL BIC MdAPE Exponential-Gamma -- 1361635 870854408 -31036.90 62096.78 50.60% Exponential-Gamma (ZI) 0.999 1.837 1169.78 -31051.5 62137.62 49.29%

Two worthwhile observations can be made regarding these models: first, “non-buyers” do not exist under the population-size assumption because the probability of being a buyer is essentially 100%. Second, the Exponential-Gamma spews out nonsensical parameters; this implies that the population is rather homogeneous. Both of these observations suggest that N merely represents the (fairly homogeneous) population of alternative rock fans who are passionate enough to buy albums by bands like Dink and Sparklehorse. This result would be consistent with the actual history of the albums; Dink failed to attract a mainstream audience and Vivadixiesubmarinetransmissionplot only caught on with the college radio niche.

3 4 Factoring in Duration Dependence: Weibull & Weibull-Gamma

Intuitively, album sales should be duration dependent. A highly competitive music scene with innumerous musicians releasing new albums year-round makes it is difficult for an individual album to maintain consumer attention. So, it is important to observe whether the findings regarding heterogeneity from the exponential models hold true when duration dependence is factored in:

Dink

Model p(buyer) λ c Total LL BIC MdAPE Weibull -- 0.0105 1.335 -253012.50 506048.02 24.98% Weibull (Zero-Inflated) 0.645 0.00793 1.692 -251989.70 504014.02 14.72%

Model p(buyer) r α c Tot. LL BIC MdAPE Weibull-Gamma -- 0.9227 154.923 1.650 -252485 505004.8 19.37% Weibull-Gamma (ZI) 0.645 11260.09 1420563 1.692 -251990 504025.7 14.72%

Sparklehorse

Model p(buyer) λ c Total LL BIC MdAPE Weibull -- 0.0002108 1.609 -30637.65 61298.33 31.99% Weibull (Zero-Inflated) 0.999 0.000211 1.609 -30637.66 61309.87 31.99%

p(buyer α c Model r Tot. LL BIC MdAPE ) Weibull-Gamma -- 1576.58 7478873 1.609 -30637.7 61309.87 31.98% Weibull-Gamma (ZI) 0.999 1782.42 8448099 1.609 -30637.7 61321.39 31.99%

The results of the Weibull and Weibull-Gamma models suggest that zero-inflated and Weibull-Gamma models are unreliable: Dink’s zero-inflated Weibull and Weibull- Gamma models contradict all other models. Its zero-inflated Weibull-Gamma model fails to improve on its zero-inflated Weibull model. All of the Weibull-Gamma models exhibit absurd parameters. Although the Weibull-Gamma models appear to improve BIC and MdAPE for Dink, they have virtually no effect on Sparklehorse’s models.

Logically, the same model ought to fit both albums because they are both alternative- rock albums released by Capitol at around the same time. Even though the Weibull- Gamma appears to remarkably improve BIC and MdAPE values for Dink, it would be short-sighted and against intuition to base future Dink models off of Weibull-Gamma and future Sparklehorse models off of the Weibull— given all of these inconsistencies and concerns with interpretation of N, zero-inflation and the Weibull-Gamma.

Overall, it is best to be conservative and use the regular Weibull model as the basis of future models.

4 5 Segmenting Consumers: Finite Mixture Models

Though heterogeneity is difficult to capture with a Gamma mixture, it still may be possible to understand the make-up of the consumer population with finite mixture models featuring two to four segments:

Dink

Model Seg 1 % Seg 2 % Seg 3 % Seg 4 % Tot. LL BIC MdAPE Weibull FM 2 50.67 59.33 -- -- -251075.1 502219.3 10.64% Weibull FM 3 50.67 49.33 0.00 -- -251075.1 502253.8 10.64% Weibull FM 4 33.89 60.43 0.06 0.00 -251276.0 502690.2 11.27%

Sparklehorse

Model Seg 1 % Seg 2 % Seg 3 % Seg 4 % Tot. LL BIC MdAPE Weibull FM 2 30.00 70.00 -- -- -30449.5 60968.1 19.99% Weibull FM 3 0.007 74.70 24.53 -- -30431.7 60966.9 12.30% Weibull FM 4 32.04 35.03 32.10 0.00 -30435.0 60988.1 12.20%

Excitingly, the Weibull 2-Segment and 3-segment Finite Mixture models are virtually the same, and their behaviors are consistent across both albums. Though Sparklehorse’s MdAPE rises with the Weibull 2-Segment Finite Mixture, it may be misleading because the BIC is about the same and MdAPE drops considerably in comparison to previous models. Overall, the simplest and most consistent model moving forward is the Weibull 2-Segment Finite Mixture.

6 Speculating about Covariates: Album Promotional Tactics

For the purpose of fitting probability models, covariates related to album promotion are likely to be the most predictive and actionable. This is because album promotion strategy is a deliberate and replicable action that directly affects album sales. Music marketers can study the effects of such covariates to inform their decisions about when to employ certain strategies.

The pre-Napster pop marketing toolkit of the 1990s consisted of several major tactics:

1. Distribution: Record labels pressured various distribution channels (e.g. record shops, big-box retailers) to stock physical records. CD was the dominant music format at the time, though singles were often also sold on vinyl. 2. Singles: Throughout the album’s promotion lifespan, four or less songs would occasionally be bundled from an album and releasing as a CD or vinyl single. 3. Radio Airplay: The record label’s promoters push the most popular songs from a single onto mainstream radio. Promising albums will receive more attention from promoters.

5 4. MTV Airplay: The record label may also fund the creation of accompanying music videos for popular singles and push them on MTV. 5. Extended Plays: The artist may release EPs consisting of approximately five to nine new songs to sustain or boost appetite for the preceding album many months after its release, or to engage fans in-between major album releases. 6. Tours: The artist may tour as either a supporting act or headliner. 7. “Holiday Wave” Promotions: The record label may promote albums during time periods in which consumer demand is higher. In particular, major labels enjoy riding a “holiday wave” in the weeks preceding New Year’s Eve.

The aforementioned promotional tactics are expected to have varying efficacy as covariates:

1. Distribution: Effective distribution is extremely unlikely to be an issue because physical records are typically sold with a “100% return privilege.” In other words, if supply exceeds demand, the retailer can return the records at no cost. Unless a band is exceedingly popular, there would not be mismatch due to excess demand either. 2. Singles: Given that typically only one single from an album is played on the radio at any moment in time (rather than the entire album, as the dataset’s airplay label may seem to imply), adding covariates related to singles release may be redundant. Even though the release of singles may be the fundamental driver of airplay, the historical documentation of release dates for unpopular artists is incredibly obscure. Additionally, release dates do not capture the enduring effects of the single’s success. 3. Radio Airplay: Generally, radio airplay is a reflection of both popularity and how much effort the record label is putting into promoting an album— which makes it a particularly desirable covariate. Perhaps more importantly, the dataset already includes radio airplay. 4. MTV Airplay: Not only is acquiring MTV airplay data unfeasible, but also it would likely exhibit multicollinearity with radio airplay. 5. Extended Plays: EPs are likely to be useful covariates because their release dates are well-documented, they feature new songwriting material, and they serve a promotional purpose that is different from that of singles. 6. Tours: Touring is an explicit act of promotion that lasts several weeks or months and is, as an act of promotion, entirely unrelated to airplay or releasing music as an act of promotion. In other words, a covariate capturing the effects of touring would not be redundant with other covariates. 7. “Holiday Wave” Promotions: The effects of holidays on consumer demand are well-known and can be easily fitted to the model.

Overall, it appears that the most useful prospective covariates are most likely to be radio airplay, EPs, tours and the holiday wave.

6 Granted, it may be possible to fit covariates related to the overall economy, music industry growth and specific significant events in music history (e.g. Nirvana’s lead singer, Kurt Cobain, committing suicide on April 5, 1994). However, such covariates are not only difficult to discern, but also not actionable from the perspective of a music marketer or promoter. These covariates would also presumably impact (or not affect) the entire industry, rather than individual albums; in other words, such economic factors would naturally already be indirectly factored into promotion strategy. Additionally, the research paper “Experimental Music Markets,” published in January 2017, suggested that consumers treat pop / rock music as necessary rather than luxury— which if true, would put bounds on economic effects on album sales. Finally, though Cobain’s suicide may (sadly) boost Nirvana sales, it is hard to imagine that such an event may boost the sales of unrelated bands such as Dink or Sparklehorse.

7 Researching Covariates in Relation to Dink and Sparklehorse

Now that two decades have passed since Dink and Vivadixiesubmarinetransmissionplot, the information about specific release dates is both sparse and imprecise. However, after aggregating scattered information from sources such as allmusic.com, rateyourmusic.com and Amazon, it appears that Dink featured several singles that were released in physical formats, including the following: “Green Mind / Reason / Angels” (2/14/1995; CD and 12 inch vinyl) and “Get On It” (5/2/1995; 12 inch vinyl). “Green Mind,” peaking at #35 on the Billboard Alternative Charts in January 1995 and boasting a music video played on MTV, appears to have been a modest success. Sparklehorse’s debut album also featured several singles, two of which were released in 7 inch vinyl and / or CD formats in February 1996. However, none of them charted.

Fortunately, the probability model was not expected to depend on single release dates. Though, it is notable that Vivadixiesubmarinetransmissionplot‘s sharp rise in sales did also begin in February of 1996— in the same month of several single releases . However, it would be indefensible to cherry-pick those singles for the Dink model— especially when they are so imprecise.

As for EPs and tours, Dink released Blame it on Tito on October 1, 1996; this does not occur within the first 27 weeks, so it cannot be used as a covariate. The band toured between November 17 and December 17, 1994 with Pop Will Eat Itself and Compulsion; this occurs between weeks 1 and 4.

Sparklehorse released Chords I’ve Known on April 2, 1996; this happens in the 27th week. The band was also the supporting act on the European segment of Radiohead’s Ok Computer-era tour with dates ranging between April 16 and October 10, 1996; the tour occurs outside of the data’s time range.

7 8 Implementing Mutual Covariates: Airplay and Holiday Season

Given that only Dink has a tour and only Sparklehorse has an EP within the first 27 weeks, airplay and holiday season covariates are fitted first to maintain consistency:

Dink

Model Seg 1 % Seg 2 % Tot. LL BIC MdAPE Weibull FM 2 + ln_airplay 45.89 54.11 -250944 501980.2 9.44% Weibull FM 2 + ln_airplay + season 65.35 34.65 -250733 501581.2 7.31% Dink Dink Weibull 2-Segment Finite Mixture + Weibull 2-Segment Finite Mixture + ln(Airplay) ln(Airplay) + Holiday Season 4000 4000

3000 3000

2000 2000

1000 1000

0 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 1 3 5 7 9 11 13 15 17 19 21 23 25 27

Actual Expected Actual Expected

Sparklehorse

Model Seg 1 % Seg 2 % Tot. LL BIC MdAPE Weibull FM 2 + airplay 64.76 35.2 -30295 60682.2 16.16% Weibull FM 2 + airplay + season 68.86 31.1 -30272 60659.2 12.53% Sparklehorse Sparklehorse Weibull 2-Segment Finite Mixture + Airplay Weibull 2-Segment Finite Mixture + Airplay + Holiday season 700 600 700 600 500 500 400 400 300 300 200 200 100 100 0 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 1 3 5 7 9 11 13 15 17 19 21 23 25 27

Actual Expected Actual Expected

Because there are zeros in Sparklehorse’s airplay data, the models are forced to trivially diverge in terms of airplay (where Dink fits airplay, Sparklehorse fits the log of airplay). Regarding improvements, MdAPE decreases with each additional covariate. Now, the most noticeable misfit occurs in week 27 for Sparklehorse.

8 9 Final Touches & New Insights: Dink’s Tour and Sparklehorse’s EP

Dink Final Model: Weibull 2-Segment Finite Mixture with ln(Airplay), Holiday Season and Tour Covariates

Segment 1 Segment 2 λ 0.00119 0.00114 c 1.399 3.748 β ln(Airplay) 0.241 -0.363 β Holiday Season 0.456 -17.092 β Tour 0.858 -1.157 Segment % 84.41% 15.59%

Total LL BIC MdAPE -250730.493 501599.142 6.809%

Dink Final Model 4000

3000

2000

1000

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Actual Expected

Dink Final Model (Cumulative) 60000 50000 40000 30000 20000 10000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Actual Expected

9 Sparklehorse Final Model: Weibull 2-Segment Finite Mixture with ln(Airplay), Holiday Season and EP Covariates

Segment 1 Segment 2 Λ 0.000588 0.001144 c 1.295 0.00293 β Airplay 0.000351 0.0162 β Holiday Season -0.0692 9.0165 β EP -18.515 1.414 Segment % 72.65% 27.35%

Total LL BIC MdAPE -30231.360 60600.874 12.052%

Sparklehorse Final Model 700 600 500 400 300 200 100 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Actual Expected

Sparklehorse Final Model (Cumulative) 5000 4000 3000 2000 1000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Actual Expected

10 The probability models not only fit exceedingly well, but also offer new insights into consumer behavior:

Unsurprisingly, the EP was able to explain a significant mismatch in Sparklehorse’s 27th week. This suggests that a well-timed release of new material can effectively improve the previous album’s sales.

The difference in covariates between segment 1 and segment 2 for both albums indicates a shared phenomenon: a proportion of a population will respond positively to promotional efforts, and the remainder will either react neutrally or negatively. This is not an unreasonable result; for example, consumers who may have purchased an album out of curiosity or passion for the overall genre may be turned off by promotional efforts that do not resonate with them (e.g. hearing and disliking one of the album’s singles being played on the radio just one-too-many times.)

Finally, although three of the models exhibit negative duration dependence (as expected), the smaller of Sparklehorse’s two segments appears to have extremely positive duration dependence. Fascinatingly, the model seems to confirms that even though Sparklehorse never achieved mainstream success, the band did effectively develop a cult following that enabled it to produce new music up until lead singer ’s suicide in 2010.

10 Final Thoughts

Throughout the entire model-building process, the model was designed with parsimony and consistency with intuitions about the music industry in mind. Most importantly, each of the covariates represents an actionable promotional tactic with effects that can be observed both visually and statistically throughout the model-building process.

Ultimately, a Weibull 2-Segment Finite Mixture Model with covariates for Airplay, the Holiday Season, Tours and EPs appears to be the best model for several reasons. First, the fit stellar. Second, the models suggest that music marketers and promoters should recognize that not all music consumers will react favorably to promotional efforts. They should be mindful of segment sizes, purchase rates, and whether specific promotional efforts do in fact increase sales. Finally, the model appears to have been able to predict that Sparklehorse would develop a small but exceedingly loyal fan base— using just 27 weeks of data.

11