Paper to be presented at DRUID21 Copenhagen Business School, Copenhagen, Denmark October 18-20, 2021

Internet Meme Production and Competition

Michael R. Ward University of Texas at Arlington (UTA) Economics [email protected]

Abstract The sharing of Internet memes is increasingly popular form of expressing opinions and complex sentiments in an easily understood image. Marketers are exploiting the attention memes generate by enlisting meme influencers to create memes to promote products. I develop a machine learning algorithm that classifies meme images scraped from the meme aggregation sub- forums to generate a panel of meme posts. The data reveal time-series patterns in meme proliferation and quality, competition for attention across memes, learning-by-doing in meme creation, and potential mechanisms of the learning-by-doing. These findings suggest that the market for meme influencer will tend to be concentrated and may tend to become a superstar market. I. Introduction

The popularity of sharing through social media has increased dramatically in just a few years. Memes are defined as an element of culture that can be passed on to another individual by nongenetic means, usually imitation. Standardization of Internet memes has emerged as a combination of a picture and a tacit concept linked to the picture. In social media, the image, usually from popular culture, that connotes a specific sentiment on to which the creator has added original text that indicates the sentiment applies in another context. Also, they usually attempt humor. Different Internet meme generator sites (e.g. Imgflip.com, kapwing.com, makeameme.org) have inventories of the more popular meme images with which any user can customize with their own text overlay. However, most of the meme expressions Internet users observe are copies of memes that friends will copy onto online social media. In this way, the most popular expressions of memes can diffuse through a social network quite quickly and broadly. Because Internet memes can command broad user attention, marketers have exploited them as a novel means to promote products (Roache, 2019). Meme influencers have emerged who create, curate, disseminate new memes and new versions of existing memes (Kar, 2020).

Mediakix reports that the top 25 meme influencers in 2020 have a combined nearly

200 million followers.1 Influencers such as @epicfunnypage, @funnymemes, and @ladbible, have millions of followers. Marketers contract with meme influencers to create promotional memes or product placements in memes.

Internet memes have been studied in various contexts. Amalia et al (2018) uses image processing and OCR to develop a method of classifying the sentiment in a meme as pro or anti- government. Shabunnia and Pasi (2018) and Weng, et al. (2012) study the identification and

1 https://mediakix.com/blog/instagram-meme-accounts-best-funniest/ (accessed 13 Nov 2020) 1 propagation of textual memes on . Weng, et al. (2012) find that social network structure and competition for attention explain much of the heterogeneity of meme popularity and persistence Xie, et al. (2011) study the virality and influence of memes embedded in videos about newsworthy events. In a series of papers, Coscia (2013, 2014, and 2018) examines how meme image similarity, faddishness, and competition affect the success of a meme. Since these studies also study stock images with idiosyncratic text, they are most closely related this research.

Research on Internet memes has largely focused the determinants of meme success in becoming viral in social networks. Spitzberg (2014) synthesizes multiple theories into a model of meme diffusion and Gleeson, et al. (2014) show how competition between memes for attention generates popularity measures that can be described by a critical branching process. This predicts power-law distribution of popularity with heavy tails. He et al (2016) overcome impractical assumptions and difficulty characterizing dynamic information to develop a model-free scheme to rank meme popularity in online social media and the social network. The model of He, et al.

(2019) uses meme attributes to evaluate user behavior and dynamic network structure. Xie, et al.

(2011) construct a graph model connecting people and content that allowed the detection of certain regularities. For example, content that will eventually have a long lifespan can often be predicted within a day. Weng, et al. (2012) model competition for attention and claim that combination of social network structure and competition is sufficient to predict broad diversity in meme popularity, lifetime, and user activity without reference to exogenous factors such as intrinsic meme appeal, user influence, or external events. Bonchi, et al. (2013) develop heuristics for a platform selecting among memes to promote maximum virality. Guadadno, et al. (2013) show that a stronger emotional response to a meme increases its propensity to go viral.

2

These analyses typically compare one meme against another. The focus of this study is to examine within-meme fluctuations in a meme’s usage. I collected a sample of meme related posts from Reddit sub-forums dedicated to meme culture. A machine learning algorithm was developed to match the image of a meme post to popular meme images. Matches are never perfect because the text always differs across posts. However, the algorithm correctly classified nearly 70,000 posts into 533 existing meme templates resulting in a separate time series for a number of cross-sections of memes that includes information on the number of posts, their authorship, their popularity score, and the number of comments on the posts. In keeping with

Coscia (2013, 2014, and 2018), the common underlying image will be referred to as an Internet meme while the various implementations of the meme, in this case a particular post to Reddit, will be referred to as a meme expression.

I develop panel estimators to study the evolution of memes over time. The estimator identifies causal effects from a difference-in-difference by including two-way fixed effects model (by meme and time period). The fixed effects for individual memes will capture the time invariant differences in popularity across memes that had been the focus of much of the prior research. The fixed effects for time periods will capture the overall growth in the phenomenon of meme sharing. Thus, identification is driven by time variation in within

The empirical analysis generates multiple insights. First, the proliferation of a meme is self-reinforcing with more and higher quality posts generating more future posts. Second, this proliferation tends to temporarily reduce the perceived quality each future post using the meme.

Third, there is dynamic competition across memes for user attention. Fourth, there is learning- by-doing in meme creation. Fifth, this learning-by-doing by meme creators may be meme-

3 specific and does not result from learning to exploiting timing patterns that could be used to predict when a post is most likely to succeed.

These findings have implications for the meme influencer market. While there are low entry barriers, and meme influencers provide differentiated content to different market segments, learning-by-doing would tend concentrated the market structure. Perhaps a better analogy would by a superstar market with a few providers out of many dominating the page views. Since meme copying enhances the value of its creator, the market should not be affected by intellectual property concerns. These features suggest that meme usage will thrive even as it evolves.

II. Testable Implications

The appeal of different meme images differs across individuals. Some are shared more than others and are “up voted” more often. This cross-sectional variation represents differences in demand. However, most of the within meme variation in the number of meme expressions and how well they are received will be determined by time-varying supply side factors. A meme poster with a topical message will select the meme image best suited to represent that image. The poster will tend toward meme images that have recently been especially well received. In this

“market,” supply and demand are equilibrated at price of zero; there are no pecuniary costs.

While the pecuniary price of a meme is zero, there are time costs in both the consumption and creation of a meme. For consumption, the time may only be a few seconds to read the post, and perhaps to rate it, comment on it, or copy and paste it into one’s social media. Higher quality memes will induce more audience members to bear these costs. Differences in the score that a post achieves through “up votes” represents differences in the number of meme consumers who value the post enough to bear that time cost. The time costs to create a post from an existing

4 meme range from a few minutes to an hour. The steps may include finding the meme image online, adapting the appropriate text for the current application, and posting it on the Internet to a site such as Reddit. Supply might shift due to the difficulty of any of these tasks. But any variation in difficulty will affect all meme creation and will not be meme specific as studied here.

Meme Popularity

A post of a meme that is particularly compelling could affect the future use of the meme.

Such a post may have touched on a nuance of the meme that had not been appreciated previously. If so, the meme will be rated highly by a larger audience. Moreover, that nuance may be expanded upon in future incarnations of the meme. This effect may be observed in two ways.

Meme creators will be enticed by an anticipated increase in demand to create more posts using a meme whose recent posts are perceived to be better. Additionally, more meme creators will adopt the now more popular meme as part of their repertoire. These considerations suggest that a popular post will likely be imitated and expanded upon. However, there is no reason to believe that there will be a permanent shift in the demand for the meme or the quality of the new posts that result. If there is diminishing marginal value of posts using a meme, demand will eventually subside to the steady-state.

H1a: Meme popularity will tend to be self-reinforcing. More meme posts in the recent past will encourage the generation of more posts using the meme. Consequently, a recent increase in the use of a meme over a steady-state will persist and dissipate gradually.

5

H1b: Better past meme posts will encourage the generation of more posts using the meme. Consequently, higher average scores of recent posts using a meme will temporarily increase use of the meme.

Meme Expression Success

A typical mem creator wants her creation to be viewed and appreciated as widely as possible. A meme expression is more successful when it is shared or “up voted” more often. The experience with recent expressions of a meme may affect the success of subsequent expressions in two ways. First, if meme consumers experience diminishing marginal value of a meme, an increased volume of posts using a meme, may decrease the rating achieved by subsequent posts.

Second, recent expressions of a meme may have been well received because the exploited a new association with popular culture. If so, audience members who appreciated a particularly expression of a meme will be more disposed to appreciate similar subsequent expressions. This would imply that the scores that a meme receives will be higher when the recent past scores are higher, at least temporarily.

H2a: Memes will tend to get stale with overuse. More past meme posts will lead to lower ratings for subsequent meme expressions.

H2b: Better memes posts will generate an afterglow that lead to temporarily higher ratings for subsequent meme expressions.

Competition for Attention across Memes

The chief cost to meme consumption is time. If a meme enjoys increased popularity, a larger audience will spend more time consuming it. The time this audience spends on the meme

6 will crowd-out time spent on other memes. This will lead to lower ratings and fewer posts for other memes.

H3a: A meme will become less popular with increased popularity of other memes or better ratings of other meme expressions.

H3b: A meme will garner lower ratings with increased popularity of other memes or better ratings of other meme expressions.

Learning-by-doing in meme production

Learning-by-doing in meme production would result in higher ratings for posts from creators with more posting experience. As one creates more posts, one may gain a deeper understanding of how the meme relates to different potential applications. Moreover, one learns how to reduce the time spent in their production. The creator can spend more time on the content of her post and less on the mechanics of producing it. Meme creators typically wish for their posts to be widely disseminated as a validation for their creativity. As they produce more posts, they may learn how to craft content so that it is better appreciated.

Meme creators may learn multiple mechanisms to increase ratings. Two mechanisms might be by specializing in a specific set of memes and by timing their posts to appear when the demand is anticipated to be higher. Specialization would be evident if a creator repeatedly uses the same meme rather than adopting different memes expressing different sentiments. Strategic timing would be evident if more prolific creators posted when demand for a meme is expected to be particularly high.

7

H4a: Meme posts by creators with more past posting experience will tend to be more successful.

H4b: Creators with more past posting experience will tend to reuse the same meme.

H4c: Some of the success of creators with more meme experience is due to strategically posting when a meme is more likely to be popular.

III. Data and Meme Classification

A consistent set of data were collected for static images rather than videos. Coscia (2014) refers to a meme implementation as the expression that puts together the meme template and some additional information relating to the meme concept. I will refer to a ‘meme’ as the underlying image upon which different text is overlaid to customize the message. Each textual customization extends the image to a new context. For example, the “Distracted Boyfriend” meme provides a reference to someone getting caught displaying a surreptitious desire. By supplying novel text, the creator extends the context of the meme to a new application. For example, figure 1 depicts the idea of student procrastination with the girlfriend labeled “studying for exams” while the woman receiving the attention is labeled “browsing memes.” Each iteration of the meme with different text will be referred to as a meme expression. A particularly clever, poignant, or humorous meme expression will tend to be copied and shared by more users.

The sample of meme expressions was taken from the meme forums on Reddit.com.

Reddit.com is a social news aggregation, web content rating, and discussion website. Users can post content to the site which is then voted up or down by other users. Posts are organized by subject into user-created boards called “subreddits,” on thousands of different topics. Despite strict rules prohibiting harassment, Reddit's administrators spend considerable resources on

8 moderating the site. Pertinent to this study, meme enthusiasts will post, vote on, and comment on, memes in five subreddits: memes, wholesomememes, dankmemes, 2meirl4meirl, and

MemeEconomy. Between 2012 and mid-2019, over three million posts were uploaded to these subreddits. While Reddit facilitates the scraping of only the most recent 6,000 posts, another site, pushshift.io, records each Reddit posts’ unique identifier which allows one to make a call to that post’s information on the Reddit site. In that way, nearly all meme posts to Reddit can be scraped. Many posts’ content consists of the sharing of a mildly interesting but idiosyncratic smartphone screenshot. Other posts contain new potential memes drawn from recently transpiring events in popular culture. Some of these will become popular memes but most will not. The remaining posts are new meme expressions of already established memes.

Reddit posts were classified into memes based on a machine learning algorithm that matches posted images to a set of existing images. The Convolutional Neural Network algorithm was trained with the set of popular meme images that existed on imgflip.com in June 2019.2

Imgflip.com is a popular website that allows users to easily create meme expressions. Users can upload their own images, but most will use the site’s catalog of thousands of meme templates for users to select from to which they add their own text.3 The catalog is fluid but represents most of the popular images used in memes at the time. This catalog was the reference set against which

Redddit posts were compared. Reddit posted images that had a 98% or higher match with one of the meme templates was classified as being an expression of that template’s meme. Perfect matches are precluded by text overlays that differ across meme expressions. In total, 68,325 of the posts in the meme related subreddit forums were classified into 842 distinct meme templates.

2 The algorithm is available at https://github.com/paullewislobo/meme_classifier. 3 https://imgflip.com/memetemplates 9

Matching meme related posts to templates was not perfect. First, only about two percent of all posts were matched. While this is low, many of the Reddit posts are either idiosyncratic user posts, use new popular culture images, or use less popular and less well-established memes.

The match rate was higher prior to 2016 probably because more of these memes became popular then and were added to the set of templates. Still, the low match rate suggests the possibility of an unrepresentative sample. Figure 2 shows that the monthly volume of both matched and unmatched posts increased sharply over past four years. This indicates a steadily growing popularity of online social interactions through memes. Second, the matching algorithm resulted in some false positives and more false negatives. Spot checking revealed about 0.18% of matched posts were classified incorrectly and about 1.4% of unmatched posts should have been classified. This implies a slight undercount of meme expressions classified to memes.

Various data elements were retrieved for each matched post. These include the date of the post, the meme to which it was matched, the title, the creator identifier, the post’s Reddit score and number of comments. There are large differences in the popularity of memes. For example,

500 meme templates garnered 10 or fewer posts while 50 memes were depicted in 200 or more posts. Table 1 lists the most popular memes and suggest that meme popularity is distributed exponentially. While the creator identifier is unique, there is no demographic information available about each creator. About 57% of the identified creators post just once but some post regularly making the average number of posts per creator 2.9. Table 2 reports the number of creators by the number of posts they create. The post’s title may provide some context for the post, but creators tend to use subtle references.

The Reddit score is a compilation of up votes and down votes that the post receives.

Generally, a post will receive more up votes when it is more timely or humorous. Figure 3

10 depicts the average score over time for both posts that were matched with templates and those that were not. The score will be a combination of meme “quality” and the number of people voting on its quality. The rise in average score for non-matched posts between 2016 and 2018 is probably indicative of the growing popularity of meme sharing in general rather than “higher quality” memes. Note that the average score of posts matched to memes began to fall in 2018.

This could be the result of the set of templates becoming “stale” in the minds of meme enthusiasts. The histogram in figure 4 indicates that there is wide variation in Reddit scores for posts. The median score is 17 while the mean is 243 and the highest is over 100,000. More controversial posts tend to receive more user comments. Figure 5 depicts the average number of comments a post receives over time and displays an increase and plateauing similar to figure 3 that is likely caused by an overall increase in the population viewing memes.

IV. Results

Meme Popularity

A popular meme is one in which the format is used often. In the data collected, this is identified by the duration in time before another post to Reddit uses the same meme. An increase in popularity will correspond to a shorter duration. For example, a duration for a specific post of, say, 7.37 hours, implies a current rate of 3.26 posts per day for the time that the specific meme was the most current. Using durations this way, rather than, say, the number of posts per day, allows each meme post to represent a separate observation of the current meme popularity. I analyze these durations using survival analyses methods developed specifically to measure duration of events.

11

I model a meme post’s duration with a Cox proportional hazards model. This model measures how covariates affect the “failure” rate of the observation. In this case, failure occurs when another post uses the same meme. The variables of interest relate a meme post’s duration as the most recent expression of a meme to the recent experience with that meme. Specifically, over the period just prior to the focal post, I measure the number of posts to Reddit using that meme, their average Reddit score, and the average number of comments it generates:

푡 푡 푡 푡 푡푖푚푒푚푖 ∽ 푒푥푝(훼 푝표푠푡푠푚 + 훽 푠푐표푟푒푚 + 훾 푐표푚푚푒푛푡푚 + 휃 푋푚푖). (1)

Here, 푡푖푚푒푚푖 refers to the time to failure of the 푚th meme and the 푖th post using the meme. The variables of interest are 푝표푠푡푠푚 , 푠푐표푟푒푚 , and 푐표푚푚푒푛푡푚 measure the recent past experience with the meme. This information is collected for multiple 24-hour periods prior to the time stamp of the focal post. The test of hypothesis 1a is whether in increase in the number of past posts of a meme lead to a decrease in the duration (increase in failure rates) that the post of a meme is the most current expression of that meme. The test of hypothesis 1b is whether in increase the average score of recent posts of a meme tends to decrease the duration (increase in failure rates) that the post of a meme is the most current expression of that meme.

In addition to the meme-specific information, the model features several controls that could be correlated with the duration and the meme-specific variables. Specifically, these are 24 fixed effects for the hour-of-the-day of the post to account for differences in how active

Redditors are throughout the daily cycle. Likewise, seven fixed effects for the day-of-week of the post account for variation in activity levels throughout the week. The secular trends in overall meme activity are accounted for with 96 monthly fixed effects that span of the length of the data set. In addition, the model is stratified by each of the 533 memes. Stratification introduces a proportional offset for each meme that will tend to absorb the time-invariant cross-sectional

12 variation in popularity across memes due to, say, unmeasured aesthetic appeal. The stratification by meme and monthly fixed effects represent two-way fixed effects. With these controls, the coefficients on past meme experience are due only to within meme variation that is not due to either individual meme popularity or the general growing popularity in meme sharing. Finally, since some posts do not have any prior posts over the recent past, the meme experience variables would have missing values. I included a dummy variable for these cases and replaced the missing values with an imputed a zero value so that the observation could be included in the analysis.

Table 3 reports the results of the Cox proportional hazards model for up to four lagged days of meme experience.4 Each column represents a sperate specification. The control variables mentioned above are also included in all specifications but are not reported. Typically, posts during the middle of the East Coast US day fail sooner, posts on Sunday fail later, and posts later in the sample fail much sooner. The table reports hazard rates for the natural logarithm of the meme experience variables and tests for a difference from one. A value greater that one indicates an increase in failure rates which, in this case, means that the next post to use the meme will tend to come sooner. The double-logarithm specification allows for an elasticity interpretation of results. The value of 1.744 for the logarithm of the number of posts in the immediately preceding

24-hour period in column one can be interpreted as (nearly) a 74.4% increase in failure rates from a 100% increase in recent posts with the same meme or that the current rate of posting for this meme is 74.4% higher than the baseline. The subsequent columns add additional 24-hour periods. In general, periods that are lagged more have a smaller effect with the coefficients after four lags becoming small or insignificant. Across all columns, the sum of the effect is between

4 A lagged day spans the 24 hours prior to the timestamp of the focal meme. Thus, meme expressions rarely share identical lagged days. 13

70% to 80% which indicates that meme popularity tends to persist for several days but eventually reverts to the previous steady-state. This result supports hypothesis 1a.

There is also evidence for hypothesis 1b because better received posts of a meme also tend to generate more posts using the same meme. A higher score on past posts indicates that more of the Reddit population indicated their approved or admiration of recent posts of the meme. The effect of the logarithm of the average score for previous posts is positive and significant for one or two days prior to the focal post but not longer. The elasticity magnitude of

0.02 to 0.05 for two lags is much smaller than the 0.7 to 0.8 estimate for the number of posts.

Even though there is more variation in scores than the number of posts, its contribution to the variation in failure rates is still much smaller. I also find a positive effect of past posts’ comments on failure rates, but these are small, are often only marginally significant and the interpretation is less clear.

Meme Expression Success

A successful expression of a meme is one that more readers will enjoy and, perhaps, will share more broadly in their social media, potentially “going viral.” While I do not observe Reddit meme posts being copied into other social media, I observe the post’s Reddit score. A higher score indicates that more viewers expressed their pleasure with the post. It is conjectured that a post that is appreciated by more Reddit readers will tend to be shared more broadly. Therefore, I identify the determinants of a post’s Reddit score as determinants of the meme expression’s success.

14

Reddit scores are modeled analogously to the failure rates. However, now the dependent variable is simply the logarithm of the post’s score. Again, the variables of interest are the recent experience with the meme: number of posts, average score, and average number of comments,

푠 푠 푠 푠 푠 푙푛(푠푐표푟푒푚푖) = 훼 푝표푠푡푠푚 + 훽 푠푐표푟푒푚 + 훾 푐표푚푚푒푛푡푚 + 휃 푋푚푖 + 휀푚푖. (2)

As before, the control variables include fixed effects for the hour-of-the-day, the day-of-week, monthly fixed effects, and fixed effects for each meme. Again, for cases of missing values due to no recent past posts of a meme, I include a dummy variable and replaced the missing values with an imputed a value so that the observation could be included in the analysis.

Table 4 reports the results of this estimation and is analogous to table 3. The various columns include increasingly more lagged values. Lags up to four days for the variables of interest affect the current post’s score. Consistent with hypothesis 2a, an increase in past use of a meme tends to decrease the score of any new meme. A 100% increase in recent past use is associated with about an 8% decrease in the score. This suggest that the meme audience is becoming weary of yet another usage of the same meme. Consistent with hypothesis 2b, an increase in the quality of past meme expressions, as measured by their average score, tends to increase the score of new meme posts. This could be due to a particularly clever direction for a meme being particularly well appreciated and then being exploited by subsequent meme creators. Past meme comments do not appear to affect meme scores.

Figure 6 provides graphical representations of these effects as impulse-response functions. It simulates the effect of a one standard deviation increase in either the number of posts of a meme or in their average score on the future number of posts and their average score over the next two weeks. These use the coefficient values from column (3) of tables 3 and 4. The

15 shock to posts on future posts persists longest but reverts to within 10% of the steady state within two weeks. All other effects from shocks die out in three to four days.

Competition for Attention across Memes

It is possible that increased popularity of other memes will divert attention away from the focal meme. To test this, I augment equation (1) with information about all other posts matched to memes. That is, I also include the number, the average score, and average number of comments of past posts of all other memes besides the focal meme. As shown above, more past posts of other memes should increase the flow of the creation of new posts using these other memes. The increased attention on these other memes is hypothesized to vie for attention on the focal meme. However, by including them in the model for the focal meme, we can determine the effects on posts of the focal meme. Equation (1) becomes:

푡 푡 푡 훼 푝표푠푡푠푚 + 훽 푠푐표푟푒푚 + 훾 푐표푚푚푒푛푡푚 + 푡푖푚푒푚푖 ∽ 푒푥푝 ( 푡 푡 푡 푡 ) (3) 훿 푝표푠푡푠¬푚 + 휁 푠푐표푟푒¬푚 + 휂 푐표푚푚푒푛푡¬푚 + 휃 푋푚푖 where ¬푚 denotes all posts that do not use meme m. Similarly, the success of a post, as measured by the score it achieves, is also related to the past experience with the focal meme as well as the aggregation of all other memes,

훼푠 푝표푠푡푠 + 훽푠 푠푐표푟푒 + 훾푠 푐표푚푚푒푛푡 ( ) 푚 푚 푚 푙푛 푠푐표푟푒푚푖 = 푠 푠 푠 푠 푠 (4) 훿 푝표푠푡푠¬푚 + 휁 푠푐표푟푒¬푚 + 휂 푐표푚푚푒푛푡¬푚 + 휃 푋푚푖 + 휀푚푖.

Note that ¬푚 are all other memes among the sample that were matched with the most popular memes. There are many more posts that use new memes or were not matched for other reasons.

Table 5 reports the results of estimations of both equations (3) and (4) for up to three lagged days. The results for past experience with the focal meme for both the post’s duration and score are qualitatively unchanged from tables 4 and 5. Specifically, in the left three columns,

16 more posts using the meme and higher scoring posts tend to shorten the duration of the focal post. However, more posts of other memes and higher average scores for other memes tend to decrease the failure rate and prolong the length of time that the focal post remains the most current expression of the meme. This result is consistent with hypothesis 3, that memes compete for user attention.

The right three columns report results for a post’s score. As before, more past posts of a meme decrease a post’s score, but so does more past posts of other memes. It was argued above that the own meme effect reflected weariness with the meme. If so, it would follow that users who are weary of other memes, would tend to score this meme higher. This result suggests this interpretation may not be accurate. It is more consistent with weariness with memes in general.

The result for other memes’ scores is consistent with competition across memes. Higher own past scores increase the focal post’s score, but higher past scores for other memes decrease the focal post’s score. This suggests that the “afterglow” from a compelling post is limited to the post’s meme and diverts scores from other memes. In sum, the results for post scores are not entirely consistent with meme competition.

Learning-by-Doing in Meme Production

Learning-by-doing can take a few different forms in this context. First, I consider how meme scores relate to the number of meme posts a creator produces. The first column of table 6 reports regression results for the average score a meme creator’s posts will achieve on the number of posts she will create over the sample. The positive coefficient indicates that more prolific meme creators’ posts tend to be better received. This could be due to either learning how to create better posts with additional contributions or due to better creators posting more often.

17

The second column relates a creator’s score on her first ever post to the number of posts she will ever create. Since there is no past production before the first post, this will be free of learning- by-doing. The positive and significant coefficient indicates that that creators who will produce more tend to be better even without experience. The smaller coefficient from column one suggests that they are not as good as they will become. Finally, column three relates a posts score to the chronological number of the creator’s posts. Since this specification includes creator fixed effects, time invariant creator quality does not contribute to the coefficient on the order of the post. The positive coefficient here indicates that a creator’s ability to achieve a higher score increases as she creates more. In sum, while there is evidence of selection of better creators producing more memes, there is also evidence that creators become better as they produce more.

Next, I consider whether a multi-post creator is prone to reuse the same meme. Having posted using a particular meme once, that meme could be more at the forefront of her thoughts.

She may be more aware when a future circumstance would work especially well with the meme.

If so, she may achieve a higher score because she knows how to better use the meme. A simple test is to see how many times a creator repeats her usage of a specific meme. Figure 7 provides histograms of the fraction of times a meme is reused for creators posting two, three, four, and, five times. If meme choice were purely random, repeating the meme would occur with the probability that the meme was used in the first post. As shown in table 1, the most popular meme is used about 13% of the time. Random meme use would generate meme reuse as high as 13% only if they posted the most popular meme.5 Instead, creators who posted twice repeat their first meme over 60% of the time. Likewise, creators who post three times repeat the same meme two more times about 40%, more often than repeating it only once or not repeating at all. The

5 The 13% is based on the most popular meme. If the meme choice was also random, the probability of repeating would be less than 1%. 18 probability of using the even most popular meme three times out of three randomly is less than

13%×13% ≈ 0.30%. The histograms in figure 7 for four and five post creators also display much more mass at the tail than would have occurred randomly.

Finally, I consider whether a multi-post creator is prone to create posts of a meme when the meme is more likely to produce a higher score. One aspect of learning-by-doing may simply be having a feel for when the audience will be receptive to a particular post. In addition to creating better posts, a more prolific creator may know when to create a post. To test for learned strategic timing, for each post, I relate the amount of post’s creator’s production to the expected score based on equation (2) above. That is, I use a meme’s past experience variables as instrumental variables to form an expected score for a post. The number of posts made by a creator are then related to this expected score. If more experienced creators are better able to exploit this expectation, we would expect a positive relationship between the post’s creator’s experience and the expected post score. Table 7 reports the coefficient of the expected score for both the post creator’s current level of production and her overall level of production. In both cases, more experienced creators appear to select times in which the post is expected to score lower, not higher. This suggests that meme creators are not learning how to exploit timing strategically.

V. Conclusion

The use of Internet memes continues to evolve with short video clips currently gaining in popularity. But no matter the form they take, if they command user attention, marketers will find a way to exploit this focused attention. This study attempts to understand certain aspects of the dynamics of meme creation. First, panel estimators were developed to parameterize the evolution

19 of meme popularity and perceived quality. These estimators include a two-way fixed effects implementation of a difference-in-difference estimator to better identify the source of variation.

The dynamics appear to conform to implications from a simple supply and demand model where the price cannot differ from zero. Shifts in demand are met by a relatively elastic supply of meme expressions. Moreover, creators can anticipate day-to-day variation in demand due to some demand persistence. Mems compete with each other for audience attention.

Second, some aspects of the different meme expressions produced by a meme creator were examined. Multiple post creators’ posts tend to be better received a meme. This is true for their initial post but the Reddit score also tends to rise as a creator posts more even controlling for a creator’s fixed effect. This suggests both that inherently better creators will post more and that there is learning-by-doing in meme production. The finding that a creator tends to reuse memes suggests that a source of the learning-by-doing could be improved familiarity with perceptions of a meme. There is no evidence for another possible source of learning-by-doing, namely strategically timing one’s posts for the days in which the meme is in greater demand.

Extrapolating from these results to meme influencer production may be problematic. It is not known if any Reddit posters are professional influencers. Moreover, influencers seem to tend toward the novelty that a new meme creation generates. These might be among the unmatched memes not analyzed here. With these caveats, these results suggest that established meme creators will be better at meme creation than new entrant meme creators. This learning-by-doing could generate a first-mover advantage that represents a substantial entry barrier. If so, a few meme influencers would earn substantial rents while fringe influencers earn small only amounts.

20

References

Amalia, A., Sharif, A., Haisar, F., Gunawan, D. and Nasution, B. B., “Meme Opinion Categorization by Using Optical Character Recognition (OCR) and Naïve Bayes Algorithm,” 2018 Third International Conference on Informatics and Computing (ICIC), Palembang, Indonesia, 2018, pp. 1-5. Coscia, Michele, “Competition and Success in the Meme Pool: a Case Study on Quickmeme.com,” International Conference of Weblogs and Social Media, 2013 arXiv:1304.1712 Coscia, Michele. “Average is Boring: How Similarity Kills a Meme’s Success.” Scientific Reports.4, 6477; DOI:10.1038/srep06477 (2014). Coscia, Michele. “Popularity Spikes Hurt Future Chances for Viral Propagation of Protomemes,” Commun. ACM}, January 2018, 61(1), 70-77 http://doi.acm.org/10.1145/3158227 Gleeson, James P. and Ward, Jonathan A. and O'Sullivan, Kevin P. and Lee, William T., “Competition-Induced Criticality in a Model of Meme Popularity,” Physical Review Letters, 112(4), 048701-1-5, January 2014. https://doi.org/10.1103/PhysRevLett.112.048701 Guadagno, Rosanna E., Rempala, Daniel M., Murphy, Shannon, Okdie, Bradley M., “What makes a video go viral? An analysis of emotional contagion and Internet memes,” Computers in Human Behavior, 29(6), 2013, 2312-19. https://doi.org/10.1016/j.chb.2013.04.016 He, Saike, Zheng, Xiaolong, and Zeng, Daniel, “A model-free scheme for meme ranking in social media,” Decision Support Systems, 81, January 2016, 1-11. https://doi.org/10.1016/j.dss.2015.10.002 He, Saike, Zheng, Xiaolong, and Zeng, Daniel Dajun, “Modeling online user behaviors with competitive interactions,” Information & Management, 56(4), June 2019, 463-475. https://doi.org/10.1016/j.im.2018.09.007 Ienco, Dino, Bonchi, Francesco, and Castillo, Carlos, “Meme Ranking to Maximize Posts Virality in Microblogging Platforms.” Journal of Intelligent Information Systems. April 2013 Kar, Sanghamitra, “Meme makers laughing all the way to the bank,” The Economic Times, Jan., 31 2020. Luo, C., Zheng, X., & Zeng, D. (2015). Inferring social influence and meme interaction with Hawkes processes. In 2015 IEEE International Conference on Intelligence and Security Informatics: Securing the World through an Alignment of Technology, Intelligence, Humans and Organizations, ISI 2015 (pp. 135-137). [7165953] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISI.2015.7165953 Roache, Kiley, “Brands are Bypassing Influencers and Targeting Teens with Memes,” Bloomberg, Nov. 27. 2019.

21

Shabunina, Ekaterina and Pasi, Gabriella, “A graph-based approach to ememes identification and tracking in Social Media streams,” Knowledge-Based Systems 139(1) January 2018, 108- 118. Spitzberg, Brian H., “Toward A Model of Meme Diffusion (M3D),” Communication Theory, 24(3), August 2014, 311-339. https://doi.org/10.1111/comt.12042 Weng, L., Flammini, A., Vespignani, A. & Menczer F. “Competition among memes in a world with limited attention.” Scientific Reports 2, 335 (2012). https://doi.org/10.1038/srep00335 Xie, Lexing, Natsev, Apostol, Kender, John R., Hill, Matthew, and Smith, John R., “Visual memes in social media: tracking real-world news in YouTube videos,” MM '11: Proceedings of the 19th ACM international conference on Multimedia November 2011 Pages 53–62 https://doi.org/10.1145/2072298.2072307

22

Figure 1 Static Meme Example

23

Figure 2 Weekly Volume of Posts on Meme related SubRedddits

24

Figure 3 Average Score of a Meme Post on Reddit

25

Figure 4 Histogram of Logarithm of Posts’ Scores

26

Figure 5 Average Number of Comments of a Meme Post on Reddit

27

Figure 6 Impulse-Response from One Standard Deviation Shock to Posts and Score

28

Figure 7 Fraction of Memes Repeated in Author Posts

29

Table 1 Most Popular Memes

Number Percent Ln(Reddit Score) Rank Meme Name of Posts of Posts Mean s.d. 1 Drake Hotline Bling 9,121 13.3% 2.58 2.03 2 Expanding Brain 7,569 11.1% 2.64 2.30 3 Tuxedo Winnie The Pooh 3,786 5.5% 2.64 1.99 4 Surprised Pikachu 3,632 5.3% 2.48 2.19 5 Unsettled Tom 2,820 4.1% 2.35 1.87 6 Two Buttons 2,371 3.5% 2.50 1.96 7 Is This A Pigeon 2,300 3.4% 2.74 2.29 8 Distracted Boyfriend 2,140 3.1% 2.45 2.16 9 The Scroll Of Truth 1,873 2.7% 2.81 2.21 10 Left Exit 12 Off Ramp 1,520 2.2% 2.69 2.21 11 Change My Mind 1,404 2.1% 2.53 2.01 12 Me And The Boys 1,124 1.6% 2.53 1.59 13 Well Yes, But Actually No 977 1.4% 2.85 2.16 14 Blank Nut Button 964 1.4% 2.23 2.08 15 Boardroom Meeting Suggestion 901 1.3% 2.77 2.25 16 Futurama Fry 876 1.3% 2.08 1.92 17 American Chopper Argument 844 1.2% 3.21 2.40 18 Success Kid 835 1.2% 2.19 1.96 19 Hard To Swallow Pills 804 1.2% 2.99 2.38 20 First World Problems 800 1.2% 1.94 1.58 21 Bad Luck Brian 786 1.2% 2.05 1.79 22 Philosoraptor 573 0.8% 1.98 2.01 23 Who Killed Hannibal 571 0.8% 3.14 2.32 24 Car Salesman Slaps Hood 457 0.7% 2.81 2.42 25 Confession Bear 453 0.7% 2.26 1.96 26 Batman Slapping Robin 451 0.7% 2.03 1.99 27 Who Would Win? 433 0.6% 3.30 2.44 28 Running Away Balloon 419 0.6% 2.84 1.84 29 Roll Safe Think About It 394 0.6% 2.10 2.07 30 Sleeping Shaq 383 0.6% 3.04 2.25 31 Scumbag Steve 378 0.6% 2.02 1.84 32 10 Guy 372 0.5% 1.76 1.63 33 One Does Not Simply 353 0.5% 1.78 1.80 34 Be Like Bill 316 0.5% 3.30 2.75 35 Good Guy Greg 311 0.5% 2.47 1.92 36-511 All Others 15,014 22.0% 2.11 1.83 Based on 68,325 posts to Reddit between 7 August 2011 and 18 August 2019.

30

Table 2 The Number of Times Someone Posts a Meme to Reddit

Number of Meme Posts Number of Total Number per Creator Creators of Posts 1 34,629 34,629 2 6,552 13,104 3 1,845 5,535 4 755 3,020 5 362 1,810 6 192 1,152 7 119 833 8 75 600 9 56 504 10 27 270 11 26 286 12 17 204 13-95 65 1,400 Based on 63,347 posts to Reddit between 7 August 2011 and 18 August 2019 with identifiable authorship.

31

Table 3 Hazard Rates from Cox Regressions of Failure Rate until Next Post of a Meme

(1) (2) (3) (4) Ln Number of Past Posts 0-24 hours prior 1.744*** 1.591*** 1.572*** 1.567*** (0.011) (0.013) (0.013) (0.013) 24-48 hours prior 1.127*** 1.085*** 1.080*** (0.009) (0.009) (0.010) 47-72 hours prior 1.053*** 1.041*** (0.008) (0.009) 72-96 hours prior 1.010 (0.008) Ln Average Score of Past Posts 0-24 hours prior 1.021*** 1.025*** 1.025*** 1.025*** (0.005) (0.005) (0.005) (0.005) 24-48 hours prior 1.023*** 1.025*** 1.026*** (0.005) (0.005) (0.005) 47-72 hours prior 1.007 1.007 (0.005) (0.005) 72-96 hours prior 1.007 (0.005) Ln Average Number of Comments on Past Posts 0-24 hours prior 1.023** 1.021** 1.017* 1.018** (0.009) (0.009) (0.009) (0.009) 24-48 hours prior 1.005 1.004 1.002 (0.009) (0.009) (0.009) 47-72 hours prior 1.025*** 1.024*** (0.009) (0.009) 72-96 hours prior 1.024*** (0.009) Specifications include fixed effects for the year & month, day of week, the hour of the day, and an indicator for zero posts over the prior period. The model represents stratified estimation where the baseline hazard differs across memes.

32

Table 4 Regression of Logarithm of Meme Post Score

(1) (2) (3) (4) Ln Number of Past Posts 0-24 hours prior -0.083*** -0.078*** -0.072*** -0.071*** (0.012) (0.015) (0.015) (0.015) 24-48 hours prior -0.025* 0.011 0.026 (0.015) (0.018) (0.018) 47-72 hours prior -0.069*** -0.037** (0.016) (0.018) 72-96 hours prior -0.062*** (0.016) Ln Average Score of Past Posts 0-24 hours prior 0.056*** 0.054*** 0.051*** 0.051*** (0.010) (0.010) (0.010) (0.010) 24-48 hours prior 0.020** 0.015 0.013 (0.010) (0.010) (0.010) 47-72 hours prior 0.039*** 0.037*** (0.010) (0.010) 72-96 hours prior -0.001 (0.010)

Ln Average Number of Comments on Past Posts 0-24 hours prior 0.033* 0.033* 0.034* 0.033* (0.018) (0.018) (0.018) (0.018) 24-48 hours prior 0.016 0.017 0.017 (0.019) (0.019) (0.019) 47-72 hours prior -0.033* -0.035* (0.019) (0.019) 72-96 hours prior 0.026 (0.019) Specifications include fixed effects for the year & month, day of week, the hour of the day, and an indicator for zero posts over the prior period. The model includes fixed effects for each meme.

33

Table 7 Competition for Attention across Memes

Failure Rate Score Same Meme – Pasts Posts 1 day prior 1.643*** 1.517*** 1.491*** -0.122*** -0.112*** -0.101*** (0.012) (0.013) (0.013) (0.013) (0.016) (0.016) 2 days prior 1.120*** 1.065*** -0.027* 0.003 (0.009) (0.009) (0.015) (0.018) 3 days prior 1.077*** -0.059*** (0.009) (0.016) Same Meme – Past Score 1 day prior 1.010** 1.013*** 1.015*** 0.050*** 0.048*** 0.046*** (0.005) (0.005) (0.005) (0.010) (0.010) (0.010) 2 days prior 1.017*** 1.020*** 0.015 0.011 (0.005) (0.005) (0.010) (0.010) 3 days prior 1.002 0.035*** (0.005) (0.010) Same Meme – Past Comments 1 day prior 1.023** 1.021** 1.017* 0.031* 0.030* 0.032* (0.009) (0.009) (0.009) (0.018) (0.018) (0.018) 2 days prior 1.008 1.007 0.019 0.020 (0.009) (0.009) (0.019) (0.019) 3 days prior 1.024*** -0.031 (0.009) (0.019) Other Memes – Past Posts 1 day prior 0.923*** 0.936*** 0.929*** -0.097*** -0.095*** -0.083*** (0.008) (0.008) (0.008) (0.017) (0.018) (0.018) 2 days prior 1.001 1.017 0.003 -0.000 (0.013) (0.014) (0.028) (0.030) 3 days prior 0.984 -0.009 (0.013) (0.028) Other Memes – Past Score 1 day prior 0.979*** 0.979*** 0.979*** -0.010** -0.009* -0.009* (0.002) (0.002) (0.002) (0.005) (0.005) (0.005) 2 days prior 0.982*** 0.980*** -0.015 -0.014 (0.006) (0.006) (0.014) (0.014) 3 days prior 1.009 -0.004 (0.007) (0.014) Other Memes – Past Comments 1 day prior 1.062*** 1.059*** 1.056*** 0.022 0.021 0.021 (0.008) (0.008) (0.008) (0.016) (0.016) (0.017) 2 days prior 1.039*** 1.043*** 0.027 0.025 (0.013) (0.013) (0.027) (0.027) 3 days prior 0.999 -0.019 (0.013) (0.027) Specifications include fixed effects for the year & month, day of week, the hour of the day, and an indicator for zero posts over the prior period. The model includes fixed effects for each meme.

34

Table 8 Meme Score as a function of Meme Creator Production

Ln Score Ln Score Ln Score Creator’s Creator’s Each Post by VARIABLES Average Post First Post Creator

Ln Number of Posts 0. 627*** 0. 068*** by Creator (0.024) (0.023) Order of Creator’s 0.166*** Post (0.022) Creator FE X Year/Month FE X X X

Observations 43,543 43, 321 61,480 R-squared 0.012 0.027 0.016 Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

35

Table 9 The Effect of a Post’s Expected Score on Post Creator’s Experience

Post Creator’s Post Creator’s Current Overall Experience Experience

Expected Score -0.060*** -0.073*** (0.019) (0.024)

Specifications include fixed effects for each meme, the year & month, day of week, and the hour of the day. Instruments include the number of meme posts and their average scores for the previous three days. Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

36