James P Houghton

# Modeling Social Trends Online 1

12 Nov 2012

In the previous post on this subject, we constructed a Susceptible Infectious-Recovered (SIR) model and used it to predict the behavior of viral videos online. To translate the model to the social media sphere, we need to adjust the way the virus spreads. In a social-media dominated epidemic, the contagion resides primarily as posts on Twitter and Facebook. I've modified the original model:
watching = "Interested, Excited"*Contact Rate*(Have not seen/Total Population)*Share Rate*Watch Fraction

I've taken out the stock of interested, excited folks, and replaced it with a separate presence of links to the viral video on social media feeds:
watching = Presence on Feeds*(Have not seen/(Have not seen+Have Seen))*SM Watch Fraction

In the modified model, 'presence on feeds' is equivalent to the population of "Interested, Excited" individuals times the contact rate. The two models are actually functionally identical to one another. There are three ways to see this. The first is a thought experiment: Take the original SIR model, and instead of applying it to people, apply it to profiles. "have not seen" -> "profiles whose owners have not seen", etc. The second way is to apply some calculus, to put the driving flow of the model (watching) into language present in both models. Looking at the SIR model:
$\\ watching(t) = encounters(t) \cdot P_w \\ watching(t) = contagious(t) \cdot F_{sus} R_c R_{sh} P_w \\ watching(t) = F_{sus} R_c R_{sh} P_w \int^t \left(watching(w) - recovering(w) \right) \delta w\\ watching(t) = F_{sus} R_c R_{sh} P_w\int^t \left( watching(w) - \frac{contagious(w)}{delay} \right) \delta w\\ watching(t) = F_{sus} R_c R_{sh} P_w\int^t \left( watching(w) - \frac{watching(w)}{F_{sus} R_c R_{sh} P_w \cdot delay} \right) \delta w\\ watching(t) = \left( F_{sus} R_c R_{sh} P_w -\frac{1}{delay}\right)\int^t watching(w) \delta w\\$
and now the same for the Social Media model:
$\\ watching(t) = encounters(t) \cdot P_w \\ watching(t) = feed(t) \cdot F_{sus} \cdot P_w \\ watching(t) = F_{sus} \cdot P_w\int^t \left(sharing(w) - decaying(w) \right) \delta w\\ watching(t) = F_{sus} \cdot P_w\int^t \left(watching(w) \cdot R_c \cdot R_{sh} - \frac{feed(t)}{delay} \right) \delta w\\ watching(t) = F_{sus} \cdot P_w\int^t \left(watching(w) \cdot R_c \cdot R_{sh} - \frac{watching(t))}{delay \cdot F_{sus} \cdot P_w} \right) \delta w\\ watching(t) = \left( F_{sus} R_c R_{sh} P_w -\frac{1}{delay}\right)\int^t watching(w) \delta w\\$
The third way to show that they are equivalent is to look at the results of simulation. The fraction of the population which has not seen the video is consistent between models to within round-off error.

Now, showing that these two are equivalent isn't by itself that interesting. What is interesting is that this model  structure gives us the ability to include additional behaviors which may account for discrepancies between the SIR model and some of the fat tail behaviors we saw in previous posts. There are two particular additions that I would like to explore. The first is word of mouth sharing behavior, and the second is when viewers re-watch the video on their own.

We can add the word of mouth propagation behavior according to the original SIR structure, which (as we now know) is identical to the social media propagation structure. In this case, however, we'll assume that the amount of time that individuals stay interested is significantly longer than the amount of time that posts spend on a Facebook page. This probably reflects actual behavior more accurately:
Assuming for now that the average length of interest is 60 days, we see a behavior in which the reinforcing feedbacks build each other, and then a slightly fatter tail due to new views as
The total number of people watching the video, however, is limited by the balancing effect of a finite population. The tail exists, but it isn't really that fat. If we include views by people who have already seen the video:
Then we have a more defined 'fat tail' which corresponds more realistically to what we observed before:

The ability of this model to fit the actual observed behavior depends entirely upon the parameters that define sharing through social media and by word of mouth, by the decay times of each of the 'contagious' stocks, and by the re-view rate. In a future post, I'd like to explore how to best discover those parameters for a given video, and by comparing trends across videos.