Take our example from before, in which three random integers between 0 and 2 inclusive are summed to find an output. We found the probability of the answer being any of 7 possible values:

Now, lets assume that we take a measurement, and determine with 100% confidence that the output of the system is '3'. What is the probability that each of the inputs takes value 0, 1, or 2? To determine this, we can go back to our exhaustively enumerated chart, and only look at the rows in which the output is equal to three, and similar to how we did before, count the number of times each value appears, this time in each input column. For the case in which output = 3, we would look only at the rows highlighted in yellow. The problem is symmetrical, so we only need to do this for one of the three inputs.

These then are the probabilities for the input taking a certain value, under various conditions on the output:

The conditional probability distribution for each output value looks like this:

Lets try mapping this idea onto the continuous domain example that we considered in the previous post. In that case, you remember, we specified that the distribution of the input function was a uniform distribution between 0 and 2, with the same system. Lets assume that we've measured the output via some terribly clever means to be between 2.5 and 3. Bucketing the results into intervals of tenths we see:

In a plot, we can see this is significantly different from the original, assumed uniform distribution for all possible cases:

We're getting a fair amount of noise in this chart, because having selected a subset of our Monte Carlo runs, the sample size that creates this chart is rather small. When we increase the number of runs to 1 million, the result is significantly cleaner:

The drop-offs at either end are an artifact of rounding into buckets, (so -.05 to .05 naturally has half as many components as .05 to .15, even in what would have otherwise been an even distribution) instead of assigning buckets more explicitly.

In a future post, I'd like to apply these analyses to the model we created about views on YouTube, and see if we can get a statistical baseline for how people share videos online.

Also, I'd like to see if we can recover an unknown input distribution by iterating our model with 'observed' values. I'd then like to see if we can make sense of what this process would look like if there was a distribution of confidence in our measured output.