### Meaningful criteria for the approximate probability distributions

- The Expected Value (approximated by the Mean or Average) of the given distribution should reflect the result of the aggregate of all the official polls conducted to date. Note: this criterion is only justifiable if the polls can be considered as truly representative of the actual full voting population. This is, of course, a highly debatable assertion, one extreme view being that "opinion polls are meaningless". But for the sake of current argument, the assertion that the polls are indeed representative, will be considered as valid.
- The integral of the distribution (or "area under the probability density function" or "cumulative density function (cdf)") evaluated over the range from 0.5 to 1 (50% to 100%) should correspond to the implied probabilities ("chance of YES winning", and "chance of NO winning", respectively) from the betting exchange data. It could be viewed that this is a considerably less justifiable criterion than the previous, but on the basis that "the bookies are seldom wrong", it seems like quite a reasonable assertion. After all, the betting exchange data is "crowd-sourced" in that it reflects the combined "beliefs" of many thousands of punters, i.e., potential voters. Also, there is no other readily available source of such information, with the exception of the polling data itself (utilised in the previous criterion). However, modelling probability distributions on just the polling data would lead to much narrower distributions (i.e., suggesting considerably less uncertainty) than implied from the betting exchange data. Or put another way, the betting exchange data in effect incorporates the (potentially real) broader uncertainties than captured in the polling data alone.

###
The *Beta *Distribution

*Beta Distribution*which is often used for the purpose at hand, i.e., to devise a meaningful probability distribution for election outcomes involving two choices. One of the benefits of using the

*Beta Distribution*is that it is formally (structurally) similar to the

*Binomial Distribution*which is widely used for modelling polling results. Taken together, these two distributions can be combined via

*Bayesian Inference*to provide an updated-distribution-after-most-recent-poll which, is itself a

*Beta Distribution*. This useful and interesting aspect will not be pursued at present.

*Beta Distribution*has two parameters, denoted A ("alpha") and B ("beta"). The numerical size of these determines the sharpness/certainty (or broadness/uncertainty) with large numerical values representing narrow distributions (i.e., with more certainty), and smaller numerical values representing broad distributions (i.e., with more uncertainty). Also, if A and B are numerically equal, the distribution is symmetric about a Mean value of 0.5 (50%). Unequal values for A and B allow for skewed (non-symmetric) distributions with Mean values different from 0.5.

*Beta Distributions*which satisfy the criteria described earlier for both YES and NO. The results of these parameterisations are presented in the distributions below.