The Importance of Bayesian Analysis in Understanding Polling Data
Now that we have constructed a prior model of our support in the upcoming election, let's turn to the next important piece of a Bayesian analysis. The data in your quest for election to public office is crucial in understanding the underlying proportion of voters that support you. Recall that parameter P denotes the underlying proportion of voters that support you. To gain insight into P, your campaign conducted a small pool and found that X equals 6 out of N equals 10, or 60% of voters support you.
These data provide some evidence about P. For example, you're more likely to observe such poll results if your underlying support were also around 0.6. Similarly, say if it were below the point five winning thresholds. To rigorously quantify the likelihood of the poll results under different election scenarios, we must understand how polling data X depend on your underlying support P.
To achieve this understanding, you can make two reasonable assumptions about the polling data. First, voters respond independently of one another. Second, the probability that any given voter supports you is P, your underlying support in the population. In turn, you can view X, the number of N polled voters that support you, as a count of successes in n independent trials, each having probability of success P.
This might sound familiar under these settings. The conditional dependence of X on P is modeled by the binomial distribution with parameters and NP, communicated by the mathematical notation here. The binomial model provides the tools needed to quantify the probability of observing your poll result under different election scenarios. This result is represented by the red dot x equals 6 of N equals 10 or 60% of voters support you if your underlying support P is only 50%. There's a roughly 20% chance that a pool of 10 voters would produce X equals 6.
You are less likely to observe such a relatively low poll result if your underlying support P is as high as 80%, whereas it's possible, though unlikely, that you would observe such a relatively high poll result if your underlying support were only 30%. By calculating the likelihood of these poll results for any level of underlying support P between 0 & 1, we can connect the dots and obtain the resulting curve.
This curve represents the likelihood function, which summarises the likelihood of observing polling data X under different values of the underlying support parameter P. The likelihood is a function of P that depends upon the observed data X, in turn providing insight into which parameter values are most compatible with the pool. Here, we see that the likelihood function is highest for values of support P between 0.4 and 0.8. These values are the most compatible with the pool, whereas small values of support below 0.4 and large values of support above 0.8 have low likelihoods.
These observations conclude that the likelihood function plays an important role in quantifying the insights from our data. While it's possible to calculate the exact binomial likelihood function, you'll use simulation techniques to approximate and build intuition for the likelihood in the following exercises.
"WEBVTTKind: captionsLanguage: ennow that you've constructed a prior model of your support in the upcoming election let's turn to the next important piece of a Bayesian analysis the data in your quest for election to public office recall that parameter P denotes the underlying proportion of voters that support you to gain insight into P your campaign conducted a small pool and found that X equals 6 of N equals 10 or 60% of voters support you these data provide some evidence about P for example you're more likely to observe such poll results if your underlying support were also around 0.6 then say if it were below the point five winning thresholds of course to rigorously quantify the likelihood of the poll results under different election scenarios we must understand how polling data X depend on your underlying support P to this end you can make two reasonable assumptions about the polling data first voters respond independently of one another second the probability that any given voter supports you is P your underlying support in the population in turn you can view X the number of N polled voters that support you as a count of successes in n independent trials each having probability of success P this might sound familiar under these settings the conditional dependence of X on P is modeled by the binomial distribution with parameters and NP communicated by the mathematical notation here the binomial model provides the tools needed to quantify the probability of observing your poll result under different election scenarios this result is represented by the red dot x equals 6 of N equals 10 or 60% of voters support you if your underlying support P we're only 50% there's a roughly 20% chance that a pool of 10 voters would produce X equals 6 you are less likely to observe such a relatively low poll result if your underlying support P where as high as 80% further it's possible though unlikely that you would observe such a relatively high poll result if your underlying support were only 30% similarly we can calculate the likelihood of these poll results for any level of underlying support P between 0 & 1 connecting the dots the resulting curve represents the likelihood function the likelihood function summarises the likelihood of observing polling data X under different values of the underlying support parameter P thus the likelihood is a function of P that depends upon the observed data X in turn it provides insight into which parameter values are most compatible with the pool here we see that the likelihood function is highest for values of support P between 0.4 and 0.8 thus these values are the most compatible with the pool in contrast with low likelihoods small values of support below 0.4 and large values of support above 0.8 are not compatible with the pool to conclude the likelihood function plays an important role in quantifying the insights from our data though it's possible to calculate the exact binomial likelihood function you'll use simulation techniques to approximate and build intuition for the likelihood in the following exercisesnow that you've constructed a prior model of your support in the upcoming election let's turn to the next important piece of a Bayesian analysis the data in your quest for election to public office recall that parameter P denotes the underlying proportion of voters that support you to gain insight into P your campaign conducted a small pool and found that X equals 6 of N equals 10 or 60% of voters support you these data provide some evidence about P for example you're more likely to observe such poll results if your underlying support were also around 0.6 then say if it were below the point five winning thresholds of course to rigorously quantify the likelihood of the poll results under different election scenarios we must understand how polling data X depend on your underlying support P to this end you can make two reasonable assumptions about the polling data first voters respond independently of one another second the probability that any given voter supports you is P your underlying support in the population in turn you can view X the number of N polled voters that support you as a count of successes in n independent trials each having probability of success P this might sound familiar under these settings the conditional dependence of X on P is modeled by the binomial distribution with parameters and NP communicated by the mathematical notation here the binomial model provides the tools needed to quantify the probability of observing your poll result under different election scenarios this result is represented by the red dot x equals 6 of N equals 10 or 60% of voters support you if your underlying support P we're only 50% there's a roughly 20% chance that a pool of 10 voters would produce X equals 6 you are less likely to observe such a relatively low poll result if your underlying support P where as high as 80% further it's possible though unlikely that you would observe such a relatively high poll result if your underlying support were only 30% similarly we can calculate the likelihood of these poll results for any level of underlying support P between 0 & 1 connecting the dots the resulting curve represents the likelihood function the likelihood function summarises the likelihood of observing polling data X under different values of the underlying support parameter P thus the likelihood is a function of P that depends upon the observed data X in turn it provides insight into which parameter values are most compatible with the pool here we see that the likelihood function is highest for values of support P between 0.4 and 0.8 thus these values are the most compatible with the pool in contrast with low likelihoods small values of support below 0.4 and large values of support above 0.8 are not compatible with the pool to conclude the likelihood function plays an important role in quantifying the insights from our data though it's possible to calculate the exact binomial likelihood function you'll use simulation techniques to approximate and build intuition for the likelihood in the following exercises\n"