Statistical Data Analysis

Data expert sees narrow 'Remain' vote in UK's Brexit poll

How much importance should we attach to the opinion polls in the lead up to the EU referendum on June 23 when the UK will vote to stay in or leave the union? The eventual outcome of the 2015 UK General Election would suggest very little can be determined over the so called ‘Brexit’ decision. Inexperience of the pollsters with referendums, difference in propensity of turnout amongst ‘stayers’ and ‘leavers’, and undecided voters who will sway the results at the last minute by voting for status quo, are all factors which could contribute to inaccurate predictions. 

Partha Sen, however, CEO and co-founder of data analytics firm, Fuzzy Logix, believes he can blend his own company technology, with information available from various leading polls to predict the outcome. The following is an edited version of a blog post Sen has written on the topic…


There is no doubt that the Brexit polls could be wrong. The problem with opinion polls for this type of a referendum is that the underlying statistical methods assume that the distribution of those who want to stay, versus those who want to leave, is relatively even across all different parts of the nation. Unfortunately for the pollsters, that is absolutely not the case. Scotland, for example, favours staying in the EU by a significant margin but, in the north of England there are regions where there is overwhelming support for a Brexit. So, should a pollster take random samples from different regions across the country, but the sample from that region is not representative of the underlying population, then the pollster’s conclusions will likely be wrong.

Every poll has a margin of error and some pollsters publish it. The calculation for the margin of error is based on the assumption of geographically uniform distribution. When this assumption is violated, the margin of error gets magnified quite significantly.

I have been looking for data on the Referendum vote to substantiate the idea that the vote may be tight. Recently, market research firm YouGov published an interactive map of the UK with each county colour-coded for Euroscepticism.

YouGov analysed the behaviour of more than 80,000 people in the UK and created this map. In this case, the sample size is obviously pretty robust. I have merged the data from YouGov with the registered voters in the UK to produce the following chart.


Chart A: Number of voters by Eurosceptic/Europhile counties


As you can see, of the 46 million voters in the UK, about 20 million reside in counties that are sceptical towards the EU. This population is higher than the 13 million who live in counties who favour being part of the EU (the Europhiles). Also, about 12 million voters live in counties where the stay/leave viewpoint is evenly matched. If YouGov’s survey is accurate, the proponents of ‘stay in the EU’ could have a relatively tough battle ahead.

I thought of doing something innovative. In the interactive map, YouGov has ranked the counties in increasing order of Euroscepticism. I made an assumption that, in the most Eurosceptic counties, only 44% would want to stay in the EU whereas, in the most Europhile counties, 60% would prefer to stay in the EU. These assumptions are based on some of the polls that state the weakest and strongest support for Brexit in various regions, age groups, political affiliations, etc. I then assigned support for Brexit based on the Euroscepticism ranking from YouGov against my assumed range of 44% to 60%. Then based on a 65% turnout, I calculated the total votes for stay, versus leave, the EU. These calculations show that the referendum is tight but that the ‘remain’ vote will prevail with about 327,000 votes (see Chart B).


So, which counties could contribute most to Brexit? Because these will be the counties targeted for sustained campaigning to vote for staying in the EU (see chart C).


I am pretty sure that the larger counties on this list – including Essex, Kent and Hampshire – will be targeted even more by the proponents of EU in the final days of campaigning.

Until recently, I believed that the margin of victory for a ‘stay’ vote could be about between 1.2 and 1.5 million. The current analysis, however, now shows the margin to be much tighter than that.

But, again, this analysis is based on data from only one source - YouGov. I wanted to project the UK electoral data on the Eurosceptic map and see if the ‘Remain’ camp still wins. Well it does, but I do believe that the projected winning margin is being impacted by the way the counties have been classified, especially the ones that are classified as mixed.

Next week, I will write a follow up blogpost which will attempt to give an even more accurate prediction of the outcome of the vote.


Also read:

Fuzzy Logix CEO stays in databases to chase insights

Data expert tips Panthers to win Super Bowl


« The real meaning of… Futurology


Quotes of the week: Nokia has "absolutely no interest" in going back into mobiles »
IDG Connect

IDG Connect tackles the tech stories that matter to you

  • Mail


Do you think your smartphone is making you a workaholic?