Lies, damn lies and statistics: Revision of South Africa’s GDP figures a reason to start effectively harnessing big data

Prof Tshilidzi Marwala is the Vice-Chancellor and Principal of the University of Johannesburg, a member of the Namibia 4IR task Force and the author of ‘Leadership Lessons From the Books I Have Read’. He is on Twitter at @txm1971. He recently penned an opinion piece published by Daily Maverick on 19 September 2021.

Benjamin Disraeli, who was prime minister of the United Kingdom first in 1868 and later between 1874 and 1880, was considered a gifted man. He was a good judge of character. It is often speculated that the decisive military defeat that the British suffered at the hands of the Zulu Kingdom in 1879, in the Battle of Isandlwana, hastened his death, which was two years after this defeat. King Cetshwayo was the leader of the Zulu Kingdom at the time, and the duet between Cetshwayo and Disraeli effectively ended Disraeli’s political career.

Disraeli said in his frank characterisation of the Zulu people, “The Zulus are remarkable people, they defeat our generals, they convert our bishops.” Here, Disraeli was referring to John Colenso, a bishop of Natal who was stationed there to use religion to colonise the Zulu people but ended up being sympathetic to their culture and beliefs. The frankness of Disraeli did not end with his assessment of the Zulu people but extended to the field of statistics. He once remarked that there are three types of lies in the order of severity: “lies, damn lies and statistics”. In this regard, Disraeli considered statistics the highest form of lies.

A few weeks back, the statistics on the Gross Domestic Product (GDP) of South Africa were revised, indicating that South Africa’s economy is actually 11% bigger than previously thought. Accordingly, the GDP of South Africa is now estimated at R5.521-trillion from the previous R4.973-trillion. The 2020 GDP reduction of 7% has now been revised to 6.4%. The problem with relying on these statistics is that many people planned and executed their plans based on the wrong GDP numbers. The number of bad decisions in our society, politics and economy based on the wrong numbers is incalculable. This revision reinforces Disraeli’s mistrust of statistics. Statistics are sometimes associated with a bad omen. At the height of his reign of terror, Joseph Stalin once remarked, “The death of one man is a tragedy. The death of millions is a statistic.” Recently, in the US, Chris Murphy, the Democratic Senator of Connecticut, wrongly claimed that eight out of 10 US drones miss their targets. Whether this was a deliberate fabrication or just incorrectly calculated statistics, we will never know.

But statistics should not be viewed in a negative light. In fact, by definition, statistics are usually an estimation and are never exact. For example, if one needs to know the average height in Johannesburg, there are two ways of doing this. First, it is to go and measure the height of everyone in Johannesburg and calculate the average height, which is impossible. The other is to measure a random selection of 1,000 people and find the average height. This is called statistical estimation.

While the need for statistics is apparent, there are several reasons statistics can be wrong. The first reason is if the data used are not comprehensive enough, which is called the small sample problem. The truth is that sample size is not an exact science, and even though some best practice guidelines guide statisticians, it is still subjective. To illustrate the sample size problem using the estimation of an average height in Johannesburg, do we sample 1,000 or 50,000 or 200,000 people?

The second is the problem of bias. In this regard, if we select our 1,000 samples from Soweto, then the average height will represent Soweto, not the whole of Johannesburg. To deal with this issue, we ought to select the samples randomly from across the entire city. Randomly selecting places to sample is a complex problem that is better handled by a machine than by a human being.

Coming back to the problem of South Africa’s GDP data, it is crucial to understand how GDP is measured. In economics, there are two approaches to estimate GDP: the expenditure and the income approaches. These approaches are intended to measure the amounts of goods and services produced in the economy. The expenditure approach calculates the GDP by adding all consumer and investor spending plus the difference between the exports and the imports in the economy.

The income approach estimates the GDP by adding all national income plus sales taxes plus depreciation plus total income generated by the country’s citizens overseas versus income by foreigners in the country. Theoretically, both these approaches are supposed to yield the same results. However, it must be noted that it is difficult to measure all these factors, and for South Africa, the relatively large underground and informal economies exacerbate this issue.

Given this context, why was the GDP of South Africa revised?

Firstly, Statistics South Africa (Stats SA) included new sources of information to estimate these numbers. Secondly, Stats SA added new compilation methods, which may have included facts such as sample sizes. Thirdly, it refined the classification of economic activities and revised the reference year from 2010 to 2015. These changes resulted in the size of our economy being R550-billion bigger.

The new numbers now indicate a significant growth in finance, business services and property, and a decline in mining, energy and transport. Despite all these changes, the GDP of South Africa is still grossly undervalued, with the informal economy being underestimated.

There is an argument here for us to augment our approach to collecting statistics even further. Injecting technology into the process ensures more accuracy. After all, machines do not suffer from bias and human error.

Despite the flaws of statistics, it is apparent that it is still an essential instrument for evidence-based decision-making. I would argue that Disraeli was wrong – statistics are not the worst form of lying but just an imperfect instrument for a data-driven economy.

Now that we live in the era of big data, where the amount of data available is enormous and the technology to process the data, such as artificial intelligence and quantum computing, is enormous, let us use these tools to improve statistics and make it a more perfect tool for rational decision- making.

The views expressed in this article are that of the author/s and do not necessarily reflect that of the University of Johannesburg.

Related: Business is war: A 10-point economic battle plan for South Africa

prof tshilidzi marwala
Professor Tshilidzi Marwala

By definition, statistics are usually an estimation and are never exact. The recent revision of South Africa’s Gross Domestic Product (GDP), indicating that the economy is actually 11% bigger than previously thought, is a glaring example. We need to harness big data more effectively in our statistical analyses.

Ai Codes Coding 97077

Latest News

| View All News

Vice-Chancellor Message – 31 March 2023

Dear UJ Community, As I settled down yesterday to write


Posthumously conceived children – a looming legal and social security…

Professor Letlhokwa George Mpedi is the Vice-Chancellor and Principal of


UJ Researcher Development Workshop Series

The UJ Library, in collaboration with Elsevier, is cordially inviting


Disability Sports Club wins 23 medals at SASAPD in Cape…

Several student-athletes and external members of the University of Johannesburg’s