16,000 case records of the new corona failed to import Excel and disappeared temporarily



On October 4, 2020, the British Public Health Agency (PHE) announced that it had 'temporarily lost 15,841 positive diagnostic data for the new coronavirus infection (COVID-19).' Major British media outlets have pointed out that the cause of this mysterious data loss is 'Excel'.

PHE statement on delayed reporting of COVID-19 cases --GOV.UK

https://www.gov.uk/government/news/phe-statement-on-delayed-reporting-of-covid-19-cases

How Excel may have caused loss of 16,000 Covid tests in England | Health policy | The Guardian
https://www.theguardian.com/politics/2020/oct/05/how-excel-may-have-caused-loss-of-16000-covid-tests-in-england

How does the PHE statistics error change the UK Covid picture? | Society | The Guardian
https://www.theguardian.com/society/2020/oct/05/how-does-the-phe-statistics-error-change-the-uk-covid-picture

Botched Excel import may have caused loss of 15,841 UK COVID-19 cases | Ars Technica
https://arstechnica.com/tech-policy/2020/10/excel-glitch-may-have-caused-uk-to-underreport-covid-19-cases-by-15841/

On October 4, 2020, PHE revealed that 'the COVID-19 positive diagnostic data for about 16,000 people was temporarily lost.' Three-quarters of the lost data is the data acquired from September 25 to October 2, 2020, the lost data (red part of the image) and the normally saved data (gray part). The graph of the amount of is as follows.



The cause of this data loss was simply explained by PHE as 'there was a problem with the process of automatically transferring data to the reporting dashboard.' However, according to British press, the problem with this automated transfer process is actually an 'Excel problem.'

According to UK media outlets, test data collected from local medical institutions is stored in

CSV format and sent to PHE. The PHE side manages the received CSV file with Excel, but because the version of Excel used is old, 'XLS' is selected as the save format.

XLS is a file storage format that was standard in Excel before Excel 2003, and there was a specification that 'data exceeding 65,536 lines cannot be saved'. As a result of PHE's integration of CSV file data collected from medical institutions around the world, it is believed that the spreadsheet exceeded 65,536 lines, leading to the loss of data.



In the United Kingdom, the number of newly infected people with COVID-19 exceeded 10,000 for the first time on October 3, 2020, and it is said that the 'second wave of infection' is spreading. Due to this data loss, there was a problem that the data was not properly transferred to the new Corona virus contact confirmation app operating in the UK during the temporary loss period. As a result, media outlets have accused PHE of 'increasing the number of people going out without noticing the infection.'

PHE said that all temporarily lost data had been recovered by 1:00 am on October 3, 2020, and commented, 'We will take precautions to prevent such errors from occurring in the future.'

in Software, Posted by log1k_iy