Researchers develop an algorithm to predict the second wave of the new coronavirus from real-time data of Google and Twitter



As for the new coronavirus infection (COVID-19), which is raging around the world, the

total number of infected people at the time of writing the article was 11.5 million, and the number of deaths exceeded 530,000 . Even in Tokyo, the number of new infections per day has exceeded 100 people for 5 consecutive days since the emergency declaration was lifted and there is concern about the arrival of the second wave, research teams such as Harvard University and Northeastern University , 'Algorithm for predicting second wave of new coronavirus from real-time data of Google and Twitter' was developed.

An Early Warning Approach to Monitor COVID-19 Activity with Multiple Digital Traces in Near Real-Time
https://arxiv.org/abs/2007.00756



Can an Algorithm Predict the Pandemic's Next Moves?-- The New York Times
https://www.nytimes.com/2020/07/02/health/santillana-coronavirus-model-forecast.html

It is said that policies to keep people's social distances effective in suppressing the epidemic of COVID-19, but considering the management of the local economy and society, the administration side loosens the regulation at some point. You have to make a decision. Policy makers in each country are making careful decisions to prevent the second wave of COVID-19 as much as possible based on figures such as the number of new cases, deaths, and the degree of bed filling.

However, the information such as the number of infected people announced will be changed two weeks ago depending on the incubation period from the infection with the new coronavirus to the onset, and the time lag between the onset and going to the hospital for testing. It is also said to be the result of. In other words, even if measures are started based on an alarm system based on numerical values such as the number of cases and deaths, it may be too late to stop the epidemic of COVID-19.

Meanwhile, researchers at Harvard University and Northeastern University have announced an 'algorithm that predicts that a COVID-19 epidemic will occur more than two weeks in advance.' According to the paper submitted to arXiv, the algorithm developed by the research team analyzes multiple real-time data such as Google search, smartphone location information, and posting to SNS, and the trend of COVID-19 Is to predict. In 2008, Google engineers developed a model for predicting influenza epidemics by tracking search trends for words such as 'fatigue,' 'joint pain,' and 'Tamiflu dose.' The model itself was not very accurate, but many researchers have predicted the epidemic of infectious diseases by focusing on real-time data.

In this time, in addition to Google search, the new algorithm developed to predict the epidemic of COVID-19 is 'Twitter posts related to COVID-19 with location information' 'Doctor's clinical diagnosis support tool ' UpToDate ' 'Data' 'anonymous location information data collected from smartphones' 'body temperature data uploaded from smart thermometer ' Kinsa Smart Thermometer '' is analyzed. Combining these data with a prediction model developed by researchers at Northeastern University, it is a mechanism to predict the epidemic of COVID-19.

By actually analyzing the American data from March to April 2020 using the algorithm developed by the research team and optimizing the weighting of the data source, the epidemic of COVID-19 occurred on average 21 days ago. It seems that he could predict what to do.



``Most infectious disease modeling predicts different scenarios based on pre-assumed assumptions, but our algorithm makes assumptions,'' said Mauricio Santillana , Associate Professor of Pediatrics and Epidemiology at Harvard University. Obviously, our method can react to immediate behavioral changes and incorporate them into predictions.'

Madhav Marathe , a computer scientist at the University of Virginia, said, 'We know that a single data stream is not useful on its own. The contribution of this new paper is that they have a very broad data stream. That's the point.'



Santillana points out that the newly developed algorithm does not replace the traditional epidemic surveillance system, but increases confidence in the outcome of the surveillance system. With this algorithm, policymakers can think of 'make decisions right now, rather than watching for a week.'

It should be noted that the algorithm cannot predict 'events that could lead to the spread of infection,' such as the protests that occurred throughout the United States in connection with the death of Mr. George Floyd. The team also acknowledges that the predictive accuracy you get from social media and search engine words can also diminish as people get used to the disease. Public health institutions such as the Centers for Disease Control and Prevention (CDC) also refer to data on social media and other sources, but these data are not central to epidemic prediction.

Shweta Bansal , a biologist at Georgetown University , admitted that the new COVID-19 epidemic prediction algorithm using real-time data is important, but the damage caused when the algorithm is wrong is very high. Pointed out that it is big. Therefore, he claimed that he needed to validate the algorithm over time.



in Software,   Science, Posted by log1h_ik