Generally speaking, contact tracing is the procedure of identifying individuals who may have come into contact with an infected individual. Then, by tracing back the contacts of these infected people, public health professionals aim to reduce infections in the population, applying measures such as selective isolation. Traditional manual contact tracing is a laborious and lengthy task involving personal interviews that, in most cases, provide only vague information about the previous contacts. Consequently, digital contact tracing aims to automate this task, mainly by using users’ smartphones to detect contacts between infected and susceptible people, and trace back contacts.
We start this section by detailing how a risky contact can be determined and the architecture of digital contact tracing, focusing on those aspects that will impact its efficiency. Then we characterise this efficiency which will be used in the proposed model for assessing digital contact tracing. We differentiate between the terms efficiency and effectiveness. Effectiveness is used more as a medical term, that is, how well a treatment works when people are using it, and it can be measured as the number of infected or dead people averted with digital contact tracing. Efficiency is used more as a technological term and allows scientists to evaluate how well digital contact tracing is working, for example, by reducing the number of false alerts or by increasing the tracing speed.
Risk contacts estimation
One of the critical aspects of digital contact tracing is how to estimate risk contacts using the underlying technology of current users’ smartphones. From the beginning of the COVID-19, health authorities have considered a risk contact someone who is in close contact (less than two meters away for at least 15 min) with a person who tested positive42,43,44.
Current smartphones can provide several ways to determine these close contacts using localisation and communication technologies, such as GPS, Wi-Fi, Bluetooth, beacons, or even QR codes. The final goal is to provide a method to detect risky contacts with enough precision for contact tracing.
As most COVID-19 applications finally used Bluetooth for its greater precision and privacy, we briefly summarise this detection technology. When two Bluetooth devices communicate, the sender emits its signal at a certain power level, while the receiver observes this signal at an attenuated power level known as the received signal strength indicator (RSSI). Since attenuation increases with the square of the distance, the distance between two Bluetooth devices can be inferred using a Path Loss Model. On the other hand, the duration of a contact is estimated by periodically sending Bluetooth messages and calculating their distance. However, RSSI values typically fluctuate in time or are influenced by other factors such as obstacles and reflections45. Several studies have shown that it depends on factors such as the relative orientation of handsets, absorption by the human body, type of devices used, and if the contact is indoor or outdoor17,46. For example, the evaluation shown in46, in which several distances and scenarios were used using the Android Beacon Lib., showed an accuracy of close to 50% indoors and of about 70% outdoors.
Nevertheless, this is only referred to the accuracy of estimating if a contact is in the 2 meters range, not about the real exposure to the virus. The risk also depends on many other factors, such as the exposure intensity to the virus, the quality of the medium, and the susceptibility of the non-infected person.
A way to improve this precision is to use other smartphone sensors to detect the kind of location (indoor/outdoor) and the quality of the medium (temperature, sunny/cloudy). Considering this new information, and with the combination of machine learning techniques, it is possible to improve the accuracy to 83% indoors and 91% outdoors45.
Digital contact tracing architecture
Digital Contact Tracing works in a similar way to traditional contact tracing but uses smartphones to detect and record the possible risky contacts (see Fig. 1). The first step is to install on the user’s smartphone an app that will be active to monitor these contacts. When the users’ phones are in contact for at least 15 min and at a distance of less than two meters, the app understands that there has been a risky contact. To preserve privacy, the smartphones exchange anonymous key codes, which can be used to determine the identity of the people contacted. If a user is diagnosed as positive after performing a test, the app should be notified in order to start the process of tracking the user’s previous contacts, which will use the generated keys of their previous contacts to identify the users at risk. Then, users who have had a contact with the positive user will receive an alert.
Nevertheless, there have been several considerations for the design of these contact tracing apps, such as where the keys are stored, how the matching is done, some privacy issues, and the adoption requirements. Regarding where the keys are stored and managed, there are two different models: in the centralised approach, the generated keys are stored and managed in a central server. This way, when positives are detected, the users notify the application of their new status, which is transmitted to the central servers. Then, the centralised servers check for all her/his previous risk contacts, who are notified immediately by the application. On the contrary, in the decentralised approach, the generated keys are stored on the user devices. Only when a user is detected positive, the mobile application will upload the recent locally stored keys to the server, which will be distributed among all the users in order to match locally if they have been in contact with this individual. Although the decentralised approach seems to preserve the privacy of the users, it depends on their willingness to check and inform health authorities of this possible risky contact, being less effective than the centralised approach.
In both models, we consider the matching of keys and notifications to users to be completely automatic and immediate in order to shorten the tracing process. Finally, note that the centralised approach allows several variations. If required, the health authorities can have direct oversight of user data, so they can check, notify and manage previous contacts. It can also be used in combination with manual contact tracing, so the positives detected are notified to the tracing teams (the user is not notified automatically by the app), and the notifications to users could be delayed.
Another implementation issue is the adoption requirements of the app. That is, how and when the contact tracing app is activated. As the effectiveness of digital contact tracing depends on the adoption rate, this is a key aspect. There are different strategies for activating the app: mandatory use, opt-in and opt-out. A mandatory adoption implies that the application is always active, for example, by compelling their citizens to install the application and use it. This mandatory adoption may be viewed as a privacy violation in most countries. Therefore, most of the offered applications implemented the opt-in strategy for activating the app: users should download the app and proactively opt-in for using it, penalising its utilisation. Finally, the opt-out strategy assumes that the application is installed and activated by default (for example, by making use of an operating system update). Nevertheless, although the user still has the option to disable the application, most people would not opt-out47, and this will increase its utilisation.
The previous adoption requirements consider the entire population of a country or state, which makes high adoption ratios very difficult. Nevertheless, we can improve the utilisation and effectiveness if we consider only the people at specific locations, such as factories, music festivals, university campuses, retirement homes, and conferences (these groups are medically referred to as cohorts). The app should only work in those locations (for example, by using the GPS and establishing the tracing area where the smartphone can detect these contacts) and would be mandatory (no privacy issues can be raised as the app only traces the contacts in those locations). Additionally, for retirement homes, we can consider that the elderlies’ could use other more manageable devices, such as wristbands or necklaces, with detection capabilities similar to those of smartphones. Thus, considering the individuals in those cohorts, the adoption ratio could reach 100%.
Finally, and regarding the implementation decisions of digital contact tracing for COVID-19, the majority of countries chose the Google/Apple Exposure Notification API as the framework for implementing their apps39. This framework implemented a decentralised approach and Opt-in activation, limiting, as we will see, its efficiency. Other countries, such as China and South Korea, developed their own framework and application, a mandatory app with a centralised model.
Characterising digital contact tracing efficiency
As detailed in the previous subsections, several technical aspects of digital contact tracing can have a huge impact on its efficiency: the precision of detecting risky contacts, some implementation decisions such as the centralised vs. decentralised approach, and the adoption model. Therefore, in this subsection, we first evaluate and parametrise the impact of these technical aspects; then, we introduce some simple expressions to evaluate the efficiency of contact tracing.
Parametrising digital contact tracing
Accuracy is fundamental in detecting real risky physical contacts, that is, a true positive contact. Nevertheless, as detailed in section “Risk contacts estimation”, current smartphone risky contact detection is not precise enough. Smartphone-based detection can generate false negatives (a true positive contact is missed) and false positives (a false contact wrongly detected as positive).
To characterise the false and negative contacts, we use the following ratios: the True Positive Ratio (or sensitivity), \( TPR \), is the ratio between the number of the detected positive contacts and the real number of positive contacts, and the False Positive Ratio, \( FPR \) is the ratio between the number of negative contacts wrongly categorised as positive and the total number of actual negative contacts. The impact of these ratios in the contact tracing process is quite different: a greater \( TPR \) implies that more infected individuals can be detected and isolated, and \( FPR \) increases the number of people wrongly considered infected (i.e. a false alarm), and thus the people isolated unnecessarily.
For example, the first evaluation of England’s contact tracing app performed in August 2020 (based on version 1.4 of the Google/Apple Exposure Notification) showed a \( TPR \) of 69% and a \( FPR \) of 45%48. These numbers were not good, especially considering the high rate of false alarms that were generated, which undermined the people’s confidence in the app. A posterior refinement on the classification algorithm reduced this \( FPR \) significantly.
The centralised and decentralised approaches have an impact on the contact tracing time and the tracing coverage. The contact tracing time TT is the time in days required since an individual is tested positive until the notification of his/her traced contacts. The tracing coverage TC is the proportion of previous contacts traced.
If we consider that in the centralised approach, all the keys are uploaded to a centralised server, when individuals test positive health authorities can immediately start the process of matching their previous contacts obtaining full tracing coverage (thus, \(TT \le 1\) day and \(TC=1\)). On the contrary, this process is not as fast in a decentralised approach and depends on the users’ willingness. Firstly, when an individual is tested positive, she/he should notify the application. Secondly, as the matching is done locally, the potential previous contacts of this new positive individual should check the App. This checking will produce delays of several days in the notification (\(TT>1\) days), and that some potential contacts are not notified, reducing the tracing coverage (\(TC<1\)). This last value will depend on the users’ willingness to check their App.
The adoption ratio has been shown to be a critical issue in the efficiency of digital contact tracing. The key question here is how many people are going to use the application. As detailed in the previous subsection, this rate depends on the adoption model. It is clear that a mandatory model will imply a high utilisation rate, while an Opt-in model will reduce its use significantly. Unfortunately, for the Opt-in model used in most countries, the utilisation rates were in the range of [0.15, 0.35], far below the necessary utilisation rates recommended for the models to be effective.
This adoption ratio (AR) can be used to estimate the number of contacts that can be traced, and, in some way, determine roughly the efficiency of the process as the proportion of the contacts detected to all real contacts. Note that, for detecting a contact, it is required that both individuals use the App. Therefore, the likelihood of detecting a contact is \(AR \times AR\), which is the probability that in a real contact both individuals use the App. This probability means that the ratio of contacts detected is \(AR^2\), which implies that a high adoption ratio is required in order to capture a considerable number of contacts (for example, with an adoption ratio of 0.25, only 6.25% of the contacts can be captured).
Measuring tracing efficiency
A simple way to measure the efficiency of contact tracing is to determine how many risky contacts can be detected. The first expression, the true traced contacts ratio \(c_T\), determines the ratio of true risky contacts detected to all real risky contacts. This ratio can be obtained by taking into consideration the true positive ratio \( TPR \), the adoption ratio AR, and the tracing coverage TC. All these parameters reduce the final number of detected contacts as follows:
$$\beginaligned c_T = TPR \cdot (AR)^2 \cdot TC \endaligned$$
A similar expression can be obtained to determine the ratio of false positives generated, or false alerts ratio, \(c_A\), using the false positive ratio:
$$\beginaligned c_A = FPR \cdot (AR)^2 \cdot TC \endaligned$$
As an example, we can estimate the efficiency of England’s digital contact tracing app. The parameters have been obtained from48, and33: the true and false positive ratios were 69% and 45%, respectively; the adoption ratio was 28% (from a total population of 58.9 million), and the tracing coverage of 80%, estimated as the reported adherence to quarantine rules of the individuals who used the app. With these values, the true traced-contacts ratio \(c_T\) was 0.0433, and the false alerts ratio \(c_A\) was 0.0282. Regarding \(c_T\), this means that only 4.3% of the real contacts were detected using the app, which can be considered a small efficiency. Nevertheless, in order to evaluate the real impact of these parameters when dealing with the COVID-19, we need to use an epidemic model.
A model for assessing digital contact tracing
The model presented here is a Susceptible, Infected, Recovered (SIR) deterministic epidemic model, which considers not only the impact of the digital contact tracing technology through the parameters described in the previous subsection but also the effect of the quarantine measures taken in case an individual is detected positive, and the immunity due to vaccines. The goal is to obtain a model that reproduces the spread dynamics of the COVID-19 disease that will be used to evaluate the effectiveness of digital contact tracing.
The model we introduce here is a derivation of the stochastic epidemic model presented in our previous work8, in which we considered the heterogeneity of the contacts. Thus, in our new model, we consider a population of N individuals and homogeneity of the contacts. This new model also considers the effect of the temporal measures taken (such as social distancing and mask), along with the vaccination rates.
Epidemic models are usually based on the transmission rate (or risk) \(\beta \), the rate at which an infection can be transmitted from an infected individual to a susceptible one. This rate can be obtained as the product of the average number of contacts with infected individuals during a day, k, and the transmission probability of the disease, b, where the time unit t is in days. Infected individuals recover after \(1/\gamma \) days, where \(\gamma \) is known as the recovery rate. These values are related to the basic reproductive ratio as \(R_0=kb/\gamma \). \(R_0\) represents the expected number of new cases directly generated by a single case. When \(R_0 > 1\), the infection will start spreading in a population, but not if \(R_0 < 1\). Generally speaking, the larger the value of \(R_0\), the harder it is to control the epidemic. When measures are taken, this reproductive ratio can be reduced, and it is usually referred to as the effective reproductive number \(R_e\). For the COVID-19, in Table 2 we can see the estimated parameters of its transmission. These parameters are estimated when no health measures are taken. As for COVID-19, we have experienced that when temporarily applying physical measures such as social distancing and wearing masks, both the probability of transmission and the number of contacts were reduced. Therefore, in this model, we consider the time dependency of these parameters to model the effect of these temporal measures: B(t) and K(t). Note that, for simplicity of notation, we will omit the time in the expressions that follow. The number of casualties can be obtained from the whole number of infected individuals multiplied by the Infection Fatality Rate (IFR).
Vaccination reduces the probability of infection and its transmission drastically, and thus the mortality rates. Fortunately, for the COVID-19, it has been the definitive solution. Nevertheless, vaccines are not 100% effective, so vaccinated people can get infected. This effectiveness depends on the type of vaccine (for example, the effectiveness of Pfizer’s vaccine is around 95% and the AstraZeneca’s one around 70%.). In our model, we take into account the weighted average effectiveness of the vaccines used in a country (v). We also consider a vaccination rate (per day) depending on time \(\Omega (t)\) to model when the vaccines were introduced and their rate.
As detailed in the previous subsection, digital contact tracing cannot trace all real contacts positively, and it can even generate false positives. We obtained two expressions for measuring this efficiency: the true traced contacts ratio \(c_T\) (Eq. 2), and the false alerts ratio \(c_A\) (Eq. 1). Nevertheless, if tracing time is greater than one, that is, for the decentralised schemes, the \(c_T\) and \(c_A\) ratios need to be normalised considering the tracing time TT, since it will take more time to trace the previous contacts. Thus, the contacts are distributed among the days that last the tracing process, in the following way: \(c_T^n=q/(1/\tau _T)=c_T\tau _T\) and \(c_A^n=c_A\tau _T\).
In our model, we assume that a newly detected infected individual is immediately isolated, and his/her previous contacts are evaluated using digital contact tracing. Then, these previous contacts are considered to be quarantined. Therefore, besides the common SIR classes (S, susceptible individuals; I, infected individuals; R, individual recovered;) we define three new classes for the individuals being in quarantine. Namely, \(Q_I\) refers to an infected individual that has been detected (or traced) and therefore quarantined; \(Q_S\) to a susceptible individual that is quarantined after being traced; and \(Q_T\) to an infected individual that has been detected and is being traced. There is also a class V for the vaccinated people. Finally, refer to Table 1 for the notation used in the model.
The transitions between classes and their rates are depicted in Fig. 2. The time unit is one day, as most human epidemic models do. The general transition rate from susceptible to infected is \(KB\fracIN\), which depends on the transmission rate of the disease, B, the average number of contacts, K, and the ratio of infected individuals to the population, \(\fracIN\). Nevertheless, in our model, we distinguish between the infected individuals that have been traced positive and the rest. The transition \(S \rightarrow I \) occurs when a susceptible individual that has not been traced positive gets infected. Thus, the previous general transition rate is multiplied by \((1-c_T)\), which is the ratio of non-traced contacts. Therefore, class I contains infected people who have not been detected positive and are not quarantined. The transition \(S \rightarrow Q_T\) is for the susceptible ones that are infected and are detected positive (a true positive), mainly using digital contact tracing (which is why this transition rate is multiplied by \(c_T\)). Note that infected individuals that are in class I can be also detected by tests (PCRs) with a \(\delta \) rate, traced back, and quarantined (transition \(I\rightarrow Q_T\)).
Individuals stay in quarantine for a total of \(1/\tau _Q\) days. Nevertheless, we divide this quarantine into two phases. The first phase (transition \(Q_T \rightarrow Q_I\)) is the time needed to trace their previous contacts, which is the tracing time \(TT=1/\tau _T\). This phase is added to evaluate the impact of this time, for example, to consider the delay incurred in the decentralised approaches. After this tracing time, the infected individuals stay at class \(Q_I\) for the rest of the quarantine, \(1/\tau ^r_Q\) = \(1/\tau _Q- 1/\tau _T\), and finally recover (transition \(Q_I \rightarrow R\)). Finally, transition \(I \rightarrow R\) represents the individuals who remain undetected and recover from the disease, with a recovery rate \(\gamma \). Note that this case also includes asymptomatic individuals.
Now we consider the effect of false alarms (\(c_A\)). The effect of false alarms (false positives) is that some non-infected individuals will be considered as infected and, therefore, wrongly quarantined. This corresponds to the transition \(S \rightarrow Q_S\), which considers the probability of not transmitting the disease \((1-b)\), and the ratio of false alarms generated, \(c_A\). Class \(Q_S\) is introduced to evaluate the individuals that are unnecessarily quarantined. When the quarantine ends (for \(1/\tau _Q\) days), these individuals return to the susceptible class (transition \(Q_S \rightarrow S\)).
Finally, transition \(S \rightarrow V\) occurs when a susceptible individual gets vaccinated with rate \(\Omega \). Nonetheless, some of the vaccinated people could get infected. This is represented by transition \(V \rightarrow I\), with a rate of \((1-v)KB\fracIN\), that depends on the weighted efficacy of the vaccines. In order to simplify the model, we do not consider that the vaccinated people are traced and quarantined. Note that, as a model, we have simplified or omitted some transitions with the aim of making the model amenable while keeping the fundamental behaviour that will help us to evaluate digital contact tracing.
From these transitions and rates, the epidemic model is defined as follows:
$$\beginaligned \beginaligned S’&= -(1-c_T)KB \fracIN S- c_T KB\fracIN S – c_A K(1-B) \fracQ_IN S \\&\quad -\Omega S + \tau _Q Q_S \\ I’&= (1-c_T)KB\fracIN S + (1-v)KB\fracIN V – \delta I – \gamma I \\ R’&= \gamma I + \tau _Q Q_I \\ Q_S’&= c_A K(1-B)\fracQ_IN S – \tau _Q Q_S \\ Q_I’&= \tau _T Q_T -\tau ^r_Q Q_I \\ Q_T’&= \delta I + c_TKB\fracIN S -\tau _T Q_T \\ V’&= \Omega S – (1-v)KB\fracIN V \endaligned \endaligned$$
Note that, for simplicity of notation, the time has been omitted in all the classes, and in the K and B functions. For example, for class S, \(S’=dI(S)/dt\) and \(S=S(t)\)). This model is solved numerically, considering an initial value for I, R and S classes so \(S(0)=N-R(0)-I(0)\), and the other classes are set to zero. The model can be solved for a given time (for example, one year), or until the infection is over.
Assessing the effectiveness of digital contact tracing
The effectiveness of digital contact tracing can be assessed in several ways. The highest level of effectiveness would be when it could control an outbreak, that is, when the number of infected individuals decreases. Considering the Eq. (3) of the epidemic model, we can determine this condition when \(I’\) is negative as:
$$\beginaligned \left( \frac(1-c_T)SN + \frac(1-v)VN\right) R_e\gamma < \delta + \gamma \endaligned$$
considering that \(KB=R_e\gamma \). We prefer to use the \(R_e\) number as it is a more simple (and known) figure to express the intensity of an epidemic. If we analyse this expression, we can determine the main components that can lead to the control of an outbreak. The term \(\frac(1-c_T)SN\) is the proportion of susceptible people that can be infected without being detected and quarantined. Similarly, the term \( \frac(1-v)VN\) is the ratio of vaccinated people that can be infected. If we substitute these terms in 4 by \(S_RN\) and \(V_RN\) we have:
$$\beginaligned (S_RN + V_RN)R_e\gamma < \delta + \gamma \endaligned$$
Particularly, we can see that, in order to control an outbreak, we should reduce component \(S_RN\) by improving the efficiency of contact tracing (\(c_T\)) or by reducing the number of susceptible individuals, that is, reduce component \(V_RN\) by improving the efficacy of the vaccines (v), reduce the transmission rate (\(R_e\)) or, alternatively, increase the detection ratio (\(\delta \)).
Other important figures to assess the effectiveness are the whole number of infected individuals and deaths. These values can be obtained by solving the model for a given time or until the infection is over (\(I<1\)). Then, we obtain numerically the number of accumulated individuals infected over the evaluated period, considering the individuals who move from classes S to I. Note that we can also obtain the number of deaths by multiplying the whole number of infected people by the Infection Fatality Rate (IFR). Nevertheless, reducing these values (infected individuals) can imply the application of severe measures such as quarantines. Thus, we get the accumulated number of people quarantined \(Q_a\), which is obtained as the number of individuals that transition to classes (\(Q_S\),\(Q_I\) and \(Q_T\)). A highly effective contact tracing based quarantine will minimise the number of people quarantined while controlling the spread of the disease.
Finally, we can also evaluate the impact of false alerts (\(c_A\)). As described in the model, less precise contact tracing increases the number of susceptible quarantined individuals \(Q_S\), that is, the individuals that are wrongly detected and quarantined. So, we can count the individuals that transition into class \(Q_S\) as the number of generated false alerts.