COVID-19 Numeracy: Charts

New York, New York

Life has dropped into a bit of a routine. Get up, start the coffee pot and watch the briefing by Andrew Cuomo, Governor of New York. The Guv can be a little tedious and pedantic (well-known characteristics to New Yorkers), but he is a wonderful communicator. First, he speaks slowly and in simple, clear sentences — good for people for which English is not their first language.

He also has been presenting clear statistics on testing and the number of new COVID-19 cases. His interpetation and remarks are correct. The more you test, the more you will find the disease. One of his charts today had three columns: locality (county and entire state), cumulative COVID-19 cases to date and the number of new cases identified for the preceding 24 hour period. At the very least, the Governor needs to add a fourth column, the number of tests performed during the preceding 24 hour period. The fourth column would show the concurrent growth of new cases and tests.

I also heard the Governor struggling, just a bit, to put the number of new cases in context with the growth in testing. [New York has done an excellent job of ramping up its testing, BTW.] That’s where the ratio of new cases to tests comes into play:

Percentage of new positive cases = ((New cases) / (New tests)) * 100 

I get that most people do not think in terms of ratios between 0.0 and 1.0, so this measure can and should be expressed as a percentage.

The percentage (ratio) accounts for the growth in testing. I believe that this percentage (ratio) is a better tracking metric than the raw number of new cases alone. It is certainly better than the cumulative number of COVID-19 cases since the beginning of the epidemic. The cumulative number expresses the depth of suffering, but it does not adequately convey the growth or decline of the disease. Given that we will experience recurring waves of COVID-19 (if we suppress this disease successfully), the cumulative number of cases alone will obscure the rise and ebb of COVID-19.

I was encouraged to see New York using a metric similar to my COVID-19 Key Performance Indicator (KPI). New York officials are monitoring the number of all acute care patients. This is the most relevant number with respect to the capacity of the health care system. Please note that acute care includes all of the “every day” acute care patients (heart attacks, strokes, etc.) and COVID-19 patients. All of these patients need to be hospitalized and all of them put a load on the health care system.

If I have to add a public service announcement (PSA) of my own, please don’t drive under the influence and please don’t drive distracted (i.e., texting). Hospital emergency rooms don’t need additional patients. And be careful with that axe, Eugene.

The Seattle Times

I’d like to give a shout-out to my (new) hometown newspaper, The Seattle Times. They publish a concise daily summary of new and cumulative test results (March 19, 2020 below; click to enlarge).

The summary has the daily number of tests performed, positive cases identified and negative cases. Charts break out positive cases and deaths by age. Tests per day has its own chart and shows the growing number of tests. There is a chart showing the cumulative number of positive cases and deaths over time since February 28.

All good work. The raw figures come from the Washington State Department of Health, who should be commended for their transparency.

My one suggestion is to add a chart showing the test results over time, like the graph published by the University of Washington Virology Laboratory (below).

UW Virology

The UW Virology Laboratory upped their game and publish daily results in the UW Virology COVID-19 Dashboard. [Click images to enlarge.]

University of Washing Virology Laboratory COVID-19 Dashboard

This is a screenshot of the dashboard for March 19, 2020. It’s easy to see the number of tests (“Sample Count”) performed each day, the number of positive and negative cases identified, and the number of inconclusive results. You can hover the cursor over a bar segment and get the raw count for the chosen segment.

Very nice work. The trend of positive cases is easy to see. The chart shows the raw number of positive cases, so interpretation is still biased by the number of tests performed. A separate chart is needed which plots the percentage of new positive cases over time.

One additional suggestion for UW Medicine — Please explain “Inconclusive”.

As I mentioned yesterday, each day’s results are a glimpse into the past. COVID-19 has a 2 to 14 day incubation period according to the CDC. Other medical experts estimate the infection to test delay at 10 to 14 days (infection, incubation, symptoms, clinical presentation, test). Even though the trend of new COVID-19 cases is relatively flat, we won’t really know for another 14 to 21 days if we successfully flattened the curve and delayed onset of the disease in the community.

Keep the faith and keep healthy — P.J. Drongowski

COVID-19 Numeracy: Time

On March 16, President Trump warned that the COVID-19 outbreak could last until July or August 2020. This was the first official recognition that the outbreak could take months to quell, not a few weeks. On March 18, The New York Times and other media outlets published the U.S. Government COVID-19 Response Plan. The plan states that a moderately severe outbreak could last 12 to 18 months with multiple “waves.”

The duration and potential number of waves may shock some. However, the timing and waves of resurgence/subsidence are fairly typical for epidemics and other phenomena in nature. Consider, for example, predator-prey population models. Predators thrive and reproduce when prey are plentiful. When the food supply dwindles, predators die off and the predator population dwindles. With fewer predators, prey survive and multiply. The cycle of growth and decline repeats.

Viruses need susceptible, infectable hosts in order to reproduce. Thanks to bodily immunilogical response, human hosts may develop an immunity as a result of infection and recovery. Immunity may be limited to the specific infecting strain of the virus. Unfortunately, viruses are adaptable and a strain may mutate into a new different strain which is not suppressed by immunity to the original strain.

The specific behavior of SARS-CoV-2 — the virus which causes the disease COVID-10 — is not fully known, including human immunilogical response to SARS-CoV-2. Citing Dr. Anthony Fauci, no scientific study to date has demonstrated acquired immunity to SARS-CoV-2. Dr. Fauci and other experts believe that acquired immunity is likely.

Timeline

The U.S. Government COVID-19 Response Plan states, “Late December 2019, authorities from the People’s Republic of China (PRC) announced a possible epidemic of pneumonia of unknown etiology centralized on a local large seafood and live animal market in Wuhan, China. Estimated case onset was early December.”

Given the lack of Chinese transparency, slow governmental response and the speed/extent of modern jet travel, the SARS-CoV-2 virus has probably been circulating in Snohomish and King County Washington for quite some time. The report states, “The first U.S. case of COVID-19 was confirmed in Washington State on January 20 and was travel-related.” This patient landed from Wuhan, China on January 15 without symptoms and eventually presented on January 19 with symptoms of pneumonia.

The first case of community spread in the U.S. (California) was confirmed by the CDC on February 26, 2020. The first case of community spread in Snohomish County, WA was confirmed on February 28. Thus, SARS-CoV-2 circulated in the Snohomish and King County area for nearly six weeks. Detection of community spread is a red flag for a developing epidemic.

On March 9, the Washington Department of Health confirmed 162 total cases including 22 deaths. Most of these deaths occurred at Life Care Center of Kirkland, a nursing home linked to at least 54 confirmed cases as of March 9. King and Snohomish Counties had 116 and 37 total cases, respectively, as of March 9. If you are interested in the details, the March 18, 2020 CDC MMWR is short and to the point.

Washington State immediately announced and slowly ramped up community mitigation. Governor Inslee issued the most stringent order on March 16. As of today, March 20, the state government has not issued a “stay/shelter at home” order like California.

State of Washington Social Distancing Summary

Washington’s new cases

The timeline is important in understanding current new case data and in knowing what to look for in the coming two to three weeks.

The University of Washington Virology Laboratory published a graph of its testing results as of Friday, March 20, 2020 (below).

UW Virology COVID-19 test results (March 20, 2020)

On a positive note (literally), the number of new COVID-19 cases detected by UW Virology is relatively flat. We are now 5 days into the most stringent social distancing rules and practices (March 20).

Although the results are promising, this is not the time to let up.

This graph is a snapshot of the past. According to the CDC, “The following symptoms may appear 2 to 14 days after exposure: fever, cough, shortness of breath.” COVID-19 tests are (effectively) rationed. A patient must meet testing criteria such as COVID-19 symptoms, contact with a known COVID-19 case, or travel from a COVID-19 hot-spot. Thus, the graph shows cases that are 2 to 14 days after infection.

Social distancing in Washington State did not “get serious” until March 16. The next three weeks (21 days) will tell us if social distancing in Washington State has really been effective in slowing the spread of COVID-19.

The curves (again)

Quoting the San Francisco Chronicle, “More than half of Californians could be infected with the new coronavirus over an eight-week period if nothing is done to stop it, according to a projection from Gov. Gavin Newsom’s office.”

Let’s hope this statement motivates California’s to adopt social distancing, hygiene and other measures suggested by health autorities.

Gov. Newsome is referring, of course to the orange curve in the now well-known CDC “flattening the curve” chart. If Californians (or the citizens of any other region or city) do not practice social distancing, then new cases will grow exponentially and hospitals, doctors and nurses will be overwhelmed.


Source: CDC, Drew Harris (Connie Hanzhang Jin/NPR)

Why “eight weeks” and why “more than half?”

Epidemiologists have studied disease for a very long time. And, like any other science, they have built mathematical models to predict the growth and decline of epidemics. The SIR model (Susceptible, Infectious, Recovered model) is one of the simplest — and classic — models. Typically, these models are driven by parameters such as the reproduction number, R0, pronounced “R naught”. R0 is the average number of people who catch the disease from one contagious person.

  • If R0 is less than one, then the disease will die out.
  • If R0 is one, the disease remains alive, but does not break out into an epidemic.
  • If R0 is greater than one, then the disease will spread, possibly growing into an epidemic.

The epidemiologists advising Gov. Newsome have plugged different values of parameters like R0 into their models, run the simulation and assessed the resulting statistics.

“Flattening the curve” essentially means reducing R0 as much as possible.

Here’s the rub which may have escaped some folks. The same number of people will catch the disease (COVID-19) in both the orange unmitigated scenario and the blue mitigated (social distancing) scenario. For the mathematically inclined, the area under the orange curve is equal to the area under the blue curve. The onset of the disease is delayed for some people in order to reduce the strain on the health care system.

The long haul

I started this blog post with statements that more realistically indicate the duration of our battle with COVID-19.

Ideally, we could like to control the growth of COVID-19 (the ramp up) and then stomp it out of existence (the ramp down). Unfortunately, real-world systems tend to behave more like the predator/prey model.

Viruses are opportunistic and adaptible. As long as SARS-CoV-2 circulates in the community, it will find susceptible individuals and infect them. Therefore, as soon as we reduce or abandon social distancing, COVID-19 will have a resurgence. The community will always be in danger of a resurgence as long as SARS-CoV-2 can find enough susceptible people to infect and thrive.

In the opinion of many health experts, our bodies likely acquire an immunity to SARS-CoV-2 as we fight and recover from the disease. At some point, when the so-called “herd immunity” is high enough, the disease will have trouble sustaining itself. We also anticipate a vaccination program against SARS-CoV-2 which will add to the herd immunity. [Yeah, no one likes to be thought of as a “herd”.]

Viruses are composed of proteins and nucleic acid. Without deep-diving molecular biology, viruses contain a form of genetic material: RNA, DNA or possibly both. Viruses usually depend upon a host cell for replication. Short story: viruses can steal genetic material anywhere in this process and mutate. Some mutations are bad and the mutants die out. Some mutations are good and the mutants thrive. That’s adaptation, that’s evolution.

SARS-CoV-2 could mutate. The mutated strain could thrive and grow because acquired immunity and/or the vaccine are specific to the original SARS-CoV-2 strain. Mutation could bring about a resurgence of COVID-19 although the underlying pathogen would be genetically different.

Realistically, it will be difficult to maintain stringent social distancing. We want to be together socially and we will definitely want to get our economy back to full strength, too. In the long haul, we should expect to see cycles of COVID-19 resurgence and decline. We may need to re-impose social distancing to keep COVID-19 in check.

Stay healthy and stay distant — P.J. Drongowski

COVID-19 Numeracy: Testing

Current situation: 18 March 2020

The United States continues to struggle with COVID-19 testing. Communities do not have sufficient COVID-19 test kits and they are, effectively, rationing tests. Generally, a patient must satisfy several criteria before a COVID-19 test is authorized and performed. Example criteria include cough, fever (high temperature), shortness of breadth, exposure to an infected individual, or travel to a known COVID-19 hot-spot. A patient may be tested first for influenza, receiving a COVID-19 test after influenza is ruled out, i.e., the patient fails the flu test.

Even in the face of shortage, the number of COVID-19 tests performed each day is steadily increasing albeit slowly.

With the current situation in mind, let’s look at few metrics.

Media reports

The most frequently reported statistics are the cumulative number of confirmed COVID-19 cases and the number of deaths due to coronavirus. A few news organizations report the number of new confirmed cases for the preceeding 24 hour period (AKA “per day”).

Most numerate people realize that “If you test, you will find the disease.” If you test more, then the raw number of confirmed cases will also rise.

Very few news organizations report the total number of tests performed per day as well as the number of new confirmed cases per day. Lacking the number of tests performed for each day, one cannot put the number of new confirmed cases in context. Nor can an investigator determine a trend over time, i.e., , is community mitigation (e.g., social distancing, etc.) controlling the spread of COVID-19?

We should insist on receiving three per-day metrics: the number of tests performed that day, the number of new confirmed cases and the number of negative cases. The number of tests performed will tell us if COVID-19 test capacity has improved or not. Then, I recommend computing, tracking and comparing the ratio of new per-day COVID-19 cases (confirmed positives) divided by the number of tests performed that day. The ratio better indicates the growth (or decline) of the disease in the community.

The Washington State Department of Health does report a daily break down. Their page breaks out confirmed cases by county, by age and by sex. The Department of Health also provides testing information. Unfortunately, the number of individuals tested, positive and negative are cumulative. As of March 17, the test figures are:

  • Positive: 1,012
  • Negative: 13,117
  • Individuals tested: 14,129 (Positive+Negative)
  • Percent positive: 1,012 / 14,129 = 7.7%

Please give us nonaggregated results for each day!

Beware sampling bias

One might be inclined to interpret the daily percentage of new confirmed cases as the rate of incidence of the disease in the community (the population at large). Unfortunately, the pre-testing criteria guarantee a biased sample. We are testing patients for COVID-19 only if they have symptoms, have a certain risk factor and/or do not have the flu. The ratio may overstate the incidence of disease or, worse, understate the incidence by missing asymptomatic, infected individuals.

The lack of COVID-19 test capacity leaves public health officials in a terrible bind. One would like to randomly sample (test) the population at large in order to estimate the number of infected people (symptomatic and asymptomatic) in the community. This is akin to taking a political poll. We need to select randomly (say, 1,600 people) from the community and test the selected individuals for COVID-19. We estimate the total number of infected people by multiplying the proportion of infected people as measured by sampling (“the poll”) times the overall population size.

One needs the right kind of test, of course. We want to measure current, active infections, not past infections that have resolved (recovered). This leads us (almost) to the classic SIR model (susceptible, infected, recovered) for epidemics — yet another topic! Also, time to ask the Google about “real-time reverse transcription – polymerase chain reaction (RT-PCR)” and COVID-19 testing. Who said self-isolation was uninteresting? 🙂

From statistics theory, 1,600 tests has a margin of error of plus/minus 2.45%. 1,600 tests is a modest number and, unfortunately, we can barely perform medical diagnostic testing let alone conduct a necessary statistical study at this time. Ideally, we would have sufficient capacity to conduct “tracking polls” to determine the overall trend, i.e., are there more (or less) infected people today than last week? Our officials are truly flying blind at the worst possible time.

Recent studies suggest that asymptomatic people are a significant source of new infections. That’s why public health officials are saying, “Act and behave as though you are infected.” We need tests. We need tools.

Stay apart and say healthy — P.J. Drongowski

COVID-19 Numeracy: NYT curve flattening by country

The New York Times published a collection of graphs comparing new COVID-19 cases over time by country. The article asks the question “Which Country Has Flattened the Curve for the Coronavirus?”

If you are interested in COVID-19 numeracy, I suggest reading the comments section. There are good insights there.

Here is my own response.

Thank you for posting your study. I started blogging about “COVID-19 numeracy” in response to the rather poor way media outlets have portrayed the disease numerically. We are in this fight for the long haul (12 to 18 months) and need metrics that guide our actions.

Please listen to the recommendations in this comment section. Many of us have spent years in measurement and statistics.

Raw numbers, e.g., the number of (new) confirmed cases, are not always meaningful or useful. We know that the number of new confirmed cases per-day will rise dramatically as testing increases. The number of new confirmed cases needs to be “normalized” against the number of tests performed. I suggest tracking the ratio of new confirmed cases divided by the number of tests.

I strongly agree that we need to track the progress of COVID-19 on a daily basis. (A seven day moving average is a good idea.) The total (cumulative) number of cases/deaths — as tracked and reported by most media outlets — will not be useful 2, 3, 4 months into the crisis, especially when there will be “waves” of resurgence and subsidence. We need to understand the dynamics of the pandemic.

Health authorities need to report the number of tests per day, the number of positive cases for that day and the number of negative cases. The raw number of tests per day will tell us if authorities are meeting their commitment to increase the number of tests and allow us to compute ratios, etc.

Should we ever get to the point of spare testing capacity beyond diagnosis, we need to conduct periodic community studies, something akin to a political poll. Take a random sample of the community and determine the number of symptomatic and asymptomatic cases (by age, by sex, etc.) Such polling will allow us to track the actual infection rate in the population at large.

COVID-19: KPI

First off, I would like to say how sad I am for those who have lost their lives to COVID-19, offering my support and empathy to their loved ones. We live in Sonomish County, Washington, not very far from King County and Kirkland. We’ve already seen the devastation which COVID-19 can wreak on care facilities for elderly people. This disease is all too real.

By now, you’ve heard the phrase “flattening the curve” and have probably seen the graph from the CDC (below). Our global goal is to slow the spread of COVID-19 through the population such that severe cases do not overwhelm the health care system. In a nutshell, the health care system has a fixed number of beds, doctors, nurses, caretakers, respirator, ventilators, etc. When capacity is exhausted, the system cannot treat all incoming patients (COVID-19 plus the regular, on-going stream of emergency situations like heart attacks, strokes, etc.) and, quite frankly, people will die.


Source: CDC, Drew Harris (Connie Hanzhang Jin/NPR)

The graph has done a good job of educating us as to the need for social distancing, hygiene, and other measures which slow the spread of disease.

It also suggests a metric — a key performance indicator (KPI) — which can tell us how well we are doing. A KPI is a measurable value that demonstrates how effectively an organization is achieving its objective(s). It’s important to note that a KPI can and should be applied at different levels in the organization. We also need to know how the KPI changes over time.

Let’s consider a KPI which measures the number of all critical care patients above or below the capacity of the health care system (below). This delta tells us if we are successfully suppressing the spread of COVID-19 or not. We must measure all critical care patients because they all are vying for the same medical personnel, beds and equipment. If the KPI is positive, i.e., the number of critical care cases exceeds capacity, then we are failing. If the KPI is negative, then we are succeeding.

COVID-19 Key Performance Indicator (KPI)

A national KPI value is only somewhat useful. The KPI needs to be measured at the state level and regional (metro) level. State-level values are useful only for small states such as Rhode Island with one major population center. Regional-level values are more useful in large states like California or Washington with two or more major population (and health care) centers. The U.S. is a very large country and health care capacity in West Virginia, for example, is not available to residents in San Francisco. Thus, regional numerical break out is required.

We need to track the KPI over time. The trend will tell us if we are successfully supressing the spread of COVID-19 or not. Tracking by region over time will tell us if hot spots are cooling off or if new hot spots are developing.

Why am I proposing this KPI, especially now? As a people, we need to walk a fine line between actionable concern and fearful panic.

I see tables and graphics in the media which tally the total number of confirmed cases and the total number of deaths. Yes, we mourn the loss of our neighbors. If we have even an ounce of humanity, can we not? I agree that such figures convey the sense of urgency needed to motivate new behavior that slows the spread of disease.

However, the total number of confirmed cases and fatalities alone are flawed measures for decision making. With respect to confirmed cases, the number can only go up — drastically — as we ramp up testing. Media need to at least report the number of people tested each day along with the number of negative results as well as the number of new confirmed cases. Even then, perhaps only the ratio of new confirmed cases to the number of new tests is truly meaningful.

Just to be clear, “each day” means the number of people tested, confirmed positive and negative in a specific 24 hour period, not an accumulated tally to date.

Aggregated and accumulated measurements are not practical and may be unnecessarily misleading. News media please take notice.

We also need to break out and track new critical COVID-19 cases which require hospitalization. These are the people who need the health care system. By tracking this number, we will see if the load on the health care system is trending upward toward catastrophe or trending downward successfully. As much as I love my younger and/or healthier brothers and sisters, if they are successfully recovering in self-isolation, then they are not loading the health care system with negative implications, and possibly death, for critical care patients.

I apologize to anyone who may feel offended by my frank discusion. I’m trying to come to grips with all of the information thrown at me by media outlets. Hopefully, you will find this approach to be useful. Even if you do not adopt my proposal, I hope to have started a practical discussion.

Wishing you safe passage — P.J. Drongowski