The US stock markets and COVID-19
By Gund Jungsanguanpornsuk
Disclaimer: The below references an opinion and is for information purposes only. It is not intended to be and is not investment advice. Seek a duly licensed professional for investment advice.
Motivations

On March 9th, the NYSE halted trading for the first time since 1997 due to the COVID-19 induced selloff. That was only a preview of what was to come; on March 20th the S&P500 hit a YTD low, and it was down approximately 33% from its peak this year. For me, having lived in Thailand until coming to the US for college 2 years ago, I had never witnessed such a selloff. Even though we have rebounded dramatically from such levels, I thought it was an interesting opportunity to explore.

What?
Originally, I wanted to use the COVID-19 numbers to predict when the US stock markets will bottom. Because that is too late (or not?), instead in this Medium post, I will try to use other data related to the COVID-19 pandemic to predict, approximately, when stocks will rebound to their YTD peak as well as spot relationships between such data and prices.
In doing so, I will separate my analysis into 3 parts:
- (1) Using US COVID-19 current numbers and projections to predict, approximately, when stocks will rebound to their YTD peak.
- (2) Analyzing the relationship between expected vs actual COVID-19 numbers and stock prices.
- (3) Analyzing US Twitter sentiment and stress with stock prices.
A few notes before we begin:
- Because the nature of this project is quite exploratory and open-ended, (I wasn’t sure if it would work), I will not only present my findings but also briefly walk you guys through my methodology, logic, and thoughts.
- The S&P500, DJIA, and NASDAQ composite are all indexes that are most widely followed by investors. As seen above, because they have all been following the same trend, I will be using the S&P500 individually for my analysis. I chose the S&P500 because it represents approximately 80% of the value of the stock market (Investopedia).
- I know, a lot of you are probably thinking you can’t use only COVID-19 trends to predict the market. It’s true, and predicting the stock market continues to be a problem with no solution. However, for the purposes of this project, I wanted to isolate the effect of the COVID-19 pandemic. The stock market is currently in a rally, and my project assumes that we have hit the bottom and that this is the rally that will take us back to the YTD peak.
Part 1: Can we use current COVID-19 numbers and projections to predict stock trends?
I decided to use current COVID-19 death count and projections, as opposed to confirmed case count. This is because using confirmed case count hides one important unknown variable: testing rate. The death count mitigates this.

First, I plotted the number of deaths in the US to date against the S&P500 (Figure 1).
To make it easier to visualize and spot trends, I inverted the graph of numDeath
.

.
.
The change is shown in figure 2. Instead of numDeaths
, I used max(numDeaths) — numDeaths
. I will be using this inverted graph for the rest of the project for visualization purposes.
.
.
We can clearly see that the deaths graph lags the S&P500, and that makes sense. We can now shift the dates
left for the deaths graph for 2 reasons:
- (1) When forward looking, people mostly look at case count. However, we are using death count for reasons stated above. COVID-19 shows positive at most 14 days after contraction, and it takes at least a couple more days if not weeks to lead to death. Therefore, I believe it is fair to shift it left 14 days to emulate case count.
- (2) Then, because markets are a function of expectation, we have to account for when markets expect a particular case count. Let’s assume that because this pandemic is very unpredictable and was unexpected, the market expects only 10 days ahead.
- Therefore, I will shift the deaths graph left by 24 days. This produces the graph below.

As we can observe, the market and COVID-19 deaths (accounted for expectation) follow similar trends. I now proceeded to my prediction.
However, the deaths graph(actual data) ended at April 3rd (due to shifting dates left). To make predictions, I had to use projections beyond that date. I chose the Institute for Health Metrics and Evaluation (IHME) projections for US COVID-19 death numbers (it was a suggested projection on the CDC website). However, because the projections were too high, I had to scale them first:

I created a linear regression model between price
and max(numDeaths) — numDeaths
. Then, I utilized supervised machine learning, with the linear regression algorithm, to predict when markets will return to the YTD peak. The model was trained on actual data up until April 3rd.
Using the linear regression model, I proceeded to test the model on IHME projected data from April 3rd to April 27th (present). With a very slight manual scaling adjustment at the end, it produced the following result:

In terms of trend and date-to-date prediction, I thought the predicted prices were reasonably accurate, given the limited 1 month training data. So, I used the model to predict the prices from April 3rd onwards based on IHME projected data with the same manual scaling adjustment at the end.
.
In using the IHME projections for the prediction, I also had to “stretch” the projection. This is because in IHME’s projection, their time from peak deaths to 0 deaths was almost the same as 0 deaths to peak deaths in the beginning. Even in China, the former took twice the amount of time of the latter, which sounds a lot more realistic. Therefore, in the projections, I stretched the projections post peak-deaths to be twice the amount of time. I believe twice was a fair and best-case assumption, because China handled their situation really well.

Based on this result, I predict that, approximately, the S&P500 will return to its YTD peak of $3,386 — at the earliest — during the first half December. Now, this last bit required a bit of personal judgement. My rationale was that the last portion of the predicted graph is unrealistically flat (this was due to the fading out of COVID-19 deaths). Thus, if we were to continue that line beyond where it ends with a slightly steeper slope, I predict that December is where we would hit $3,386.
Part 2: Analyzing the relationship between expected vs actual COVID-19 deaths and stock prices

One other relationship I hypothesized was a positive correlation between the Expected(numDeaths) — Actual(numDeaths)
of a particular day and price
the day after. Because the projections were only available from March 25th onwards, I was only able to analyze this relationship after the March 20th low of the S&P500.
As seen above, there was actually a surprisingly high positive correlation of 0.83
between the 2 variables, which makes sense. If actual number of deaths are less than expected, markets should react positively. However, I believe the correlation is surprisingly high due to the over projections (as shown in figure 4). Nevertheless, we can observe that there is somewhat a positive relationship.
Part 3: Analyzing US Twitter sentiment + stress during the COVID-19 pandemic with stock prices
I saw the Penn Medicine Center for Digital Health create an interesting analysis on US Twitter Digital Health during COVID-19 (https://tinyurl.com/Penn-covid ). Unfortunately, after asking, they were only able to provide data on change in sentiment and stress, and not topic prevalence on the economy and panic buying (which would have been more directly relevant). Nonetheless, I thought the data provided would make an interesting analysis with stock prices. The data was represented as a z_score relative to January levels, which I thought was a good representation given January was still quite normal in the US.
Sentiment

There was a slightly positive correlation between US Twitter sentiment and price. This makes sense, because investors should be more willing to buy with positive sentiment. However, the low positive correlation does not lead us to any conclusions.
Stress

Matching graphs; looks encouraging? Actually, it does not make sense. This graph is showing that as stock prices plummet, US Twitter stress is decreasing. From this graph, we can almost conclude that US Twitter stress is not a good representation of the markets.
Maybe investors are stressed with the opportunity of buying in at low prices?..haha
Final thoughts..
My findings:
- I predict that, approximately, US stock prices will return to YTD peak prices at the earliest in December. Remember, this is just based on the COVID-19 situation, and not the hundreds/thousands of other things markets factor in. Hence the at the earliest.
- There is a positive correlation between the
Expected(numDeaths) — Actual(numDeaths)
of a particular day and stockprice
the day after. Thus, if you have a differentiated view on the US COVID-19 situation (as opposed to projections released by the government etc.), that may present a good opportunity. - US Twitter sentiment and stress are not quite representative of the markets.
I found this to be an interesting dive into an open ended problem related to an industry I am interested in. I hope you guys enjoyed the read the same way I enjoyed creating the post! Stay safe.
Data
This project would not have been possible without the following data sources:
- Yahoo Finance
- Daily COVID-19 data: Center for Systems Science and Engineering (CSSE), John Hopkins University
- Institute for Health Metrics and Evaluation (IHME). COVID-19 Hospital Needs and Death Projections. Seattle, United States of America: Institute for Health Metrics and Evaluation (IHME), University of Washington, 2020
- Guntuku, S. C., Sherman, G., Stokes, D., Agarwal, A., Seltzer, E., Merchant, R.M., Ungar, L.H., Tracking mental health and symptom mentions on Twitter during COVID-19. 2020