**Backtest:

We assume that we are able to buy at market open and liquidate at market close. Our backtest does not incorporate:

  1. Use of options/derivatives

  2. Self-financing portfolios

  3. Leverage

  4. Optimal sizing of trades: all positions are the same size

  5. Transaction costs

  6. Slippage

**Top 5 position selection

For each day, we select the top 5 long positions and top 5 short position based on features and discard the remaining positions.

**Proof-of-Concept 1: Reuters dataset 1 2017-2020

Data is segmented into training data (2017-2018) and test data (2019-2020). Preprocessing of the text data for text normalization, stemming, lemmatization and extraction of stop words.

*Model accuracy on the validation dataset:

NTLK VADER Sentiment Analyzer - N/A

Linear Classifier - 53%

Sentimetre Model 1 - 53%

Sentimetre Model 2 - 57%

*Prediction accuracy on the test dataset:

NTLK VADER Sentiment Analyzer - 50%

Linear Classifier - 52%

Sentimetre Model 1 - 51%

Sentimetre Model 2 - 55%