Stock selection using machine learning & Sentifi's ESG score
Project by: Eduardo Aguilar Moreno
, Anselme Borgeaud
, and Rubén Coll Menéndez
To conclude their Bootcamp, students work in groups to complete a capstone project. This is an excellent opportunity for students to solve real data science problems provided by companies and research institutions. In this case our students were provided with a project assigned by Sentifi
Who is Sentifi?
is an established fintech company and alternative data provider. They have transformed raw data into investment analytics, aiming to support institutional investors across multiple stages of their decision-making process. Used by leading financial service organizations to gain unique insights on over 50’000 companies, currencies, commodities, and the events that impact their valuation.
Capstone project scope
Our students were tasked with developing an AI model that uses Sentifi’s ESG score and related features (such as ESG events, sentiment, attention, etc.) to select stocks in such a way that the respective portfolio outperforms the market.
An ESG (Environmental, Social, and Corporate Governance) score is an evaluation of a firm’s collective conscientiousness for social, environmental and governance factors. Investors are increasingly applying these non-financial factors as part of their investment process to identify material risks and growth opportunities. To help investors and the government, data providers provide ESG ratings for companies/stocks. The problem with these ratings is that they are manually created by analysts and are only updated yearly.
Sentifi has developed an ESG score that can be calculated in real-time by an AI engine. Sentifi’s AI engine scans five hundred million daily news articles, blogs, forums, and tweets. It detects ESG events reported in these sources and updates the score accordingly based on the intensity and sentiment of the discussion around an ES event.
The goal of the project
The goal of this project was to develop a machine learning model that uses Sentifi’s ESG score and related features (such as ESG events, sentiment, attention, etc.) to select stocks in such a way that the respective portfolio outperforms the market.
Using Sentifi's ESG scores and attention data as features, Eduardo, Anselme and Ruben trained XGBoost models for predicting the expected performance of stocks. By using their model to pick the most promising stocks out of the S&P 500, they managed to outperform the base index over a period of 6 years by 20%. In addition, they also managed to outperform a random selection strategy by more than 45%. As the image below shows, outperforming the market was achieved while the portfolio selection remained diversified across sectors. Furthermore, by using machine learning to explain ability tools they could prove the relevance of Sentifi's ESG-related features. Specifically, they could show how higher scores for environmental and social awareness correlate with better overall performance.
For the ensured success of the ESG, the focus remains on long-term investment strategies. Thus, the group had to find the top-ranked companies that resulted in high returns. A high last-7-day performance and a low last-30-day performance led to high returns.
A model that maximizes returns
The group created a machine-learning model that was able to track a short-term average with a long-term average and send out a buy signal when these two averages intersected. This signal will help investors know which companies to select for creating a portfolio in order to maximize the customer's return.
They were able to confirm the positive impact of the ESG on performance, they determined that higher ESG scores resulted in higher returns. So, it is meaningful to invest in companies that are environmentally and socially conscious.
In addition to their findings, the group made some suggestions for improvements to their model. These recommendations included:
- Add information about each individual ESG event
- Use the performance instead of a rank as a target variable
- Trying other model architectures such as CNN or LSTM may provide new insights and serve as cross-reference models
The project aimed to develop a machine learning model that uses Sentifi's ESG score and related features to select stocks in a manner that outperforms the market. Their model outperformed the random selection strategy by more than 45% over a period of 6 years and the base index by 20%. Through tracking the intersection of short-term and long-term averages their model sent out a “buy signal” to the investors, notifying them of the time to buy. This signal will help investors determine which companies to select for creating a portfolio to maximize the customer's return. In the future, other models such as CNN or LSTM could be used as cross-reference models to get a broader perspective of the effectiveness of the ESG.
On behalf of Constructor Learning, we would like to thank Sentifi for collaborating with us on this project!