Become a Client

AI and Alternative Data for Quantamental Investing

Raghav Bharadwaj

Raghav serves as Content Lead at Emerj, covering our major industry areas and conducting research. Raghav has a personal interest in robotics, and previously worked for research firms like Frost & Sullivan and Infiniti Research.

AI and Alternative Data for Quantamental Investing

AI and machine learning have had successful applications in the financial sector even before the entry of the mobile banking ecosystem. AI is being used to leverage insights from data for financial investing and trading, wealth management, asset management, and risk management.

Investors and financial advisors have relied on the existing information they have about stocks and company performance from SEC filings. The investment divisions of large banks or independent hedge funds have also had access to information regarding macroeconomic factors such as economic indicators and market conditions and microeconomic factors financial conditions and assessing the performance of company’s management team.

As smartphones and devices multiply, cameras and other sensors boom, and organizations increasingly ground their business processes in data, new kinds of analysis are opening up for traders and investors to make more informed decisions about the world – beyond traditional data sources like stock price activity or earnings reports.

Defining Quantamental Investing

Quantamental investing is a term that is now being used to describe a combination of these methods on different data sets to improve on the accuracy of predicting outcomes – such as stock prices or a company (or country)’s financial performance.

We spoke to Anwar Ghauche and Carlos Pazos from SparkCognition about how AI can help with quantamental investing and what finance business leaders should know about improving ROIs using AI. According to Anwar:

[Quantamental investing] Means different things to different people…Morgan Stanley believes that it is about using quant factors for screening, with fundamental analysis for stock picking. If you ask some of the more quant-heavy folks, they would tell you that it’s about using fundamental factors in a quantitative and systematic training environment.

The way I see it, it’s a bit more of a spectrum. The bulk of the quantamental analysts are using quantitative methods to analyze alternative data and the output is a piece to the puzzle in their fundamental analysis.

Examples of using “alternative” data include:

  • Using natural language processing for sentiment analysis of a financial institution’s Twitter channel or
  • Using image processing on satellite imagery of parking lots and correlating that with sales for certain retail firm

These kinds of analysis would be entirely impossible without the proliferation of new data sources, and the development of new AI methods.

The big difference here from traditional quantitative investing lies in the type of data being analyzed. In the latter, AI systems are being used on alternative data. This usually refers to information such as social media data, geolocational data, satellite imagery, email receipt data, credit card purchases, website traffic or app purchases – any information that is not taken from SEC or equivalent filings.

There seems to be some evidence for the fact that AI/machine learning hedge funds are performing better than traditional funds. For instance, in the image below from SparkCognition’s presentation at Columbia University shows the performance index for AI funds compared to traditional human managed funds:

Many of the funds in this index are using alternative data to augment their investment strategies. According to Anwar, until now the issue with alternative forms of data was the fact that they are largely unstructured information (social media posts don’t come with convenient labels about what they mean, and images don’t come with labels on how many cars or people are in the picture).

AI might be a good fit for this task since it can enable capabilities of sifting through mountains of such data at a speed and scale that human teams simply cannot. While this can help to unearth new insights, quantamental investing requires a strategic approach and overcoming a number of barriers to accuracy or relevance.

Challenges in Combining Alternative Data

Although the financial sector has historically recorded plenty of trading and investment-related data, the intention was not to use the data for the express purpose of data science projects. Not all data is easily accessible, harmonized, and easy to use to train algorithms. Transforming old or disjointed financial data into fuel for AI-base predictions is hard – but doing so for unstructured alternative data can be even more challenging.

There is no set way to leverage quantamental approaches. Any new use of alternative data must be thought through, built out, and tested in order to determine its potential value.

Let’s say that we believe that the conversations on online blogs and web forums can help determine how well a certain pharmaceutical drug is selling. Questions arise:

  • What websites or forums should we analyze? Which do we believe will be indicative of real customers of the drug?
  • What data should we extract from these sites? Sentiment analysis? The description of symptoms and side-effects? The mention of the drug along with related entities (such as other drugs or drug companies)?
  • Are we able to crawl these websites in real time in order to get the text data in the first place?
  • Do we want to factor in additional information about the users who post the content (giving some more value, and others less), or should we treat them all equally?
  • How long would it take for us to train this model?
  • How much money would it cost to get access to this sentiment information, and do we believe that it would be the best way to spend our time and money?

Absolutely none of these questions have easy answers, especially because quantamental analysis with AI is novel and experimental. Clearly, however, it is still powerful – and for many firms that means that it will be worth experimenting with in order to gain an early advantage.

To be able to expand the capabilities of existing investment managers using AI requires data science resources. Although the financial services industry itself is a relatively early mover in acquiring teams of data scientists, the level of applications and the amount of data needed to be handled and combined effectively is vast – and hedge funds and banks often have the funds to bring on experienced talent.

The data science challenge here for investment firms is that it’s not just about building one model, but building several models for each type of data and combining the insights in a way that makes sense.

In addition – alternative data can become obsolete – models have to be constantly updated with new data in order to be more accurate in predictions – putting their lasting value into question. For example:

  • Will a sentiment analysis model still be relevant when new jargon, emojis, or even entirely new social media platforms become available?
  • How do our sales projection adjust when a retailer changes their product lines? What if the same number of cars in the parking lot mean vastly more revenue than before (higher margins), or vastly less (heavy couponing campaigns that drive low-margin purchases)?

The questions above aren’t necessarily unique to the finance sector, and experienced data scientists will be familiar with these challenges. Time is often the limiting factor here – as hands-on tweaking and data-wrangling can prevent data scientists from being able to do higher value activities, such as develop stronger hypotheses or find new ways that AI can unlock value from data.

Businesses might need to first take stock of what data they have, what data they can actually use, and what data really has valuable insights to be drawn from. Combining this with an AI system that can draw on multiple forms of such data, can help investment firms make better predictions and gain new trading ideas.

Given below is an example from SparkCognition where they claim their DeepNLP product can help with automation of data collection from unstructured sources to augment the capabilities of investors or traders. According to a whitepaper from the company, the system can allow clients to customize the ontologies of data collection by picking the datasets they think are most impactful for the prediction and generate summary reports.

AI for Automation Of Investment Operations

Anwar and Carlos think that among AI applications today, a lot of the supervised machine learning models are being used to integrate and predict sales or stock price of a certain company.

Financial firms that gain the data competency and have a model to combine all of the different types of datasets by linking to company dataflows can automate processes in investing using alternative data. Building this model that will be unique to each investment firm is complex and will take time to reach a level where the system outperforms humans. Yet, with more incoming data, investment firms can refine their models to make them more accurate.

With their data structured, these firms can take advantage of automated machine learning systems to help improve the speed of training and updating models for new datasets. Carlos states that automated ML systems can help with retraining models in a rapid and scalable way. He adds that this is essentially important since self-adaptability is a core tenet of quantamental investing.

SparkCognition also offers an automated machine learning platform called Darwin that they claim can help speed up of building and deployment of models. For example, Carlos explains in this 2 minute video how the Darwin platform can be used to predict the customer churn prediction in financial or retail applications

What Can We Expect in the Future

At present, using alternative data in quantamental investing might be concentrated in the hands of few large asset managers, which might be in independent hedge funds or housed as part of banking conglomerate. Carlos says that the reason for this might be that although many investment firms have data records, the investment required in terms of acquiring this data, making it machine readable, combining insights, deploying auto ML tools and the IT infrastructure to support all this data is extremely substantial.

This means only the large investment banks and asset management firms have the resources to deploy these systems in a scalable way. As more fruitful use-cases are developed, we can expect to see more and more investment in quantamental investing. For example:

  • Methods for proxying retail store parking lot images to sales data may become more sophisticated, with relatively repeatable methodologies and well-known best-practices.
  • Methods for determining the sales of a pharmaceutical drug based on social media and news articles may be developed to a degree to which they are well known and used by most investment firms.

As soon as one method is developed and established – new methods will emerge. It is often said that in the stock market “money is made in the dark” (i.e. by having an advantage or insight that others don’t have), and quantamental opens up almost all of the world’s data to become that “dark” insight.

Combining both SEC and alternative data successfully using AI to extract information can help investment firms test scenarios, and ask questions of their data and receive answers for question with different hypothetical starting conditions.

From hedge fund managers to mutual funds and even private equity managers, alternative data might help improve the valuation of securities and boost the clarity of the investment process. Alternative data enriches the structured data sets already acquired by investment management firms, fueling the potential for information advantage and providing a distinct differentiator in terms of speed and knowledge.

Quantamental investing strategies are no replacement to fundamental analysis or even for some kinds of present-day technical analysis – but it allows investors the ability to draw data from the real world (not just abstractions from the stock market) to inform their investments – and that’s a trend we can expect to continue.

This article was sponsored by Spark Cognition, and was written, edited and published in alignment with our transparent Emerj sponsored content guidelines. Learn more about reaching our AI-focused executive audience on our Emerj advertising page.


Header Image Credit: PYMNTS

Stay Ahead of the AI Curve

Discover the critical AI trends and applications that separate winners from losers in the future of business.

Sign up for the ‘AI Advantage’ newsletter:

Stay Ahead of the Machine Learning Curve

At Emerj, we have the largest audience of AI-focused business readers online - join other industry leaders and receive our latest AI research, trends analysis, and interviews sent to your inbox weekly.

Thanks for subscribing to the Emerj "AI Advantage" newsletter, check your email inbox for confirmation.