Photo by Markus Spiske on Unsplash

4 Ways Data Scientists in Finance Can Improve Efficiency and Accuracy

It’s notoriously difficult to extract any predictive power or “alpha” from your data in finance. When you do, it often quickly decays because the predictive power is also being mined by other Data Scientists or Quants out there.

Nevertheless, 80% of the daily moves in US stock markets are algorithmic trades and Quant hedge funds currently manage over $1 trillion, which means they must be doing something right?

This short article will share insights collected from some leading data scientists in finance to help you improve your productivity and the accuracy of your machine learning modelling.

1. Asking the right questions of data in finance

For example, it is much more useful to forecast whether a stock will significantly rise or not over the following period since this type of forecast can easily be acted on. Moreover, the certainty of the forecast will be more reliable since the task is easier. This example highlights how important it is to ask the right questions.

If you would like to find out which methods can help you find the right questions to ask, you can check out our guide. In short though, when it comes to data science in finance, any step which benefits from experience gathering can also benefit from machine learning.

2. Data cleaning and ingestion

The best approach to address these issues is to create a data model of your various inputs, and centralise the data cleaning part of your workflow as much as possible in order to avoid future incompatibilities or information leakage. Communication with the data engineers is key to make sure that the right inputs and features are being created for your model.

3. Model selection and optimisation

Moreover, the best type of model and hyper-parameter range for your problem will evolve over time. This is why it’s recommended to build a flexible framework which allows you to iteratively train and test various combinations of models and hyper-parameters to give real-time understanding on what is driving performance.

When your models are not performing very well, building and optimising an ensemble model can help improve the forecasting accuracy.

4. Enriching data in finance

For example, will a geolocation dataset of petrol stations allow me to better understand the behaviour of my consumers if we join both data sets?

Each data set will have some or no predictive power, but joined they will be worth more than their parts. In one study, we joined IBES and PermId data sets to forecast surprise earnings.

Mind Foundry is an Oxford U. Company. Operating at the intersection of innovation, research, and usability we empower teams with AI built for the real world.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store