Introduction
The machine learning (ML) has emerged as a powerful tool for extracting insights, making predictions, and optimizing decision-making processes. At the heart of these ML models lies the critical component of high-quality financial data. This is where EODHD (EOD Historical Data) steps in, providing a comprehensive suite of financial data APIs that serve as the foundation for sophisticated ML applications in finance.
EODHD offers a rich array of financial data, including historical stock prices, real-time market data, fundamental data, technical indicators, and sentiment data. This diverse dataset spans over 150,000 tickers across 70+ global exchanges, providing developers and data scientists with the necessary data for training robust ML models.
The purpose of this article is to explore how EODHD’s financial data can be leveraged to train ML models and to examine its real-world applications. Whether you’re a fintech startup looking to develop a new trading algorithm, or a researcher exploring new frontiers in quantitative finance, understanding how to effectively utilize EODHD’s data in ML models can significantly elevate the quality of your projects.
A Python Notebook with all the example codes you’ll find clicking on the link.
Quick jump:
- 1 Introduction
- 2 The Role of Financial Data in Machine Learning
- 3 Types of Machine Learning Models for Financial Data
- 4 Financial Forecasting with EODHD Sentiment Data
- 5 Example of Successful Projects Built with EODHD Data
- 6 Benefits of Using EODHD Data for ML Projects
- 7 Getting Started with EODHD for Machine Learning
- 8 Conclusion
The Role of Financial Data in Machine Learning
Financial data serves as the lifeblood of ML models in the finance sector. These models are only as good as the data they’re trained on, making the quality, accuracy, and comprehensiveness of financial data paramount. EODHD’s offerings play a crucial role in this ecosystem by providing clean, validated, and easily accessible financial data.
Let’s explore the types of financial data provided by EODHD and their significance in ML applications:
- Historical Stock Prices: This forms the backbone of many financial ML models. EODHD provides extensive historical data, allowing models to learn from past market behaviors and identify patterns.
- Real-time Market Data: For models that need to make split-second decisions, such as in high-frequency trading, EODHD’s real-time data feed is invaluable.
- Fundamental Data: Balance sheets, income statements, and cash flow data are crucial for models that aim to assess a company’s intrinsic value or predict long-term performance.
- Technical Indicators: Pre-calculated technical indicators save development time and ensure consistency in model inputs across different applications.
- Sentiment Data: This alternative data source can provide models with insights into market sentiment, potentially predicting short-term price movements.
The importance of data quality and granularity cannot be overstated when it comes to training accurate ML models. EODHD ensures data quality through rigorous validation processes and provides granular data (up to tick-level for some markets) that allows for precise model training.
Example of the EODHD API usage
Example of how to fetch historical stock data using the EODHD’s End-of-Day API:
import requests
import pandas as pd
def get_stock_data(symbol, start_date, end_date, api_key):
url = f"https://eodhistoricaldata.com/api/eod/{symbol}"
params = {
"from": start_date,
"to": end_date,
"api_token": api_key,
"fmt": "json"
}
response = requests.get(url, params=params)
if response.status_code == 200:
data = pd.DataFrame(response.json())
data['date'] = pd.to_datetime(data['date'])
return data.set_index('date')
else:
raise Exception(f"API request failed with status code {response.status_code}")
# Usage
api_key = "demo"
apple_data = get_stock_data("AAPL", "2024-01-01", "2024-09-31", api_key)
print(apple_data.head())
This code snippet demonstrates how easily EODHD’s data integrates into a Python environment, setting the stage for further data processing and model training. You can also use the EODHD’s libraries for faster code development and integration.
Note: Please replace ‘demo’ with your actual EODHD API key from your dashboard. The ‘demo’ key provides data only for AAPL, TSLA, AMZN, and MSFT tickers.
Types of Machine Learning Models for Financial Data
The finance industry employs a wide range of machine learning models, each suited to different tasks and types of financial data. Let’s explore some of the most common types of ML models used with EODHD’s financial data:
Supervised Learning Models
Supervised Learning is usually used for predicting stock prices, forecasting market trends, and credit scoring. These models learn from labeled historical data to make predictions or classifications.
Examples:
- Linear Regression: Used for simple trend predictions.
- Random Forest: Effective for complex, non-linear relationships in financial data.
- Support Vector Machines (SVM): Useful for binary classification tasks like predicting market direction.
An Example of a Random Forest Model for a Stock Price Prediction
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
import numpy as np
import requests
import pandas as pd
def get_stock_data(symbol, start_date, end_date, api_key):
url = f"https://eodhistoricaldata.com/api/eod/{symbol}"
params = {
"from": start_date,
"to": end_date,
"api_token": api_key,
"fmt": "json"
}
response = requests.get(url, params=params)
if response.status_code == 200:
data = pd.DataFrame(response.json())
data['date'] = pd.to_datetime(data['date'])
return data.set_index('date')
else:
raise Exception(f"API request failed with status code {response.status_code}")
def prepare_data(df, look_back=30):
X, y = [], []
for i in range(len(df) - look_back):
X.append(df.iloc[i:i+look_back]['close'].values)
y.append(df.iloc[i+look_back]['close'])
return np.array(X), np.array(y)
#loading data
api_key = "demo"
apple_data = get_stock_data("AAPL", "2024-01-01", "2024-09-31", api_key)
print(apple_data.head())
X, y = prepare_data(apple_data)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
print(f"Model Score: {model.score(X_test, y_test)}")
Visualizing the performance of our predictive model on AAPL stock data, the first plot reveals how predicted closing prices align with actual values.
The residual plot identifies the error magnitude and distribution, providing a detailed look at prediction reliability.
Unsupervised Learning Models
These models find patterns in unlabeled data, useful for discovering hidden structures in financial markets. Could be used for market segmentation, fraud detection, and risk assessment. Example of models:
- K-Means Clustering: Used for market segmentation.
- Anomaly Detection: Identifies unusual patterns that could indicate fraud or market manipulation.
Reinforcement Learning Models
These models learn by interacting with an environment, making them suitable for dynamic financial tasks. Usually applied for algorithmic trading, portfolio optimization, and dynamic asset allocation. Example of models:
- Deep Q-Networks (DQN): Used in algorithmic trading.
- Policy Gradient Methods: Applied in portfolio optimization.
Deep Learning Models
These complex neural networks are particularly effective for handling large volumes of financial data. Usually used for Predicting time series data, analyzing sentiment for trading signals, and high-frequency trading. Example of models:
- Recurrent Neural Networks (RNNs): Ideal for time series prediction.
- Long Short-Term Memory (LSTM): Effective for capturing long-term dependencies in financial data.
Basic LSTM model for a Stock Price Prediction Using TensorFlow
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from sklearn.preprocessing import MinMaxScaler
import numpy as np
import requests
import pandas as pd
def get_stock_data(symbol, start_date, end_date, api_key):
url = f"https://eodhistoricaldata.com/api/eod/{symbol}"
params = {
"from": start_date,
"to": end_date,
"api_token": api_key,
"fmt": "json"
}
response = requests.get(url, params=params)
if response.status_code == 200:
data = pd.DataFrame(response.json())
data['date'] = pd.to_datetime(data['date'])
return data.set_index('date')
else:
raise Exception(f"API request failed with status code {response.status_code}")
def prepare_data(df, look_back=30):
X, y = [], []
for i in range(len(df) - look_back):
X.append(df.iloc[i:i+look_back]['close'].values)
y.append(df.iloc[i+look_back]['close'])
return np.array(X), np.array(y)
#loading data
api_key = "demo"
apple_data = get_stock_data("AAPL", "2024-01-01", "2024-09-31", api_key)
print(apple_data.head())
# Prepare data
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(apple_data[['close']])
X, y = prepare_data(pd.DataFrame(scaled_data, columns=['close']))
X = X.reshape((X.shape[0], X.shape[1], 1))
# Build model
model = Sequential([
LSTM(50, return_sequences=True, input_shape=(30, 1)),
LSTM(50, return_sequences=False),
Dense(25),
Dense(1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X, y, batch_size=32, epochs=100)
print("Model trained successfully")
A visualization of a trained model performance
Financial Forecasting with EODHD Sentiment Data
ML models can be used to predict various financial and economic indicators, from stock prices to interest rates. Here’s a simple sentiment stock price forecasting example using EODHD’s sentiment data:
import requests
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from datetime import datetime, timedelta
def get_stock_data(symbol, start_date, end_date, api_key):
"""Fetch stock price data"""
url = f"https://eodhistoricaldata.com/api/eod/{symbol}"
params = {
"from": start_date,
"to": end_date,
"api_token": api_key,
"fmt": "json"
}
response = requests.get(url, params=params)
if response.status_code == 200:
data = pd.DataFrame(response.json())
data['date'] = pd.to_datetime(data['date'])
return data.set_index('date')
else:
raise Exception(f"API request failed with status code {response.status_code}")
def get_sentiment_data(symbol, start_date,end_date, api_key):
"""Fetch sentiment data"""
url = f"https://eodhistoricaldata.com/api/sentiments?s={symbol}"
params = {
"from": start_date,
"to": end_date,
"api_token": api_key,
"fmt": "json"
}
response = requests.get(url, params=params)
if response.status_code == 200:
data = response.json()
symbol_key = f"{symbol}.US"
if symbol_key not in data:
raise KeyError(f"No sentiment data found for symbol {symbol}")
sentiment_df = pd.DataFrame(data[symbol_key])
sentiment_df['date'] = pd.to_datetime(sentiment_df['date'])
sentiment_df = sentiment_df.set_index('date')
# Calculate additional sentiment metrics
sentiment_df['sentiment_strength'] = abs(sentiment_df['normalized'] - 0.5) * 200
return sentiment_df
else:
raise Exception(f"API request failed with status code {response.status_code}")
def prepare_data_with_sentiment(price_df, sentiment_df, look_back=30):
"""Prepare data with both price and sentiment features"""
# Ensure sentiment_df has same dates as price_df
combined_df = price_df.join(sentiment_df, how='left')
# Forward fill sentiment values for missing dates
combined_df['normalized'] = combined_df['normalized'].ffill()
combined_df['count'] = combined_df['count'].ffill()
combined_df['sentiment_strength'] = combined_df['sentiment_strength'].ffill()
# Fill any remaining NaN values with median values
combined_df = combined_df.fillna(combined_df.median())
X, y = [], []
feature_names = ['close', 'normalized', 'count', 'sentiment_strength']
for i in range(len(combined_df) - look_back):
features = []
for feature in feature_names:
features.extend(combined_df.iloc[i:i+look_back][feature].values)
X.append(features)
y.append(combined_df.iloc[i+look_back]['close'])
return np.array(X), np.array(y)
# Modified training and prediction code to include dates
def train_model_with_dates(X, y, dates, test_size=0.2):
"""Train model and return results with corresponding dates"""
# Split the data
split_idx = int(len(X) * (1 - test_size))
X_train, X_test = X[:split_idx], X[split_idx:]
y_train, y_test = y[:split_idx], y[split_idx:]
test_dates = dates[split_idx:]
# Train model
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
# Make predictions
y_pred = model.predict(X_test)
score = model.score(X_test, y_test)
return model, X_test, y_test, y_pred, test_dates, score
# Example usage
api_key = "demo"
symbol = "AAPL"
end_date = datetime.now()
start_date = end_date - timedelta(days=365)
# Get data
stock_data = get_stock_data(symbol, start_date.strftime('%Y-%m-%d'),
end_date.strftime('%Y-%m-%d'), api_key)
sentiment_data = get_sentiment_data(symbol, start_date.strftime('%Y-%m-%d'),end_date.strftime('%Y-%m-%d'), api_key)
# Prepare data
X, y = prepare_data_with_sentiment(stock_data, sentiment_data)
dates = stock_data.index[30:] # Adjust for lookback period
# Train model and get predictions with dates
model, X_test, y_test, y_pred, test_dates, score = train_model_with_dates(X, y, dates)
print(f"Model Score with Sentiment Features: {score:.4f}")
# Calculate error metrics
mse = np.mean((y_test - y_pred) ** 2)
rmse = np.sqrt(mse)
mae = np.mean(np.abs(y_test - y_pred))
print(f"Root Mean Square Error: ${rmse:.2f}")
print(f"Mean Absolute Error: ${mae:.2f}")
Results of the prediction model using EODHDs sentiment data.
These applications demonstrate the versatility and power of ML models when combined with high-quality financial data from EODHD.
Example of Successful Projects Built with EODHD Data
To illustrate the practical applications of EODHD data in machine learning projects, let’s explore how some companies are leveraging this data to power their financial technology solutions.
Eccuity
Eccuity is a financial technology company that provides advanced analytics and risk management solutions. They utilize EODHD’s comprehensive financial data to train their machine learning models for various purposes:
- Market Risk Assessment: Eccuity uses historical price data and volatility indicators from EODHD to train models that assess market risk across different asset classes.
- Portfolio Optimization: By incorporating EODHD’s fundamental data and technical indicators, Eccuity’s ML models suggest optimal asset allocations for their clients’ portfolios.
Korzo
Korzo is a platform that provides algorithmic trading solutions. They integrate EODHD data into their ML models for:
- Signal Generation: Using EODHD’s real-time and historical data, Korzo’s models identify potential trading signals.
- Backtesting: EODHD’s extensive historical data allows Korzo to rigorously backtest their trading strategies.
Quinetics
Quinetics specializes in quantitative trading strategies. They use EODHD data extensively in their ML models for:
- Technical Analysis: Quinetics’ models process EODHD’s technical indicators to identify potential trading opportunities.
- Fundamental Analysis: By incorporating EODHD’s fundamental data, Quinetics’ models assess the intrinsic value of stocks.
- Sentiment Analysis: Quinetics uses EODHD’s sentiment data to gauge market mood and adjust their trading strategies accordingly.
- Economic Forecasting: EODHD’s macroeconomic data feeds into Quinetics’ models for predicting broader market trends.
Benefits of Using EODHD Data for ML Projects
EODHD transforms financial machine learning development by providing a comprehensive data solution that combines precision with practicality. At its core, the platform delivers four-decimal-place accuracy across its data sets, serving over 70 global exchanges through intuitive APIs that support industry-standard formats. This foundation of accessibility and accuracy makes EODHD an invaluable resource for organizations looking to develop sophisticated financial ML models without the overhead of complex data management systems.
The platform’s approach to data management addresses key challenges in financial ML development. By providing pre-cleaned, validated data sets alongside extensive historical databases, EODHD significantly reduces the resource investment typically required for data preparation and validation. Organizations can leverage both deep historical data for pattern recognition and real-time feeds for current market analysis, enabling the development of more comprehensive and responsive ML models. This dual capability supports both strategic long-term analysis and tactical short-term trading strategies.
EODHD’s value proposition extends beyond basic market data through its integration of fundamental analysis tools and alternative data sources. The platform provides rich company-level data and market sentiment analysis, offering developers unique perspectives for model development. This comprehensive approach, backed by reliable technical infrastructure, enables organizations to accelerate their ML development cycles while maintaining high data quality standards. By reducing the complexity of data management and increasing the availability of diverse data sources, EODHD helps organizations focus on their core objective: developing effective and innovative financial ML solutions that drive competitive advantage in the market.
Getting Started with EODHD for Machine Learning
EODHD offers a comprehensive suite of financial APIs that can be leveraged for various machine learning projects. Here’s a brief overview of the key APIs and their potential use cases in ML:
The End-of-Day (EOD) Historical Data API provides comprehensive historical pricing data that’s essential for training models focused on long-term trend analysis and backtesting trading strategies. For optimal results, you can combine this data with fundamental metrics to build more robust predictive models.
The Intraday Historical Data API delivers granular, time-series data perfect for developing high-frequency trading models and analyzing short-term price movements. This API becomes particularly powerful when used alongside technical indicators to generate more accurate trading signals.
The Live (Delayed) Stock Prices API enables real-time model inference and dynamic portfolio rebalancing capabilities.
The Fundamental Data API supplies comprehensive company financial metrics, ideal for training models focused on value investing and predicting company performance. To achieve a more holistic view of a company’s prospects, it’s beneficial to combine this data with sentiment analysis.
The Technical Indicators API offers pre-calculated technical indicators that serve as valuable features for various trading models. Users are encouraged to experiment with different combinations of indicators as input features to optimize model performance.
The Alternative Data APIs, including sentiment analysis, provide non-traditional data signals that can enhance conventional trading models. It’s recommended to utilize Natural Language Processing techniques to extract additional features from the textual data provided.
Remember to refer to EODHD’s official documentation for the most up-to-date information on API usage and best practices.
Conclusion
Throughout this comprehensive guide, we’ve explored the intricate world of machine learning in finance, demonstrating how EODHD’s robust APIs can serve as the backbone for sophisticated fintech applications. From basic stock price prediction models to complex algorithmic trading systems, we’ve seen how high-quality financial data is crucial for developing accurate and reliable ML models.
As the field of fintech continues to evolve, the combination of machine learning techniques and comprehensive financial data will play an increasingly crucial role.
Whether you’re building a personal trading algorithm, developing a comprehensive financial analysis platform, or creating the next groundbreaking fintech application, EODHD is committed to providing the high-quality data and support you need to succeed. We invite you to explore our full range of financial data services and join our community of innovative fintech developers.