Have you ever thought about how you could make smarter decisions when it comes to investing in the stock market? I’ve been on that very journey, and I’ve come across a remarkable tool called Live (Delayed) Stock Prices API that provides almost real-time information about any stock, delayed by 15 minutes. This discovery got me thinking, and I decided to create my own personal stock market assistant. With live stock data at your fingertips, you can make more informed choices without relying on expensive platforms. What’s more, you can develop your own investment strategy based on historical data to help you make the best decisions.
I explored various ways to quickly access up-to-date stock prices and trends, and I found that most solutions fell short of my needs, except for this API that offers live stock data.
Let’s take a closer look at what this API can do. This is the output when the API is requested for TSLA stock:
This output provides all the information you need to create a successful investment strategy. Now, let’s delve into how you can build a model to simplify your financial decision-making.
- 1 Python Implementation
- 2 Where to Go from Here
1. Importing the Necessary Packages
Begin by importing some essential Python packages to support your project. These packages will assist you in handling data, training models, and more.
import pandas as pd from eodhd import APIClient import numpy as np from xgboost import XGBClassifier from sklearn.ensemble import IsolationForest from sklearn.ensemble import RandomForestClassifier
Here’s what each package does:
Pandas: Helps with various data operations.
Numpy: Used for mathematical operations in Python.
Train_test_split: Splits your data into training and test sets.
XGBoost, Random Forest, and Isolated Random Forest: These are classification models used for our task.
eodhd: This is the official library of EODHD for accessing their APIs
2. API Key Activation
It is essential to register the EODHD API key with the package in order to use its functions. If you don’t have an EODHD API key, firstly, head over to their website, then, finish the registration process to create an EODHD account, and finally, navigate to the ‘Settings’ page where you could find your secret EODHD API key. It is important to ensure that this secret API key is not revealed to anyone. You can activate the API key by following this code:
api_key = '<YOUR API KEY>' client = APIClient(api_key)
The code is pretty simple. In the first line, we are storing the secret EODHD API key into the
api_key and then in the second line, we are using the
APIClient class provided by the
eodhd package to activate the API key and stored the response in the
Note that you need to replace
<YOUR API KEY> with your secret EODHD API key. Apart from directly storing the API key with text, there are other ways for better security such as utilizing environmental variables, and so on.
3. Loading Historical Data
Retrieve historical stock data for the period you’re interested in. This data will be the foundation for training your model. We can easily extract the historical data of stocks using EODHD’s historical market data API endpoint via the
def get_historical_data(ticker, start_date, end_date): json_resp = client.get_historical_data(symbol = ticker, period = '5m', from_date = start_date, to_date = end_date, order = 'a') df = pd.DataFrame(json_resp) df = df.set_index('date') df.index = pd.to_datetime(df.index) return df TSLA = get_historical_data('TSLA', '2021-08-02', '2021-09-02')
In the above code, we are using the
get_historical_data function provided by the
eodhd package to extract the split-adjusted historical stock data of Tesla. The function consists of the following parameters:
tickerparameter where the symbol of the stock we are interested in extracting the data should be mentioned
periodrefers to the time interval between each data point (5 minutes interval in our case).
to_dateparameters which indicate the starting and ending date of the data respectively. The format of the input should be “YYYY-MM-DD”
orderparameter which is an optional parameter that can be used to order the dataframe either in ascending (
a) or descending (
d). It is ordered based on the dates.
4. Obtaining Live Data for Prediction
Use EODHD’s Live (Delayed) Stock Prices API via the
eodhd package to access live stock data, which is crucial for your decision-making process.
def extract_intraday(symbol): raw_df = client.get_live_stock_prices(ticker = symbol) df = pd.DataFrame([raw_df]) return df tsla_intraday = extract_intraday('TSLA')
This function takes a stock code as input, fetches live stock information from the API, converts the response into a Pandas dataframe, and returns it.
5. Preprocessing the Data
Before you can train a model to predict stock prices, you need to prepare and clean the data. This step ensures your predictions are as accurate as possible. Here’s what you should do:
Check for Class Imbalances: Sometimes, you might have more data for one class (e.g., “buy”) than another (e.g., “sell”). This can skew your model’s predictions. To fix this, you can use two techniques:
Oversampling: Creating more instances of the underrepresented class.
Undersampling: Reducing the number of instances in the overrepresented class.
Normalize the Data: Data normalization ensures that all your features have the same scale. This is important because some algorithms are sensitive to the scale of the input features. You can do this using techniques like:
Standard Scaler: It scales your data to have a mean of 0 and a standard deviation of 1.
MinMax Scaler: This scales your data to a specific range, usually between 0 and 1.
Drop Unnecessary Columns: Sometimes, you have columns in your data that aren’t relevant to your prediction task. It’s a good idea to remove them to simplify your model and improve its performance.
Here’s an example:
dataF = dataF.drop(['timestamp', 'gmtoffset', 'datetime'],axis =1)
tsla_intraday = tsla_intraday.drop(['code', 'timestamp', 'gmtoffset', 'previousClose', 'change', 'change_p'], axis=1)
6. Forming a Strategy
Next, you’ll want to classify your training data based on your unique strategy. This helps your model understand how to make predictions. In my case, I used a simple strategy with three classes: “waiting” (0), “buying” (1), and “selling” (2).
Here’s how I did it:
open = df.Open.iloc[-1]
close = df.Close.iloc[-1]
previous_open = df.Open.iloc[-2]
previous_close = df.Close.iloc[-2]
if (open > close and previous_open < previous_close and close < previous_open and open >= previous_close):
return 1 # Buying
elif (open < close and previous_open > previous_close and close > previous_open and open <= previous_close):
return 2 # Selling
return 0 # Waiting
signal =  # Initialize with "waiting"
for i in range(1, len(dataF)):
df = dataF[i - 1:i + 1]
dataF["signal"] = signal
7. Loading and Training the Models
Now comes the exciting part — selecting and training your model. In this tutorial, I used three different models: XGBoost, Isolated Random Forest, and Random Forest. Each has its unique strengths.
XGBoost: a powerful boosting algorithm that’s great at handling structured data like stock prices.
from xgboost import XGBClassifier
model1 = XGBClassifier()
Isolated Random Forest: this model is excellent for detecting anomalies using binary trees.
from sklearn.ensemble import IsolationForest
random_state = np.random.RandomState(42)
Random Forest: a machine-learning algorithm that uses multiple decision trees to make predictions or classifications.
from sklearn.ensemble import RandomForestClassifier
model1 = RandomForestClassifier()
8. Prediction on Live Data for Suggestions
With your trained model, you can now make predictions on both test data and live data. Evaluate your predictions using metrics like precision, recall, accuracy, and F1 score. The F1 score is particularly useful when dealing with imbalanced data.
Here’s how you can do it:
# Make predictions for test data
y_pred1 = model1.predict(X_test)
predictions1 = [round(value) for value in y_pred1]
# Evaluate the predictions
from sklearn.metrics import confusion_matrix, recall_score, precision_score, f1_score, accuracy_score
cm = confusion_matrix(y_test, predictions)
rf_Recall = recall_score(y_test, predictions1, average='macro')
rf_Precision = precision_score(y_test, predictions1, average='macro')
rf_f1 = f1_score(y_test, predictions1, average='macro')
rf_accuracy = accuracy_score(y_test, predictions1)
Where to Go from Here
Now that you’ve embarked on this journey, the possibilities are endless. You can fine-tune your model, explore different machine-learning algorithms, and analyze various stocks. With access to live stock data using Live (Delayed) Stock Prices API and the knowledge from this tutorial, you’re well-equipped to make more informed decisions in the ever-changing world of stock trading.
Stay tuned for more insights and tips on improving your stock market strategies with data-driven solutions!