Stock Price Prediction using Machine Learning

Predicting the stock market is one of the most important applications of Machine Learning in finance. In this article, I will take you through a simple Data Science project on Stock Price Prediction using Machine Learning Python.

At the end of this article, you will learn how to predict stock prices by using the Linear Regression model by implementing the Python programming language.

Also, Read – Machine Learning Full Course for free.

Stock Price Prediction

Predicting the stock market has been the bane and goal of investors since its inception. Every day billions of dollars are traded on the stock exchange, and behind every dollar is an investor hoping to make a profit in one way or another.

Entire companies rise and fall daily depending on market behaviour. If an investor is able to accurately predict market movements, he offers a tantalizing promise of wealth and influence. 

Today, so many people are making money staying at home trading in the stock market. It is a plus point for you if you use your experience in the stock market and your machine learning skills for the task of stock price prediction.

Let’s see how to predict stock prices using Machine Learning and the python programming language. I will start this task by importing all the necessary python libraries that we need for this task:

Data Preparation

In the above section, I started the task of stock price prediction by importing the python libraries. Now I will write a function that will prepare the dataset so that we can fit it easily in the Linear Regression model:

You can easily understand the above function as I have narrated the functioning of every line step by step. Now the next thing to do is reading the data:

df = pd.read_csv("prices.csv")
df = df[df.symbol == "GOOG"]

Now we need to prepare three input variables as already prepared in the function created in the above section. We need to declare an input variable mentioning about which column we want to predict. The next variable we need to declare is how much far we want to predict.

And the last variable that we need to declare is how much should be the size of the test set. Now let’s declare all the variables:

forecast_col = 'close'
forecast_out = 5
test_size = 0.2

Applying Machine Learning for Stock Price Prediction

Now I will split the data and fit into the linear regression model:

X_train, X_test, Y_train, Y_test , X_lately =prepare_data(df,forecast_col,forecast_out,test_size); #calling the method were the cross validation and data preperation is in
learner = LinearRegression() #initializing linear regression model

learner.fit(X_train,Y_train) #training the linear regression model

Now let’s predict the output and have a look at the prices of the stock prices:

{‘test_score’: 0.9481024935723803, ‘forecast_set’: array([786.54352516, 788.13020371, 781.84159626, 779.65508615, 769.04187979])}

So this is how we can predict the stock prices with Machine Learning. I hope you liked this article on Stock Price prediction using Python with machine learning by implementing the Linear Regression Model. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

Data Strategist at Statso. My aim is to decode data science for the real world in the most simple words.

Articles: 1611

27 Comments

  1. Hi Aman, I have got the output but with a very different test score.

    {‘test_score’: 0.639145178346672, ‘forecast_set’: array([73.37040254, 73.12634778, 73.16456803, 73.20017668, 73.1776807 ])}

    Also can you tell why we are taking below given point as it is giving me error:

    df = df[df.symbol == “GOOG”]

  2. 1.#calling the method were the cross validation and data preperation is in
    X_train, X_test, Y_train, Y_test , X_lately =prepare_data(df,forecast_col,forecast_out,test_size)
    learner = LinearRegression() #initializing linear regression model

    learner.fit(X_train,Y_train) #training the linear regression model

    ValueError: Found input variables with inconsistent numbers of samples: [246, 244] (i am getting this error when i ran above code… could you please solve for me

  3. these are my results
    {‘test_score’: 0.9132328868016113, ‘forecast_set’: array([14733.28834587, 14678.01387179, 14455.85132032, 14320.59050617,
    14044.5766888 ])}

    i think test score is okay but i dont understand forecast set

    • For any task where most of your work is related to analysis and visualization, you can use Jypyter notebook or Google Colab there. And for other tasks like GUI and logical problem solving you can use VS Code or any other IDE.

  4. hey please tell me why we are taking below given point as it is giving me error:

    df = df[df.symbol == “GOOG”]

  5. THE LINK THAT YOU SHARED https://query1.finance.yahoo.com/v7/finance/download/INR=X?period1=1580035828&period2=1611658228&interval=1d&events=history&includeAdjustedClose=true DOES NOT CONTAIN ATTRIBUTE ‘SYMBOL’
    AttributeError: ‘DataFrame’ object has no attribute ‘symbol’
    and without df=df[df.symbol=”GOOG”] it is is giving result as follow
    ‘test_score’: 0.6391451783466715, ‘forecast_set’: array([73.37040254, 73.12634778, 73.16456803, 73.20017668, 73.1776807 ])}

Leave a Reply

Discover more from thecleverprogrammer

Subscribe now to keep reading and get access to the full archive.

Continue reading