# Stock Price Prediction using Machine Learning

Predicting the stock market is one of the most important applications of Machine Learning in finance. In this article, I will take you through a simple Data Science project on Stock Price Prediction using Machine Learning Python.

At the end of this article, you will learn how to predict stock prices by using the Linear Regression model by implementing the Python programming language.

## Stock Price Prediction

Predicting the stock market has been the bane and goal of investors since its inception. Every day billions of dollars are traded on the stock exchange, and behind every dollar is an investor hoping to make a profit in one way or another.

Entire companies rise and fall daily depending on market behaviour. If an investor is able to accurately predict market movements, he offers a tantalizing promise of wealth and influence.

Today, so many people are making money staying at home trading in the stock market. It is a plus point for you if you use your experience in the stock market and your machine learning skills for the task of stock price prediction.

Let’s see how to predict stock prices using Machine Learning and the python programming language. I will start this task by importing all the necessary python libraries that we need for this task:

## Data Preparation

In the above section, I started the task of stock price prediction by importing the python libraries. Now I will write a function that will prepare the dataset so that we can fit it easily in the Linear Regression model:

You can easily understand the above function as I have narrated the functioning of every line step by step. Now the next thing to do is reading the data:

```df = pd.read_csv("prices.csv")
df = df[df.symbol == "GOOG"]```

Now we need to prepare three input variables as already prepared in the function created in the above section. We need to declare an input variable mentioning about which column we want to predict. The next variable we need to declare is how much far we want to predict.

And the last variable that we need to declare is how much should be the size of the test set. Now let’s declare all the variables:

```forecast_col = 'close'
forecast_out = 5
test_size = 0.2```

## Applying Machine Learning for Stock Price Prediction

Now I will split the data and fit into the linear regression model:

```X_train, X_test, Y_train, Y_test , X_lately =prepare_data(df,forecast_col,forecast_out,test_size); #calling the method were the cross validation and data preperation is in
learner = LinearRegression() #initializing linear regression model

learner.fit(X_train,Y_train) #training the linear regression model```

Now let’s predict the output and have a look at the prices of the stock prices:

{‘test_score’: 0.9481024935723803, ‘forecast_set’: array([786.54352516, 788.13020371, 781.84159626, 779.65508615, 769.04187979])}

So this is how we can predict the stock prices with Machine Learning. I hope you liked this article on Stock Price prediction using Python with machine learning by implementing the Linear Regression Model. Feel free to ask your valuable questions in the comments section below. ##### Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1498

1. #### azizulhtuhin

can you tell me where is prices .csv file?

2. #### Vicky

what should we on yahoo finance to get prices.csv dataset?

• #### vicky

what should we search on yahoo finance to get prices.csv dataset?

• #### Aman Kharwal

Go to Yahoo finance and search for the company, then click on the historical data and then click on download

3. #### abdulwassey

Hi Aman, I have got the output but with a very different test score.

{‘test_score’: 0.639145178346672, ‘forecast_set’: array([73.37040254, 73.12634778, 73.16456803, 73.20017668, 73.1776807 ])}

Also can you tell why we are taking below given point as it is giving me error:

df = df[df.symbol == “GOOG”]

• #### Aman Kharwal

maybe you are using a new dataset

4. #### RAMANDEEP SINGH BEDI

can you please tell me what is your input and output data column

• #### Aman Kharwal

Close column is the input variable, which indicates close prices

5. #### vbasheer

1.#calling the method were the cross validation and data preperation is in
X_train, X_test, Y_train, Y_test , X_lately =prepare_data(df,forecast_col,forecast_out,test_size)
learner = LinearRegression() #initializing linear regression model

learner.fit(X_train,Y_train) #training the linear regression model

ValueError: Found input variables with inconsistent numbers of samples: [246, 244] (i am getting this error when i ran above code… could you please solve for me

• #### Aman Kharwal

Check the dataset you are working with

hey thanks, it worked there were some null values worked after deleting it

• #### Rajeev

@vbasheer how did you resolve the error could you please lete me know

6. #### vbasheer

these are my results
{‘test_score’: 0.9132328868016113, ‘forecast_set’: array([14733.28834587, 14678.01387179, 14455.85132032, 14320.59050617,
14044.5766888 ])}

i think test score is okay but i dont understand forecast set

Great

7. #### Deepti

Hi am a beginner, want to know which tool to use? Spyder or Jupyter or Pycharm?

• #### Aman Kharwal

For any task where most of your work is related to analysis and visualization, you can use Jypyter notebook or Google Colab there. And for other tasks like GUI and logical problem solving you can use VS Code or any other IDE.

hey please tell me why we are taking below given point as it is giving me error:

df = df[df.symbol == “GOOG”]

9. #### surbhi

What is the meaning of this line-
df = df[df.symbol == “GOOG”]

• #### Aman Kharwal

GOOG is the financial symbol of stock prices of Google