Pandas Datareader is a Python package that allows us to create a pandas DataFrame object by using various data sources from the internet. It is popularly used for working with realtime stock price datasets. In this article, I will take you through a tutorial on Pandas datareader using Python.
What is Pandas Datareader in Python?
Pandas Datareader is a Python package that allows us to create a pandas DataFrame by using some popular data sources available on the internet including:
- Yahoo Finance
- Google Finance
- World Bank
- OECD and many more.
All of the data sources mentioned above provide data in a different format, so collecting data from each source follows a different method. In the section below, I will take you through a tutorial on pandas datareader to collect stock price data from Yahoo Finance.
Working with Pandas Datareader using Python
I hope you now have understood what is pandas_datareader, now let’s see how to use this package to read the stock price data from yahoo finance using Python. If you have never used it before then you can easily install it by using the pip command; pip install pandas_datareader. Now let’s import the necessary Python libraries that we need for this task:
import pandas as pd import pandas_datareader.data as web import matplotlib.pyplot as plt
I will set a start date and an end date that can be easily customized in the same format as in the code below:
start_date = "2020-01-1" end_date = "2020-12-31"
Now let’s use the datareader method to store the stock price data of Tesla into a DataFrame:
data = web.DataReader(name="TSLA", data_source='yahoo', start=start_date, end=end_date) print(data)
High Low Open Close Volume Adj Close Date 2019-12-31 84.258003 80.416000 81.000000 83.666000 51428500.0 83.666000 2020-01-02 86.139999 84.342003 84.900002 86.052002 47660500.0 86.052002 2020-01-03 90.800003 87.384003 88.099998 88.601997 88892500.0 88.601997 2020-01-06 90.311996 88.000000 88.094002 90.307999 50665000.0 90.307999 2020-01-07 94.325996 90.671997 92.279999 93.811996 89410500.0 93.811996 ... ... ... ... ... ... ... 2020-12-24 666.090027 641.000000 642.989990 661.770020 22865600.0 661.770020 2020-12-28 681.400024 660.799988 674.510010 663.690002 32278600.0 663.690002 2020-12-29 669.900024 655.000000 661.000000 665.989990 22910800.0 665.989990 2020-12-30 696.599976 668.359985 672.000000 694.780029 42846000.0 694.780029 2020-12-31 718.719971 691.119995 699.989990 705.669983 49649900.0 705.669983 [254 rows x 6 columns]
The above output looks the same as what we read from any CSV file. Now let’s visualize this data by using the matplotlib library in Python:
close = data['Close'] ax = close.plot(title='Tesla') ax.set_xlabel('Date') ax.set_ylabel('Close') ax.grid() plt.show()
Also, Read – Python Projects with Source Code: Solved and Explained.
So this is how easy it is to read and store the stock price data into a pandas DataFrame. In this article, I collected the stock price data of Tesla from Yahoo Finance. I hope you liked this article on a tutorial on pandas_datareader using Python. Feel free to ask your valuable questions in the comments section below.
Hi Aman, thank you for this detailed introduction. It is very easy to follow.
When I print your code in my console I get an error saying “TypeError: string indices must be integers”. It seems to work for you so I wondered what I am doing wrong?
Best Regards and Thank you!
Currently there are some problems in the pandas datareader module. You can follow the link mentioned below to collect stock price data: https://bit.ly/stockpricedata