In this short article, I’ll show you how you can use the power of Python to extract text from images. The applications of this technique are endless. Some examples include:
- Data mining for machine learning tasks
- Take photos of receipts and read content for processing
How to Extract Text From Images using Python?
To extract text from images with python, I’ll be using a library called Python Tesseract. Tesseract is an optical character recognition tool for python. In simple words, by using this package we can recognize and “read” the text embedded in the images.
Setting Up Tesseract
When it comes to configuring Python libraries to use, this is usually a one-step process. With PyTesseract, however, we’ll need to do two things:
- Install the Python library
- Install the Tesseract app
You can easily install this library by using the pip command – pip install pytesseract. Then you have to download the Tesseract executable file from here. Just remember where you are installing this file because we need to mention the path of this file in our python code.
Now, let’s head over to the python code to extract text from images. I am using the image below:
import pytesseract pytesseract.pytesseract.tesseract_cmd = "Path to tesseract executable" print(pytesseract.image_to_string('image.png'))
Output: Python is Amazing
I hope you liked this article on extracting text from images with Python. Feel free to ask your valuable questions in the comments section below. You can also follow me on Medium to learn every topic of Machine Learning and Python.