If you are learning Data Science and find it hard to create a dataset for practice from scratch, you can either download a dataset from Kaggle or create fake data. If you want to learn how to create a dummy dataset in just a few lines of code, this article is for you. In this article, I will take you through how to create dummy data using Python.
Create Dummy Data using Python
To create dummy data using Python, we can use the faker library. The faker library generates fake data randomly. If you have never used this library before, you can easily install it by using the pip command mentioned below in your command prompt or terminal:
- pip install faker
Now let’s look at some examples of this library before creating a dummy dataset. The code below will return a fake name, address, and text randomly:
from faker import Faker fake = Faker() print(fake.name()) print(fake.address()) print(fake.text())
Sean Obrien 2606 Mackenzie Tunnel Apt. 215 East Ericfurt, CO 88091 Building job station sometimes what language money. Able air really it study suffer health. Body why approach difference case notice choose.
Every time you will run this code, you will get different results. Now let’s see how to create fake data for creating a dummy dataset using Python.
The Faker().profile() method returns fake data about job profiles containing 13 columns. So below is how you can create a dummy dataset using Python:
from faker import Faker import pandas as pd fake = Faker() data = [fake.profile() for i in range(50)] data = pd.DataFrame(data) print(data.head())
job ... birthdate 0 Engineer, control and instrumentation ... 1949-06-13 1 Editor, film/video ... 1959-07-23 2 Chiropractor ... 1927-12-12 3 Nurse, adult ... 1996-11-02 4 Personnel officer ... 1953-08-19 [5 rows x 13 columns]
You can learn more about creating fake data using the faker library from here.
Summary
So this is how you can create a fake or dummy dataset using the Python programming language. If you want to work with better datasets, I will recommend visiting Kaggle. I hope you liked this article on creating dummy data using Python. Feel free to ask valuable questions in the comments section below.