Password Strength Checker with Machine Learning

Password Strength Checker is an application that checks how strong a password is. Some popular password strength meters use machine learning algorithms to predict the strength of your password. So, if you want to learn how to use machine learning to check your password’s strength, this article is for you. In this article, I will take you through how to create a password strength checker with machine learning using Python.

How to Create a Password Strength Checker?

A password strength checker works by understanding the combination of digits, letters, and special symbols you use in your password. It is created by training a machine learning model on a labelled dataset of different combinations of letters and special symbols people use in passwords. The model learns from data about what combinations of letters and symbols can be classified as a solid or weak password.

So to create an application to check the strength of passwords, we need to have a labelled dataset about different combinations of letters and symbols. I found a dataset on Kaggle to train a machine learning model to predict the strength of a password. We can use that data for this task. You can download the dataset from here.

In the section below, I will take you through how to use Machine Learning to create a password strength checker using Python.

Password Strength Checker using Python

Let’s start by importing the necessary Python libraries and the dataset we need for creating a password strength checker:

import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

data = pd.read_csv("data.csv", error_bad_lines=False)
print(data.head())
      password  strength
0     kzde5577         1
1     kino3434         1
2    visi7k1yr         1
3     megzy123         1
4  lamborghin1         1

The dataset has two columns; password and strength. In the strength column:

  1. 0 means: the password’s strength is weak;
  2. 1 means: the password’s strength is medium;
  3. 2 means: the password’s strength is strong;

Before moving forward, I will convert 0, 1, and 2 values in the strength column to weak, medium, and strong:

data = data.dropna()
data["strength"] = data["strength"].map({0: "Weak", 
                                         1: "Medium",
                                         2: "Strong"})
print(data.sample(5))
            password strength
476676       xupet0n     Weak
112569   cdm06690669   Medium
267402  bluerose1291   Medium
237407    2298409uur   Medium
336018       jejien8     Weak

Password Strength Prediction Model

Now let’s move to train a machine learning model to predict the strength of the password. Before we start preparing the model, we need to tokenize the passwords as we need the model to learn from the combinations of digits, letters, and symbols to predict the password’s strength. So here’s how we can tokenize and split the data into training and test sets:

def word(password):
    character=[]
    for i in password:
        character.append(i)
    return character
  
x = np.array(data["password"])
y = np.array(data["strength"])

tdif = TfidfVectorizer(tokenizer=word)
x = tdif.fit_transform(x)
xtrain, xtest, ytrain, ytest = train_test_split(x, y, 
                                                test_size=0.05, 
                                                random_state=42)

Now here’s how to train a classification model to predict the strength of the password:

model = RandomForestClassifier()
model.fit(xtrain, ytrain)
print(model.score(xtest, ytest))
0.956991816498417

Now here’s how we can check the strength of a password using the trained model:

import getpass
user = getpass.getpass("Enter Password: ")
data = tdif.transform([user]).toarray()
output = model.predict(data)
print(output)
Enter Password: 路路路路路路路路路路
['Strong']

Summary

So this is how you can use machine learning to create a password’s strength checker using the Python programming language. A password strength checker works by understanding the combination of digits, letters, and special symbols you use in your password. I hope you liked this article on creating a password’s strength checker with Machine Learning using Python. Feel free to ask valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data馃搱.

Articles: 1433

2 Comments

  1. I am struggling for what to study with confused mind the confused mind killing me daily ,i am interested in business nd also in coding so i plane to study first python then learn some knowledge in business field like data analytics data science but totally confused confusion mind killing me not to sleep properly always negative thinking but my mind also willing to do something new to world I want to speak somebody to clear my mind for my future help me to crack my confused mind to exited mind

Leave a Reply