0% found this document useful (0 votes)

160 views

Project-Password Strength Classifier

This document describes a project to classify password strengths using natural language processing and machine learning techniques. It loads a dataset of passwords and strength levels, cleans the data, and splits it into training and test sets. It uses TF-IDF to vectorize the password strings and trains a logistic regression classifier on the training set. The classifier achieves over 80% accuracy on the test set and is able to predict the strength of new passwords as weak, average, or strong.

Uploaded by

Olalekan Samuel

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

160 views

Project-Password Strength Classifier

Uploaded by

Olalekan Samuel

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

By Nitish Adhikari

Email id :nitishbuzzpro@gmail.com (mailto:nitishbuzzpro@gmail.com), +91-9650740295

Linkedin : https://www.linkedin.com/in/nitish-adhikari-6b2350248 (https://www.linkedin.com/in/nitish-adhikari-6b2350248)

A Project on Natural Language Processing - PASSWORD STRENGTH CLASSIFIER

In [1]:

import pandas as pd
import numpy as np
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

data = pd.read_csv('data.csv',error_bad_lines=False)

In [3]:

data.head(5)
Out[3]:

password strength

0 kzde5577 1

1 kino3434 1

2 visi7k1yr 1

3 megzy123 1

4 lamborghin1 1

In [4]:

data['strength'].unique()
Out[4]:

array([1, 2, 0], dtype=int64)

In [5]:

data.isnull() #check null values

Out[5]:

password strength

0 False False

1 False False

2 False False

3 False False

4 False False

... ... ...

669635 False False

669636 False False

669637 False False

669638 False False

669639 False False

669640 rows × 2 columns

In [6]:

data.isnull().sum()
Out[6]:

password 1
strength 0
dtype: int64

In [7]:

data.dropna(inplace = True) #remove null values

In [8]:

data.isnull().sum()
Out[8]:

password 0
strength 0
dtype: int64

In [9]:

data[data['strength']==0].count()
Out[9]:

password 89701
strength 89701
dtype: int64

In [10]:

data[data['strength']==1].count()
Out[10]:

password 496801
strength 496801
dtype: int64

In [11]:

data[data['strength']==2].count()
Out[11]:

password 83137
strength 83137
dtype: int64

In [12]:

password_tuple=np.array(data) #creating array

password tuple
Out[12]:

array([['kzde5577', 1],
['kino3434', 1],
['visi7k1yr', 1],
...,
['184520socram', 1],
['marken22a', 1],
['fxx4pw4g', 1]], dtype=object)

In [13]:

password tuple.shape #shape of the array

Out[13]:

(669639, 2)

In [14]:

import random
random.shuffle(password_tuple) #shuffle the array

In [15]:

password tuple #shuffled array

Out[15]:

array([['kzde5577', 1],
['kino3434', 1],
['kzde5577', 1],
...,
['kobeji659', 1],
['kt5tu2o0', 1],
['killi48', 0]], dtype=object)

In [16]:

X = [labels[0] for labels in password_tuple] #list of independent variable

y = [labels[1] for labels in password_tuple] #list of dependent variable

In [18]:

len(X)
Out[18]:

669639
In [82]:

len(y)
Out[82]:

669639

In [21]:

def word_divide_char(inputs): #function to split the string to list

character=[]
for i in inputs:
character.append(i)
return character

In [22]:

word_divide_char('kzde5577') #check the fuction's working

Out[22]:

['k', 'z', 'd', 'e', '5', '5', '7', '7']

In [23]:

from sklearn.feature extraction.text import TfidfVectorizer

In [24]:

vectorizer=TfidfVectorizer(tokenizer=word_divide_char)

In [26]:

X = vectorizer.fit_transform(X)

In [27]:

X.shape #shape of sparse matrix

Out[27]:

(669639, 132)

In [28]:

print(X) #sparse matrix

(0, 34) 0.5917520524694371
(0, 32) 0.5665331455581984
(0, 53) 0.2214639539695442
(0, 52) 0.2855291890678396
(0, 74) 0.33602096776990453
(0, 59) 0.2922095342105659
(1, 31) 0.6175654131802808
(1, 30) 0.5601711835927342
(1, 63) 0.2565023277367334
(1, 62) 0.26785873390846976
(1, 57) 0.2521638567898762
(1, 59) 0.3220137409789036
(2, 34) 0.5917520524694371
(2, 32) 0.5665331455581984
(2, 53) 0.2214639539695442
(2, 52) 0.2855291890678396
(2, 74) 0.33602096776990453
(2, 59) 0.2922095342105659
(3, 34) 0.5917520524694371
(3 32) 0 5665331455581984
In [29]:

vectorizer.get_feature_names()
Out[29]:

['\x02',
'\x05',
'\x06',
'\x08',
'\x0f',
'\x10',
'\x11',
'\x16',
'\x17',
'\x19',
'\x1b',
'\x1c',
'\x1e',
' ',
'!',
'"',
'#',
'$',
In [30]:

X.shape
Out[30]:

(669639, 132)

In [31]:

first_document_vector= X[0]
first document vector
Out[31]:

<1x132 sparse matrix of type '<class 'numpy.float64'>'

with 6 stored elements in Compressed Sparse Row format>

In [32]:

print(first_document_vector) #Sparse matrix of first_document_vector

(0, 34) 0.5917520524694371
(0, 32) 0.5665331455581984
(0, 53) 0.2214639539695442
(0, 52) 0.2855291890678396
(0, 74) 0.33602096776990453
(0, 59) 0.2922095342105659

In [33]:

print(first document vector.T) #Transpose of first document vector

(34, 0) 0.5917520524694371
(32, 0) 0.5665331455581984
(53, 0) 0.2214639539695442
(52, 0) 0.2855291890678396
(74, 0) 0.33602096776990453
(59, 0) 0.2922095342105659

In [34]:

first_document_vector.T.todense()
[0. ], #Dense matrix representation of Transpose of first_document_vector
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0. ],
[0.56653315],
[0. ],
[0.59175205],
[0 ]
In [35]:

pd.DataFrame(first_document_vector.T.todense(),
index=vectorizer.get_feature_names(),
columns=['Tf-Idf'],
).sort values(by='Tf-Idf',ascending=False)
Out[35]:

Tf-Idf

7 0.591752

5 0.566533

z 0.336021

k 0.292210

d 0.285529

... ...

= 0.000000

< 0.000000

; 0.000000

9 0.000000

™ 0.000000

132 rows × 1 columns

In [36]:

from sklearn.model selection import train test split

In [37]:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [38]:

type(X train)
Out[38]:

scipy.sparse._csr.csr_matrix

In [39]:

X_train.shape
Out[39]:

(535711, 132)

In [40]:

type(y_train)
Out[40]:

list

In [41]:

from sklearn.linear model import LogisticRegression

In [42]:

clf = LogisticRegression(random_state=0,multi_class='multinomial')

In [43]:

clf.fit(X train,y train)

Out[43]:

▾ LogisticRegression
LogisticRegression(multi_class='multinomial', random_state=0)

In [44]:

y_pred=clf.predict(X_test)
y_pred
Out[44]:

array([1, 1, 1, ..., 1, 1, 2])

In [45]:

from sklearn.metrics import confusion_matrix, accuracy_score

In [46]:

confusion_matrix(y_test,y_pred)
Out[46]:

array([[ 5381, 12513, 8],

[ 3864, 93046, 2685],
[ 37, 5033, 11361]], dtype=int64)

In [47]:

accuracy_score(y_test,y_pred)
Out[47]:

0.8197538976166299

In [68]:

dt = ['ru76799sdhoh%41'] #Predicting strength of password 'ru76799sdhoh%41'

dt = vectorizer.transform(dt)
clf.predict(dt)
Out[68]:

array([1])

Clasification is 1, means password is average

In [70]:

dt = ['a1'] #Predicting strength of password 'a1'

dt = vectorizer.transform(dt)
clf.predict(dt)
Out[70]:

array([0])

Clasification is 0, means password is weak

In [80]:

dt = ['AsD234Ads&^%SGSJ7736SK1'] #Predicting strength of password 'AsD234Ads&^%SGSJ7736SK1'

dt = vectorizer.transform(dt)
clf.predict(dt)
Out[80]:

array([2])

Clasification is 2, means password is Strong!!

Complete!!

Boom and Crash Spike Arrow Guide
No ratings yet
Boom and Crash Spike Arrow Guide
1 page
Algo - One Bot Trading Presentation V1.6
No ratings yet
Algo - One Bot Trading Presentation V1.6
15 pages
Differences Between Deriv and Forex Wich One Is The Best, What Is
No ratings yet
Differences Between Deriv and Forex Wich One Is The Best, What Is
28 pages
Trading Hub Official DERIVBOT
No ratings yet
Trading Hub Official DERIVBOT
10 pages
Metatrader 4: Expert Advisors
No ratings yet
Metatrader 4: Expert Advisors
13 pages
MQL4 Build 154 (Commands & Samples)
No ratings yet
MQL4 Build 154 (Commands & Samples)
69 pages
Machine Learning For Asset Management
No ratings yet
Machine Learning For Asset Management
2 pages
FxKeys $1000 Forex Plan
No ratings yet
FxKeys $1000 Forex Plan
2 pages
Statistical Arbitrage in High Frequency Trading Based On Limit Order Book Dynamics
No ratings yet
Statistical Arbitrage in High Frequency Trading Based On Limit Order Book Dynamics
26 pages
Sample Conference Bid Proposal
100% (2)
Sample Conference Bid Proposal
20 pages
Random Number Generator
No ratings yet
Random Number Generator
46 pages
How To Trade Synthetic Indices
No ratings yet
How To Trade Synthetic Indices
52 pages
Discrete Mathematics - Predicate Logic
No ratings yet
Discrete Mathematics - Predicate Logic
3 pages
Boom and Crash (Pip Lord)
No ratings yet
Boom and Crash (Pip Lord)
14 pages
Database Questions
No ratings yet
Database Questions
10 pages
Description Strong System M15 - H1 PDF
No ratings yet
Description Strong System M15 - H1 PDF
6 pages
MACDxxxxxx
100% (1)
MACDxxxxxx
8 pages
Boom and Crash Sofware Pro Version
No ratings yet
Boom and Crash Sofware Pro Version
6 pages
Pipcrest Crash Boom Updated 2023
No ratings yet
Pipcrest Crash Boom Updated 2023
12 pages
Under Market by Mkorean
No ratings yet
Under Market by Mkorean
3 pages
Support and Resistance Level
No ratings yet
Support and Resistance Level
2 pages
Volume Spike Indicator Code
No ratings yet
Volume Spike Indicator Code
1 page
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
100% (1)
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
12 pages
WinningStrategie toTradeSyntheticIndices
No ratings yet
WinningStrategie toTradeSyntheticIndices
6 pages
Digits Matches Trsdingsignalbinarycompdf
No ratings yet
Digits Matches Trsdingsignalbinarycompdf
14 pages
Odin Manual
No ratings yet
Odin Manual
21 pages
LSTM Stock Prediction
100% (1)
LSTM Stock Prediction
38 pages
MACD - Pine Script
No ratings yet
MACD - Pine Script
1 page
Algorithmic Trading Training
No ratings yet
Algorithmic Trading Training
2 pages
Ebook Accumulators en HQ
No ratings yet
Ebook Accumulators en HQ
17 pages
Derivative: Differentiation and The Derivative
No ratings yet
Derivative: Differentiation and The Derivative
16 pages
Algorithmic Trading Using Intelligent Agents
No ratings yet
Algorithmic Trading Using Intelligent Agents
8 pages
How To Easily Remove RAR Password
100% (1)
How To Easily Remove RAR Password
4 pages
READ ME - Gecko (Lite) Trader's Guide
No ratings yet
READ ME - Gecko (Lite) Trader's Guide
2 pages
NO Touch Strategy Description 1
100% (1)
NO Touch Strategy Description 1
4 pages
BINARY TRADING BY MIKE
No ratings yet
BINARY TRADING BY MIKE
4 pages
Grid, Martingale and Hedging
No ratings yet
Grid, Martingale and Hedging
2 pages
Session 4
No ratings yet
Session 4
11 pages
Forex Factory - Forex Markets For The Smart Money.
No ratings yet
Forex Factory - Forex Markets For The Smart Money.
4 pages
Good - Analyzer - 3 (Manual - En)
No ratings yet
Good - Analyzer - 3 (Manual - En)
6 pages
Algorithmic Trading Week 1
No ratings yet
Algorithmic Trading Week 1
9 pages
A Hybrid Deep Learning Approach by Integrating LSTM-ANN Networks
No ratings yet
A Hybrid Deep Learning Approach by Integrating LSTM-ANN Networks
33 pages
How-Trade-Analyze Synthetic Indexes On Deriv Platform
No ratings yet
How-Trade-Analyze Synthetic Indexes On Deriv Platform
53 pages
LAB REPORT Database
No ratings yet
LAB REPORT Database
5 pages
Review of Deep Learning Models For Crypto Prices Prediction
No ratings yet
Review of Deep Learning Models For Crypto Prices Prediction
29 pages
Forex Samurai Robot: User's Guide For
100% (1)
Forex Samurai Robot: User's Guide For
7 pages
Understanding The Deriv Killer 3 by Finestburu
No ratings yet
Understanding The Deriv Killer 3 by Finestburu
4 pages
Franken Strategy Multilpe Pairs-DCA
No ratings yet
Franken Strategy Multilpe Pairs-DCA
4 pages
Over Under Strategies
No ratings yet
Over Under Strategies
2 pages
Terms and Jargons For Trading by EmperorBTC
No ratings yet
Terms and Jargons For Trading by EmperorBTC
9 pages
Numpy Cheat Sheet & Quick Reference
No ratings yet
Numpy Cheat Sheet & Quick Reference
5 pages
Strategy - XXXZ - ZX
No ratings yet
Strategy - XXXZ - ZX
1 page
Forex Robots and How They Work
No ratings yet
Forex Robots and How They Work
6 pages
10indicators
No ratings yet
10indicators
8 pages
MacArthur, R - MAWG Thesis
No ratings yet
MacArthur, R - MAWG Thesis
78 pages
Oisches Capital Boom and Crash Strategy - PDF - Economies - Business 2
No ratings yet
Oisches Capital Boom and Crash Strategy - PDF - Economies - Business 2
16 pages
Optiver Interview
No ratings yet
Optiver Interview
15 pages
Brownian Motion - Is It Really Possible To Create A Robust Algorithmic Trading Strategy For Intraday Trading - Quantitative Finance Stack Exchange
No ratings yet
Brownian Motion - Is It Really Possible To Create A Robust Algorithmic Trading Strategy For Intraday Trading - Quantitative Finance Stack Exchange
8 pages
NLP PDF
No ratings yet
NLP PDF
17 pages
16 - Practical - 6-7.ipynb - Colab
No ratings yet
16 - Practical - 6-7.ipynb - Colab
3 pages
Exp 7 Text Sequence Generator LSTM
No ratings yet
Exp 7 Text Sequence Generator LSTM
12 pages
Walmart Sales Prediction
No ratings yet
Walmart Sales Prediction
21 pages
2014 Code of Conduct
No ratings yet
2014 Code of Conduct
31 pages
The Use of Information and Communication Technology (ICT) Policy
No ratings yet
The Use of Information and Communication Technology (ICT) Policy
5 pages
ICT Policy Development Process in Africa
No ratings yet
ICT Policy Development Process in Africa
31 pages
Argument ForAndAgainst CSR
No ratings yet
Argument ForAndAgainst CSR
15 pages
Trespass Act 1959
No ratings yet
Trespass Act 1959
2 pages
Cyber Security Governance
0% (2)
Cyber Security Governance
20 pages
Salary Delay Complaint Letter - 3
100% (1)
Salary Delay Complaint Letter - 3
1 page
Woolworth 2015 Integrated Report
No ratings yet
Woolworth 2015 Integrated Report
71 pages
Transcending E-Government: A Case of Mobile Government in Beijing
No ratings yet
Transcending E-Government: A Case of Mobile Government in Beijing
9 pages
IC4D 2012 Executive Summary
No ratings yet
IC4D 2012 Executive Summary
8 pages
I.R.R - Internal Rate of Return Explained: Finance & Property Research Pty LTD
No ratings yet
I.R.R - Internal Rate of Return Explained: Finance & Property Research Pty LTD
4 pages
Mindjet Project Management Whitepaper
No ratings yet
Mindjet Project Management Whitepaper
6 pages
Cash Conversion Inventory and Receivables Management
100% (3)
Cash Conversion Inventory and Receivables Management
7 pages
Succeeding at New Products The PG Way
No ratings yet
Succeeding at New Products The PG Way
12 pages