CAR CLASSIFICATION PROJECT

K-NEAREST NEIGHBOURS

a classification algorithm / deals with irregular data

Attributes

Buying cost _"buying"
Maintenance _"maint"
Number of doors _"door"
Number of Persons _"persons"
Boot size _"lug_boot"
Safety degree _"safety"

Label / The prediction

class = ["unacc", "acc", "good", "vgood"]

Requirements

pandas

python import pandas as pd
Numpy

python import numpy as np

sklearn

import sklearn 
from sklearn import linear_model, preprocessing
from sklearn.utils import shuffle 
from sklearn.neighbors import KNeighborsClassifier

Steps of the project:

STEP 1:

We have to read in our dataset. Using panda.
```
data = pd.read_csv("car.data")
```

STEP 2:

To encode the non-integral data values

encode = preprocessing.LabelEncoder()

buying = encode.fit_transform(list(data["buying"]))
maint = encode.fit_transform(list(data["maint"]))
door = encode.fit_transform(list(data["door"]))
persons = encode.fit_transform(list(data["persons"]))
lug_boot = encode.fit_transform(list(data["lug_boot"]))
safety = encode.fit_transform(list(data["safety"]))
cls = encode.fit_transform(list(data["class"]))

STEP 3:

Here we define what we want to predict. The label.

predict = "cls"

x = list(zip(buying, maint, door, persons, lug_boot, safety))

This line defines attributes that will help with prediction.
```
y = list(cls)
```
This line gives only the 'class' value.

STEP 4:

We divide x & y into four('x train', 'y train', 'x test', 'y test'). Using sklearn!

x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x, y, test_size=0.1)

This line splits our data X and Y, This is for TRAINING & TESTING.
The line 'test_size=0.1', means that from our dataset 10% will be used for testing.

STEP 5:

create the training model.

model = KNeighborsClassifier(n_neighbors=7)
model.fit(x_train, y_train)

n_neighbours=7 , means a maximum of 7 neighbours is allowed
You can use more or less eg: n_neighbours=5, n_neighbours=11 etc..

only odd numbers are needed.

accuracy = model.score(x_test, y_test)
print(accuracy)

model-score, finds how accurate the model is!

HOW K-NEAREST NEIGHBOURS works!!!

STEP 6:

```python 
prediction = model.predict(x_test)
names = ["unacc", "acc", "good", "vgood"]

for x in range(len(prediction)):
    print(names[prediction[x]], x_test[x], names[y_test[x]])
    # or
    print("Predicted: ", names[prediction[x]], "Actual: ", names[y_test[x]])
    
```

On the first line our model makes a prediction.
A for loop to iterate through each prediction!
names variable & n helps give proper class names to our output.

STEP 7:

Saving our model
If we were to save our model it would consume alot of space, it's one limitations of this algorithm.
This because of the magnitude calculations of each single value with all the other possibilities.

The END

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CAR CLASSIFICATION PROJECT

Attributes

Label / The prediction

Requirements

Steps of the project:

HOW K-NEAREST NEIGHBOURS works!!!

Files

README.md

Latest commit

History

README.md

File metadata and controls

CAR CLASSIFICATION PROJECT

Attributes

Label / The prediction

Requirements

Steps of the project:

HOW K-NEAREST NEIGHBOURS works!!!