is_assignments/a2/code/.ipynb_checkpoints/Second IS assignment-checkp...

3707 lines
4.6 MiB
Plaintext
Raw Normal View History

2022-12-19 10:09:00 +01:00
{
"cells": [
{
"cell_type": "markdown",
"id": "c093ea0c",
"metadata": {},
"source": [
"# Seminar 2: Predicting Biodegradability of Chemical"
]
},
{
"cell_type": "markdown",
"id": "7aa30d7d",
"metadata": {},
"source": [
"## 1. Introduction\n",
"Chemicals are all around us. Studying their properties by the means of machine learning is an active\n",
"research field; matching molecular patterns with their behavior can be a decisive factor in the creation of\n",
"new materials, drugs, and more.\n",
"In this seminar assignment, your task is to explore the data and build machine-learning models that\n",
"predict the biodegradability of chemicals."
]
2022-12-29 10:21:35 +01:00
},
{
"cell_type": "markdown",
"id": "aeab08c8",
"metadata": {},
"source": [
"## 2. Task\n",
"You will work with the data set compiled by Mansouri et al. [data](https://www.openml.org/search?type=data&status=active&id=1494&sort=runs). There are 41 features and one target feature (biodegradability).\n",
"The target variable is encoded as ready biodegradable (1) and not ready biodegradable (2). The data set\n",
"consists of 1055 instances. Features can be either symbolic or numeric.\n",
"IMPORTANT: Use the dataset provided on uˇcilnica and NOT the one posted on the link above. It is\n",
"minimally modified and split into train in test sets.\n"
]
},
{
"cell_type": "markdown",
"id": "a4f197dd",
"metadata": {},
"source": [
"### 2.1 Exploration\n",
"Inspect the dataset. How balanced is the target variable? Are there any missing values present? If there\n",
"are, choose a strategy that takes this into account.\n",
"Most of your data is of the numeric type. Can you identify, by adopting exploratory analysis, whether\n",
"some features are directly related to the target? What about feature pairs? Produce at least three types of\n",
"visualizations of the feature space and be prepared to argue why these visualizations were useful for your\n",
"subsequent analysis."
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 129,
2022-12-29 10:21:35 +01:00
"id": "5bcf6290",
"metadata": {},
"outputs": [],
"source": [
"# Needed imports\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import sklearn\n",
"import seaborn as sns\n",
"import scikitplot as skplt\n",
"import warnings\n",
2023-01-06 10:09:28 +01:00
"warnings.filterwarnings('ignore')\n"
2022-12-29 10:21:35 +01:00
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 130,
2022-12-29 10:21:35 +01:00
"id": "18ff4f76",
"metadata": {},
"outputs": [],
"source": [
"df_train = pd.read_csv('train.csv')\n",
"df_test = pd.read_csv('test.csv')"
]
},
{
"cell_type": "markdown",
"id": "ea26bfdf",
"metadata": {},
"source": [
"#### Lets inspect training and test data"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 131,
2022-12-29 10:21:35 +01:00
"id": "5933f4d7",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>V1</th>\n",
" <th>V2</th>\n",
" <th>V3</th>\n",
" <th>V4</th>\n",
" <th>V5</th>\n",
" <th>V6</th>\n",
" <th>V7</th>\n",
" <th>V8</th>\n",
" <th>V9</th>\n",
" <th>V10</th>\n",
" <th>...</th>\n",
" <th>V33</th>\n",
" <th>V34</th>\n",
" <th>V35</th>\n",
" <th>V36</th>\n",
" <th>V37</th>\n",
" <th>V38</th>\n",
" <th>V39</th>\n",
" <th>V40</th>\n",
" <th>V41</th>\n",
" <th>Class</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>3.919</td>\n",
" <td>2.6909</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>31.4</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2.949</td>\n",
" <td>1.591</td>\n",
" <td>0</td>\n",
" <td>7.253</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>4.170</td>\n",
" <td>2.1144</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>30.8</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>3.315</td>\n",
" <td>1.967</td>\n",
" <td>0</td>\n",
" <td>7.257</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3.000</td>\n",
" <td>2.7098</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>20.0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>3.046</td>\n",
" <td>5.000</td>\n",
" <td>0</td>\n",
" <td>6.690</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>4.214</td>\n",
" <td>2.6272</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>30.0</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2.998</td>\n",
" <td>1.722</td>\n",
" <td>0</td>\n",
" <td>6.770</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>3.942</td>\n",
" <td>2.7719</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>31.6</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>3.542</td>\n",
" <td>1.739</td>\n",
" <td>0</td>\n",
" <td>8.127</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 42 columns</p>\n",
"</div>"
],
"text/plain": [
" V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ... V33 V34 V35 \\\n",
"1 3.919 2.6909 0 0 0 0 0 31.4 2 0 ... 0 0 0 \n",
"2 4.170 2.1144 0 0 0 0 0 30.8 1 1 ... 0 0 0 \n",
"4 3.000 2.7098 0 0 0 0 0 20.0 0 2 ... 0 0 1 \n",
"13 4.214 2.6272 0 0 0 0 0 30.0 3 0 ... 0 0 0 \n",
"16 3.942 2.7719 1 0 0 0 0 31.6 2 0 ... 0 0 0 \n",
"\n",
" V36 V37 V38 V39 V40 V41 Class \n",
"1 2.949 1.591 0 7.253 0 0 2 \n",
"2 3.315 1.967 0 7.257 0 0 2 \n",
"4 3.046 5.000 0 6.690 0 0 2 \n",
"13 2.998 1.722 0 6.770 0 0 2 \n",
"16 3.542 1.739 0 8.127 0 1 2 \n",
"\n",
"[5 rows x 42 columns]"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 131,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_test.head()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 132,
2022-12-29 10:21:35 +01:00
"id": "1743d191",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>V1</th>\n",
" <th>V2</th>\n",
" <th>V3</th>\n",
" <th>V4</th>\n",
" <th>V5</th>\n",
" <th>V6</th>\n",
" <th>V7</th>\n",
" <th>V8</th>\n",
" <th>V9</th>\n",
" <th>V10</th>\n",
" <th>...</th>\n",
" <th>V33</th>\n",
" <th>V34</th>\n",
" <th>V35</th>\n",
" <th>V36</th>\n",
" <th>V37</th>\n",
" <th>V38</th>\n",
" <th>V39</th>\n",
" <th>V40</th>\n",
" <th>V41</th>\n",
" <th>Class</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>821.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>...</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>821.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" <td>846.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>4.790476</td>\n",
" <td>3.054551</td>\n",
" <td>0.739953</td>\n",
" <td>0.030451</td>\n",
" <td>0.946809</td>\n",
" <td>0.277778</td>\n",
" <td>1.669031</td>\n",
" <td>37.422813</td>\n",
" <td>1.342790</td>\n",
" <td>1.784870</td>\n",
" <td>...</td>\n",
" <td>0.903073</td>\n",
" <td>1.241135</td>\n",
" <td>0.926714</td>\n",
" <td>3.922100</td>\n",
" <td>2.549406</td>\n",
" <td>0.671395</td>\n",
" <td>8.643191</td>\n",
" <td>0.059102</td>\n",
" <td>0.706856</td>\n",
" <td>1.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>0.531991</td>\n",
" <td>0.813983</td>\n",
" <td>1.504545</td>\n",
" <td>0.198281</td>\n",
" <td>2.318081</td>\n",
" <td>1.045544</td>\n",
" <td>2.220221</td>\n",
" <td>9.030008</td>\n",
" <td>2.018433</td>\n",
" <td>1.773856</td>\n",
" <td>...</td>\n",
" <td>1.526124</td>\n",
" <td>2.248684</td>\n",
" <td>1.239133</td>\n",
" <td>0.992636</td>\n",
" <td>0.625021</td>\n",
" <td>1.093633</td>\n",
" <td>1.223700</td>\n",
" <td>0.342364</td>\n",
" <td>2.145396</td>\n",
" <td>0.471683</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>2.000000</td>\n",
" <td>0.803900</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>9.100000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>...</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>2.279000</td>\n",
" <td>1.467000</td>\n",
" <td>0.000000</td>\n",
" <td>4.948000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>4.499000</td>\n",
" <td>2.510175</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>30.800000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>...</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>3.497000</td>\n",
" <td>2.101000</td>\n",
" <td>0.000000</td>\n",
" <td>8.009500</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>4.840000</td>\n",
" <td>3.052400</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>37.850000</td>\n",
" <td>1.000000</td>\n",
" <td>1.500000</td>\n",
" <td>...</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>3.732500</td>\n",
" <td>2.461000</td>\n",
" <td>0.000000</td>\n",
" <td>8.508000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>5.119000</td>\n",
" <td>3.415725</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>3.000000</td>\n",
" <td>43.800000</td>\n",
" <td>2.000000</td>\n",
" <td>3.000000</td>\n",
" <td>...</td>\n",
" <td>1.000000</td>\n",
" <td>2.000000</td>\n",
" <td>1.000000</td>\n",
" <td>3.980000</td>\n",
" <td>2.861000</td>\n",
" <td>1.000000</td>\n",
" <td>9.019750</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>2.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>6.496000</td>\n",
" <td>7.918400</td>\n",
" <td>12.000000</td>\n",
" <td>2.000000</td>\n",
" <td>36.000000</td>\n",
" <td>13.000000</td>\n",
" <td>18.000000</td>\n",
" <td>60.700000</td>\n",
" <td>24.000000</td>\n",
" <td>12.000000</td>\n",
" <td>...</td>\n",
" <td>12.000000</td>\n",
" <td>18.000000</td>\n",
" <td>7.000000</td>\n",
" <td>10.695000</td>\n",
" <td>5.750000</td>\n",
" <td>8.000000</td>\n",
" <td>14.700000</td>\n",
" <td>4.000000</td>\n",
" <td>27.000000</td>\n",
" <td>2.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>8 rows × 42 columns</p>\n",
"</div>"
],
"text/plain": [
" V1 V2 V3 V4 V5 V6 \\\n",
"count 846.000000 846.000000 846.000000 821.000000 846.000000 846.000000 \n",
"mean 4.790476 3.054551 0.739953 0.030451 0.946809 0.277778 \n",
"std 0.531991 0.813983 1.504545 0.198281 2.318081 1.045544 \n",
"min 2.000000 0.803900 0.000000 0.000000 0.000000 0.000000 \n",
"25% 4.499000 2.510175 0.000000 0.000000 0.000000 0.000000 \n",
"50% 4.840000 3.052400 0.000000 0.000000 0.000000 0.000000 \n",
"75% 5.119000 3.415725 1.000000 0.000000 1.000000 0.000000 \n",
"max 6.496000 7.918400 12.000000 2.000000 36.000000 13.000000 \n",
"\n",
" V7 V8 V9 V10 ... V33 \\\n",
"count 846.000000 846.000000 846.000000 846.000000 ... 846.000000 \n",
"mean 1.669031 37.422813 1.342790 1.784870 ... 0.903073 \n",
"std 2.220221 9.030008 2.018433 1.773856 ... 1.526124 \n",
"min 0.000000 9.100000 0.000000 0.000000 ... 0.000000 \n",
"25% 0.000000 30.800000 0.000000 0.000000 ... 0.000000 \n",
"50% 1.000000 37.850000 1.000000 1.500000 ... 0.000000 \n",
"75% 3.000000 43.800000 2.000000 3.000000 ... 1.000000 \n",
"max 18.000000 60.700000 24.000000 12.000000 ... 12.000000 \n",
"\n",
" V34 V35 V36 V37 V38 V39 \\\n",
"count 846.000000 846.000000 846.000000 821.000000 846.000000 846.000000 \n",
"mean 1.241135 0.926714 3.922100 2.549406 0.671395 8.643191 \n",
"std 2.248684 1.239133 0.992636 0.625021 1.093633 1.223700 \n",
"min 0.000000 0.000000 2.279000 1.467000 0.000000 4.948000 \n",
"25% 0.000000 0.000000 3.497000 2.101000 0.000000 8.009500 \n",
"50% 0.000000 1.000000 3.732500 2.461000 0.000000 8.508000 \n",
"75% 2.000000 1.000000 3.980000 2.861000 1.000000 9.019750 \n",
"max 18.000000 7.000000 10.695000 5.750000 8.000000 14.700000 \n",
"\n",
" V40 V41 Class \n",
"count 846.000000 846.000000 846.000000 \n",
"mean 0.059102 0.706856 1.333333 \n",
"std 0.342364 2.145396 0.471683 \n",
"min 0.000000 0.000000 1.000000 \n",
"25% 0.000000 0.000000 1.000000 \n",
"50% 0.000000 0.000000 1.000000 \n",
"75% 0.000000 0.000000 2.000000 \n",
"max 4.000000 27.000000 2.000000 \n",
"\n",
"[8 rows x 42 columns]"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 132,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train.describe()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 133,
2022-12-29 10:21:35 +01:00
"id": "b2689ec0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"Int64Index: 846 entries, 3 to 1055\n",
"Data columns (total 42 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 V1 846 non-null float64\n",
" 1 V2 846 non-null float64\n",
" 2 V3 846 non-null int64 \n",
" 3 V4 821 non-null float64\n",
" 4 V5 846 non-null int64 \n",
" 5 V6 846 non-null int64 \n",
" 6 V7 846 non-null int64 \n",
" 7 V8 846 non-null float64\n",
" 8 V9 846 non-null int64 \n",
" 9 V10 846 non-null int64 \n",
" 10 V11 846 non-null int64 \n",
" 11 V12 846 non-null float64\n",
" 12 V13 846 non-null float64\n",
" 13 V14 846 non-null float64\n",
" 14 V15 846 non-null float64\n",
" 15 V16 846 non-null int64 \n",
" 16 V17 846 non-null float64\n",
" 17 V18 846 non-null float64\n",
" 18 V19 846 non-null int64 \n",
" 19 V20 846 non-null int64 \n",
" 20 V21 846 non-null int64 \n",
" 21 V22 830 non-null float64\n",
" 22 V23 846 non-null int64 \n",
" 23 V24 846 non-null int64 \n",
" 24 V25 846 non-null int64 \n",
" 25 V26 846 non-null int64 \n",
" 26 V27 838 non-null float64\n",
" 27 V28 846 non-null float64\n",
" 28 V29 838 non-null float64\n",
" 29 V30 846 non-null float64\n",
" 30 V31 846 non-null float64\n",
" 31 V32 846 non-null int64 \n",
" 32 V33 846 non-null int64 \n",
" 33 V34 846 non-null int64 \n",
" 34 V35 846 non-null int64 \n",
" 35 V36 846 non-null float64\n",
" 36 V37 821 non-null float64\n",
" 37 V38 846 non-null int64 \n",
" 38 V39 846 non-null float64\n",
" 39 V40 846 non-null int64 \n",
" 40 V41 846 non-null int64 \n",
" 41 Class 846 non-null int64 \n",
"dtypes: float64(19), int64(23)\n",
"memory usage: 284.2 KB\n"
]
}
],
"source": [
"df_train.info()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 134,
2022-12-29 10:21:35 +01:00
"id": "22003f33",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>V1</th>\n",
" <th>V2</th>\n",
" <th>V3</th>\n",
" <th>V4</th>\n",
" <th>V5</th>\n",
" <th>V6</th>\n",
" <th>V7</th>\n",
" <th>V8</th>\n",
" <th>V9</th>\n",
" <th>V10</th>\n",
" <th>...</th>\n",
" <th>V33</th>\n",
" <th>V34</th>\n",
" <th>V35</th>\n",
" <th>V36</th>\n",
" <th>V37</th>\n",
" <th>V38</th>\n",
" <th>V39</th>\n",
" <th>V40</th>\n",
" <th>V41</th>\n",
" <th>Class</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>3.919</td>\n",
" <td>2.6909</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>31.4</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2.949</td>\n",
" <td>1.591</td>\n",
" <td>0</td>\n",
" <td>7.253</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>4.170</td>\n",
" <td>2.1144</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>30.8</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>3.315</td>\n",
" <td>1.967</td>\n",
" <td>0</td>\n",
" <td>7.257</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3.000</td>\n",
" <td>2.7098</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>20.0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>3.046</td>\n",
" <td>5.000</td>\n",
" <td>0</td>\n",
" <td>6.690</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>4.214</td>\n",
" <td>2.6272</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>30.0</td>\n",
" <td>3</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2.998</td>\n",
" <td>1.722</td>\n",
" <td>0</td>\n",
" <td>6.770</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>3.942</td>\n",
" <td>2.7719</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>31.6</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>3.542</td>\n",
" <td>1.739</td>\n",
" <td>0</td>\n",
" <td>8.127</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 42 columns</p>\n",
"</div>"
],
"text/plain": [
" V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ... V33 V34 V35 \\\n",
"1 3.919 2.6909 0 0 0 0 0 31.4 2 0 ... 0 0 0 \n",
"2 4.170 2.1144 0 0 0 0 0 30.8 1 1 ... 0 0 0 \n",
"4 3.000 2.7098 0 0 0 0 0 20.0 0 2 ... 0 0 1 \n",
"13 4.214 2.6272 0 0 0 0 0 30.0 3 0 ... 0 0 0 \n",
"16 3.942 2.7719 1 0 0 0 0 31.6 2 0 ... 0 0 0 \n",
"\n",
" V36 V37 V38 V39 V40 V41 Class \n",
"1 2.949 1.591 0 7.253 0 0 2 \n",
"2 3.315 1.967 0 7.257 0 0 2 \n",
"4 3.046 5.000 0 6.690 0 0 2 \n",
"13 2.998 1.722 0 6.770 0 0 2 \n",
"16 3.542 1.739 0 8.127 0 1 2 \n",
"\n",
"[5 rows x 42 columns]"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 134,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_test.head()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 135,
2022-12-29 10:21:35 +01:00
"id": "d7235214",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>V1</th>\n",
" <th>V2</th>\n",
" <th>V3</th>\n",
" <th>V4</th>\n",
" <th>V5</th>\n",
" <th>V6</th>\n",
" <th>V7</th>\n",
" <th>V8</th>\n",
" <th>V9</th>\n",
" <th>V10</th>\n",
" <th>...</th>\n",
" <th>V33</th>\n",
" <th>V34</th>\n",
" <th>V35</th>\n",
" <th>V36</th>\n",
" <th>V37</th>\n",
" <th>V38</th>\n",
" <th>V39</th>\n",
" <th>V40</th>\n",
" <th>V41</th>\n",
" <th>Class</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.00000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>...</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" <td>209.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>4.750938</td>\n",
" <td>3.130050</td>\n",
" <td>0.62201</td>\n",
" <td>0.086124</td>\n",
" <td>1.114833</td>\n",
" <td>0.339713</td>\n",
" <td>1.555024</td>\n",
" <td>35.569378</td>\n",
" <td>1.511962</td>\n",
" <td>1.880383</td>\n",
" <td>...</td>\n",
" <td>0.803828</td>\n",
" <td>1.411483</td>\n",
" <td>1.100478</td>\n",
" <td>3.902612</td>\n",
" <td>2.629201</td>\n",
" <td>0.746411</td>\n",
" <td>8.574038</td>\n",
" <td>0.019139</td>\n",
" <td>0.789474</td>\n",
" <td>1.354067</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>0.603914</td>\n",
" <td>0.897556</td>\n",
" <td>1.27690</td>\n",
" <td>0.406969</td>\n",
" <td>2.393143</td>\n",
" <td>1.182566</td>\n",
" <td>2.246383</td>\n",
" <td>9.471334</td>\n",
" <td>1.721220</td>\n",
" <td>1.784023</td>\n",
" <td>...</td>\n",
" <td>1.498327</td>\n",
" <td>2.374355</td>\n",
" <td>1.320857</td>\n",
" <td>1.029605</td>\n",
" <td>0.714285</td>\n",
" <td>1.077657</td>\n",
" <td>1.315016</td>\n",
" <td>0.195176</td>\n",
" <td>2.589491</td>\n",
" <td>0.479378</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>2.000000</td>\n",
" <td>1.134900</td>\n",
" <td>0.00000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>...</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>2.267000</td>\n",
" <td>1.576000</td>\n",
" <td>0.000000</td>\n",
" <td>4.917000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>4.414000</td>\n",
" <td>2.494500</td>\n",
" <td>0.00000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>29.400000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>...</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>3.401000</td>\n",
" <td>2.146000</td>\n",
" <td>0.000000</td>\n",
" <td>7.872000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>4.807000</td>\n",
" <td>3.039300</td>\n",
" <td>0.00000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>34.200000</td>\n",
" <td>1.000000</td>\n",
" <td>2.000000</td>\n",
" <td>...</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>3.694000</td>\n",
" <td>2.469000</td>\n",
" <td>0.000000</td>\n",
" <td>8.464000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>5.188000</td>\n",
" <td>3.555400</td>\n",
" <td>1.00000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>3.000000</td>\n",
" <td>41.200000</td>\n",
" <td>2.000000</td>\n",
" <td>3.000000</td>\n",
" <td>...</td>\n",
" <td>1.000000</td>\n",
" <td>2.000000</td>\n",
" <td>2.000000</td>\n",
" <td>3.991000</td>\n",
" <td>2.967000</td>\n",
" <td>1.000000</td>\n",
" <td>9.017000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>2.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>6.253000</td>\n",
" <td>9.177500</td>\n",
" <td>8.00000</td>\n",
" <td>3.000000</td>\n",
" <td>16.000000</td>\n",
" <td>12.000000</td>\n",
" <td>14.000000</td>\n",
" <td>60.000000</td>\n",
" <td>9.000000</td>\n",
" <td>11.000000</td>\n",
" <td>...</td>\n",
" <td>12.000000</td>\n",
" <td>18.000000</td>\n",
" <td>6.000000</td>\n",
" <td>10.355000</td>\n",
" <td>5.825000</td>\n",
" <td>6.000000</td>\n",
" <td>14.030000</td>\n",
" <td>2.000000</td>\n",
" <td>27.000000</td>\n",
" <td>2.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>8 rows × 42 columns</p>\n",
"</div>"
],
"text/plain": [
" V1 V2 V3 V4 V5 V6 \\\n",
"count 209.000000 209.000000 209.00000 209.000000 209.000000 209.000000 \n",
"mean 4.750938 3.130050 0.62201 0.086124 1.114833 0.339713 \n",
"std 0.603914 0.897556 1.27690 0.406969 2.393143 1.182566 \n",
"min 2.000000 1.134900 0.00000 0.000000 0.000000 0.000000 \n",
"25% 4.414000 2.494500 0.00000 0.000000 0.000000 0.000000 \n",
"50% 4.807000 3.039300 0.00000 0.000000 0.000000 0.000000 \n",
"75% 5.188000 3.555400 1.00000 0.000000 1.000000 0.000000 \n",
"max 6.253000 9.177500 8.00000 3.000000 16.000000 12.000000 \n",
"\n",
" V7 V8 V9 V10 ... V33 \\\n",
"count 209.000000 209.000000 209.000000 209.000000 ... 209.000000 \n",
"mean 1.555024 35.569378 1.511962 1.880383 ... 0.803828 \n",
"std 2.246383 9.471334 1.721220 1.784023 ... 1.498327 \n",
"min 0.000000 0.000000 0.000000 0.000000 ... 0.000000 \n",
"25% 0.000000 29.400000 0.000000 0.000000 ... 0.000000 \n",
"50% 0.000000 34.200000 1.000000 2.000000 ... 0.000000 \n",
"75% 3.000000 41.200000 2.000000 3.000000 ... 1.000000 \n",
"max 14.000000 60.000000 9.000000 11.000000 ... 12.000000 \n",
"\n",
" V34 V35 V36 V37 V38 V39 \\\n",
"count 209.000000 209.000000 209.000000 209.000000 209.000000 209.000000 \n",
"mean 1.411483 1.100478 3.902612 2.629201 0.746411 8.574038 \n",
"std 2.374355 1.320857 1.029605 0.714285 1.077657 1.315016 \n",
"min 0.000000 0.000000 2.267000 1.576000 0.000000 4.917000 \n",
"25% 0.000000 0.000000 3.401000 2.146000 0.000000 7.872000 \n",
"50% 0.000000 1.000000 3.694000 2.469000 0.000000 8.464000 \n",
"75% 2.000000 2.000000 3.991000 2.967000 1.000000 9.017000 \n",
"max 18.000000 6.000000 10.355000 5.825000 6.000000 14.030000 \n",
"\n",
" V40 V41 Class \n",
"count 209.000000 209.000000 209.000000 \n",
"mean 0.019139 0.789474 1.354067 \n",
"std 0.195176 2.589491 0.479378 \n",
"min 0.000000 0.000000 1.000000 \n",
"25% 0.000000 0.000000 1.000000 \n",
"50% 0.000000 0.000000 1.000000 \n",
"75% 0.000000 0.000000 2.000000 \n",
"max 2.000000 27.000000 2.000000 \n",
"\n",
"[8 rows x 42 columns]"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 135,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_test.describe()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 136,
2022-12-29 10:21:35 +01:00
"id": "9598495e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"Int64Index: 209 entries, 1 to 1051\n",
"Data columns (total 42 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 V1 209 non-null float64\n",
" 1 V2 209 non-null float64\n",
" 2 V3 209 non-null int64 \n",
" 3 V4 209 non-null int64 \n",
" 4 V5 209 non-null int64 \n",
" 5 V6 209 non-null int64 \n",
" 6 V7 209 non-null int64 \n",
" 7 V8 209 non-null float64\n",
" 8 V9 209 non-null int64 \n",
" 9 V10 209 non-null int64 \n",
" 10 V11 209 non-null int64 \n",
" 11 V12 209 non-null float64\n",
" 12 V13 209 non-null float64\n",
" 13 V14 209 non-null float64\n",
" 14 V15 209 non-null float64\n",
" 15 V16 209 non-null int64 \n",
" 16 V17 209 non-null float64\n",
" 17 V18 209 non-null float64\n",
" 18 V19 209 non-null int64 \n",
" 19 V20 209 non-null int64 \n",
" 20 V21 209 non-null int64 \n",
" 21 V22 209 non-null float64\n",
" 22 V23 209 non-null int64 \n",
" 23 V24 209 non-null int64 \n",
" 24 V25 209 non-null int64 \n",
" 25 V26 209 non-null int64 \n",
" 26 V27 209 non-null float64\n",
" 27 V28 209 non-null float64\n",
" 28 V29 209 non-null int64 \n",
" 29 V30 209 non-null float64\n",
" 30 V31 209 non-null float64\n",
" 31 V32 209 non-null int64 \n",
" 32 V33 209 non-null int64 \n",
" 33 V34 209 non-null int64 \n",
" 34 V35 209 non-null int64 \n",
" 35 V36 209 non-null float64\n",
" 36 V37 209 non-null float64\n",
" 37 V38 209 non-null int64 \n",
" 38 V39 209 non-null float64\n",
" 39 V40 209 non-null int64 \n",
" 40 V41 209 non-null int64 \n",
" 41 Class 209 non-null int64 \n",
"dtypes: float64(17), int64(25)\n",
"memory usage: 70.2 KB\n"
]
}
],
"source": [
"df_test.info()"
]
},
{
"cell_type": "markdown",
"id": "84e0c414",
"metadata": {},
"source": [
"#### Display distributions of target variable **Class** in training and validation set."
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 137,
2022-12-29 10:21:35 +01:00
"id": "5ca239ec",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjsAAAHHCAYAAABZbpmkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABJz0lEQVR4nO3deVxUZf//8fcIgoiAKyAuuKe4JpaSmrkkKlmm3ZpZoqlZobmUFd8slxa7NZc00+67UlvMrTQzl9zSUmwxNbM0NdcUNE0QFRS4fn/0Y+5GQGEcGDi9no/HPB7Mda4553POmWHec805Z2zGGCMAAACLKubuAgAAAPITYQcAAFgaYQcAAFgaYQcAAFgaYQcAAFgaYQcAAFgaYQcAAFgaYQcAAFgaYQcAAFgaYaeIGzt2rGw2W4Es64477tAdd9xhv//ll1/KZrNpyZIlBbL8fv36qVq1agWyLGclJydr4MCBCg4Ols1m0/Dhw/M8j8x9+scff7i+QOSLq18buXX48GHZbDa99tpr1+3r6tf63LlzZbPZdPjwYZfNMyf9+vVTqVKl8n05+c1ms2ns2LFOPbZatWrq16+fS+tB7hF2CpHMfz6ZtxIlSigkJESRkZGaPn26zp8/75LlnDhxQmPHjtXOnTtdMj9XKsy15cYrr7yiuXPn6rHHHtP777+vhx566Jp9ly1bVnDFXWXr1q0aO3aszp0757Ya8qKo1ftPc/HiRY0dO1Zffvml22pYuXKl02EE11cY9rHTDAqNOXPmGElm/Pjx5v333zfvvvuueeWVV0zHjh2NzWYzoaGhZteuXQ6PuXLlirl06VKelvPdd98ZSWbOnDl5elxqaqpJTU2139+4caORZBYvXpyn+Thb2+XLl01KSorLlpUfmjdvblq2bJmrvr6+viY6OjpL+5gxY4wkc/r0aRdX52jSpElGkjl06FC+LsdVCnO9V782cuvQoUNGkpk0adJ1+2Y+L1wlLS3NXLp0yWRkZLhkfqdPnzaSzJgxY7JMi46ONr6+vi5ZzrXExMS4dBtd7dKlS+bKlStOPTYlJcVcvnzZxRUVrGvt48LO0z0RC9fSuXNnNWvWzH4/NjZWGzZs0F133aW7775bv/zyi3x8fCRJnp6e8vTM39148eJFlSxZUl5eXvm6nOspXry4W5efG6dOnVJYWJi7y3AbY4xSUlLsz0+rKyyvDWd4eHjIw8PD3WW4TVpamjIyMvK070qUKOH08ry9vZ1+LFzA3WkL/5M5svPdd99lO/2VV14xksx//vMfe1t2n/a++OIL07JlSxMQEGB8fX1NnTp1TGxsrDHmf6MxV98yR1LatGlj6tevb77//nvTunVr4+PjY4YNG2af1qZNG/tyMue1YMECExsba4KCgkzJkiVN165dzdGjRx1qCg0NzXYU4+/zvF5t0dHRJjQ01OHxycnJZuTIkaZy5crGy8vL1KlTx0yaNCnLp1VJJiYmxixdutTUr1/feHl5mbCwMLNq1apst/XVEhISzMMPP2wCAwONt7e3adSokZk7d26WbXH1LadRiOz6Zm6fzH26f/9+Ex0dbQICAoy/v7/p16+fuXDhQpZ5vf/++6Zp06amRIkSpkyZMqZXr15Ztv/VMpeRU73vvvuuadu2ralQoYLx8vIy9erVM2+++WaW+YSGhpqoqCizevVqEx4ebry9vc3UqVONMcYcPnzYdO3a1ZQsWdJUqFDBDB8+3KxevdpIMhs3bnSYz7Zt20xkZKTx9/c3Pj4+5vbbbzdff/11ruu9WkxMjPH19c12e91///0mKCjIpKWlGWOMWbZsmenSpYupWLGi8fLyMjVq1DDjx4+3T8+Ul9dGamqqef75503Tpk2Nv7+/KVmypGnVqpXZsGGDwzz/PrIzZcoUU7VqVVOiRAlz++23m927d2e7z67mzP435n//b/6+DTP351dffWVuueUW4+3tbapXr27mzZt3zXllrsfVt8wRgMyRnePHj5t77rnH+Pr6mvLly5snn3wyy3ZOT083U6dONWFhYcbb29sEBgaaRx55xJw9e/aaNURHR2dbw9/rmzRpkpk6daqpUaOGKVasmNmxY0eu95UxJsuoRl5eq1f/D8zc/l9//bUZMWKEKV++vClZsqTp1q2bOXXqVJZtMmbMGFOxYkXj4+Nj7rjjDrNnz54c/69e7aOPPjJNmzY1pUqVMn5+fqZBgwZm2rRpDn3+/PNPM2zYMPv/0po1a5pXX33VpKenO2zDnPZxYcfIThHy0EMP6f/+7//0xRdfaNCgQdn22bNnj+666y41atRI48ePl7e3tw4cOKAtW7ZIkurVq6fx48frhRde0COPPKLWrVtLkm677Tb7PM6cOaPOnTvr/vvv14MPPqigoKBr1vXyyy/LZrPpmWee0alTpzRt2jR16NBBO3fuzNMn/NzU9nfGGN19993auHGjBgwYoCZNmmjNmjUaNWqUfv/9d02dOtWh/9dff61PPvlEjz/+uPz8/DR9+nT16NFDR48eVbly5XKs69KlS7rjjjt04MABDRkyRNWrV9fixYvVr18/nTt3TsOGDVO9evX0/vvva8SIEapcubKefPJJSVKFChWynef777+vgQMH6tZbb9UjjzwiSapZs6ZDn549e6p69eqaMGGCfvjhB7399tsKDAzUv//9b3ufl19+Wc8//7x69uypgQMH6vTp05oxY4Zuv/127dixQ6VLl852+d27d9evv/6qjz76SFOnTlX58uUd6p01a5bq16+vu+++W56envrss8/0+OOPKyMjQzExMQ7z2rdvn3r37q3Bgwdr0KBBuummm3ThwgW1a9dOJ0+e1LBhwxQcHKz58+dr48aNWWrZsGGDOnfurPDwcI0ZM0bFihXTnDlz1K5dO3311Ve69dZbr1vv1Xr16qWZM2fq888/17/+9S97+8WLF/XZZ5+pX79+9lGNuXPnqlSpUho5cqRKlSqlDRs26IUXXlBSUpImTZrkMN/cvjaSkpL09ttvq3fv3ho0aJDOnz+vd955R5GRkfr222/VpEkTh/7vvfeezp8/r5iYGKWkpOj1119Xu3bttHv37mu+/pzd/9dy4MAB3XfffRowYICio6P17rvvql+/fgoPD1f9+vWzfUyFChU0a9YsPfbYY7r33nvVvXt3SVKjRo3sfdLT0xUZGanmzZvrtdde07p16zR58mTVrFlTjz32mL3f4MGDNXfuXPXv319PPPGEDh06pDfeeEM7duzQli1bchzhHTx4sE6cOKG1a9fq/fffz7bPnDlzlJKSokceeUTe3t4qW7ZsnvdVdnLzWs3J0KFDVaZMGY0ZM0aHDx/WtGnTNGTIEC1cuNDeJzY2VhMnTlTXrl0VGRmpXbt2KTIyUikpKded/9q1a9W7d2+1b9/eXs8vv/yiLVu2aNiwYZL+el20adNGv//+uwYPHqyqVatq69atio2N1cmTJzVt2rRc7eNCzd1pC/9zvZEdY4wJCAgwN998s/3+1Z/2pk6det3jPa51XEybNm2MJDN79uxsp2U3slOpUiWTlJRkb1+0aJGRZF5//XV7W25Gdq5X29UjO8uWLTOSzEsvveTQ77777jM2m80cOHDA3ibJeHl5ObTt2rXLSDIzZszIsqy/mzZtmpFkPvjgA3vb5cuXTUREhClVqpTDumd+Ms6N6x2z8/DDDzu033vvvaZcuXL2+4cPHzYeHh7m5Zdfdui3e/du4+npmaX9atc6BubixYtZ2iIjI02NGjUc2kJDQ40ks3r1aof2yZMnG0lm2bJl9rZLly6ZunXrOozsZGRkmNq1a5vIyEiH0biLFy+a6tWrmzvvvDNX9V4tIyPDVKpUyfTo0cOhPfO5uXnz5muu6+DBg03JkiUdjhHLy2sjLS0tyzE8f/75pwkKCnLYr5mfln18fMzx48ft7d98842RZEaMGGFvu/q1fqP7P6eRnau3z6lTp4y3t7d58sknrzm/6x2zo/9/POLf3XzzzSY8PNx+/6uvvjKSzIcffujQL3NE8Or2q+V0zE7mdvb3988yapLbfWVMziM713utGpPzyE6HDh0cnvsjRow
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"_, _, bars = plt.hist(df_train['Class'], bins=10)\n",
"plt.xlabel('Class')\n",
"plt.ylabel('Frequency')\n",
"plt.title('Distribution of the target variable in the training set')\n",
"plt.bar_label(bars, fmt='%1.0f')\n",
"plt.show()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 138,
2022-12-29 10:21:35 +01:00
"id": "c74f9fb5",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjsAAAHHCAYAAABZbpmkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABH6UlEQVR4nO3deVxUZf//8fcggogsYgJiLrjlmhYmebsrSWqmabmWuFthrmV5l7mkkZlrmd6VqZVmampmppmYtqi5lm3uu4JbgrggyvX7ox/zbQQUYdiOr+fjMQ+d65y5zuecYZg311znjM0YYwQAAGBRLrldAAAAQHYi7AAAAEsj7AAAAEsj7AAAAEsj7AAAAEsj7AAAAEsj7AAAAEsj7AAAAEsj7AAAAEsj7NzhRo0aJZvNliPbaty4sRo3bmy//91338lms2nx4sU5sv3u3burbNmyObKtzEpISFDv3r0VGBgom82mQYMG3XYfKc/pmTNnnF8gssWNr42MOnTokGw2m956661bruvs1/qcOXNks9l06NAhp/WZnu7du6tIkSLZvh1YF2HHQlJ++aTcChUqpKCgIIWHh2vatGm6cOGCU7Zz4sQJjRo1Sjt37nRKf86Ul2vLiNdff11z5szRM888o48//lhPPfXUTdddtmxZzhV3g59++kmjRo3S+fPnc62G25Hf6r3TXLp0SaNGjdJ3332XazWsXLlSo0aNytZt5OZ+/vHHHxo1alSOBNQ8x8AyZs+ebSSZMWPGmI8//th8+OGH5vXXXzfNmzc3NpvNlClTxvzyyy8Oj0lKSjKXL1++re1s2bLFSDKzZ8++rcclJiaaxMRE+/1169YZSWbRokW31U9ma7t69aq5cuWK07aVHUJDQ029evUytK6np6eJiIhI1T5y5EgjyZw+fdrJ1TmaMGGCkWQOHjyYrdtxlrxc742vjYw6ePCgkWQmTJhwy3VTfi6c5dq1a+by5csmOTnZKf2dPn3aSDIjR45MtSwiIsJ4eno6ZTs3ExkZ6dRjlJab7Wd2W7RokZFk1q1bl+Pbzm2uuZKwkK1atGih2rVr2+8PHz5c0dHReuSRR/Too4/qzz//lIeHhyTJ1dVVrq7Z+2Nw6dIlFS5cWG5ubtm6nVspWLBgrm4/I06dOqWqVavmdhm5xhijK1eu2H8+rS6vvDYyo0CBAipQoEBulwFkTG6nLThPysjOli1b0lz++uuvG0nmvffes7el9dfeN998Y+rVq2d8fHyMp6enqVSpkhk+fLgx5v9GY268pYykNGrUyFSrVs1s3brVNGjQwHh4eJiBAwfalzVq1Mi+nZS+FixYYIYPH24CAgJM4cKFTevWrc2RI0ccaipTpkyaoxj/7vNWtUVERJgyZco4PD4hIcEMGTLE3H333cbNzc1UqlTJTJgwIdVfq5JMZGSkWbp0qalWrZpxc3MzVatWNV9//XWax/pGsbGxpmfPnsbf39+4u7ube++918yZMyfVsbjxlt4oRFrrphyflOd07969JiIiwvj4+Bhvb2/TvXt3c/HixVR9ffzxx+b+++83hQoVMkWLFjUdO3ZMdfxvlLKN9Or98MMPTZMmTUzx4sWNm5ubqVKlinn33XdT9VOmTBnTqlUrs2rVKhMSEmLc3d3N5MmTjTHGHDp0yLRu3doULlzYFC9e3AwaNMisWrUqzb9MN23aZMLDw423t7fx8PAwDRs2ND/88EOG671RZGSk8fT0TPN4derUyQQEBJhr164ZY4xZtmyZadmypSlRooRxc3Mz5cqVM2PGjLEvT3E7r43ExEQzYsQIc//99xtvb29TuHBhU79+fRMdHe3Q579HdiZNmmRKly5tChUqZBo2bGh27dqV5nN2o8w8/8b83++bfx/DlOfz+++/Nw888IBxd3c3wcHBZu7cuTftK2U/bryljH6kjOwcO3bMtGnTxnh6epq77rrLDB06NNVxvn79upk8ebKpWrWqcXd3N/7+/qZv377m3LlzN60hIiIizRput98tW7aY5s2bm2LFiplChQqZsmXLmh49emRoP9Ny9epVM2rUKFOhQgXj7u5u/Pz8TL169cw333zjsN6ff/5p2rdvb4oWLWrc3d1NSEiI+eKLL+zLU56vG293yigPIzt3kKeeekr//e9/9c0336hPnz5prvP777/rkUce0b333qsxY8bI3d1d+/bt048//ihJqlKlisaMGaNXX31Vffv2VYMGDSRJ//nPf+x9nD17Vi1atFCnTp305JNPKiAg4KZ1jRs3TjabTS+++KJOnTqlKVOmKCwsTDt37rytv/AzUtu/GWP06KOPat26derVq5dq1aql1atX64UXXtDx48c1efJkh/V/+OEHLVmyRM8++6y8vLw0bdo0tW/fXkeOHFGxYsXSrevy5ctq3Lix9u3bp/79+ys4OFiLFi1S9+7ddf78eQ0cOFBVqlTRxx9/rMGDB+vuu+/W0KFDJUnFixdPs8+PP/5YvXv3Vp06ddS3b19JUvny5R3W6dChg4KDgxUVFaXt27frgw8+kL+/v8aPH29fZ9y4cRoxYoQ6dOig3r176/Tp03r77bfVsGFD7dixQ76+vmluv127dtqzZ48+/fRTTZ48WXfddZdDvTNmzFC1atX06KOPytXVVV9++aWeffZZJScnKzIy0qGv3bt3q3PnzurXr5/69Omje+65RxcvXlTTpk118uRJDRw4UIGBgZo/f77WrVuXqpbo6Gi1aNFCISEhGjlypFxcXDR79mw1bdpU33//verUqXPLem/UsWNHTZ8+XV999ZWeeOIJe/ulS5f05Zdfqnv37vZRjTlz5qhIkSIaMmSIihQpoujoaL366quKj4/XhAkTHPrN6GsjPj5eH3zwgTp37qw+ffrowoULmjVrlsLDw/Xzzz+rVq1aDut/9NFHunDhgiIjI3XlyhVNnTpVTZs21a5du276+svs838z+/bt0+OPP65evXopIiJCH374obp3766QkBBVq1YtzccUL15cM2bM0DPPPKPHHntM7dq1kyTde++99nWuX7+u8PBwhYaG6q233tK3336riRMnqnz58nrmmWfs6/Xr109z5sxRjx49NGDAAB08eFDvvPOOduzYoR9//DHdEd5+/frpxIkTWrNmjT7++OM0l9+q31OnTql58+YqXry4XnrpJfn6+urQoUNasmRJhvfzRqNGjVJUVJT99R4fH6+tW7dq+/bteuihhyT983u7Xr16KlmypF566SV5enpq4cKFatu2rT7//HM99thjatiwoQYMGKBp06bpv//9r6pUqSJJ9n8tL7fTFpznViM7xhjj4+Nj7rvvPvv9G//amzx58i3ne9xsXkyjRo2MJDNz5sw0l6U1slOyZEkTHx9vb1+4cKGRZKZOnWpvy8jIzq1qu3FkZ9myZUaSGTt2rMN6jz/+uLHZbGbfvn32NknGzc3Noe2XX34xkszbb7+dalv/NmXKFCPJfPLJJ/a2q1evmrp165oiRYo47HvKX8YZcas5Oz179nRof+yxx0yxYsXs9w8dOmQKFChgxo0b57Derl27jKura6r2G91sDsylS5dStYWHh5ty5co5tJUpU8ZIMqtWrXJonzhxopFkli1bZm+7fPmyqVy5ssNfo8nJyaZixYomPDzcYTTu0qVLJjg42Dz00EMZqvdGycnJpmTJkqZ9+/YO7Sk/mxs2bLjpvvbr188ULlzYYY7Y7bw2rl27lmoOz99//20CAgIcnteUkQIPDw9z7Ngxe/vmzZuNJDN48GB7242v9aw+/+mN7Nx4fE6dOmXc3d3N0KFDb9rfrebs6P/PR/y3++67z4SEhNjvf//990aSmTdvnsN6KSOCN7bfKL05Oxntd+nSpbf8HXy7c3Zq1qx5y98JzZo1MzVq1HD4eUtOTjb/+c9/TMWKFe1td/KcHc7GusMUKVLkpmdlpfwl98UXXyg5OTlT23B3d1ePHj0yvH63bt3k5eVlv//444+rRIkSWrlyZaa2n1ErV65UgQIFNGDAAIf2oUOHyhijr7/+2qE9LCzMYfTk3nvvlbe3tw4cOHDL7QQGBqpz5872toIFC2r
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"_, _, bars = plt.hist(df_test['Class'], bins=10)\n",
"plt.xlabel('Class')\n",
"plt.ylabel('Frequency')\n",
"plt.title('Distribution of the target variable in the test set')\n",
"plt.bar_label(bars, fmt='%1.0f')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "82afd315",
"metadata": {},
"source": [
"#### Display relationship between features in the training set using the correlation matrix"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 139,
2022-12-29 10:21:35 +01:00
"id": "e8cf8eb1",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(42.5, -0.5)"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 139,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABy4AAAe2CAYAAABKEJQUAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzddXRU1/rw8e9EZuLubkQIEtyhtBQp0EIFK1KkUOqlCrTQ9vbWBai73gq0pcXd3TUJxBPirhOf948JSSbMJMO9cJPfe5/PWrNWSfaZPD1nn2fvffY5+yg0Go0GIYQQQgghhBBCCCGEEEIIIYRoRybtHYAQQgghhBBCCCGEEEIIIYQQQsjEpRBCCCGEEEIIIYQQQgghhBCi3cnEpRBCCCGEEEIIIYQQQgghhBCi3cnEpRBCCCGEEEIIIYQQQgghhBCi3cnEpRBCCCGEEEIIIYQQQgghhBCi3cnEpRBCCCGEEEIIIYQQQgghhBCi3cnEpRBCCCGEEEIIIYQQQgghhBCi3cnEpRBCCCGEEEIIIYQQQgghhBCi3cnEpRBCCCGEEEIIIYQQQgghhBCi3cnEpRBCCCGEEEIIIYQQQgghhBCi3cnEpRBCCCGEEEIIIYQQQgghhBD/h+3bt4/x48fj5eWFQqHgr7/+anObPXv20LNnT1QqFSEhIXz33XfXlPn4448JCAjAwsKCfv36cezYsRsffDMycSmEEEIIIYQQQgghhBBCCCHE/2Hl5eV0796djz/+2KjySUlJjB07luHDh3PmzBmefPJJ5s2bx9atWxvL/PbbbyxatIjly5dz6tQpunfvzqhRo8jJyblZ/xsoNBqN5qZ9uxBCCCGEEEIIIYQQQgghhBDiv0ahULB27VomTJhgsMzzzz/Pxo0buXDhQuPPpkyZQlFREVu2bAGgX79+9OnTh48++giA+vp6fH19eeyxx3jhhRduSuzyxKUQQgghhBBCCCGEEEIIIYQQHUxVVRUlJSU6n6qqqhvy3YcPH2bEiBE6Pxs1ahSHDx8GoLq6mpMnT+qUMTExYcSIEY1lbgazm/bNQgghhBBCCCGEEEIIIYQQ4v8sS7+p7R3C/7Tn54Txyiuv6Pxs+fLlvPzyy//xd2dlZeHu7q7zM3d3d0pKSlCr1RQWFlJXV6e3TGxs7H/89w2RiUshhBBCCCGEEEIIIYQQQgghOpjFixezaNEinZ+pVKp2iua/QyYuhRBCCCGEEEIIIYQQQgghhOhgVCrVTZuo9PDwIDs7W+dn2dnZ2NnZYWlpiampKaampnrLeHh43JSYQN5xKYQQQgghhBBCCCGEEEIIIcT/lAEDBrBz506dn23fvp0BAwYAoFQq6dWrl06Z+vp6du7c2VjmZpCJSyGEEEIIIYQQQgghhBBCCCH+DysrK+PMmTOcOXMGgKSkJM6cOUNqaiqgXXZ25syZjeUfeughEhMTee6554iNjeWTTz5h9erVPPXUU41lFi1axJdffsn3339PTEwMCxcupLy8nNmzZ9+0/w9ZKlYIIYQQQgghhBBCCCGEEEKI/8NOnDjB8OHDG/999d2Ys2bN4rvvviMzM7NxEhMgMDCQjRs38tRTT7Fy5Up8fHz46quvGDVqVGOZyZMnk5uby7Jly8jKyiIqKootW7bg7u5+0/4/FBqNRnPTvl0IIYQQQgghhBBCCCGEEEL8n2TpN7W9Q/ifpk79pb1D+K+TpWKFEEIIIYQQQgghhBBCCCGEEO1OlooVQgghhBBCCCGEEEIIIYQQ11Ao5Pk38d8lNU4IIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e7M2jsAIYQQQgghhBBCCCGEEEII0fEo5Pk38V8mNU4IIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e5k4lIIIYQQQgghhBBCCCGEEEII0e7M2jsAIYQQQgghhBBCCCGEEEII0fEoFPL8m/jv6lATl5Z+U9s7BB3q1F/o+fP+9g5Dx6lpQ4j4el97h9EoZu5QpuzuOPEA/Dp8KP3/ONDeYeg4cs9gAp/f0N5h6Eh6axwByza3dxg6kl8dg/8bO9o7DB0pi0fQ65eOlQdOTh1C52861nkXPWcofdd0rPPu2H2DeeborvYOQ8e7/W4lZOIP7R1Go/i1M1l+qmOdc6/0HIFL2JPtHYaOvEsrCFi+pb3D0JH8yugO1R8AbZ/gldMdqz4t7zGCOfv3tHcYOr4Zcgu3bDzY3mHo2DN2EN1+7Fht3bkZQ+izumO1K8cnDabf7x0rpqP3DubRw7vbO4xGHw0YzuC/O9Y+OnDXYO7a0bHq998jhuAc+nh7h6Ej//IqZu7d295h6Phh2DAmdbCx5urhQztkP/yhgx0nDwB8Nmg4kd92rP10cfZQAj/uWHU86ZFhDF3fcfoE+8YP6pDnXKfPO1ZMcQuGMmprx2rrto7qmG1d4DPr2zsMHUnvjuf2LR3nnAPYPnoQfqs6Vm5KfXwYp/M71jXMHs7j2jsEIcR/mUyVCyGEEEIIIYQQQgghhBBCCCHanUxcCiGEEEIIIYQQQgghhBBCCCHanUxcCiGEEEIIIYQQQgghhBBCCCHanUxcCiGEEEIIIYQQQgghhBBCCCHanUxcCiGEEEIIIYQQQgghhBBCCCHanVl7ByCEEEIIIYQQQgghhBBCCCE6HoVCnn8T/11S44QQQgghhBBCCCGEEEIIIYQQ7U4mLoUQQgghhBBCCCGEEEIIIYQQ7U4mLoUQQgghhBBCCCGEEEIIIYQQ7U4mLoUQQgghhBBCCCGEEEIIIYQQ7U4mLoUQQgghhBBCCCGEEEIIIYQQ7U4mLoUQQgghhBBCCCGEEEIIIYQQ7U4mLoUQQgghhBBCCCGEEEIIIYQQ7c7sRn1RbW0tGRkZ+Pn53aivFEIIIYQQQgghhBBCCCGEEO1EoVC0dwjif8wNm7i8ePEiPXv2pK6u7oZ836C+4Tz10Dh6dg3C092RSfPeY/22E61uM6R/BG+9NIPOoT5cycznzVVr+en3fTplFsy8nacWjMfd1Z7zMaksWvYdJ84mXFdskzp5MjPCB2dLJZcLy3j7ZAIX88v0lp0Y7MG4QDeCHawAiCko46OzyTrlb/Vx5p5OnkQ42eCgMmfKplNcLiq/rpimRXgyp6svLpZKYgvK+OfhBM7nleote1+YB3eGuNPJURtTdF4ZH5xI1in/+pBQJoZ66Gy3/0oB87deMCqe3D27ydm2lZqSYix9fPGZPBXrwECD5QtPniBz3d9U5+ehcnPHa+I92HftCoCmrpaMv/+i5MIFqvNyMbG0xDY8Au+J92Du4GBUPAD3BHkyPdQbJwsl8cXlvHcmgehC/cct0NaK+ZF+hDvY4GltwQdnE/ktPuOacq4WSh7pGsAAd0dUZiZcKavktRNxxBbp/96WZgzwZ/7QYFxtVcRklvDy3xc5e6Woze3Gdffiw2k92XYxiwU/6D8vXpvYlfv7+/Pq+ot8eyDJqHgAZvT1Y8GgQFxtVMRkl7J8YzRn04vb3G58F08+nBTFtphs5v9yqvHnoyLcub+PH1297HC0UnLHJweIztJfNw2Z2dOH+f38cbVREpNTxvJtlzibWaK37OhQVx4ZGIi/oyXmJiYkFVbw5bEU1l7IaiyTsniE3m1f3xXH50dTjIrpvk6ezAzX5oG4q3mgQP9xD7Kz4qFu/kQ42uBlY8G7pxL45ZJufbIyM2VhN3+G+zjjqDLnUmE5755KINrAd+ozNcKTOV20eeBSYet54N5QD+4KcSfkah7IL2NFizzwzyGhTOx0bR5YsM24PABwb7An08O8cbZQEldUzrunDZ93QXY
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 2500x2500 with 2 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"correlation_matrix = df_train.corr()\n",
"fig, ax = plt.subplots(figsize=(25, 25))\n",
"\n",
"ax = sns.heatmap(\n",
" correlation_matrix,\n",
" annot=True,\n",
" linewidths=0.5,\n",
" fmt=\".2f\",\n",
" cmap=\"YlGnBu\"\n",
")\n",
"\n",
"# Jupyter notebook specific\n",
"bottom_side, top_side = ax.get_ylim()\n",
"ax.set_ylim(bottom_side + 0.5, top_side - 0.5)"
]
},
{
"cell_type": "markdown",
"id": "c2b4a57c",
"metadata": {},
"source": [
"We can see that there is the highest positive correlation in **V14** atribute and the highest negative value in the attributes **V1, V27** So lets see the distribution of those values in comparrison to class."
]
},
{
"cell_type": "markdown",
"id": "f1918d5b",
"metadata": {},
"source": [
2023-01-06 10:09:28 +01:00
"**V1 vs V27**"
2022-12-29 10:21:35 +01:00
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 140,
2022-12-29 10:21:35 +01:00
"id": "8d4ce9a6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
2023-01-06 10:41:21 +01:00
"<matplotlib.legend.Legend at 0x7ff7d9f164a0>"
2022-12-29 10:21:35 +01:00
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 140,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNoAAANXCAYAAADjAjLCAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAADjfklEQVR4nOzdeVhUZf8G8PucYd8XRcAUcElFchdFc/lVCmqo1Ztmmlrma6aVLb5lZUhWaquVZablmpmlpaihZpobiokbobmElgqSICCyzzm/P8YZGWY7AwMDeH+uy0vnnGfOec4w9b7efZ/nK8iyLIOIiIiIiIiIiIiqRbT3BIiIiIiIiIiIiBoCBm1EREREREREREQ2wKCNiIiIiIiIiIjIBhi0ERERERERERER2QCDNiIiIiIiIiIiIhtg0EZERERERERERGQDDNqIiIiIiIiIiIhsgEEbERERERERERGRDTBoIyIiIiIiIiIisgEGbURERNRgjR8/HqGhoXa5tyAImDVrlk2vuXLlSrRt2xaOjo7w8fGx6bVt5fz58xAEAcuWLbP3VMxKTExEp06d4OLiAkEQkJuba+8p1Sp7/rNBRETUkDFoIyIiqgcEQVD0a9euXfaeqp79+/dj1qxZt12IURNOnTqF8ePHo2XLlli8eDG+/PJLu85n9erVmD9/vl3nUFXZ2dkYMWIEXF1d8dlnn2HlypVwd3c3GDd06FC4ubnh+vXrJq81evRoODk5ITs7GwDw3XffYcyYMWjdujUEQUD//v1r6jFqVFZWFhwcHDBmzBiTY65fvw5XV1c8+OCDAIBDhw5h6tSpaN++Pdzd3dG8eXOMGDECp0+fNnivuX+PDRgwoMaei4iIqKY52HsCREREZNnKlSv1Xq9YsQLbt283ON6uXbvanJZF+/fvR3x8PMaPH19nK7BqSlFRERwcbPd/tXbt2gVJkvDxxx+jVatWNrtuVa1evRqpqamYNm2a3vGQkBAUFRXB0dHRPhNT4NChQ7h+/Tpmz56N++67z+S40aNHIyEhAT/++CPGjh1rcL6wsBAbNmxATEwM/P39AQALFy7E4cOH0b17d134Vh8FBARgwIAB2LBhAwoLC+Hm5mYwZv369SguLtaFcfPmzcO+ffvw8MMPo0OHDsjMzMSCBQvQpUsXHDhwABEREbr3Vv53FwD8/vvv+PjjjzFw4MCaezAiIqIaxqCNiIioHqhcVXLgwAFs377dbLWJUrIso7i4GK6urtW+Ft3i4uJi0+tlZWUBQJ0PLAVBsPmz25rSz3Lo0KHw9PTE6tWrjQZtGzZswI0bNzB69GjdsZUrV6Jp06YQRVEvWKqPRo8ejcTERGzcuBGPPPKIwfnVq1fD29sbQ4YMAQC88MILWL16NZycnHRjRo4cibvuugtz587FqlWrdMeN/btr165dEAQBo0aNqoGnISIiqh1cOkpERNRALF26FPfccw8CAgLg7OyM8PBwLFy40GBcaGgo7r//fmzduhXdunWDq6srFi1aBAC4cOEChg4dCnd3dwQEBOD555/H1q1bjS5LPXjwIGJiYuDt7Q03Nzf069cP+/bt052fNWsWpk+fDgAICwvTLQs7f/680flPnToVHh4eKCwsNDg3atQoBAYGQq1WA9AEHEOGDEFwcDCcnZ3RsmVLzJ49W3feFO1f5Cs/i6l9xU6dOoX//Oc/8PPzg4uLC7p164aNGzeavYdW5T3aZs2aBUEQcPbsWV2Fn7e3Nx5//HGjz1xRaGgo4uLiAACNGzfWu7apveBCQ0Mxfvx43etly5ZBEATs27cPL7zwAho3bgx3d3c88MAD+Pfffw3e//PPP6Nfv37w9PSEl5cXunfvjtWrVwMA+vfvj82bN+PChQu6n6t2vy9Tn+Wvv/6KPn36wN3dHT4+Phg2bBhOnjypN6Y6n5HW999/j65du8LV1RWNGjXCmDFjcOnSJd35/v37Y9y4cQCA7t27QxAEvc+pIu2yyB07dujCuYpWr14NT09PDB06VHesWbNmEMWq/V/s0tJSvPHGG+jatSu8vb3h7u6OPn36YOfOnXrjtJ/x+++/jy+//BItW7aEs7MzunfvjkOHDhlc96effkJERARcXFwQERGBH3/8UdF8HnjgAbi7u+t+7hVlZWVhx44d+M9//gNnZ2cAQK9evfRCNgBo3bo12rdvb/CzrqykpATr1q1Dv379cMcddyiaHxERUV3EijYiIqIGYuHChWjfvj2GDh0KBwcHJCQk4Omnn4YkSZgyZYre2D///BOjRo3CpEmTMHHiRLRp0wY3btzAPffcg4yMDDz33HMIDAzE6tWrDf6SD2hCk0GDBqFr166Ii4uDKIq6oG/Pnj2IjIzEgw8+iNOnT+Pbb7/FRx99hEaNGgHQBEXGjBw5Ep999hk2b96Mhx9+WHe8sLAQCQkJGD9+PFQqFQBNaOTh4YEXXngBHh4e+PXXX/HGG28gPz8f7733nk0+zz/++AO9e/dG06ZN8corr8Dd3R1r167F8OHDsW7dOjzwwANVuu6IESMQFhaGOXPmICUlBUuWLEFAQADmzZtn8j3z58/HihUr8OOPP2LhwoXw8PBAhw4dqnT/Z555Br6+voiLi8P58+cxf/58TJ06Fd99951uzLJly/DEE0+gffv2mDFjBnx8fHDkyBEkJibi0UcfxWuvvYa8vDxcvHgRH330EQDAw8PD5D1/+eUXDBo0CC1atMCsWbNQVFSETz/9FL1790ZKSorBpvxV+Yy083788cfRvXt3zJkzB1euXMHHH3+Mffv24ciRI/Dx8cFrr72GNm3a4Msvv8Sbb76JsLAwtGzZ0uQ1R48ejeXLl2Pt2rWYOnWq7nhOTg62bt2KUaNG2awaND8/H0uWLMGoUaMwceJEXL9+HV999RWio6ORnJyMTp066Y1fvXo1rl+/jkmTJkEQBLz77rt48MEH8ddff+mW7m7btg0PPfQQwsPDMWfOHGRnZ+Pxxx9XFGa5u7tj2LBh+OGHH5CTkwM/Pz/due+++w5qtVqvms8YWZZx5coVtG/f3uy4LVu2IDc31+L1iIiI6jyZiIiI6p0pU6bIlf9nvLCw0GBcdHS03KJFC71jISEhMgA5MTFR7/gHH3wgA5B/+ukn3bGioiK5bdu2MgB5586dsizLsiRJcuvWreXo6GhZkiS9+4eFhckDBgzQHXvvvfdkAHJ6errFZ5IkSW7atKn80EMP6R1fu3atDEDevXu32WedNGmS7ObmJhcXF+uOjRs3Tg4JCdG93rlzp96zaKWnp8sA5KVLl+qO3XvvvfJdd92ldz1JkuRevXrJrVu3tvg8AOS4uDjd67i4OBmA/MQTT+iNe+CBB2R/f3+L19O+/99//zV7H62QkBB53LhxutdLly6VAcj33Xef3s/t+eefl1UqlZybmyvLsizn5ubKnp6eco8ePeSioiK9a1Z835AhQ/Q+Wy1jn2WnTp3kgIAAOTs7W3fs2LFjsiiK8tixYw2esSqfUWlpqRwQECBHRETozXvTpk0yAPmNN94w+CwOHTpk9pqyLMvl5eVyUFCQHBUVpXf8iy++kAHIW7duNfne9u3by/369bN4j4r3Kikp0Tt27do1uUmTJnqfifYz9vf3l3NycnTHN2zYIAOQExISdMc6deokBwUF6X6+sizL27ZtkwEY/flVtnnzZhmAvGjRIr3jPXv2lJs2bSqr1Wqz71+5cqUMQP7qq6/MjnvooYdkZ2dn+dq1axbnREREVJdx6SgREVEDUbGqJi8vD1evXkW/fv3w119/IS8vT29sWFgYoqOj9Y4lJiaiadOmesvgXFxcMHHiRL1xR48exZkzZ/Doo48iOzsbV69exdWrV3Hjxg3ce++92L17NyRJsnr+giDg4YcfxpYtW1BQUKA7/t1336Fp06a4++67jT7r9evXcfXqVfTp0weFhYU4deqU1feuLCcnB7/++itGjBihu/7Vq1eRnZ2N6OhonDlzRm85ojWeeuopvdd9+vRBdnY28vPzqz1vJf773/9CEAS9+6vValy4cAEAsH37dly/fh2vvPKKwV5rFd+nVEZGBo4ePYr
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 1500x1000 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"\n",
"plt.figure(figsize=(15, 10))\n",
"\n",
"# Scatter with 1 values of target class\n",
"plt.scatter(\n",
" df_train['V1'][df_train['Class'] == 1],\n",
" df_train['V27'][df_train['Class'] == 1],\n",
")\n",
"\n",
"# Scatter with 2 values of target class\n",
"plt.scatter(\n",
" df_train['V1'][df_train['Class'] == 2],\n",
" df_train['V27'][df_train['Class'] == 2],\n",
")\n",
"\n",
"plt.title('Target value in function of V1 and V27')\n",
"\n",
"plt.xlabel('V1')\n",
"plt.ylabel('V27')\n",
"plt.legend(['Biodegradable', 'Non-biodegradable'])\n"
]
},
2023-01-06 10:09:28 +01:00
{
"cell_type": "markdown",
"id": "8791293c",
"metadata": {},
"source": [
"**V14 vs V1**"
]
},
2022-12-29 10:21:35 +01:00
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 141,
2023-01-06 10:09:28 +01:00
"id": "88a4ed44",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0.5, 1.0, 'Target value in function of V14 and V1')"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 141,
2023-01-06 10:09:28 +01:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABLEAAANECAYAAABPR71wAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAADow0lEQVR4nOzde3gU5fk38O9sDiQQEkggJEENEQ8xIIJKJAjSopgogmcrctIqWtTiqX3xhCGiIj9rxSNqtB6goLTWAoKJWLSgBEMLqDGgiAELJCAJJOREkp15/1gm7GFmZ2Z3dnd28/1clxdm8+zss7Ozk5177/t+BEmSJBAREREREREREVmYLdQTICIiIiIiIiIi0sIgFhERERERERERWR6DWEREREREREREZHkMYhERERERERERkeUxiEVERERERERERJbHIBYREREREREREVkeg1hERERERERERGR5DGIREREREREREZHlMYhFRERERERERESWxyAWERER+eTmm2/GgAEDQvLYgiBg7ty5pm5z8eLFyM7ORkxMDHr16mXqts2ye/duCIKAt99+O9RT8aqkpARDhw5FXFwcBEHAkSNHQj0lSwqX15OIiMgqGMQiIqIuTxAEXf99/vnnoZ6qi40bN2Lu3LkMEJhgx44duPnmmzFw4EAUFxfj9ddfD+l8li5dioULF4Z0Dr6qra3FDTfcgPj4eLz88stYvHgxevTo4TFu4sSJ6N69O44ePaq6rcmTJyM2Nha1tbUAgPfffx9TpkzB6aefDkEQ8Ktf/UrXnJ588kkIgoDBgwf79JxCadasWRAEAT/++KPqmEceeQSCIOCbb74BAHzyySe49dZbMXjwYERFRYUs2ExERGS26FBPgIiIKNQWL17s8vO7776LtWvXetx+1llnBXNamjZu3IiioiLcfPPNls0cCpSWlhZER5v3Mebzzz+HKIp4/vnncdppp5m2XV8tXboUFRUVuPfee11uz8zMREtLC2JiYkIzMR02b96Mo0ePYt68ebjkkktUx02ePBmrVq3Chx9+iGnTpnn8vrm5GStWrEBBQQFSUlIAAIsWLcJ///tfDB8+vDOwpWXv3r146qmnFANp4WDy5Ml48cUXsXTpUjz22GOKY5YtW4azzz4bQ4YMAeA4ft5//32ce+65yMjICOZ0iYiIAopBLCIi6vKmTJni8vOmTZuwdu1aj9t9IUkSWltbER8f7/e26IS4uDhTt3fw4EEAsHwwUBAE05+72fTuy4kTJ6Jnz55YunSpYhBrxYoVaGpqwuTJkztvW7x4Mfr37w+bzaY7q+oPf/gDRowYAbvdjkOHDul/IhZxwQUX4LTTTsOyZcsUg1hlZWWoqqrC008/3XnbU089heLiYsTExOCKK65ARUVFMKdMREQUMCwnJCIi0uGtt97C2LFjkZqaim7duiEnJweLFi3yGDdgwABcccUVKC0txfnnn4/4+Hi89tprAIA9e/Zg4sSJ6NGjB1JTU3HfffehtLRUsVTxq6++QkFBAZKSktC9e3eMGTMGX375Zefv586diz/+8Y8AgKysrM6Sx927dyvO/+6770ZCQgKam5s9fjdp0iSkpaXBbrcDcAQPxo8fj4yMDHTr1g0DBw7EvHnzOn+v5vPPP1d8Lmp9f3bs2IHrrrsOycnJiIuLw/nnn4+VK1d6fQyZe0+suXPndpZcyZlpSUlJuOWWWxSfs7MBAwagsLAQANC3b1+Xbav13howYABuvvnmzp/ffvttCIKAL7/8Evfffz/69u2LHj164Oqrr8Yvv/zicf+PP/4YY8aMQc+ePZGYmIjhw4dj6dKlAIBf/epXWL16Nfbs2dP5usrlYGr7ct26dRg9ejR69OiBXr164corr8T27dtdxvizj2R/+9vfcN555yE+Ph59+vTBlClTsG/fvs7f/+pXv8L06dMBAMOHD4cgCC77yVl8fDyuueYa/Otf/+oMfDlbunQpevbsiYkTJ3bedvLJJ8Nm0//xdf369fj73/9uuDRzw4YNuP7663HKKaegW7duOPnkk3HfffehpaXFZdzNN9+MhIQE7Nu3D1dddRUSEhLQt29f/OEPf/B4vxw5cgQ333wzkpKS0KtXL0yfPl13KfDkyZOxY8cObNmyxeN3S5cuhSAImDRpUudtGRkZls7WIyIi8hWDWERERDosWrQImZmZePjhh/Hss8/i5JNPxp133omXX37ZY+z333+PSZMmYdy4cXj++ecxdOhQNDU1YezYsfj0008xa9YsPPLII9i4cSNmz57tcf9169bhoosuQkNDAwoLC/HUU0/hyJEjGDt2LMrLywEA11xzTedF63PPPYfFixdj8eLF6Nu3r+L8f/Ob36CpqQmrV692ub25uRmrVq3Cddddh6ioKACOgExCQgLuv/9+PP/88zjvvPPw2GOP4cEHH/RrHzr77rvvMGLECGzfvh0PPvggnn32WfTo0QNXXXUVPvzwQ5+3e8MNN+Do0aOYP38+brjhBrz99tsoKiryep+FCxfi6quvBuB4nRcvXoxrrrnGp8f//e9/j6+//hqFhYWYOXMmVq1ahbvvvttlzNtvv43x48ejrq4ODz30EJ5++mkMHToUJSUlABz9jYYOHYo+ffp0vq7egjCffvop8vPzcfDgQcydOxf3338/Nm7ciAsvvFAxqOnLPpLnfcMNNyAqKgrz58/HjBkz8I9//AOjRo3qDMY88sgjuP322wEAjz/+OBYvXow77rhDdZuTJ09GR0cHli9f7nJ7XV0dSktLcfXVV/ucxWi32/H73/8et912G84++2xD9/3b3/6G5uZmzJw5Ey+++CLy8/Px4osvKmaM2e125OfnIyUlBX/6058wZswYPPvssy591SRJwpVXXonFixdjypQpeOKJJ7B3797OgJ8WORtNDnQ6P/by5csxevRonHLKKYaeIxERUViSiIiIyMVdd90luf+JbG5u9hiXn58vnXrqqS63ZWZmSgCkkpISl9ufffZZCYD0z3/+s/O2lpYWKTs7WwIgffbZZ5IkSZIoitLpp58u5efnS6Ioujx+VlaWNG7cuM7bnnnmGQmAVFVVpfmcRFGU+vfvL1177bUuty9fvlwCIK1fv97rc73jjjuk7t27S62trZ23TZ8+XcrMzOz8+bPPPnN5LrKqqioJgPTWW2913nbxxRdLZ599tsv2RFGURo4cKZ1++umazweAVFhY2PlzYWGhBED67W9/6zLu6quvllJSUjS3J9//l19+8fo4sszMTGn69OmdP7/11lsSAOmSSy5xed3uu+8+KSoqSjpy5IgkSZJ05MgRqWfPntIFF1wgtbS0uGzT+X7jx4932bcypX05dOhQKTU1Vaqtre287euvv5ZsNps0bdo0j+foyz5qa2uTUlNTpcGDB7vM+6OPPpIASI899pjHvti8ebPXbUqSJHV0dEjp6elSXl6ey+2vvvqqBEAqLS1Vve+gQYOkMWPGqP7+pZdekpKSkqSDBw9KkiRJY8aMkQYNGqQ5J0lSfg/Mnz9fEgRB2rNnT+dt06dPlwBIjz/+uMvYYcOGSeedd17nz//85z8lANL//d//dd7W0dEhjR492uP1VDN8+HDppJNOkux2e+dtJSUlEgDptddeU72f2rFEREQUjpiJRUREpINzNkh9fT0OHTqEMWPG4KeffkJ9fb3L2KysLOTn57vcVlJSgv79+7uURsXFxWHGjBku47Zt24adO3fipptuQm1tLQ4dOoRDhw6hqakJF198MdavXw9RFA3PXxAEXH/99VizZg0aGxs7b3///ffRv39/jBo1SvG5Hj16FIcOHcLo0aPR3NyMHTt2GH5sd3V1dVi3bl1nRpD8HGtra5Gfn4+dO3e6lKgZ8bvf/c7l59GjR6O2thYNDQ1+z1uP22+/HYIguDy+3W7Hnj17AABr167F0aNH8eCDD3r0tnK+n17V1dXYtm0bbr75ZiQnJ3fePmTIEIwbNw5r1qzxuI8v++g///kPDh48iDvvvNNl3uPHj0d2drZHhp9eUVFRuPHGG1FWVuaSNbZ06VL069cPF198sU/bra2txWOPPYY5c+a
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 1500x1000 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# V14 vs V1 \n",
"plt.figure(figsize=(15, 10))\n",
"\n",
"# Scatter with 1 values of target class\n",
"plt.scatter(\n",
" df_train['V14'][df_train['Class'] == 1],\n",
" df_train['V1'][df_train['Class'] == 1],\n",
")\n",
"\n",
"# Scatter with 2 values of target class\n",
"plt.scatter(\n",
" df_train['V14'][df_train['Class'] == 2],\n",
" df_train['V1'][df_train['Class'] == 2],\n",
")\n",
"\n",
"\n",
"plt.title('Target value in function of V14 and V1')\n"
]
},
{
"cell_type": "markdown",
"id": "50c8da26",
"metadata": {},
"source": [
"**V36 vs V1**"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 142,
2023-01-06 10:09:28 +01:00
"id": "ecc43aab",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0.5, 1.0, 'Target value in function of V36 and V1')"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 142,
2023-01-06 10:09:28 +01:00
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABLEAAANECAYAAABPR71wAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAADJWUlEQVR4nOzde3wU5dn/8e/skpAESCBATMKjIaIVIyqipoCirRVJRbC1rRVF1FprqdZq26d4qIbUVuXX9hFPxVZbK1C0Vm0FD0GstiqHxhZBY0QFA1ZIQBJIgJAQdub3xzJhzzu72U02yef9eiHu7D0z9+wum+y113XdhmVZlgAAAAAAAIAU5uruCQAAAAAAAADREMQCAAAAAABAyiOIBQAAAAAAgJRHEAsAAAAAAAApjyAWAAAAAAAAUh5BLAAAAAAAAKQ8glgAAAAAAABIeQSxAAAAAAAAkPIIYgEAAAAAACDlEcQCAABxufLKKzVy5MhuObdhGJo7d25Cj7lo0SKNHj1aaWlpGjx4cEKPnSibN2+WYRj64x//2N1TiaiyslJjx45VRkaGDMPQ7t27u3tKKamnPJ8AAKQKglgAgD7PMAxHf/7xj39091T9rFq1SnPnziVAkAAbNmzQlVdeqVGjRumRRx7R7373u26dz5IlSzR//vxunUO8GhoadPHFFyszM1MPPfSQFi1apAEDBgSNmz59urKysrRnz56wx7rsssuUnp6uhoYGSdJNN92kcePGKTc3V1lZWTr++OM1d+5c7d27N+T+a9eu1fTp0zvGjxkzRvfff39iLrSL3HDDDTIMQxs3bgw75rbbbpNhGHrnnXckSS+//LKuvvpqjRkzRm63u9uCzQAAJFq/7p4AAADdbdGiRX63Fy5cqBUrVgRtP/7447tyWlGtWrVKFRUVuvLKK1M2cyhZ9u/fr379EvdrzD/+8Q+Zpqn77rtPxxxzTMKOG68lS5aourpaN954o9/2oqIi7d+/X2lpad0zMQfeeust7dmzR3feeafOPffcsOMuu+wyLVu2TH/96181a9asoPtbWlr03HPPqaysTEOHDu049qRJk3TVVVcpIyNDb7/9tu655x698sorev311+VyHf5+9uWXX9a0adN0yimn6Pbbb9fAgQO1adMmffrpp4m/6CS67LLL9MADD2jJkiW64447Qo554okndOKJJ+qkk06S5H39/PnPf9a4ceNUWFjYldMFACCpCGIBAPq8mTNn+t1es2aNVqxYEbQ9HpZlqbW1VZmZmZ0+Fg7LyMhI6PF27NghSSkfDDQMI+HXnmhOH8vp06dr0KBBWrJkScgg1nPPPad9+/bpsssu69j25ptvBo0bNWqUfvzjH6uqqkrjx4+XJDU3N2vWrFmaOnWqnn76ab/gVk/z+c9/Xsccc4yeeOKJkEGs1atXq7a2Vvfcc0/HtrvuukuPPPKI0tLSdMEFF6i6urorpwwAQNL03J/oAAB0occee0znnHOO8vLy1L9/f5WUlGjBggVB40aOHKkLLrhAy5cv12mnnabMzEz99re/lSRt2bJF06dP14ABA5SXl6ebbrpJy5cvD1mq+K9//UtlZWXKyclRVlaWzj77bK1cubLj/rlz5+p///d/JUnFxcUdJY+bN28OOf/rr79eAwcOVEtLS9B9M2bMUH5+vjwejyRv8GDq1KkqLCxU//79NWrUKN15550d94fzj3/8I+S1hOv7s2HDBn39619Xbm6uMjIydNppp2np0qURz2EL7Ik1d+7cjpIrOzMtJydHV111Vchr9jVy5EiVl5dLkoYPH+537HC9t0aOHKkrr7yy4/Yf//hHGYahlStX6oc//KGGDx+uAQMG6Ktf/ao+++yzoP1feuklnX322Ro0aJCys7N1+umna8mSJZKkL3zhC3rhhRe0ZcuWjufVLgcL91i++uqrmjRpkgYMGKDBgwfrwgsv1Pvvv+83pjOPke0vf/mLTj31VGVmZmrYsGGaOXOmtm7d2nH/F77wBV1xxRWSpNNPP12GYfg9Tr4yMzN10UUX6e9//3tH4MvXkiVLNGjQIE2fPj3inOzHxresdsmSJdq+fbt+8YtfyOVyad++fTJN09E1StIbb7yhb3zjGzrqqKPUv39/HXnkkbrpppu0f/9+v3FXXnmlBg4cqK1bt+orX/mKBg4cqOHDh+vHP/5x0L+X3bt368orr1ROTo4GDx6sK664wnEp8GWXXaYNGzZo7dq1QfctWbJEhmFoxowZHdsKCwtTOlsPAIB4EcQCAMCBBQsWqKioSLfeeqt+/etf68gjj9T3vvc9PfTQQ0FjP/jgA82YMUOTJ0/Wfffdp7Fjx2rfvn0655xz9Morr+iGG27QbbfdplWrVmnOnDlB+7/66qs666yz1NzcrPLyct11113avXu3zjnnHFVVVUmSLrrooo4Prffee68WLVqkRYsWafjw4SHn/81vflP79u3TCy+84Le9paVFy5Yt09e//nW53W5J3oDMwIED9cMf/lD33XefTj31VN1xxx26+eabO/UY+nrvvfc0fvx4vf/++7r55pv161//WgMGDNBXvvIV/fWvf437uBdffLH27Nmju+++WxdffLH++Mc/qqKiIuI+8+fP11e/+lVJ3ud50aJFuuiii+I6//e//32tX79e5eXlmj17tpYtW6brr7/eb8wf//hHTZ06VY2Njbrlllt0zz33aOzYsaqsrJTk7W80duxYDRs2rON5jdQf65VXXtGUKVO0Y8cOzZ07Vz/84Q+1atUqnXHGGSGDmvE8Rva8L774Yrndbt1999265ppr9Oyzz+rMM8/sCMbcdttt+s53viNJ+tnPfqZFixbp2muvDXvMyy67TAcPHtRTTz3lt72xsVHLly/XV7/61aAsxoMHD2rnzp3atm2bXn75Zf30pz/VoEGDVFpa6veYZGdna+vWrTruuOM0cOBAZWdna/bs2WptbY16rX/5y1/U0tKi2bNn64EHHtCUKVP0wAMPhMwY83g8mjJlioYOHapf/epXOvvss/XrX//ar6+aZVm68MILtWjRIs2cOVM///nP9emnn3YE/KKxs9HsQKfvuZ966ilNmjRJRx11lKNjAQDQo1kAAMDPddddZwX+iGxpaQkaN2XKFOvoo4/221ZUVGRJsiorK/22//rXv7YkWX/72986tu3fv98aPXq0Jcl67bXXLMuyLNM0rWOPPdaaMmWKZZqm3/mLi4utyZMnd2z75S9/aUmyamtro16TaZrWiBEjrK997Wt+25966ilLkvX6669HvNZrr73WysrKslpbWzu2XXHFFVZRUVHH7ddee83vWmy1tbWWJOuxxx7r2PalL33JOvHEE/2OZ5qmNXHiROvYY4+Nej2SrPLy8o7b5eXlliTrW9/6lt+4r371q9bQoUOjHs/e/7PPPot4HltRUZF1xRVXdNx+7LHHLEnWueee6/e83XTTTZbb7bZ2795tWZZl7d692xo0aJD1+c9/3tq/f7/fMX33mzp1qt9jawv1WI4dO9bKy8uzGhoaOratX7/ecrlc1qxZs4KuMZ7H6MCBA1ZeXp41ZswYv3k///zzliTrjjvuCHos3nrrrYjHtCzLOnjwoFVQUGBNmDDBb/vDDz9sSbKWL18etM/q1astSR1/jjvuuKDX3EknnWRlZWVZWVlZ1ve//33rmWeesb7//e9bkqxLLrkk6rxC/Ru4++67LcMwrC1btnRsu+KKKyxJ1s9+9jO/saeccop16qmndtz+29/+Zkmy/t//+39+1z5p0qSg5zOc008/3fqf//kfy+PxdGyrrKy0JFm//e1vw+4X7rUEAEBPRCYWAAAO+GaDNDU1aefOnTr77LP18ccfq6mpyW9scXGxpkyZ4retsrJSI0aM8CuNysjI0DXXXOM3bt26dfroo4906aWXqqGhQTt37tTOnTu1b98+felLX9Lrr78eU1mUzTAMfeMb39CLL77ot5Lbn//8Z40YMUJnnnlmyGvds2ePdu7cqUmTJqmlpUUbNmyI+dyBGhsb9eqrr3ZkBNnX2NDQoClTpuijjz7yK1GLxXe/+12/25MmTVJDQ4O
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 1500x1000 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# V36 vs v1\n",
"plt.figure(figsize=(15, 10))\n",
"\n",
"# Scatter with 1 values of target class\n",
"plt.scatter(\n",
" df_train['V36'][df_train['Class'] == 1],\n",
" df_train['V1'][df_train['Class'] == 1],\n",
")\n",
"\n",
"# Scatter with 2 values of target class\n",
"plt.scatter(\n",
" df_train['V36'][df_train['Class'] == 2],\n",
" df_train['V1'][df_train['Class'] == 2],\n",
")\n",
"\n",
"plt.title('Target value in function of V36 and V1')\n"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 143,
2022-12-29 10:21:35 +01:00
"id": "d50d1f44",
"metadata": {},
"outputs": [],
"source": [
"# Spliting the data into features and labels\n",
"X_train = df_train.drop('Class', axis=1)\n",
"y_train = df_train['Class']\n",
"X_test = df_test.drop('Class', axis=1)\n",
"y_test = df_test['Class']"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 144,
2022-12-29 10:21:35 +01:00
"id": "f0aa7c9d",
"metadata": {},
"outputs": [],
"source": [
"from sklearn.linear_model import LogisticRegression\n",
"from sklearn.neighbors import KNeighborsClassifier\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"\n",
"# Put models in a dictionary\n",
"models = {\n",
" \"Logistic Regression\": LogisticRegression(),\n",
" \"KNN\": KNeighborsClassifier(),\n",
" \"Random Forest\": RandomForestClassifier()\n",
"}\n",
"\n",
"# Create a function to fit and score models\n",
"def fit_and_score(models, X_train, X_test, y_train, y_test):\n",
" \"\"\"\n",
" Fits and evaluates given machine learning models.\n",
" models: dict of different Scikit-Learn machine learning models\n",
" X_train: training data (no labels)\n",
" x_test: testing data (no labels)\n",
" y_train: training labels\n",
" y_test: trest labels\n",
" \"\"\"\n",
"\n",
" # Set random seed\n",
" np.random.seed(42)\n",
"\n",
" # Make a dictioanry to keep model scores\n",
" model_scores = {}\n",
"\n",
" # Loop through models\n",
" for name, model in models.items():\n",
" # Fit the model to the data\n",
" model.fit(X_train, y_train)\n",
" # Evaluate the model and append its score to model_scores\n",
" model_scores[name] = model.score(X_test, y_test)\n",
"\n",
" return model_scores"
]
},
{
"cell_type": "markdown",
"id": "10387356",
"metadata": {},
"source": [
"#### Check if there are any missing values"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 145,
2022-12-29 10:21:35 +01:00
"id": "87e277e6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"V4 25\n",
"V22 16\n",
"V27 8\n",
"V29 8\n",
"V37 25\n",
"dtype: int64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 145,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"na_counts = df_train.isna().sum()\n",
"na_counts[na_counts > 0]\n"
]
},
{
"cell_type": "markdown",
"id": "cb57434a",
"metadata": {},
"source": [
"#### We can see that there are five atributes that have missing values. Lets inspect them."
]
},
{
"cell_type": "markdown",
"id": "9dbd2c02",
"metadata": {},
"source": [
"##### V4"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 146,
2022-12-29 10:21:35 +01:00
"id": "ca1e544a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 821.000000\n",
"mean 0.030451\n",
"std 0.198281\n",
"min 0.000000\n",
"25% 0.000000\n",
"50% 0.000000\n",
"75% 0.000000\n",
"max 2.000000\n",
"Name: V4, dtype: float64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 146,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V4'].describe()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 147,
2022-12-29 10:21:35 +01:00
"id": "9e4d7d1d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.0 800\n",
"1.0 17\n",
"2.0 4\n",
"Name: V4, dtype: int64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 147,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V4'].value_counts()"
]
},
{
"cell_type": "markdown",
"id": "3a3191c9",
"metadata": {},
"source": [
"We can see that the majority of entires in that particular atribute are zeros. So I think that it would be best if I set all the `Nan` values to zeros."
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 148,
2022-12-29 10:21:35 +01:00
"id": "d8489bd4",
"metadata": {},
"outputs": [],
"source": [
"df_train['V4'].fillna(0, inplace=True)\n",
"df_test['V4'].fillna(0, inplace=True)"
]
},
{
"cell_type": "markdown",
"id": "3e84e48b",
"metadata": {},
"source": [
"##### V22"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 149,
2022-12-29 10:21:35 +01:00
"id": "a711431d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 830.000000\n",
"mean 1.243898\n",
"std 0.094109\n",
"min 0.898000\n",
"25% 1.187500\n",
"50% 1.248500\n",
"75% 1.298750\n",
"max 1.641000\n",
"Name: V22, dtype: float64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 149,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V22'].describe()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 150,
2022-12-29 10:21:35 +01:00
"id": "f0325325",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.299 9\n",
"1.280 9\n",
"1.296 8\n",
"1.254 8\n",
"1.264 8\n",
" ..\n",
"1.449 1\n",
"1.159 1\n",
"1.363 1\n",
"1.331 1\n",
"1.410 1\n",
"Name: V22, Length: 321, dtype: int64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 150,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V22'].value_counts()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 151,
2022-12-29 10:21:35 +01:00
"id": "25a74baf",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjIAAAHHCAYAAACle7JuAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABLi0lEQVR4nO3deVxU1f8/8NcAssgyyg6iiLiFiguW4YqKIhq5FW4lmGYWrlgWnywlLTTLLbf6ZriFW25lKi4ouFYq5FIumOYC4s5moDDn90c/JkeGbZzhzoXX8/G4j4dz751z33NnhBfnnntGIYQQICIiIpIhE6kLICIiItIVgwwRERHJFoMMERERyRaDDBEREckWgwwRERHJFoMMERERyRaDDBEREckWgwwRERHJFoMMERERyRaDTDU0ffp0KBSKSjlWQEAAAgIC1I8PHDgAhUKBH374oVKOHx4ejvr161fKsXSVk5ODUaNGwdXVFQqFAhMnTqxwG0Xv6Z07d/RfIEGhUGD69Onqx3I63ytWrIBCocCVK1cMfqzw8HDY2NgY/DiG9vT7TcaNQUbmin5IFS2WlpZwd3dHUFAQFi5ciOzsbL0cJy0tDdOnT0dKSope2tMnY66tPD777DOsWLECb7/9NlavXo3XX3+91H23bt1aecU94eWXX0bNmjVL/UwNGzYM5ubmuHv3Lu7evYs5c+agc+fOcHJyQq1atfDiiy9i/fr1xZ7322+/YezYsWjWrBmsra1Rr149hIaG4sKFC3p9DUeOHMH06dPx4MEDvbarD8Zc25MePnyI6dOn48CBA5LVsGPHjmobNOLi4jB//nypyzAugmQtNjZWABCffPKJWL16tfjuu+/EZ599Jnr27CkUCoXw9PQUv//+u8ZzHj9+LP75558KHee3334TAERsbGyFnpefny/y8/PVj/fv3y8AiI0bN1aoHV1re/TokcjLy9PbsQyhXbt2okOHDuXa19raWoSFhRVbP23aNAFA3L59W8/V/WfdunUCgFi5cqXW7bm5ucLa2lqEhIQIIYT46aefRI0aNUTfvn3F/PnzxaJFi0TXrl0FAPHxxx9rPHfgwIHC1dVVjBs3Tvzf//2fmDFjhnBxcRHW1tbi9OnTensNc+bMEQDE5cuXK/S8f/75Rzx+/Fj92BDnW9faylJQUCD++ecfoVKp9NLe7du3BQAxbdq0YtvCwsKEtbW1Xo5TmoiICGHIX19Pv9/GpE+fPsLT01PqMoyKmTTxifQtODgYbdu2VT+OiopCQkICXnrpJbz88sv4888/YWVlBQAwMzODmZlh3/qHDx+iZs2aMDc3N+hxylKjRg1Jj18et27dgo+Pj9RllOnll1+Gra0t4uLiMHz48GLbt23bhtzcXAwbNgwA0KxZM1y8eBGenp7qfd555x0EBgZi9uzZmDJlCqytrQEAkZGRiIuL0/i8DBo0CC1atMCsWbOwZs0aA7+64lQqFR49egRLS0tYWlpW+vH1xdTUFKamplKXIZmCggKoVKoK/SyS8/tdLUmdpOjZFPXI/Pbbb1q3f/bZZwKA+Oabb9Triv6afNLu3btFhw4dhFKpFNbW1qJx48YiKipKCPFfL8rTS1EPSJcuXUSzZs3E8ePHRadOnYSVlZWYMGGCeluXLl3Uxylqa926dSIqKkq4uLiImjVripCQEHH16lWNmjw9PbX2PjzZZlm1hYWFFfvrJScnR0RGRgoPDw9hbm4uGjduLObMmVPsL1YAIiIiQmzZskU0a9ZMmJubCx8fH7Fz506t5/ppGRkZ4o033hDOzs7CwsJC+Pr6ihUrVhQ7F08vJf1Frm3fovNT9J5evHhRhIWFCaVSKezs7ER4eLjIzc0t1tbq1atFmzZthKWlpahdu7YYNGhQsfOvTVhYmDAzMxMZGRnFtr300kvC1tZWPHz4sNQ2Fi5cKACIU6dOlXm8Nm3aiDZt2pS53++//y7CwsKEl5eXsLCwEC4uLmLEiBHizp076n2KzlFJ57vo/V6zZo3w8fERZmZmYsuWLeptT/ZAFLX1559/ildffVXY2toKe3t7MX78eI3ezsuXL5fYW/hkm2XVJoTu71nRz4gn2/L09BR9+vQRBw8eFM8//7ywsLAQXl5eJfa2Pf16nl6KXkdRj8z169dF3759hbW1tXB0dBSTJ08WBQUFGm0VFhaKefPmCR8fH2FhYSGcnZ3F6NGjxb1790qtISwsTGsNT9Y3Z84cMW/ePNGgQQNhYmIikpOTRX5+vvjoo49EmzZthJ2dnahZs6bo2LGjSEhIKHaMkt7v8v7/etqFCxfEgAEDhIuLi7CwsBB16tQRgwYNEg8ePNDYr6z3uEuXLsVeN3tn2CNT5b3++uv43//+h927d+PNN9/Uus/Zs2fx0ksvwdfXF5988gksLCyQmpqKw4cPAwCee+45fPLJJ/j4448xevRodOrUCQDQvn17dRt3795FcHAwBg8ejNdeew0uLi6l1vXpp59CoVDg/fffx61btzB//nwEBgYiJSVF3XNUHuWp7UlCCLz88svYv38/Ro4ciVatWiE+Ph7vvfcebty4gXnz5mnsf+jQIWzevBnvvPMObG1tsXDhQgwcOBBXr16Fg4NDiXX9888/CAgIQGpqKsaOHQsvLy9s3LgR4eHhePDgASZMmIDnnnsOq1evxqRJk+Dh4YHJkycDAJycnLS2uXr1aowaNQovvPACRo8eDQDw9vbW2Cc0NBReXl6IiYnByZMn8e2338LZ2RmzZ89W7/Ppp5/io48+QmhoKEaNGoXbt2/jq6++QufOnZGcnIxatWqV+LqGDRuGlStXYsOGDRg7dqx6/b179xAfH48hQ4aU+f7dvHkTAODo6FjqfkIIZGRkoFmzZqXuBwB79uzBX3/9hREjRsDV1RVnz57FN998g7Nnz+LYsWNQKBQYMGAALly4gLVr12LevHnq4z95vhMSEtSvzdHRscyB4qGhoahfvz5iYmJw7NgxLFy4EPfv38eqVavKrPlJZdX2LO9ZSVJTU/HKK69g5MiRCAsLw3fffYfw8HD4+fmVeM6dnJywdOlSvP322+jfvz8GDBgAAPD19VXvU1hYiKCgILRr1w5ffPEF9u7diy+//BLe3t54++231fu99dZbWLFiBUaMGIHx48fj8uXLWLRoEZKTk3H48OESe1PfeustpKWlYc+ePVi9erXWfWJjY5GXl4fRo0fDwsIC9vb2yMrKwrfffoshQ4bgzTffRHZ2NpYvX46goCD8+uuvaNWqVZnnrDz/v5726NEjBAUFIT8/H+PGjYOrqytu3LiB7du348GDB1AqlQDK9x5/+OGHyMzMxPXr19U/q6rC4OpnJnWSomdTVo+MEEIolUrRunVr9eOne2TmzZtX5vX+0sahFP2VsGzZMq3btPXI1KlTR2RlZanXb9iwQQAQCxYsUK8rT49MWbU93SOzdetWAUDMnDlTY79XXnlFKBQKkZqaql4HQJibm2us+/333wUA8dVXXxU71pPmz58vAIg1a9ao1z169Ej4+/sLGxsbjdde9NdxeZQ1RuaNN97QWN+/f3/h4OCgfnzlyhVhamoqPv30U439Tp8+LczMzIqtf1pBQYFwc3MT/v7+GuuXLVsmAIj4+PhSn3/37l3h7OwsOnXqVOp+Qvz71ykAsXz58jL31dYLtHbtWgFAJCUlqdeVNg4FgDAxMRFnz57Vuk3bX+gvv/yyxn7vvPOOAKAel1beHpnSanvW96ykHpmnz82tW7eEhYWFmDx5cqntlTVGBv9/zN6TWrduLfz8/NSPDx48KACI77//XmO/Xbt2aV3/tJLGyBSdbzs7O3Hr1i2NbQUFBRrj9YQQ4v79+8LFxaXY/5uS3u+y/n9pk5ycXOa4wIq8xxwjUxzvWqoGbGxsSr3TpOivuW3btkGlUul0DAsLC4wYMaLc+w8fPhy2trbqx6+88grc3NywY8cOnY5fXjt27ICpqSnGjx+vsX7y5MkQQmDnzp0a6wMDAzV6PXx9fWFnZ4e//vqrzOO4urpiyJAh6nU1atTA+PHjkZOTg8TERD28muLGjBmj8bhTp064e/c
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"_, _, bars = plt.hist(df_test['V22'], bins=20)\n",
"plt.xlabel('V22')\n",
"plt.ylabel('Frequency')\n",
"plt.title('Distribution of the V22 atribute in the train set')\n",
"plt.bar_label(bars, fmt='%1.0f')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "6d6b63fd",
"metadata": {},
"source": [
"The distribution of the target variable **V22** is normal, so i could try to fill the missing values with `mean()`."
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 152,
2022-12-29 10:21:35 +01:00
"id": "2b2b6e2d",
"metadata": {},
"outputs": [],
"source": [
"df_train['V22'].fillna(df_train['V22'].mean(), inplace=True)\n",
"df_test['V22'].fillna(df_test['V22'].mean(), inplace=True)"
]
},
{
"cell_type": "markdown",
"id": "4164f62c",
"metadata": {},
"source": [
"##### V27"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 153,
2022-12-29 10:21:35 +01:00
"id": "9a8b64ac",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 838.000000\n",
"mean 2.218153\n",
"std 0.221545\n",
"min 1.000000\n",
"25% 2.107000\n",
"50% 2.251000\n",
"75% 2.359750\n",
"max 2.859000\n",
"Name: V27, dtype: float64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 153,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V27'].describe()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 154,
2022-12-29 10:21:35 +01:00
"id": "1bddfb76",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2.000 36\n",
"2.236 31\n",
"2.194 24\n",
"1.848 22\n",
"2.175 21\n",
" ..\n",
"2.294 1\n",
"2.466 1\n",
"2.488 1\n",
"2.372 1\n",
"2.622 1\n",
"Name: V27, Length: 290, dtype: int64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 154,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V27'].value_counts()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 155,
2022-12-29 10:21:35 +01:00
"id": "f1787f2e",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjIAAAHHCAYAAACle7JuAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABPhElEQVR4nO3deViUVf8/8PeIzohsirI+IBK4oWJGhmgqKoJohmm55AKKaT5gqW3SoqIVmuVSKVZfA7UQl0TLEnIDl9TSJJdSwTQ1QcyFzUBlzu8Pf8zjyDaMM9xzD+/Xdc1Vc+4zZz5nzj3jhzPnPqMQQggQERERyVADqQMgIiIi0hcTGSIiIpItJjJEREQkW0xkiIiISLaYyBAREZFsMZEhIiIi2WIiQ0RERLLFRIaIiIhki4kMERERyRYTmXpozpw5UCgUdfJcgYGBCAwM1NxPT0+HQqHAxo0b6+T5IyIi0KpVqzp5Ln0VFRVh4sSJcHZ2hkKhwLRp02rdRvmY/vPPP4YPsJ47f/48FAoFEhMTNWURERGwtraWLqhaqOv3e8eOHevkuYylsvEm08ZERuYSExOhUCg0t8aNG8PV1RUhISH4+OOPUVhYaJDnuXz5MubMmYPMzEyDtGdIphybLt5//30kJiZiypQpWLNmDcaOHVtt3c2bN9ddcPd5+umn0aRJk2rPqdGjR0OpVOLatWu4du0aFi5ciF69esHBwQFNmzZFt27dsG7dugqPi4iI0DqPH7z9/fffBunDDz/8gDlz5hikLUMz5djuZwrvt6SkJCxZskSy55fS8uXLmWQ9SJCsJSQkCABi7ty5Ys2aNeLLL78U77//vggODhYKhUJ4eHiI3377Tesxd+7cEf/++2+tnueXX34RAERCQkKtHldaWipKS0s193fv3i0AiA0bNtSqHX1ju337tigpKTHYcxmDv7+/6NGjh051raysRHh4eIXy2bNnCwDi6tWrBo7uf5KTkwUAsWrVqkqPFxcXCysrKzF48GAhhBDfffedaNSokQgLCxNLliwRn376qejTp48AIGbNmqX12J9++kmsWbNG67Z69WrRpEkT4ePjY7A+REVFidp+7KnVavHvv/+Ku3fvasrCw8OFlZWVweLSNzZd6PN+r05177fevXuLDh06GOy5qjJo0CDh4eFhlLYrG29T0qFDB9G7d2+pwzApDaVKoMiwQkND8fjjj2vux8TEYNeuXXjqqafw9NNP448//oClpSUAoGHDhmjY0LhDf+vWLTRp0gRKpdKoz1OTRo0aSfr8usjLy4OPj4/UYdTo6aefho2NDZKSkjBu3LgKx7ds2YLi4mKMHj0aANChQwdkZWXBw8NDU+e///0vgoKCsGDBArz++uuwsrICAAQEBCAgIECrvX379uHWrVua9ura3bt3oVaroVQq0bhxY0liMIS6eL+bspKSEiiVSjRooNsXEOUz2yQjUmdS9HDKZ2R++eWXSo+///77AoD4/PPPNWXlf73f78cffxQ9evQQdnZ2wsrKSrRp00bExMQIIf43i/LgrfwvsvK/wg4fPix69uwpLC0txcsvv6w5dv9fD+VtJScni5iYGOHk5CSaNGkiBg8eLC5cuKAVk4eHR6WzD/e3WVNs4eHhFf5yKyoqEjNmzBBubm5CqVSKNm3aiIULFwq1Wq1VD4CIiooSKSkpokOHDkKpVAofHx+xbdu2Sl/rB125ckVMmDBBODo6CpVKJXx9fUViYmKF1+LB27lz5yptr7K65a9P+ZhmZWWJ8PBwYWdnJ2xtbUVERIQoLi6u0NaaNWvEY489Jho3biyaNWsmRowYUeH1r0x4eLho2LChuHLlSoVjTz31lLCxsRG3bt2qto2PP/5YABDHjh2rtt6UKVOEQqGo8vW43549e8Szzz4r3N3dhVKpFG5ubmLatGlasYSHh1f6GgohxLlz5wQAsXDhQrF48WLxyCOPiAYNGoijR49qjt0/A1E+I3P27FkRHBwsmjRpIlxcXERsbKzWeVQ+xrt379aK98E2q4tNCCHKysrE4sWLhY+Pj1CpVMLR0VFMmjRJXL9+vcbXprL3u77ntq6fBSdPnhSBgYHC0tJSuLq6igULFlRoq6SkRMyaNUt4eXlpxuy1116rcQa1d+/eFZ6//D1eHt/atWvFW2+9JVxdXYVCoRA3btwQ165dE6+88oro2LGjsLKyEjY2NmLAgAEiMzNTq/3qxvvSpUsiLCxMWFlZiRYtWohXXnlFp5mbX375RQQHB4vmzZuLxo0bi1atWonx48dr1dFljD08PCr0nbMznJExe2PHjsWbb76JH3/8ES+88EKldU6ePImnnnoKvr6+mDt3LlQqFbKzs7F//34AQPv27TF37lzMmjULkyZNQs+ePQEA3bt317Rx7do1hIaGYuTIkRgzZgycnJyqjeu9996DQqHAG2+8gby8PCxZsgRBQUHIzMzUzBzpQpfY7ieEwNNPP43du3cjMjISjz76KNLS0vDaa6/h77//xuLFi7Xq79u3D5s2bcJ///tf2NjY4OOPP8awYcNw4cIFNG/evMq4/v33XwQGBiI7OxvR0dHw9PTEhg0bEBERgZs3b+Lll19G+/btsWbNGkyfPh1ubm545ZVXAAAODg6VtrlmzRpMnDgRTzzxBCZNmgQA8PLy0qozfPhweHp6Ii4uDr/++iv+7//+D46OjliwYIGmznvvvYd33nkHw4cPx8SJE3H16lV88skn6NWrF44ePYqmTZtW2a/Ro0dj1apVWL9+PaKjozXl169fR1paGkaNGlXj+OXm5gIAWrRoUWWdO3fuYP369ejevbtOi7U3bNiAW7duYcqUKWjevDl+/vlnfPLJJ7h06RI2bNgAAJg8eTIuX76M7du3Y82aNZW2k5CQgJKSEkyaNAkqlQr29vZQq9WV1i0rK8OAAQPQrVs3fPDBB0hNTcXs2bNx9+5dzJ07t8aY71dTbJMnT0ZiYiLGjx+Pl156CefOncOnn36Ko0ePYv/+/XrNPOpzbuvyfrtx4wYGDBiAoUOHYvjw4di4cSPeeOMNdOrUCaGhoQAAtVqNp59+Gvv27cOkSZPQvn17HD9+HIsXL8aZM2eqXQf21ltvIT8/H5cuXdK8Xx9ceD1v3jwolUq8+uqrKC0thVKpxO+//47Nmzfjueeeg6enJ65cuYLPPvsMvXv3xu+//w5XV9dqX6+ysjKEhITA398fH374IXbs2IGPPvoIXl5emDJlSpWPy8vLQ3BwMBwcHDBz5kw0bdoU58+fx6ZNm7Tq6TLGS5YswdSpU2FtbY233noLAGr8rK0XpM6k6OHUNCMjhBB2dnaiS5cumvsP/oW2ePHiGtdX1PS9OACxYsWKSo9VNiPzn//8RxQUFGjK169fLwCIpUuXasp0mZGpKbYHZ2Q2b94sAIh3331Xq96zzz4rFAqFyM7O1pQBEEqlUqvst99+EwDEJ598UuG57rdkyRIBQHz11Veastu3b4uAgABhbW2t1XcPDw8xaNCgatsrV9MamQkTJmiVP/PMM6J58+aa++fPnxcWFhbivffe06p3/Phx0bBhwwrlD7p7965wcXERAQEBWuUrVqwQAERaWlq1j7927ZpwdHQUPXv2rLbed999JwCI5cuXV1uvXGWzQHFxcUKhUIi//vpLU1bVOpTyv8JtbW1FXl5epcce/AsdgJg6daqmTK1Wi0GDBgmlUql5L+k6I1NdbHv37hUAxNdff61VnpqaWmn5g6qakdH33Nbls2D16tWastLSUuHs7CyGDRumKVuzZo1o0KCB2Lt3r9bjy8+j/fv3VxtDVWtkyl/vRx55pMI5UVJSIsrKyrTKzp07J1QqlZg7d65WWVXjfX89IYTo0qWL8PPzqzbWlJSUGj+jazPGXCNTEa9aqgesra2rvdKk/C/wLVu2VPnXZ01UKhXGjx+vc/1x48bBxsZGc//ZZ5+Fi4sLfvjhB72eX1c//PADLCws8NJLL2mVv/LKKxBCYNu2bVrlQUFBWrMevr6+sLW1xZ9//lnj8zg7O2PUqFGaskaNGuGll15CUVERMjIyDNCbil588UWt+z179sS
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"_, _, bars = plt.hist(df_test['V27'], bins=20)\n",
"plt.xlabel('V27')\n",
"plt.ylabel('Frequency')\n",
"plt.title('Distribution of the V27 atribute in the train set')\n",
"plt.bar_label(bars, fmt='%1.0f')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "53b79865",
"metadata": {},
"source": [
"The distribution of the target variable **V27** is normal, so i could try to fill the missing values with `mean()`."
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 156,
2022-12-29 10:21:35 +01:00
"id": "8974127e",
"metadata": {},
"outputs": [],
"source": [
"# Set the nan values to the mean of the column\n",
"df_train['V27'].fillna(df_train['V27'].mean(), inplace=True)\n",
"df_test['V27'].fillna(df_test['V27'].mean(), inplace=True)"
]
},
{
"cell_type": "markdown",
"id": "3afb5a2f",
"metadata": {},
"source": [
"##### V29"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 157,
2022-12-29 10:21:35 +01:00
"id": "f410439d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 838.00000\n",
"mean 0.02506\n",
"std 0.15640\n",
"min 0.00000\n",
"25% 0.00000\n",
"50% 0.00000\n",
"75% 0.00000\n",
"max 1.00000\n",
"Name: V29, dtype: float64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 157,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V29'].describe()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 158,
2022-12-29 10:21:35 +01:00
"id": "2d33e7c4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.0 817\n",
"1.0 21\n",
"Name: V29, dtype: int64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 158,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V29'].value_counts()"
]
},
{
"cell_type": "markdown",
"id": "515e9e80",
"metadata": {},
"source": [
"We can see that the majority of entires in that particular atribute are zeros. So I think that it would be best if I set all the `Nan` values to zeros."
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 159,
2022-12-29 10:21:35 +01:00
"id": "48e8ba49",
"metadata": {},
"outputs": [],
"source": [
"# Set nan values to 0\n",
"df_train['V29'].fillna(0, inplace=True)\n",
"df_test['V29'].fillna(0, inplace=True)"
]
},
{
"cell_type": "markdown",
"id": "f659f8bc",
"metadata": {},
"source": [
"##### V37"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 160,
2022-12-29 10:21:35 +01:00
"id": "8515f06b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 821.000000\n",
"mean 2.549406\n",
"std 0.625021\n",
"min 1.467000\n",
"25% 2.101000\n",
"50% 2.461000\n",
"75% 2.861000\n",
"max 5.750000\n",
"Name: V37, dtype: float64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 160,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V37'].describe()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 161,
2022-12-29 10:21:35 +01:00
"id": "36bc89b5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2.167 9\n",
"2.500 9\n",
"2.833 8\n",
"2.667 8\n",
"1.833 7\n",
" ..\n",
"2.029 1\n",
"1.886 1\n",
"2.089 1\n",
"2.197 1\n",
"2.206 1\n",
"Name: V37, Length: 535, dtype: int64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 161,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_train['V37'].value_counts()"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 162,
2022-12-29 10:21:35 +01:00
"id": "02c38a9f",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjMAAAHHCAYAAABKudlQAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABNjUlEQVR4nO3deVhUZf8G8HsUZ0SWUZQ1FhEXRMWMTFFT3FA0xNQss8Qt01BTtIx6K5cMrV+KlqKWgVqkYqJlEbmiphaY5Pa6YBqaLOYCgjEo8/z+6GVyZB9nOHPg/lzXuS7nnDPPfOeccbjnmec8oxBCCBARERHJVD2pCyAiIiJ6GAwzREREJGsMM0RERCRrDDNEREQkawwzREREJGsMM0RERCRrDDNEREQkawwzREREJGsMM0RERCRrDDN10Ny5c6FQKGrksQICAhAQEKC7vW/fPigUCmzZsqVGHn/s2LFo3rx5jTyWofLz8zFx4kQ4OTlBoVBgxowZ1W6j5Jz+9ddfxi+wjrt06RIUCgViY2N168aOHQtra2vpiqqGmv7/3r59+xp5LFMp63yT+WOYkbnY2FgoFArd0rBhQ7i4uGDAgAFYvnw5bt++bZTHuXr1KubOnYu0tDSjtGdM5lxbVbz//vuIjY3FlClTsGHDBrz44osV7rtt27aaK+4+Q4YMQaNGjSp8TY0ePRpKpRLXr18HAMycOROPPfYY7Ozs0KhRI7Rt2xZz585Ffn6+3v3Gjh2r9zp+cPnzzz+N8hy+//57zJ071yhtGZs513Y/c/j/FhcXh6ioKMkeX0orV65k0CqLIFmLiYkRAMT8+fPFhg0bxOeffy7ef/99ERgYKBQKhfDw8BC//fab3n3u3r0r/v7772o9TkpKigAgYmJiqnU/jUYjNBqN7vbevXsFABEfH1+tdgytraioSBQWFhrtsUyhS5cuonv37lXa18rKSoSGhpZa/+677woA4tq1a0au7l8bN24UAMS6devK3F5QUCCsrKxEcHCwbl337t3F9OnTxfLly8WaNWvElClThEqlEt27dxfFxcW6/Q4dOiQ2bNigt6xfv140atRI+Pj4GO05hIWFieq+7Wm1WvH333+Le/fu6daFhoYKKysro9VlaG1VYcj/94pU9P+tV69eol27dkZ7rPIMHjxYeHh4mKTtss63OWnXrp3o1auX1GWYHQvJUhQZVVBQEB5//HHd7YiICOzZswdPPfUUhgwZgv/+97+wtLQEAFhYWMDCwrSn/s6dO2jUqBGUSqVJH6cyDRo0kPTxqyInJwc+Pj5Sl1GpIUOGwMbGBnFxcRgzZkyp7du3b0dBQQFGjx6tW3fw4MFS+3l5eWH27Nn45Zdf0LVrVwCAv78//P399fY7ePAg7ty5o9deTbp37x60Wi2USiUaNmwoSQ3GUBP/381ZYWEhlEol6tWr2hcRJT3cJDNSpyl6OCU9MykpKWVuf//99wUAsWbNGt26kk/x9/vxxx9F9+7dhVqtFlZWVqJ169YiIiJCCPFvb8qDS8kns5JPY6mpqeLJJ58UlpaW4tVXX9Vtu/9TRElbGzduFBEREcLR0VE0atRIBAcHi4yMDL2aPDw8yuyFuL/NymoLDQ0t9QkuPz9fhIeHC1dXV6FUKkXr1q3Fhx9+KLRard5+AERYWJhISEgQ7dq1E0qlUvj4+IjExMQyj/WDsrOzxfjx44WDg4NQqVTC19dXxMbGljoWDy4XL14ss72y9i05PiXn9Pz58yI0NFSo1Wpha2srxo4dKwoKCkq1tWHDBvHYY4+Jhg0biiZNmohnn3221PEvS2hoqLCwsBDZ2dmltj311FPCxsZG3Llzp8I2tmzZIgBUehynTJkiFApFucfjfvv37xcjRowQbm5uQqlUCldXVzFjxgy9WkJDQ8s8hkIIcfHiRQFAfPjhh2Lp0qWiRYsWol69euLYsWO6bff3RJT0zFy4cEEEBgaKRo0aCWdnZzFv3jy911HJOd67d69evQ+2WVFtQghRXFwsli5dKnx8fIRKpRIODg5i0qRJ4saNG5Uem7L+vxv62q7qe8GpU6dEQECAsLS0FC4uLmLx4sWl2iosLBTvvPOO8PLy0p2z1157rdKe1F69epV6/JL/4yX1ffXVV+Ktt94SLi4uQqFQiJs3b4rr16+LWbNmifbt2wsrKythY2MjBg4cKNLS0vTar+h8X7lyRYSEhAgrKyvRrFkzMWvWrCr14KSkpIjAwEDRtGlT0bBhQ9G8eXMxbtw4vX2qco49PDxKPXf20vyj7sb1OuLFF1/Em2++iR9//BEvvfRSmfucOnUKTz31FHx9fTF//nyoVCqkp6fjp59+AgC0bdsW8+fPxzvvvINJkybhySefBAB069ZN18b169cRFBSE5557Di+88AIcHR0rrGvhwoVQKBSYM2cOcnJyEBUVhX79+iEtLU3Xg1QVVantfkIIDBkyBHv37sWECRPw6KOPIikpCa+99hr+/PNPLF26VG//gwcPYuvWrXjllVdgY2OD5cuXY/jw4cjIyEDTpk3Lrevvv/9GQEAA0tPTMXXqVHh6eiI+Ph5jx47FrVu38Oqrr6Jt27bYsGEDZs6cCVdXV8yaNQsAYG9vX2abGzZswMSJE/HEE09g0qRJAP7p5bjfyJEj4enpicjISPz666/47LPP4ODggMWLF+v2WbhwId5++22MHDkSEydOxLVr1/Dxxx+jZ8+eOHbsGBo3blzu8xo9ejTWrVuHzZs3Y+rUqbr1N27cQFJSEkaNGlXq/N27dw+3bt1CUVERTp48if/85z+wsbHBE088Ue7j3L17F5s3b0a3bt2qNIA7Pj4ed+7cwZQpU9C0aVP88ssv+Pjjj3HlyhXEx8cDAF5++WVcvXoVO3fuxIYNG8psJyYmBoWFhZg0aRJUKhXs7Oyg1WrL3Le4uBgDBw5E165d8cEHH+CHH37Au+++i3v37mH+/PmV1ny/ymp7+eWXERsbi3HjxmH69Om4ePEiPvnkExw7dgw//fSTQT2Qhry2q/L/7ebNmxg4cCCGDRuGkSNHYsuWLZgzZw46dOiAoKAgAIBWq8WQIUNw8OBBTJo0CW3btsWJEyewdOlSnDt3rsJxYW+99RZyc3Nx5coV3f/XBwdjL1iwAEqlErNnz4ZGo4FSqcTp06exbds2PPPMM/D09ER2djZWr16NXr164fTp03BxcanweBUXF2PAgAHo0qUL/u///g+7du3CRx99BC8vL0yZMqXc++Xk5CAwMBD29vZ444030LhxY1y6dAlbt27V268q5zgqKgrTpk2DtbU13nrrLQCo9L22zpA6TdHDqaxnRggh1Gq16NSpk+72g5/Uli5dWul4i8q+JwcgVq1aVea2snpmHnnkEZGXl6dbv3nzZgFALFu2TLeuKj0zldX2YM/Mtm3bBADx3nvv6e03YsQIoVAoRHp6um4dAKFUKvXW/fbbbwKA+Pjjj0s91v2ioqIEAPHFF1/o1hUVFQl/f39hbW2t99w9PDzE4MGDK2yvRGVjZsaPH6+3/umnnxZNmzbV3b506ZKoX7++WLhwod5+J06cEBYWFqXWP+jevXvC2dlZ+Pv7661ftWqVACCSkpJK3efw4cN6nyTbtGlTqqfiQd9++60AIFauXFnhfiXK6g2KjIwUCoVC/PHHH7p15Y1LKfk0bmtrK3Jycsrc9uAndQBi2rRpunVarVYMHjxYKJVK3f+lqvbMVFTbgQMHBADx5Zdf6q3/4Ycfylz/oPJ6Zgx9bVflvWD9+vW6dRqNRjg5OYnhw4fr1m3YsEHUq1dPHDhwQO/+Ja+jn376qcIayhszU3K8W7RoUeo1UVhYqDdOS4h/zoNKpRLz58/XW1fe+b5/PyGE6NSpk/Dz86uw1oSEhErfo6tzjjlmpmy8mqkOsLa2rvAKlJJP4tu3by/3U2hlVCoVxo0bV+X9x4wZAxsbG93tESNGwNnZGd9//71Bj19V33//PerXr4/p06frrZ81axaEEEhMTNRb369fP73eD19fX9ja2uL333+v9HGcnJwwatQo3boGDRpg+vTpyM/PR3JyshGeTWmTJ0/Wu/3
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"_, _, bars = plt.hist(df_test['V37'], bins=20)\n",
"plt.xlabel('V37')\n",
"plt.ylabel('Frequency')\n",
"plt.title('Distribution of the V37 atribute in the train set')\n",
"plt.bar_label(bars, fmt='%1.0f')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "15f862dd",
"metadata": {},
"source": [
"The distribution of the target variable **V37** is normal, so i could try to fill the missing values with `mean()`."
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 163,
2022-12-29 10:21:35 +01:00
"id": "e1058d9a",
"metadata": {},
"outputs": [],
"source": [
"df_train['V37'].fillna(df_train['V37'].mean(), inplace=True)\n",
"df_test['V37'].fillna(df_test['V37'].mean(), inplace=True)"
]
},
{
"cell_type": "markdown",
"id": "44ca71d0",
"metadata": {},
"source": [
"### 2.2 Modeling\n",
"Besides the baselines (majority classifier, random classifier), use at least three machine learning algorithms\n",
"to model the target class. Be ready to argue why did you select specific algorithms and how did you find\n",
"the best hyperparameters for them. Consider the following points when creating your models:\n",
"- Create your models using all features and subsets of them using various feature selection techniques.\n",
"- Certain models assume that data follows a particular distribution or may work better with other\n",
"types of variables (e.g., categorical instead of numeric). Explore whether you can come up with feature\n",
"transformations that are more appropriate for your models. Try to construct new features from existing\n",
"ones. Try to explain the results and performance of different models."
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 164,
2022-12-29 10:21:35 +01:00
"id": "42e83cd5",
"metadata": {},
"outputs": [],
"source": [
"# Spliting the data into features and labels\n",
"X_train = df_train.drop('Class', axis=1).reset_index(drop=True)\n",
"y_train = df_train['Class'].reset_index(drop=True)\n",
"X_test = df_test.drop('Class', axis=1).reset_index(drop=True)\n",
"y_test = df_test['Class'].reset_index(drop=True)"
]
},
{
"cell_type": "markdown",
"id": "9544c1ec",
"metadata": {},
"source": [
"#### Using majority classifier and random classifier"
]
},
{
"cell_type": "markdown",
"id": "a07a61d4",
"metadata": {},
"source": [
"##### Majority classifier"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 165,
2022-12-29 10:21:35 +01:00
"id": "2f41cf22",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1 0.666667\n",
"2 0.333333\n",
"Name: Class, dtype: float64"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 165,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get the reatio between thhe class we are trying to predict\n",
"y_train.value_counts(normalize=True)\n"
]
},
{
"cell_type": "markdown",
"id": "c3ddae4d",
"metadata": {},
"source": [
"If we were to predict using the majority classifier then we would always predict Ready non-biodegradable."
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 166,
2022-12-29 10:21:35 +01:00
"id": "abef9e0c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.645933014354067"
]
},
2023-01-06 10:41:21 +01:00
"execution_count": 166,
2022-12-29 10:21:35 +01:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y_test[y_test == 1].shape[0] / y_test.shape[0]"
]
},
{
"cell_type": "markdown",
"id": "b136180f",
"metadata": {},
"source": [
"We would get the accuracy of 0.645933014354067 if we predicted all the values to be 1."
]
},
{
"cell_type": "markdown",
"id": "bff2a7d3",
"metadata": {},
"source": [
"#### Random classifier"
]
},
{
"cell_type": "markdown",
"id": "a9a5ac3b",
"metadata": {},
"source": [
"We have two classes to predict, so probability of predicting the right class is 50%."
]
},
{
"cell_type": "markdown",
"id": "5779375e",
"metadata": {},
"source": [
"#### Lets firstly write a simple function that will score all our generated models"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 167,
2022-12-29 10:21:35 +01:00
"id": "3d716f7b",
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import precision_score\n",
"from sklearn.metrics import recall_score\n",
"from sklearn.metrics import f1_score\n",
"from sklearn.metrics import roc_auc_score\n",
"from sklearn.metrics import RocCurveDisplay\n",
"from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay\n",
"from sklearn.model_selection import KFold, RepeatedKFold\n",
"all_scores = []\n",
"\n",
"def score_the_model(model, model_name, random_seed, X_train, X_test, y_train, y_test, plot=False):\n",
" \"\"\"\n",
" Fits and evaluates given machine learning models.\n",
" models: dict of different Scikit-Learn machine learning models\n",
" X_train: training data (no labels)\n",
" x_test: testing data (no labels)\n",
" y_train: training labels\n",
" y_test: trest labels\n",
" \"\"\"\n",
"\n",
" # Set random seed\n",
" np.random.seed(random_seed)\n",
"\n",
" # Fit the model to the data\n",
" model.fit(X_train, y_train)\n",
"\n",
" model_score = model.score(X_test, y_test) # Mean accuracy of ``self.predict(X)`` wrt. `y`.\n",
" # Predict the labels\n",
" y_pred = model.predict(X_test)\n",
"\n",
" # Compute scores\n",
" f1 = f1_score(y_test, y_pred)\n",
" precision = precision_score(y_test, y_pred)\n",
" recall = recall_score(y_test, y_pred)\n",
" auc = roc_auc_score(y_test, y_pred)\n",
" # Plot scores\n",
" normal_scores = {\n",
" 'Accuracy': model_score,\n",
" 'F1': f1,\n",
" 'Precision': precision,\n",
" 'Recall': recall,\n",
" 'AUC': auc\n",
" }\n",
"\n",
" def normal_cv(model, X_train, y_train, random_seed):\n",
" # Perform normal cross-validation\n",
" X_train = X_train.copy()\n",
" y_train = y_train.copy()\n",
" kfold = KFold(n_splits=5, shuffle=True, random_state=random_seed)\n",
" scores = []\n",
"\n",
" for train_ix, test_ix in kfold.split(X_train):\n",
" # Split the data\n",
" X_train_cv, X_test_cv = X_train.iloc[train_ix], X_train.iloc[test_ix]\n",
" y_train_cv, y_test_cv = y_train.iloc[train_ix], y_train.iloc[test_ix]\n",
"\n",
" # Fit the model\n",
" model.fit(X_train_cv, y_train_cv)\n",
"\n",
" # Evaluate the model\n",
" y_pred = model.predict(X_test_cv)\n",
" scrs = {\n",
" 'Accuracy': model.score(X_test_cv, y_test_cv),\n",
" 'F1': f1_score(y_test_cv, y_pred),\n",
" 'Precision': precision_score(y_test_cv, y_pred),\n",
" 'Recall': recall_score(y_test_cv, y_pred),\n",
" 'AUC': roc_auc_score(y_test_cv, y_pred)\n",
" }\n",
" scores.append(scrs)\n",
" \n",
" # Plot all the scores\n",
" scores = pd.DataFrame(scores)\n",
" scores.plot(kind='bar', figsize=(10, 8))\n",
" # Plot also the values at the top of the bars\n",
" plt.title(f'Cross-validated scores for {model_name}')\n",
" plt.xlabel('Fold')\n",
" plt.ylabel('Score')\n",
" plt.legend(loc='lower right')\n",
" plt.show()\n",
" return scores\n",
" if type(X_train) == pd.core.frame.DataFrame:\n",
" scores_cv = normal_cv(model, X_train, y_train, random_seed)\n",
"\n",
" def repeated_cv(model, X_train, y_train, random_seed):\n",
" # Perform another cv with 10 folds\n",
" scores_k_fold = []\n",
" rkf = RepeatedKFold(n_splits=10, n_repeats=10, random_state=random_seed)\n",
" for train_index, test_index in rkf.split(X_train):\n",
" model.fit(X_train.iloc[train_index], y_train.iloc[train_index])\n",
" y_pred = model.predict(X_train.iloc[test_index])\n",
" scrs = {\n",
" 'Accuracy': model.score(X_train.iloc[test_index], y_train.iloc[test_index]),\n",
" 'F1': f1_score(y_train.iloc[test_index], y_pred),\n",
" 'Precision': precision_score(y_train.iloc[test_index], y_pred),\n",
" 'Recall': recall_score(y_train.iloc[test_index], y_pred),\n",
" 'AUC': roc_auc_score(y_train.iloc[test_index], y_pred)\n",
" }\n",
"\n",
" scores_k_fold.append(scrs)\n",
" return scores_k_fold\n",
"\n",
" k_fold_scores_mean = {}\n",
" k_fold_scores_std = {}\n",
"\n",
" if type(X_train) == pd.core.frame.DataFrame:\n",
" scores_k_fold = repeated_cv(model, X_train, y_train, random_seed)\n",
" k_fold_scores_mean['acccuracy_mean'] = np.mean([score['Accuracy'] for score in scores_k_fold])\n",
" k_fold_scores_std['accuracy_std'] = np.std([score['Accuracy'] for score in scores_k_fold]) \n",
" k_fold_scores_mean['f1_mean'] = np.mean([score['F1'] for score in scores_k_fold])\n",
" k_fold_scores_std['f1_std'] = np.std([score['F1'] for score in scores_k_fold])\n",
" k_fold_scores_mean['precision_mean'] = np.mean([score['Precision'] for score in scores_k_fold])\n",
" k_fold_scores_std['precision_std'] = np.std([score['Precision'] for score in scores_k_fold])\n",
" k_fold_scores_mean['recall_mean'] = np.mean([score['Recall'] for score in scores_k_fold])\n",
" k_fold_scores_std['recall_std'] = np.std([score['Recall'] for score in scores_k_fold])\n",
" k_fold_scores_mean['auc_mean'] = np.mean([score['AUC'] for score in scores_k_fold])\n",
" k_fold_scores_std['auc_std'] = np.std([score['AUC'] for score in scores_k_fold])\n",
"\n",
" if plot:\n",
" # Plot scores\n",
" fig, ax = plt.subplots(nrows=3, ncols=2, figsize=(15,15))\n",
"\n",
" # Plot the bar chart of Normal cv scores in the first subplot \n",
" ax[0, 0].bar(normal_scores.keys(), normal_scores.values())\n",
" # Display values of the bars\n",
" for i, v in enumerate(normal_scores.values()):\n",
" ax[0, 0].text(i-0.1, v+0.01, str(round(v, 2)))\n",
" ax[0, 0].set_title(f'Default scoring of {model_name}')\n",
" ax[0, 0].set_ylabel('Score')\n",
"\n",
" # Plot the k-fold cv scores in the third subplot\n",
" ax[0, 1].bar(k_fold_scores_mean.keys(), k_fold_scores_mean.values())\n",
" # Display values of the bars\n",
" for i, v in enumerate(k_fold_scores_mean.values()):\n",
" ax[0, 1].text(i-0.1, v+0.01, str(round(v, 2)))\n",
" ax[0, 1].set_title(f'10-fold cross-validated scoring of {model_name} (mean)')\n",
"\n",
" # Plot the k-fold cv scores in the third subplot\n",
" ax[1, 0].bar(k_fold_scores_std.keys(), k_fold_scores_std.values())\n",
" ax[1, 0].set_title(f'10-fold cross-validated scoring of {model_name} (std)')\n",
"\n",
" \n",
" # Plot the ROC curve in the second subplot\n",
" f = RocCurveDisplay.from_estimator(model, X_test, y_test).plot(ax=ax[1, 1])\n",
" \n",
" # Plot the confusion matrix in the third subplot\n",
" cm = confusion_matrix(y_test, y_pred, labels=model.classes_)\n",
" cm_plt = ConfusionMatrixDisplay(cm, display_labels=model.classes_).plot(ax=ax[2, 0])\n",
"\n",
2023-01-06 10:09:28 +01:00
" most_important_features = []\n",
2022-12-29 10:21:35 +01:00
" if hasattr(model, 'feature_importances_'):\n",
" # Plot feature importance in the fourth subplot\n",
" feature_dict = dict(zip(X_train.columns, model.feature_importances_))\n",
"\n",
" # Sort the features by their importance\n",
" feature_dict = {k: v for k, v in sorted(feature_dict.items(), key=lambda item: item[1], reverse=True)}\n",
"\n",
" # Plot the feature importance\n",
" ax[2, 1].bar(feature_dict.keys(), feature_dict.values())\n",
" ax[2, 1].set_title(f'Feature importance of {model_name}')\n",
" ax[2, 1].set_ylabel('Importance')\n",
" ax[2, 1].set_xticklabels(feature_dict.keys(), rotation=90)\n",
2023-01-06 10:09:28 +01:00
"\n",
" most_important_features = [k for k, v in feature_dict.items() if v > 0.01]\n",
" print(f'Most important features: {most_important_features}')\n",
2022-12-29 10:21:35 +01:00
" else:\n",
" ax[2, 1].set_visible(False)\n",
2023-01-06 10:09:28 +01:00
" \n",
2023-01-06 10:41:21 +01:00
" if not hasattr(model, 'feature_importances_'):\n",
2023-01-06 10:09:28 +01:00
" most_important_features = None\n",
2022-12-29 10:21:35 +01:00
" \n",
"\n",
" scores = []\n",
" scores.append(normal_scores)\n",
" normal_scores['model_name'] = model_name\n",
" all_scores.append(normal_scores)\n",
2023-01-06 10:09:28 +01:00
" return scores, model, most_important_features"
2022-12-29 10:21:35 +01:00
]
},
{
"cell_type": "markdown",
"id": "d144deb1",
"metadata": {},
"source": [
"### Decision tree model"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 168,
2022-12-29 10:21:35 +01:00
"id": "63fe4438",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABUyUlEQVR4nO3dd3hU1b7G8XfSSUIIkEKAQGgKSAQNxVAEIRQRFFFBPEpRmhBFUBBUQBSNFeEIR0SkiHBoInAOXZoKUSAUqdI7CSDSEiEks+8fXuY4JrBIIRPC9/M881xn7bX2/s3O5ty8WXuvsVmWZQkAAAAAcE1uri4AAAAAAPI7ghMAAAAAGBCcAAAAAMCA4AQAAAAABgQnAAAAADAgOAEAAACAAcEJAAAAAAwITgAAAABgQHACAAAAAAOCEwDkA5MmTZLNZtPBgwcdbY0aNVKjRo2MY1etWiWbzaZVq1bdtPqyIyIiQp07d3Z1GfnKnj171KxZMxUpUkQ2m01z5851dUk3VXaugTfffFM2m+3mFAQAOUBwApDn9u3bpx49eqh8+fLy8fFRQECA6tWrp1GjRumPP/5wdXm3lYULF+rNN990dRm3jU6dOmnr1q165513NGXKFNWsWfOmHevgwYOy2WyOl6enp4KCglS3bl299tprOnz48E079q2mc+fOTufqWi/+EADc3jxcXQCA28uCBQv0xBNPyNvbWx07dlS1atWUmpqqH3/8Uf3799f27ds1btw4V5eZLyxduvSmH2PhwoUaM2YM4SkP/PHHH4qPj9frr7+u2NjYPDtuhw4d1LJlS9ntdv3+++9av369Ro4cqVGjRunLL7/Uk08+edOO/euvv8rNLWt/o33jjTc0cODAm1RR5nr06KGYmBjH+wMHDmjIkCHq3r27GjRo4GivUKFCntYFIH8hOAHIMwcOHNCTTz6psmXLasWKFQoLC3Ns6927t/bu3asFCxZcc7zdbldqaqp8fHzyolyX8/LycnUJt4W0tDTZ7fabfr5PnTolSQoMDMy1fSYnJ8vPz++6fe699149/fTTTm2HDh1Ss2bN1KlTJ1WpUkXVq1fPtZr+ytvbO8tjPDw85OGRt7+eREdHKzo62vF+w4YNGjJkiKKjozOcu7+6kfMPoODgVj0AeeaDDz7QxYsX9eWXXzqFpqsqVqyoPn36ON7bbDbFxsZq6tSpuuuuu+Tt7a3FixdLkjZt2qQHH3xQAQEB8vf3V5MmTfTTTz857e/KlSsaNmyYKlWqJB8fHxUvXlz169fXsmXLHH0SExPVpUsXlS5dWt7e3goLC9Mjjzzi9KzR382ePVs2m02rV6/OsO3zzz+XzWbTtm3bJEm//PKLOnfu7LgtsUSJEnr22Wf122+/Gc9XZs84HT16VG3atJGfn59CQkLUt29fXb58OcPYH374QU888YTKlCkjb29vhYeHq2/fvk63Qnbu3FljxoyRJKfbka6y2+0aOXKk7rrrLvn4+Cg0NFQ9evTQ77//7nQsy7I0fPhwlS5dWr6+vnrggQe0fft24+e7avr06YqKilLhwoUVEBCgyMhIjRo1yqnP2bNn1bdvX0VERMjb21ulS5dWx44ddfr0aUefkydP6rnnnlNoaKh8fHxUvXp1TZ482Wk/V29f++ijjzRy5EhVqFBB3t7e2rFjhyRp165devzxx1WsWDH5+PioZs2amj9/vtM+buS6+rs333xTZcuWlST1799fNptNERERju03cj1ffQ5u9erV6tWrl0JCQlS6dOkbPs9/VbZsWU2aNEmpqan64IMPnLadPXtWL730ksLDw+Xt7a2KFSvq/fffl91ud+pnt9s1atQoRUZGysfHR8HBwWrRooU2bNjg6PP3Z5xu5Nxl9oxTWlqa3n77bcfPKyIiQq+99lqGaz8iIkKtWrXSjz/+qNq1a8vHx0fly5fXV199la3z9Fem879o0SI1aNBAfn5+Kly4sB566KFM/x3cyDUGIH9ixglAnvnPf/6j8uXLq27dujc8ZsWKFZo5c6ZiY2MVFBSkiIgIbd++XQ0aNFBAQIAGDBggT09Pff7552rUqJFWr16tOnXqSPrzF7C4uDh17dpVtWvX1vnz57VhwwZt3LhRTZs2lSQ99thj2r59u1544QVFRETo5MmTWrZsmQ4fPuz0i+1fPfTQQ/L399fMmTPVsGFDp20zZszQXXfdpWrVqkmSli1bpv3796tLly4qUaKE41bE7du366effsrSQ/B//PGHmjRposOHD+vFF19UyZIlNWXKFK1YsSJD31mzZiklJUXPP/+8ihcvrnXr1unTTz/V0aNHNWvWLEl/3p50/PhxLVu2TFOmTMmwjx49emjSpEnq0qWLXnzxRR04cECjR4/Wpk2btGbNGnl6ekqShgwZouHDh6tly5Zq2bKlNm7cqGbNmik1NdX4mZYtW6YOHTqoSZMmev/99yVJO3fu1Jo1axwh+uLFi2rQoIF27typZ599Vvfee69Onz6t+fPn6+jRowoKCtIff/yhRo0aae/evYqNjVW5cuU0a9Ysde7cWWfPnnUK5JI0ceJEXbp0Sd27d5e3t7eKFSum7du3q169eipVqpQGDhwoPz8/zZw5U23atNE333yjRx99VNKNXVd/17ZtWwUGBqpv376OW+f8/f0l6Yav56t69eql4OBgDRkyRMnJycZzfC3R0dGqUKGCU2hJSUlRw4YNdezYMfXo0UNlypTR2rVrNWjQIJ04cUIjR4509H3uuec0adIkPfjgg+ratavS0tL0ww8/6Keffrrms1vZOXeS1LVrV02ePFmPP/64Xn75Zf3888+Ki4vTzp079e233zr13bt3rx5//HE999xz6tSpkyZMmKDOnTsrKipKd911V7bP11WZnf8pU6aoU6dOat68ud5//32lpKTos88+U/369bVp0ybH/5bc6DUGIJ+yACAPnDt3zpJkPfLIIzc8RpLl5uZmbd++3am9TZs2lpeXl7Vv3z5H2/Hjx63ChQtb999/v6OtevXq1kMPPXTN/f/++++WJOvDDz+88Q/y/zp06GCFhIRYaWlpjrYTJ05Ybm5u1ltvveVoS0lJyTD23//+tyXJ+v777x1tEydOtCRZBw4ccLQ1bNjQatiwoeP9yJEjLUnWzJkzHW3JyclWxYoVLUnWypUrr3vcuLg4y2azWYcOHXK09e7d28rs/xX88MMPliRr6tSpTu2LFy92aj958qTl5eVlPfTQQ5bdbnf0e+211yxJVqdOnTLs+6/69OljBQQEOJ3HvxsyZIglyZozZ06GbVePefXcfP31145tqampVnR0tOXv72+dP3/esizLOnDggCXJCggIsE6ePOm0ryZNmliRkZHWpUuXnPZft25dq1KlSo4203V1LVeP/ffr7Uav56vXSP369a97vkzH+6tHHnnEkmSdO3fOsizLevvtty0/Pz9r9+7dTv0GDhxoubu7W4cPH7Ysy7JWrFhhSbJefPHFDPv863VQtmxZp2vgRs7d0KFDna7JzZs3W5Ksrl27OvV75ZVXLEnWihUrnI73939bJ0+etLy9va2XX375usf9q/Xr11uSrIkTJzrarnX+L1y4YAUGBlrdunVz2kdiYqJVpEgRp/YbvcYA5E/cqgcgT5w/f16SVLhw4SyNa9iwoapWrep4n56erqVLl6pNmzYqX768oz0sLExPPfWUfvzxR8exAgMDtX37du3ZsyfTfRcqVEheXl5atWpVhtvPTNq3b6+TJ086LQE+e/Zs2e12tW/f3ukYV126dEmnT5/WfffdJ0nauHFjlo65cOFChYWF6fHHH3e0+fr6qnv37hn6/vW4ycnJOn36tOrWrSvLsrRp0ybjsWbNmqUiRYqoadOmOn36tOMVFRUlf39/rVy5UpL03XffKTU1VS+88ILT7NlLL710Q58pMDBQycnJ173N7ZtvvlH16tUz/Wv81WMuXLhQJUqUUIcOHRzbPD099eKLL+rixYsZbqt87LHHFBwc7Hh/5swZrVixQu3atdOFCxccn/e3335T8+bNtWfPHh07dsxR8/Wuq6zIyvV8Vbdu3eTu7p7jY0t
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
2023-01-06 10:09:28 +01:00
"Most important features: ['V36', 'V1', 'V14', 'V27', 'V34', 'V12', 'V30', 'V38', 'V17', 'V31', 'V16', 'V8', 'V2', 'V37', 'V18', 'V39', 'V28', 'V3', 'V13', 'V15']\n",
2022-12-29 10:21:35 +01:00
"[{'Accuracy': 0.8038277511961722, 'F1': 0.844106463878327, 'Precision': 0.8671875, 'Recall': 0.8222222222222222, 'AUC': 0.7962462462462462, 'model_name': 'Decision Tree'}]\n"
]
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdd3xO9///8WcSksgWROxoaO3RaNQqbUOMKh0oSqJGi9SIDlrEaKVDNWpT66NVs1VFbdoara1W7VFqb0FCcn5/+OX65pIrubKviMf9drtuXO/zPue8z7jOeeV1znkfO8MwDAEAAAAAAABIlr2tGwAAAAAAAADkdCTRAAAAAAAAACtIogEAAAAAAABWkEQDAAAAAAAArCCJBgAAAAAAAFhBEg0AAAAAAACwgiQaAAAAAAAAYAVJNAAAAAAAAMAKkmgAAAAAAACAFSTRgGxy+PBhNWrUSJ6enrKzs9OiRYuyZD4NGjRQgwYNsmTa2W3IkCGys7OzdTOsWr58uapVqyZnZ2fZ2dnp2rVrtm6SRSdOnJCdnZ1mzJiRpvFy0z4FAI+DrVu3qnbt2nJ1dZWdnZ127dqV6nFnzJghOzs7nThxwmpdPz8/hYaGprudjytL6zi159r169fLzs5O69evz7L2pcejti+kNybKbufPn9frr7+uAgUKyM7OTlFRUbZuUrLSEy+m5XjzOLp165Z8fHz0/fff27opmWLixIkqWbKkYmJibN2UDCGJBvx/CQfxhI+zs7OKFi2q4OBgffPNN7p582aGph8SEqI9e/bo008/1axZs1SjRo1MannK/vvvPw0ZMiRNATRS7/Lly2rdurXy5cuncePGadasWXJ1dbVYN6v3sdzEz8/PbF0l98npwS8ASA/+EIqIiFDjxo3l7e1t9fh14MABNW7cWG5ubvL29laHDh108eLFVM3r3r17atWqla5cuaKvv/5as2bNUqlSpTJpSfAoW7ZsmYYMGWLrZiAN+vbtqxUrVmjAgAGaNWuWGjdunGzdxPFRnjx55O3trYCAAPXu3Vv79+/PxlbnbAkX6a19csIF5NGjR8vd3V1vvPGGrZuSKUJDQxUbG6tJkybZuikZksfWDQBymmHDhql06dK6d++ezp07p/Xr16tPnz4aNWqUFi9erCpVqqR5mnfu3NHmzZv18ccfKywsLAtanbz//vtPQ4cOlZ+fn6pVq5at886ogQMHqn///rZuRoq2bt2qmzdvavjw4QoKCkrVOFmxj6VGqVKldOfOHeXNmzdN461cuTJL2pOSqKgo3bp1y/R92bJl+uGHH/T111+rYMGCpvLatWtne9sAIK0uXbqkYcOGqWTJkqpatWqKdxGdPn1azz33nDw9PTVixAjdunVLI0eO1J49e7RlyxY5OjqmOK+jR4/q5MmTmjJlirp06ZLJS4Kskh3n2mXLlmncuHEk0pT+mCi7rV27Vi1atNB7772XqvoNGzZUx44dZRiGrl+/rt27d2vmzJkaP368Pv/8c4WHh2dZW9OzD3fo0EFvvPGGnJycsqBFlr366qsqU6aM6futW7fUvXt3vfLKK3r11VdN5YULF862Nlly7949jR49Wn379pWDg4NN25JZnJ2dFRISolGjRundd999JJ44soQkGvCQJk2amN0lNmDAAK1du1YvvfSSXn75ZR04cED58uVL0zQTrh57eXllZlNzrejoaLm6uipPnjzKkydnH6YuXLggKW3bNiv2sdRIuPstraz9wZYVWrZsafb93Llz+uGHH9SyZUv5+fklO17CvgMAOUmRIkV09uxZ+fr6atu2bXrmmWeSrTtixAhFR0dr+/btKlmypCQpMDBQDRs21IwZM9StW7cU55We81JOFB8fr9jY2HSdtx5FtjjXPo7u37+v+Ph4OTo6PhL71oULF9L0W37yySf15ptvmpV99tlnat68ufr166dy5cqpadOmmdzKB9KzDzs4OGR7gqhKlSpmF6wvXbqk7t27q0qVKknWXWJ3796Vo6Oj7O2z52G+JUuW6OLFi2rdunW2zC+7tG7dWl988YXWrVunF154wdbNSRce5wRS4YUXXtCgQYN08uRJfffdd2bD/vnnH73++uvy9vaWs7OzatSoocWLF5uGDxkyxPQYxfvvvy87OztTEuDkyZPq0aOHnnrqKeXLl08FChRQq1atkvQLkFzfYNb6EVi/fr0pUO/UqVOqHoG7efOm+vTpIz8/Pzk5OcnHx0cNGzbUjh07zOr99ddfatq0qfLnzy9XV1dVqVJFo0ePNquzdu1a1atXT66urvLy8lKLFi104MABi8u2f/9+tWvXTvnz51fdunWTXW47OzuFhYVp0aJFqlSpkpycnFSxYkUtX77c4vLXqFFDzs7O8vf316RJk9LUz9r8+fMVEBCgfPnyqWDBgnrzzTd15swZ0/AGDRooJCREkvTMM8/Izs4u3f2BZGQfS3Dt2jX17dvXtO2KFy+ujh076tKlS5Is9/9x7tw5derUScWLF5eTk5OKFCmiFi1aWO2n5cKFC+rcubMKFy4sZ2dnVa1aVTNnzjSrkzC/kSNHavLkyfL395eTk5OeeeYZbd26NV3rKbHQ0FC5ubnp6NGjatq0qdzd3dW+fXtJD/74ioqKUsWKFeXs7KzChQvr7bff1tWrV5NM59dffzXtp+7u7mrWrJn27duX4fYBQAInJyf5+vqmqu7ChQv10ksvmRJokhQUFKQnn3xS8+bNS3Hc0NBQ1a9fX5LUqlWrJI8kpea8bIlhGPrkk09UvHhxubi46Pnnn0/TcTI+Pl6jR49W5cqV5ezsrEKFCqlx48batm2bqU7C+f37779XxYoV5eTkZDq379y5U02aNJGHh4fc3Nz04osv6s8//zSbx7179zR06FCVLVtWzs7OKlCggOrWratVq1aZ6qTmnPewBQsWyM7OTr/99luSYZMmTZKdnZ327t0rSfr7778VGhqqJ554Qs7OzvL19dVbb72ly5cvW11Hls61p0+fVsuWLeXq6iofHx/17dvXYl9Cf/zxh1q1aqWSJUvKyclJJUqUUN++fXXnzh1TndDQUI0bN06S+WN/CVJ73szovjBnzhwFBATI3d1dHh4eqly5cpL40Vo8I6U9DomKijLFIfv377cYEyXEFWfOnFHLli3l5uamQoUK6b333lNcXJzZtC9fvqwOHTrIw8NDXl5eCgkJ0e7du1Pd1cSxY8fUqlUreXt7y8XFRc8++6yWLl1qGp4Q4xuGoXHjxiXZXmlRoEABzZkzR3ny5NGnn35qNiwmJkYREREqU6aMad/54IMPLO5n3333nQIDA+Xi4qL8+fPrueeeM7v7zNI+PGbMGFWsWNE0To0aNTR79uwky/nwb3D8+PGm40DRokXVs2fPJH0ON2jQQJUqVdL+/fv1/PPPy8XFRcWKFdMXX3yRrvWUWELfg3PmzNHAgQNVrFgxubi46MaNG5Ie/B3UuHFjeXp6ysXFRfXr19fGjRuTTOfMmTN66623VLhwYdPfLNOmTUtVGxYtWiQ/Pz/5+/ublSfsp6dOndJLL70kNzc3FStWzPT73rNnj1544QW5urqqVKlSZus7wbVr19SnTx+VKFFCTk5OKlOmjD7//HPFx8eb1Rs5cqRq166tAgUKKF++fAoICNCCBQuSTC8tf58FBATI29tbP//8c6rWQ06Us2/xAHKQDh066KOPPtLKlSvVtWtXSdK+fftUp04dFStWTP3795erq6vmzZunli1bauHChabbgr28vNS3b1+1bdtWTZs2lZubm6QHjwJu2rRJb7zxhooXL64TJ05owoQJatCggfbv3y8XF5cMtbl8+fIaNmyYBg8erG7duqlevXqSUn4E7p133tGCBQsUFhamChUq6PLly9qwYYMOHDigp59+WpK0atUqvfTSSypSpIh69+4tX19fHThwQEuWLFHv3r0lSatXr1aTJk30xBNPaMiQIbpz547GjBmjOnXqaMeOHUnuJmrVqpXKli2rESNGyDCMFJdrw4YN+vHHH9WjRw+5u7vrm2++0WuvvaZTp06pQIECkh4E3I0bN1aRIkU0dOhQxcX
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABnwElEQVR4nO3deVhUZf8G8HtYZthBRHYUd0XBNUksTUVxya0yyg2tbFMzfa00d8ulLJfKMjU1+/m+llZmibjgklvuGpuouKAIKCIM+zLz/P5ARkdA5+AMA8P9uS6ummfOOfOd4+DcPss5MiGEABEREZGJMDN2AURERET6xHBDREREJoXhhoiIiEwKww0RERGZFIYbIiIiMikMN0RERGRSGG6IiIjIpFgYu4CqplarcfPmTdjb20Mmkxm7HCIiItKBEAJZWVnw9PSEmdmj+2ZqXbi5efMmfHx8jF0GERERVcL169fh7e39yG1qXbixt7cHUHJyHBwcjFwNERER6UKpVMLHx0fzPf4otS7clA5FOTg4MNwQERHVMLpMKeGEYiIiIjIpDDdERERkUhhuiIiIyKQw3BAREZFJYbghIiIik8JwQ0RERCaF4YaIiIhMCsMNERERmRSGGyIiIjIpDDdERERkUowabv7++28MGDAAnp6ekMlk2Lp162P32b9/P9q3bw+FQoEmTZpg/fr1Bq+TiIiIag6jhpucnBy0adMGK1as0Gn7K1euoH///ujevTvOnj2L999/H2+88QZ27txp4EqJiIiopjDqjTP79u2Lvn376rz9ypUr0bBhQ3z55ZcAgJYtW+LQoUNYunQpQkJCDFUmERER6eiWMh/ZBcVoVM/OaDXUqLuCHz16FMHBwVptISEheP/99yvcp6CgAAUFBZrHSqXSUOURERHVKqnKfETdyERUUiaik0r+eyurAM81r4f1YzoZra4aFW5SUlLg5uam1ebm5galUom8vDxYW1uX2WfhwoWYO3duVZVIRERkklKV+fj3oSBzO6ugzHZmMqCgSG2ECu+rUeGmMqZNm4bJkydrHiuVSvj4+BixIiIioupLCIFUZQGiknQLMk1c7dDayxH+Xo4I8HZESw8H2MiNGy9qVLhxd3dHamqqVltqaiocHBzK7bUBAIVCAYVCURXlERER1ShCCKTcG1oqDTFRSUqkZZcfZJq62t8LMg7wryZBpjzVr6JH6Ny5M8LDw7Xadu/ejc6dOxupIiIioprh4SDz771embTswjLbmsmAZm72mh6Z1l6O8PNwgLXc3AiVS2fUcJOdnY1Lly5pHl+5cgVnz56Fs7Mz6tevj2nTpiEpKQkbNmwAALz99tv45ptv8OGHH+K1117D3r178csvv2D79u3GegtERETVjhACyZn5WsNKFQUZczMZmj4wtFTTgkx5jBpuTp48ie7du2sel86NCQsLw/r165GcnIzExETN8w0bNsT27dsxadIkLF++HN7e3lizZg2XgRMRUa31cJD5917PzJ2cioOMv5cj/L3vBxkry5obZMojE0IIYxdRlZRKJRwdHZGZmQkHBwdjl0NERKQzIQRuZmrPkdE1yPh7lcyRqalBRsr3d42ac0NERFRb3A8yGZqJvtFJmUivIMg0c7Mvmeh7b2ipJgeZJ8VwQ0REZGRCCCRl5GmtWKooyFiYydD0gSDj7+2EFu72tTbIlIfhhoiIqAoJIXDj7oNBpmRo6W5uUZltLTQ9Mo5ofW9oiUHm8RhuiIiIDERqkGnubq8ZVvL3ckRzBplKYbghIiLSg9Ig8/CVfTPKCTKW5g/0yDDI6B3DDRERkUQPB5moG5mIvllxkCmvR0ZhwSBjKAw3REREjyCEwPV07R4ZXYNMgJcTmrnbMchUMYYbIiKie0qDzL9JGfeDTJISmXnlB5kW7g6a3hh/L0cGmWqC4YaIiGolIQQS03O1e2QqCDJyczO08LDXDjJu9pBbmBmhcnochhsiIjJ5Qghcu5Nb5l5LyvziMtsyyNR8DDdERGRSHg4y/96b7JtVQZBp6aF992sGmZqP4YaIiGostVrgWvoDPTKPCjIWZmjpziBTGzDcEBFRjfBwkPn3RgZibiorDjIeDlr3WmrmZg9LcwaZ2oDhhoiIqh21WuDqnRytOTIxSUpkFTw+yPh7OaGpmx2DTC3GcENEREb1YJCJulESZGJvlh9kFJogc39oiUGGHsZwQ0REVUatFrhyJ0czPyYqKRMxN5XIriDI+Hk6aF3Zt4krgww9HsMNEREZhFotcDktR+umkbESgkxTVztYMMhQJTDcEBHREysTZG5kIuZmJnIKVWW2tbI0g5/HA0HG2xFN6jHIkP4w3BARkSQqtcCVtOx7IUaJ6CTdg0yAtxMa17NlkCGDYrghIqIKPRxkopIyEHtTWW6QsbY0LzO0xCBDxsBwQ0REAEqCzOXb2Vr3Woq5qUSuDkEmwNsRjevZwdxMZoTKibQx3BAR1UKlQebfG/eDTGxyxUGmlecDd79mkKFqjuGGiMjEqdQCCbezNUuvdQky/t73bxrZiEGGahiGGyIiE1KsUiPhtvaVfWNvKpFXVDbI2Mgf6pFhkCETwXBDRFRDlRdkYm5mIr9IXWZbW7k5WnmWLr0umSvT0IVBhkwTww0RUQ1QrFLj0r2hJU2PTLJSxyDjhIYutgwyVGsw3BARVTOSg8wDw0qtvRzRyMUWZgwyVIsx3BARGVGxSo2Lt7K1hpbiKggydgoLzfLrAO+SINOwLoMM0cMYboiIqogmyNy4f6+luGQlCorLDzKt7gUZfwYZIkkYboiIDKBIpcbF1Gytm0Y+LsiU9sb4eznCl0GGqNIYboiInlCRSo0LqVkPBBkl4pKVKCwnyNgrLNDKS/sWBQwyRPrFcENEJIGkIGNlgdae94eV/L0c0cDZhkGGyMAYboiIKlCkUiM+5X6QiU7KRFxKlk5BJsDLEfUZZIiMguGGiAhAYbF2j0x0UibikrNQqCo/yDy49NqfQYaoWmG4IaJapzTIPHj36/MVBBkHK4t7F8O7fy2Z+s42kMkYZIiqK4YbIjJppUHmwbtfx6dUHGQenB/DIENUMzHcEJHJKChW4UJKtlaPTEVBxtHaUmtYyd/LET7O1gwyRCaA4YaIaqSCYhXiU7K0ruwbn5KFIpUosy2DDFHtwnBDRNWelCDjZFM2yHjXYZAhqk0qFW6KioqQkpKC3Nxc1KtXD87Ozvqui4hqqYJiFc4naweZC6kMMkSkO53DTVZWFv7v//4PmzZtwvHjx1FYWAghBGQyGby9vdG7d2+8+eabeOqppwxZLxGZkPyi8ntkitVlg0wdG0utENOaQYaIKqBTuFmyZAnmz5+Pxo0bY8CAAfj444/h6ekJa2trpKenIzo6GgcPHkTv3r0RGBiIr7/+Gk2bNjV07URUg+QXqXC+NMjcuN8j86ggE+B9P8h4OTHIEJFuZEKIsn+zPOTVV1/FjBkz0KpVq0duV1BQgHXr1kEul+O1117TW5H6pFQq4ejoiMzMTDg4OBi7HCKTpAkyNzI0tyi4WEGQcbaV3+uRcWCQIaIKSfn+1incmBKGGyL9yi9SIS5ZqXWvpYqCTF1NkHHUXBjP09GKQYaIHkvK9zdXSxGRzvKLVIgtDTL3hpYu3sqGSocgE+DtCA8GGSKqApLCzblz5/Dnn3/C2dkZL7/8MlxcXDTPKZVKvP/++1i7dq3eiySiqiclyLjYPdQj48UgQ0TGo/Ow1K5duzBgwAA0bdoUWVlZyMnJwebNm9G9e3cAQGpqKjw9PaFSqQxa8JPisBRRWXmFDwSZeyuXdA0yAd6OcHdgkCEiwzLIsNScOXMwZcoUzJ8/H0IILF68GAMHDsTmzZvRp0+fJy6aiKqGtCCj0Jro688gQ0Q1gM7hJiYmBj/99BMAQCaT4cMPP4S3tzdeeuklbNq0ide3IaqGSoJM6bCS8l6QyUI5OeZ+kPF20lxLxs1BwSBDRDWOzuFGoVAgIyNDq23YsGEwMzNDaGgovvzyS33XRkQSlAaZB+9+felWdrlBpp69osyVfRlkiMhU6Bxu2rZti3379qFDhw5a7a+88gqEEAg
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from sklearn.tree import DecisionTreeClassifier\n",
"\n",
"# Score the model with default parameters\n",
2023-01-06 10:09:28 +01:00
"score_dec_tree, model_dec_tree, most_importatn_features_dec_tree = score_the_model(\n",
2022-12-29 10:21:35 +01:00
" model=DecisionTreeClassifier(),\n",
" model_name='Decision Tree',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")\n",
"\n",
"print(score_dec_tree)\n"
]
},
2023-01-06 10:09:28 +01:00
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 169,
2023-01-06 10:09:28 +01:00
"id": "60a23b91",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABUl0lEQVR4nO3dd3hU1b7G8XfSSUIIkEKAQGgKSAQNxVAEIRQRFFFBPApBKQpRBAVBpSmKFeEIR0SkiHBoInAOXZoKUSAUqdI7CSDSEiEks+8fXuY4JrBIIRPC9/M881xn7bX2/s3O5ty8WXuvsVmWZQkAAAAAcE1uri4AAAAAAPI7ghMAAAAAGBCcAAAAAMCA4AQAAAAABgQnAAAAADAgOAEAAACAAcEJAAAAAAwITgAAAABgQHACAAAAAAOCEwDkA5MmTZLNZtPBgwcdbY0aNVKjRo2MY1etWiWbzaZVq1bdtPqyIyIiQrGxsa4uI1/Zs2ePmjVrpiJFishms2nu3LmuLummys41MGTIENlstptTEADkAMEJQJ7bt2+funfvrvLly8vHx0cBAQGqV6+eRo0apT/++MPV5d1WFi5cqCFDhri6jNtGp06dtHXrVr3zzjuaMmWKatasedOOdfDgQdlsNsfL09NTQUFBqlu3rl5//XUdPnz4ph37VhMbG+t0rq714g8BwO3Nw9UFALi9LFiwQE888YS8vb3VsWNHVatWTampqfrxxx/Vt29fbd++XePGjXN1mfnC0qVLb/oxFi5cqDFjxhCe8sAff/yh+Ph4vfHGG4qLi8uz43bo0EEtW7aU3W7X77//rvXr12vkyJEaNWqUvvzySz355JM37di//vqr3Nyy9jfaN998U/37979JFWWue/fuiomJcbw/cOCABg0apG7duqlBgwaO9goVKuRpXQDyF4ITgDxz4MABPfnkkypbtqxWrFihsLAwx7aePXtq7969WrBgwTXH2+12paamysfHJy/KdTkvLy9Xl3BbSEtLk91uv+nn+9SpU5KkwMDAXNtncnKy/Pz8rtvn3nvv1dNPP+3UdujQITVr1kydOnVSlSpVVL169Vyr6a+8vb2zPMbDw0MeHnn760l0dLSio6Md7zds2KBBgwYpOjo6w7n7qxs5/wAKDm7VA5BnPvjgA128eFFffvmlU2i6qmLFiurVq5fjvc1mU1xcnKZOnaq77rpL3t7eWrx4sSRp06ZNevDBBxUQECB/f381adJEP/30k9P+rly5oqFDh6pSpUry8fFR8eLFVb9+fS1btszRJzExUZ07d1bp0qXl7e2tsLAwPfLII07PGv3d7NmzZbPZtHr16gzbPv/8c9lsNm3btk2S9Msvvyg2NtZxW2KJEiX07LPP6rfffjOer8yecTp69KjatGkjPz8/hYSEqHfv3rp8+XKGsT/88IOeeOIJlSlTRt7e3goPD1fv3r2dboWMjY3VmDFjJMnpdqSr7Ha7Ro4cqbvuuks+Pj4KDQ1V9+7d9fvvvzsdy7IsDRs2TKVLl5avr68eeOABbd++3fj5rpo+fbqioqJUuHBhBQQEKDIyUqNGjXLqc/bsWfXu3VsRERHy9vZW6dKl1bFjR50+fdrR5+TJk3ruuecUGhoqHx8fVa9eXZMnT3baz9Xb1z766CONHDlSFSpUkLe3t3bs2CFJ2rVrlx5//HEVK1ZMPj4+qlmzpubPn++0jxu5rv5uyJAhKlu2rCSpb9++stlsioiIcGy/kev56nNwq1evVo8ePRQSEqLSpUvf8Hn+q7Jly2rSpElKTU3VBx984LTt7NmzevnllxUeHi5vb29VrFhR77//vux2u1M/u92uUaNGKTIyUj4+PgoODlaLFi20YcMGR5+/P+N0I+cus2ec0tLS9Pbbbzt+XhEREXr99dczXPsRERFq1aqVfvzxR9WuXVs+Pj4qX768vvrqq2ydp78ynf9FixapQYMG8vPzU+HChfXQQw9l+u/gRq4xAPkTM04A8sx//vMflS9fXnXr1r3hMStWrNDMmTMVFxenoKAgRUREaPv27WrQoIECAgLUr18/eXp66vPPP1ejRo20evVq1alTR9Kfv4ANHz5cXbp0Ue3atXX+/Hlt2LBBGzduVNOmTSVJjz32mLZv364XX3xREREROnnypJYtW6bDhw87/WL7Vw899JD8/f01c+ZMNWzY0GnbjBkzdNddd6latWqSpGXLlmn//v3q3LmzSpQo4bgVcfv27frpp5+y9BD8H3/8oSZNmujw4cN66aWXVLJkSU2ZMkUrVqzI0HfWrFlKSUnRCy+8oOLFi2vdunX69NNPdfToUc2aNUvSn7cnHT9+XMuWLdOUKVMy7KN79+6aNGmSOnfurJdeekkHDhzQ6NGjtWnTJq1Zs0aenp6SpEGDBmnYsGFq2bKlWrZsqY0bN6pZs2ZKTU01fqZly5apQ4cOatKkid5//31J0s6dO7VmzRpHiL548aIaNGignTt36tlnn9W9996r06dPa/78+Tp69KiCgoL0xx9/qFGjRtq7d6/i4uJUrlw5zZo1S7GxsTp79qxTIJekiRMn6tKlS+rWrZu8vb1VrFgxbd++XfXq1VOpUqXUv39/+fn5aebMmWrTpo2++eYbPfroo5Ju7Lr6u7Zt2yowMFC9e/d23Drn7+8vSTd8PV/Vo0cPBQcHa9CgQUpOTjae42uJjo5WhQoVnEJLSkqKGjZsqGPHjql79+4qU6aM1q5dqwEDBujEiRMaOXKko+9zzz2nSZMm6cEHH1SXLl2UlpamH374QT/99NM1n93KzrmTpC5dumjy5Ml6/PHH9corr+jnn3/W8OHDtXPnTn377bdOfffu3avHH39czz33nDp16qQJEyYoNjZWUVFRuuuuu7J9vq7K7PxPmTJFnTp1UvPmzfX+++8rJSVFn332merXr69NmzY5/rfkRq8xAPmUBQB54Ny5c5Yk65FHHrnhMZIsNzc3a/v27U7tbdq0sby8vKx9+/Y52o4fP24VLlzYuv/++x1t1atXtx566KFr7v/333+3JFkffvjhjX+Q/9ehQwcrJCTESktLc7SdOHHCcnNzs9566y1HW0pKSoax//73vy1J1vfff+9omzhxoiXJOnDggKOtYcOGVsOGDR3vR44caUmyZs6c6WhLTk62KlasaEmyVq5ced3jDh8+3LLZbNahQ4ccbT179rQy+38FP/zwgyXJmjp1qlP74sWLndpPnjxpeXl5WQ899JBlt9sd/V5//XVLktWpU6cM+/6rXr16WQEBAU7n8e8GDRpkSbLmzJmTYdvVY149N19//bVjW2pqqhUdHW35+/tb58+ftyzLsg4cOGBJsgICAqyTJ0867atJkyZWZGSkdenSJaf9161b16pUqZKjzXRdXcvVY//9ervR6/nqNVK/fv3rni/T8f7qkUcesSRZ586dsyzLst5++23Lz8/P2r17t1O//v37W+7u7tbhw4cty7KsFStWWJKsl156KcM+/3odlC1b1ukauJFzN3jwYKdrcvPmzZYkq0uXLk79Xn31VUuStWLFCqfj/f3f1smTJy1vb2/rlVdeue5x/2r9+vWWJGvixImOtmud/wsXLliBgYFW165dnfaRmJhoFSlSxKn9Rq8xAPkTt+oByBPnz5+XJBUuXDhL4xo2bKiqVas63qenp2vp0qVq06aNypcv72gPCwvTU089pR9//NFxrMDAQG3fvl179uzJdN+FChWSl5eXVq1aleH2M5P27dvr5MmTTkuAz549W3a7Xe3bt3c6xlWXLl3S6dOndd9990mSNm7cmKVjLly4UGFhYXr88ccdbb6+vurWrVuGvn89bnJysk6fPq26devKsixt2rTJeKxZs2apSJEiatq0qU6fPu14RUVFyd/fXytXrpQkfffdd0pNTdWLL77oNHv28ssv39BnCgwMVHJy8nVvc/vmm29UvXr1TP8af/WYCxcuVIkSJdShQwfHNk9PT7300ku6ePFihtsqH3vsMQUHBzvenzlzRitWrFC7du104cIFx+f97bff1Lx5c+3Zs0fHjh1z1Hy96yorsnI9X9W1a1e5u7vn+NiSHLN
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most important features: ['V36', 'V1', 'V34', 'V27', 'V12', 'V14', 'V30', 'V37', 'V18', 'V38', 'V31', 'V16', 'V8', 'V17', 'V15', 'V3', 'V2', 'V39']\n"
]
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdd3QVxf//8VcSSEI6gRRqgoD0ZpBIE9RAKEYQBQSEBBEUiSBgAQVCUbAgBqUj7YOFqoiAdFApSkd6L4L0ToAEkv39wS/7zSU3uUlIA56Pc+45ubOzuzO7e3cn752dtTMMwxAAAAAAAACAFNnndAEAAAAAAACA3I4gGgAAAAAAAGADQTQAAAAAAADABoJoAAAAAAAAgA0E0QAAAAAAAAAbCKIBAAAAAAAANhBEAwAAAAAAAGwgiAYAAAAAAADYQBANAAAAAAAAsIEgGpBNDhw4oIYNG8rT01N2dnaaN29elqynfv36ql+/fpYsO7sNHDhQdnZ2OV0MmxYvXqyqVavK2dlZdnZ2unz5ck4XyaqjR4/Kzs5OU6dOTdd8D9MxBQCPgo0bN6pWrVpydXWVnZ2dtm3bluZ5p06dKjs7Ox09etRm3sDAQEVERGS4nI8qa9s4rdfa1atXy87OTqtXr86y8mXEg3YsZLRNlN3OnDmjl19+WQUKFJCdnZ2io6Nzukgpykh7MT3nm0fR9evX5evrq++//z6ni5Ipxo0bp+LFiys2Njani3JfCKIB/1/iSTzx4+zsrMKFCys0NFRff/21rl27dl/LDw8P144dO/TJJ59o+vTpql69eiaVPHX//fefBg4cmK4GNNLuwoULatWqlfLly6fRo0dr+vTpcnV1tZo3q4+xh0lgYKDFtkrpk9sbvwAg3f1HKCoqSo0aNZK3t7fN89eePXvUqFEjubm5ydvbW+3bt9e5c+fStK7bt2+rZcuWunjxor766itNnz5dAQEBmVQTPMgWLVqkgQMH5nQxkA49e/bUkiVL1LdvX02fPl2NGjVKMW/S9lGePHnk7e2toKAg9ejRQ7t3787GUuduiTfpbX1yww3kkSNHyt3dXa+88kpOFyVTREREKC4uTuPHj8/potyXPDldACC3GTx4sEqUKKHbt2/r9OnTWr16td555x2NGDFC8+fPV+XKldO9zJs3b2r9+vX66KOPFBkZmQWlTtl///2nQYMGKTAwUFWrVs3Wdd+vfv36qU+fPjldjFRt3LhR165d05AhQxQSEpKmebLiGEuLgIAA3bx5U3nz5k3XfEuXLs2S8qQmOjpa169fN78vWrRIP/74o7766isVLFjQTK9Vq1a2lw0A0uv8+fMaPHiwihcvripVqqTai+jEiRN6+umn5enpqaFDh+r69esaPny4duzYoQ0bNsjR0THVdR06dEjHjh3TxIkT9frrr2dyTZBVsuNau2jRIo0ePZpAmjLeJspuK1euVLNmzfTuu++mKX+DBg3UoUMHGYahK1euaPv27Zo2bZrGjBmjzz77TL169cqysmbkGG7fvr1eeeUVOTk5ZUGJrGvRooVKlSplfr9+/bq6du2qF198US1atDDT/fz8sq1M1ty+fVsjR45Uz5495eDgkKNlySzOzs4KDw/XiBEj9Pbbbz8QTxxZQxANuEfjxo0teon17dtXK1eu1PPPP68XXnhBe/bsUb58+dK1zMS7x15eXplZ1IdWTEyMXF1dlSdPHuXJk7tPU2fPnpWUvn2bFcdYWiT2fksvW/+wZYXmzZtbfD99+rR+/PFHNW/eXIGBgSnOl3jsAEBuUqhQIZ06dUr+/v7atGmTnnzyyRTzDh06VDExMdq8ebOKFy8uSapRo4YaNGigqVOnqkuXLqmuKyPXpdwoISFBcXFxGbpuPYhy4lr7KLpz544SEhLk6Oj4QBxbZ8+eTddv+fHHH9err75qkfbpp58qLCxMvXv3VtmyZdWkSZNMLuVdGTmGHRwcsj1AVLlyZYsb1ufPn1fXrl1VuXLlZNsuqVu3bsnR0VH29tnzMN+CBQt07tw5tWrVKlvWl11atWqlzz//XKtWrdKzzz6b08XJEB7nBNLg2WefVf/+/XXs2DF99913FtP27t2rl19+Wd7e3nJ2dlb16tU1f/58c/rAgQPNxyjee+892dnZmUGAY8eO6a233lKZMmWUL18+FShQQC1btkw2LkBKY4PZGkdg9erVZkO9Y8eOaXoE7tq1a3rnnXcUGBgoJycn+fr6qkGDBtqyZYtFvr///ltNmjRR/vz55erqqsqVK2vkyJEWeVauXKm6devK1dVVXl5eatasmfbs2WO1brt371bbtm2VP39+1alTJ8V629nZKTIyUvPmzVPFihXl5OSkChUqaPHixVbrX716dTk7O6tkyZIaP358usZZmz17toKCgpQvXz4VLFhQr776qk6ePGlOr1+/vsLDwyVJTz75pOzs7DI8Hsj9HGOJLl++rJ49e5r7rmjRourQoYPOnz8vyfr4H6dPn1bHjh1VtGhROTk5qVChQmrWrJnNcVrOnj2rTp06yc/PT87OzqpSpYqmTZtmkSdxfcOHD9eECRNUsmRJOTk56cknn9TGjRsztJ2SioiIkJubmw4dOqQmTZrI3d1d7dq1k3T3n6/o6GhVqFBBzs7O8vPz0xtvvKFLly4lW85vv/1mHqfu7u5q2rSpdu3add/lA4BETk5O8vf3T1PeuXPn6vnnnzcDaJIUEhKixx9/XLNmzUp13oiICNWrV0+S1LJly2SPJKXlumyNYRj6+OOPVbRoUbm4uOiZZ55J13kyISFBI0eOVKVKleTs7CwfHx81atRImzZtMvMkXt+///57VahQQU5OTua1fevWrWrcuLE8PDzk5uam5557Tn/99ZfFOm7fvq1BgwapdOnScnZ2VoECBVSnTh0tW7bMzJOWa9695syZIzs7O/3+++/Jpo0fP152dnbauXOnJOmff/5RRESEHnvsMTk7O8vf31+vvfaaLly4YHMbWbvWnjhxQs2bN5erq6t8fX3Vs2dPq2MJ/fnnn2rZsqWKFy8uJycnFStWTD179tTNmzfNPBERERo9erQky8f+EqX1unm/x8KMGTMUFBQkd3d3eXh4qFKlSsnaj7baM1L62yHR0dFmO2T37t1W20SJ7YqTJ0+qefPmcnNzk4+Pj959913Fx8dbLPvChQtq3769PDw85OXlpfDwcG3fvj3NQ00cPnxYLVu2lLe3t1xcXPTUU09p4cKF5vTENr5hGBo9enSy/ZUeBQoU0IwZM5QnTx598sknFtNiY2MVFRWlUqVKmcfO+++/b/U4++6771SjRg25uLgof/78evrppy16n1k7hr/55htVqFDBnKd69er64YcfktXz3t/gmDFjzPNA4cKF1a1bt2RjDtevX18VK1bU7t279cwzz8jFxUVFihTR559/nqHtlFTi2IMzZsxQv379VKRIEbm4uOjq1auS7v4f1KhRI3l6esrFxUX16tXT2rVrky3n5MmTeu211+Tn52f+zzJ58uQ0lWHevHkKDAxUyZIlLdITj9Pjx4/r+eefl5ubm4oUKWL+vnfs2KFnn31Wrq6uCggIsNjeiS5fvqx33nlHxYoVk5OTk0qVKqXPPvtMCQkJFvmGDx+uWrVqqUCBAsqXL5+CgoI0Z86cZMtLz/9nQUFB8vb21i+//JKm7ZAb5e4uHkAu0r59e3344YdaunSpOnfuLEnatWuXateurSJFiqhPnz5ydXXVrFmz1Lx5c82dO9fsFuzl5aWePXuqTZs2atKkidzc3CTdfRRw3bp1euWVV1S0aFEdPXpUY8eOVf369bV79265uLjcV5nLlSunwYMHa8CAAerSpYvq1q0rKfVH4N58803NmTNHkZGRKl++vC5cuKA1a9Zoz549euKJJyRJy5Yt0/PPP69ChQqpR48e8vf31549e7RgwQL16NFDkrR8+XI1btxYjz32mAYOHKibN2/qm2++Ue3atbVly5ZkvYlatmyp0qVLa+jQoTIMI9V6rVmzRj/99JPeeustubu76+uvv9ZLL72k48ePq0CBApLuNrgbNWqkQoUKadCgQYq
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABtgUlEQVR4nO3dd3iT1dsH8G+SNulelE462JsOkEpRECyUYVmKKKug4gJk/FBBNshQlKGiCMryVUEcgIKsCgiIg9KW0VJ2y2gLpSPdIznvH9BASAtJSZqO7+e6cl3k5DzPc+chbe6eKRFCCBARERHVElJzB0BERERkTExuiIiIqFZhckNERES1CpMbIiIiqlWY3BAREVGtwuSGiIiIahUmN0RERFSrWJg7gKqmVqtx/fp12NvbQyKRmDscIiIi0oMQAjk5OfDy8oJU+uC2mTqX3Fy/fh0+Pj7mDoOIiIgq4cqVK2jQoMED69S55Mbe3h7A7Zvj4OBg5miIiIhIH0qlEj4+Pprv8Qepc8lNWVeUg4MDkxsiIqIaRp8hJRxQTERERLUKkxsiIiKqVZjcEBERUa3C5IaIiIhqFSY3REREVKswuSEiIqJahckNERER1SpMboiIiKhWYXJDREREtQqTGyIiIqpVzJrc/Pnnn4iIiICXlxckEgm2bt360GMOHDiA4OBgKBQKNGnSBOvXrzd5nERERFRzmDW5ycvLQ0BAAFauXKlX/UuXLqFv377o1q0bYmNjMXHiRLzyyivYvXu3iSMlIiKimsKsG2f27t0bvXv31rv+qlWr0LBhQ3z88ccAgJYtW+Lw4cNYtmwZwsPDTRUmERER6emGshA5RaVoXN/ObDHUqF3Bjx49irCwMK2y8PBwTJw4scJjioqKUFRUpHmuVCpNFR4REVGdUqpS40xqDo4nZyI66fbjamYBurdww9pRj5ktrhqV3KSmpsLd3V2rzN3dHUqlEgUFBbC2ttY5ZtGiRZg7d25VhUhERFRrZeeX4PiVTBy/k8jEXslCfrFKq45UAhTcV1bValRyUxnTpk3D5MmTNc+VSiV8fHzMGBEREVH1J4TAxfQ8RCfdTWbO3cjVqWevsECQnzPa+zqjvZ8zAnwcYW9laYaI76pRyY2HhwfS0tK0ytLS0uDg4FBuqw0AKBQKKBSKqgiPiIioxiooVuHE1SxEJ2ci+nImjidnIjO/RKdeQ1dbBN9JZNr7OaOJmx1kUokZIq5YjUpuOnXqhJ07d2qV7d27F506dTJTRERERDVTSnaBZpzM8aRMnL6uRKlaaNVRWEgR0MAJwXcSmWBfJ9Szq/4NBmZNbnJzc3H+/HnN80uXLiE2NhYuLi7w9fXFtGnTcO3aNWzcuBEA8Prrr+Ozzz7DO++8g5deegl//PEHfvjhB+zYscNcb4GIiKjaK1GpkZCi1EpmrmcX6tRzd1Cgg5+LJplp5ekAuUXNW+/XrMnNsWPH0K1bN83zsrExkZGRWL9+PVJSUpCcnKx5vWHDhtixYwcmTZqEFStWoEGDBvjqq684DZyIiOgeGXnFiLlnBlPc1SwUlqi16sikErTydLjdInMnmfFytIJEUr26mCpDIoQQD69WeyiVSjg6OiI7OxsODg7mDoeIiOiRqNUCF27mahKZ6ORMXLyZp1PP0dpSM04m2Pf2wF8bec0ZnWLI93fNeVdERESEvKJSxF3J0iQyx5MyoSws1anXxM1OM4Mp2M8ZjVxtIa1mA39NhckNERFRNSWEwNXMAq1F8hJSlLhv3C+sLWUI9HHStMwE+TrByUZunqCrASY3RERE1URRqQqnrys168pEJ2XiRk6RTj1vJ2tNItPezxktPOxhIat5A39NhckNERGRmdzMKcLx5LuL5J24lo3iUu2Bv5YyCVp7OWqNl/FwtDJTxDUDkxsiIqIqoFILnE3Lubvib3Imkm7l69SrZyvXzF5q7+eMtt6OsLKUmSHimovJDRERkQkoC0sQm3x74O/x5EzEJGcht0h74K9EAjR3t7+dzNwZ/OtXz6ZWTMc2JyY3REREj0gIgaRb+VozmBLTcnD/Yit2CgsE+Tppti8I9HWCg5n3YaqNmNwQEREZqLBEhZPXsrVW/L2VV6xTz6+eDdr73l0kr5m7fbXbh6k2YnJDRET0EGnKwruL5CVl4vT1bJSotJtl5BZStPN21KwrE+zrjPr21X8fptqIyQ0REdE9SlVqnEnN0UpmrmUV6NSrb69AB7+7i+S19nKAwoIDf6sDJjdERFSnZeUXI+bOwN/opEzEXslCQYlKq45UArS8sw9T2XTsBs7WHPhbTTG5ISKiOkOtFriYnnd3kbzkTJy/katTz8HKQmsGUzsfJ9gp+JVZU/B/ioiIaq384lLEXcnWbF9wPDkTWfklOvUa1bfVJDLt/ZzRuL5dndmHqTZickNERLWCEALXswvvLpKXlIn4FCVU923EZGUpRUCDe/dhcoaLbd3dh6k2YnJDREQ1UnGpGvEpSq1kJlVZqFPP09FKq4uplZcDLLkPU63G5IaIiGqEW7lFOF624m9SJuKuZqHovn2YLKQStPZy0KwrE+zrDC8nazNFTObC5IaIiKodtVrg3I3cu4vkJWfiUnqeTj1nG0vNVOz2vs5o18AJ1nJOx67rmNwQEZHZ5RSWIO5KtmYGU0xyJnIKS3XqNXO307TItPdzRkNXW07HJh1MboiIqEoJIXAlowDRyRl3WmaykJiqxH3jfmEjlyHI10mzfUGQjzMcbbgPEz0ckxsiIjKpwhIVTl/PvmfF3yyk5xbp1PNxsdYM+g32c0Zzd3tYcOAvVQKTGyIiMqobykLNujLRSZk4dU2JYpX2wF9LmQRtvB211pZxc7AyU8RU2zC5ISKiSitVqZGYlqO14u+VDN19mFzt5JpxMu39nNHG2xFWlhz4S6bB5IaIiPSWXVCCmOQ768okZyI2OQt5xdr7MEkkQHN3e00i097PGb4uNhz4S1WGyQ0REZVLCIFL6XmaqdjRSZk4m6a7D5O9wgKBvndX/A30cYK9FQf+kvkwuSEiIgBAQbEKJ65mITr57oq/meXsw+Rfz0azSF57P2c0dbOHjPswUTXC5IaIqI5KyS64u0heUiZOX1ei9L752HILKQIaOGoWyQv2c4arncJMERPph8kNEVEdUKJSI+HOPkxlycz1bN19mNzsFejgf3eRvNZejpBbcDo21SxMboiIaqHMvGKt6dhxV7NQWKI9HVsmlaClp72mRaa9nzO8naw58JdqPCY3REQ1nFotcOHm3X2YopMzcfGm7j5MjtaWCL4z8DfYzxkBDZxgq+DXANU+/FQTEdUweUWliLuSpUlkjidlQlnOPkyN69tqTcdu5GoHKQf+Uh3A5IaIqBoTQuBqZoFWF1NCiu4+TNaWMgT4OGoSmSAfZzjbys0TNJGZMbkhIqpGikpVOH1deXfF36RM3MjR3YfJ28n6zgwmJ7T3c0ELT3tYch8mIgBMboiIzOpmThGO37OuzIlr2Sgu1R74ayGVoPU9+zAF+znB09HaTBETVX9MboiIqohKLXA2LUczFTs6ORNJt/J16rnYau/D1K4B92EiMgSTGyIiE1EWliA2OUuzfUFMchZyi7QH/kokQDM3e60Vf/3rcR8mokfB5IaIyAiEEEi6la81gykxLQfivoG/tnIZgu5ZVybQxwmO1tyHiciYmNwQEVVCYYkKJ69la634eyuvWKeer4uNZl2Z9r7OaO7BfZiITI3JDRGRHtKUhXcXyUvKxOnr2ShR3bcPk0yKtg1uT8cO9r098NfN3spMERPVXUxuiIjuU6pS40xqjlYycy2rQKeeq50CHfycNS0zbbwdoLDgwF8ic6tUclNSUoLU1FTk5+ejfv36cHFxMXZcRERVJiu/GDF3Bv5GJ2Ui9koWCkpUWnWkEqCFh4PWir8NnLkPE1F1pHdyk5OTg//7v//Dpk2b8O+//6K4uBhCCEgkEjRo0AA9e/bEq6++iscee8yU8RIRPRK1WuBiet7dRfKSM3H+Rq5OPXsrC63p2AE+TrDjPkxENYJeP6lLly7FggUL0LhxY0REROC9996Dl5cXrK2
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# How is the model performing using subset of best features?\n",
"score_dec_tree, model_dec_tree, most_importatn_features_dec_tree = score_the_model(\n",
" model=DecisionTreeClassifier(),\n",
" model_name='Decision Tree',\n",
" random_seed=42,\n",
" X_train=X_train[most_importatn_features_dec_tree],\n",
" X_test=X_test[most_importatn_features_dec_tree],\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")\n",
" "
]
},
2022-12-29 10:21:35 +01:00
{
"cell_type": "markdown",
"id": "a72e54f6",
"metadata": {},
"source": [
"Now lets plot the decision tree"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 1,
2022-12-29 10:21:35 +01:00
"id": "c4fe47bd",
"metadata": {},
"outputs": [
{
2023-01-06 10:41:21 +01:00
"ename": "NameError",
"evalue": "name 'plt' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[1], line 3\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01msklearn\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mtree\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m plot_tree\n\u001b[0;32m----> 3\u001b[0m \u001b[43mplt\u001b[49m\u001b[38;5;241m.\u001b[39mfigure(figsize\u001b[38;5;241m=\u001b[39m(\u001b[38;5;241m40\u001b[39m, \u001b[38;5;241m40\u001b[39m), dpi\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m200\u001b[39m)\n\u001b[1;32m 4\u001b[0m plot_tree(model_dec_tree, filled\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m, rounded\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m, class_names\u001b[38;5;241m=\u001b[39m[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mReady biodegradable\u001b[39m\u001b[38;5;124m'\u001b[39m, \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mReday non-biodegradable\u001b[39m\u001b[38;5;124m'\u001b[39m], feature_names\u001b[38;5;241m=\u001b[39mX_train\u001b[38;5;241m.\u001b[39mcolumns)\n",
"\u001b[0;31mNameError\u001b[0m: name 'plt' is not defined"
]
2022-12-29 10:21:35 +01:00
}
],
"source": [
"from sklearn.tree import plot_tree\n",
"\n",
2023-01-06 10:41:21 +01:00
"plt.figure(figsize=(40, 40), dpi=150)\n",
2022-12-29 10:21:35 +01:00
"plot_tree(model_dec_tree, filled=True, rounded=True, class_names=['Ready biodegradable', 'Reday non-biodegradable'], feature_names=X_train.columns)"
]
},
{
"cell_type": "markdown",
"id": "b55c97cd",
"metadata": {},
"source": [
"### Random Forrest Classifier"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 171,
2022-12-29 10:21:35 +01:00
"id": "c9d5676b",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABYeklEQVR4nO3dd3QV1d7G8ScJ6SEJkEKooSktBg29SAu9iKggForSBKQpCCogCmJFvNIEpKhwaQKiIEUERIiUIEjvHRJAJEAoIcm8f/jmXA8JbEhCDoTvZ62zrmfP3jO/ORnuypM9s4+TZVmWAAAAAAA35ezoAgAAAADgXkdwAgAAAAADghMAAAAAGBCcAAAAAMCA4AQAAAAABgQnAAAAADAgOAEAAACAAcEJAAAAAAwITgAAAABgQHACgCwydepUOTk56fDhw7a2WrVqqVatWsaxq1atkpOTk1atWnXX6kuP0NBQtW/f3tFl3FP27dun+vXry8/PT05OTlqwYIGjS8pyaV3rAHC/IzgBuCsOHDigLl26qGjRovLw8JCvr6+qVaumzz//XFeuXHF0eQ+UxYsX65133nF0GQ+Mdu3aadu2bRo+fLi++eYblS9f/q4d6/Dhw3JycrK9nJ2dlTt3bjVq1EhRUVF37bj3mxs/p3+/Kleu7Ojy0jRjxgyNGjXK0WUA+Jccji4AQPazaNEiPfPMM3J3d1fbtm1VtmxZJSQk6LffflO/fv20Y8cOTZgwwdFl3hOWLVt214+xePFijRkzhvCUBa5cuaKoqCi99dZb6tGjR5Ydt02bNmrcuLGSkpK0d+9ejR07VrVr19bGjRsVFhaWZXXc61I+p38LDAx0UDW3NmPGDG3fvl29e/d2dCkA/h/BCUCmOnTokJ599lkVLlxYv/zyi0JCQmzbunfvrv3792vRokU3HZ+cnKyEhAR5eHhkRbkO5+bm5ugSHgiJiYlKTk6+65/3mTNnJEn+/v6Zts/4+Hh5e3vfss9jjz2mF154wfa+Ro0aatSokcaNG6exY8dmWi33uxs/p8xy9epVubm5ydmZG3mA7Ix/4QAy1UcffaRLly7pq6++sgtNKYoXL65evXrZ3js5OalHjx6aPn26ypQpI3d3dy1ZskSS9Mcff6hRo0by9fWVj4+P6tatq99//91uf9evX9fQoUNVokQJeXh4KE+ePKpevbqWL19u6xMTE6MOHTqoQIECcnd3V0hIiJ544olbPn8xd+5cOTk5afXq1am2ffnll3JyctL27dslSX/++afat29vuy0xb968eumll/TXX38ZP6+0nnE6fvy4WrRoIW9vbwUFBalPnz66du1aqrFr1qzRM888o0KFCsnd3V0FCxZUnz597G6FbN++vcaMGSNJdrcnpUhOTtaoUaNUpkwZeXh4KDg4WF26dNHff/9tdyzLsjRs2DAVKFBAXl5eql27tnbs2GE8vxQzZ85URESEcubMKV9fX4WFhenzzz+363P+/Hn16dNHoaGhcnd3V4ECBdS2bVudPXvW1uf06dN6+eWXFRwcLA8PD4WHh2vatGl2+0m5LeuTTz7RqFGjVKxYMbm7u2vnzp2SpN27d+vpp59W7ty55eHhofLly2vhwoV2+7id6+pG77zzjgoXLixJ6tevn5ycnBQaGmrbfjvXc8qzQatXr1a3bt0UFBSkAgUK3PbnnKJGjRqS/rll9t+mTJmiOnXqKCgoSO7u7ipdurTGjRuXanxoaKiaNm2q3377TRUrVpSHh4eKFi2qr7/+OlXfHTt2qE6dOvL09FSBAgU0bNgwJScnp1nX2LFjbf/O8+XLp+7du+v8+fN2fWrVqqWyZcvqzz//VM2aNeXl5aXixYtr7ty5kqTVq1erUqVK8vT01MMPP6yff/75jj+fmzl48KCeeeYZ5c6dW15eXqpcuXKqP/SkPG84c+ZMvf3228qfP7+8vLx04cIFSdL69evVsGFD+fn5ycvLSzVr1tTatWvt9nHx4kX17t3bdq0HBQWpXr162rx5s+0zWLRokY4cOWL7N/vvawmAYzDjBCBT/fDDDypatKiqVq1622N++eUXzZ49Wz169FBAQIBCQ0O1Y8cO1ahRQ76+vurfv79cXV315ZdfqlatWrZfnKR/flkdMWKEOnbsqIoVK+rChQvatGmTNm/erHr16kmSnnrqKe3YsUOvvvqqQkNDdfr0aS1fvlxHjx696S8jTZo0kY+Pj2bPnq2aNWvabZs1a5bKlCmjsmXLSpKWL1+ugwcPqkOHDsqbN6/tVsQdO3bo999/twsqJleuXFHdunV19OhR9ezZU/ny5dM333yjX375JVXfOXPm6PLly3rllVeUJ08ebdiwQV988YWOHz+uOXPmSJK6dOmikydPavny5frmm29S7aNLly6aOnWqOnTooJ49e+rQoUMaPXq0/vjjD61du1aurq6SpMGDB2vYsGFq3LixGjdurM2bN6t+/fpKSEgwntPy5cvVpk0b1a1bVx9++KEkadeuXVq7dq0tRF+6dEk1atTQrl279NJLL+mxxx7T2bNntXDhQh0/flwBAQG6cuWKatWqpf3796tHjx4qUqSI5syZo/bt2+v8+fN2gVz6JyRcvXpVnTt3lru7u3Lnzq0dO3aoWrVqyp8/vwYMGCBvb2/Nnj1bLVq00Hfffacnn3xS0u1dVzdq2bKl/P391adPH9stYT4+PpJ029dzim7duikwMFCDBw9WfHy88TO+UcofBXLlymXXPm7cOJUpU0bNmzdXjhw59MMPP6hbt25KTk5W9+7d7fru379fTz/9tF5++WW1a9dOkydPVvv27RUREaEyZcpI+uePErVr11ZiYqLt85wwYYI8PT1T1fTOO+9o6NChioyM1CuvvKI9e/Zo3Lhx2rhxo921Jkl///23mjZtqmeffVbPPPOMxo0bp2effVbTp09X79691bVrVz333HP6+OOP9fTTT+vYsWPKmTOn8XO5fPmyXRCXJD8/P7m6uio2NlZVq1bV5cuX1bNnT+XJk0fTpk1T8+bNNXfuXNu1keK9996Tm5ubXn/9dV27dk1ubm765Zdf1KhRI0VERGjIkCFydna2hdU1a9aoYsWKkqSuXbtq7ty56tGjh0qXLq2//vpLv/32m3bt2qXHHntMb731luLi4nT8+HF99tlnkmS7lgA4kAUAmSQuLs6SZD3xxBO3PUaS5ezsbO3YscOuvUWLFpabm5t14MABW9vJkyetnDlzWo8//ritLTw83GrSpMlN9//3339bkqyPP/749k/k/7Vp08YKCgqyEhMTbW2nTp2ynJ2drXfffdfWdvny5VRj//vf/1qSrF9//dXWNmXKFEuSdejQIVtbzZo1rZo1a9rejxo1ypJkzZ4929YWHx9vFS9e3JJkrVy58pbHHTFihOXk5GQdOXLE1ta9e3crrf+7X7NmjSXJmj59ul37kiVL7NpPnz5tubm5WU2aNLGSk5Nt/d58801LktWuXbtU+/63Xr16Wb6+vnaf440GDx5sSbLmzZuXalvKMVM+m2+//da2LSEhwapSpYrl4+NjXbhwwbIsyzp06JAlyfL19bVOnz5tt6+6detaYWFh1tWrV+32X7VqVatEiRK2NtN1dTMpx77xervd6znlGqlevfotP68bjzd06FDrzJkzVkxMjLVmzRqrQoUKliRrzpw5dv3TumYaNGhgFS1a1K6tcOHCqa7f06dPW+7u7tZrr71ma+vdu7clyVq/fr1dPz8/P7trPeUaql+/vpWUlGTrO3r0aEuSNXnyZFtbzZo1LUnWjBkzbG27d++2/X/F77//bmtfunSpJcmaMmXKbX1Oab1S/k2lnMuaNWts4y5evGgVKVLECg0NtdW9cuVKS5JVtGhRu88zOTnZKlGihNWgQQO7fyeXL1+2ihQpYtWrV8/W5ufnZ3Xv3v2WNTdp0sQqXLjwLfsAyFrcqgcg06TcqnI7f/n9t5o1a6p06dK290lJSVq2bJlatGihokWL2tpDQkL03HPP6bfffrMdy9/fXzt27NC+ffvS3Lenp6fc3Ny0atWqVLefmbRu3VqnT5+2WwJ87ty5Sk5
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
2023-01-06 10:09:28 +01:00
"name": "stdout",
2022-12-29 10:21:35 +01:00
"output_type": "stream",
"text": [
2023-01-06 10:09:28 +01:00
"Most important features: ['V36', 'V39', 'V22', 'V27', 'V1', 'V12', 'V15', 'V13', 'V18', 'V34', 'V14', 'V37', 'V30', 'V2', 'V17', 'V31', 'V8', 'V16', 'V3', 'V38', 'V10', 'V28', 'V9', 'V11', 'V5', 'V7', 'V6', 'V41']\n"
2022-12-29 10:21:35 +01:00
]
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdeVgW1f//8RegLALiwuaWGO67YRJqqYXiEqVpmfpR3EvFDa20RDQX2jTMTM3cPpZpmpkfNU1JK5fcNct9NxPcN1QImN8f/pgvt4A3KniTPh/XdV8XnDlzzzkzc8997vecc8bOMAxDAAAAAAAAALJkb+sCAAAAAAAAAHkdQTQAAAAAAADACoJoAAAAAAAAgBUE0QAAAAAAAAArCKIBAAAAAAAAVhBEAwAAAAAAAKwgiAYAAAAAAABYQRANAAAAAAAAsIIgGgAAAAAAAGAFQTQghxw8eFBNmjSRh4eH7OzstHjx4lzZTsOGDdWwYcNcee8HbcSIEbKzs7N1MaxasWKFatasKWdnZ9nZ2enSpUu2LtJd6dy5s/z8/GxdDABADtqyZYvq1q0rV1dX2dnZaefOndled9asWbKzs9OxY8es5vXz81Pnzp3vuZyPqsz2cXbbcGvXrpWdnZ3Wrl2ba+W7F/+2c+HYsWOys7PTrFmzbF2UO4qPj1ebNm1UtGhR2dnZKSYmxtZFuit59Xx92KWmpqpq1aoaM2aMrYuSI1asWCE3NzedPXvW1kWxiiAaHhlpjZm0l7Ozs4oXL66QkBB98sknunr16n29f1hYmHbv3q0xY8Zozpw5ql27dg6V/M7+/vtvjRgx4q4az8i+8+fP65VXXpGLi4smTZqkOXPmyNXVNdO8t59j+fLlU4kSJdS5c2edOnXqAZc877p9P6V/DRkyxNbFy9TYsWNzLTAO4OFw7do1RUVFqWnTpipSpIjVH+979+5V06ZN5ebmpiJFiqhjx47Z/vHwzz//6OWXX9aFCxf08ccfa86cOSpdunQO1QT/ZsuXL9eIESNsXQzchYEDB2rlypUaOnSo5syZo6ZNm2aZ9/Z2U8GCBdWgQQMtW7bsAZY478uqnenr62vromXqXj63X3/9tU6ePKnw8PDcKdQD1rRpU5UtW1bR0dG2LopV+WxdAOBBe/fdd1WmTBn9888/iouL09q1azVgwACNHz9eS5YsUfXq1e/6PW/cuKGNGzfqnXfeeeAXsr///lsjR46Un5+fatas+UC3fb+GDRuWZ4MmabZs2aKrV69q1KhRCg4OztY6aefYzZs39dtvv2nWrFlat26d/vjjDzk7O+dyif890vZTelWrVrVRae5s7NixatOmjVq2bGnrogDIo86dO6d3331Xjz32mGrUqHHHXhl//fWXnnnmGXl4eGjs2LG6du2aPvroI+3evVubN2+Wo6PjHbd1+PBhHT9+XNOmTVP37t1zuCbILT/++GOub2P58uWaNGkSgTRJpUuX1o0bN5Q/f35bF+WOfvrpJ7344osaPHhwtvI3btxYnTp1kmEYOn78uCZPnqzQ0FD98MMPCgkJyeXS/nuk7af0XFxcbFSaO7uXz+2HH36oV199VR4eHrlXsAfstdde0+DBgzVy5Ei5u7vbujhZIoiGR06zZs0seokNHTpUP/30k55//nm98MIL2rt3711fYNPuHBcqVCgni/rQSkhIkKurq/Lly6d8+fL2ZejMmTOS7u7Ypj/HunfvLk9PT73//vtasmSJXnnlldwo5r/S7Z/FnJJ2fgHAg1SsWDGdPn1avr6+2rp1q5588sks844dO1YJCQnatm2bHnvsMUlSnTp11LhxY82aNUs9e/a847bu5bspL0pNTVVSUtIjc4PJWnAUOSM5OVmpqalydHT8V5xbZ86cuavPcvny5fWf//zH/L9169aqXLmyJkyYQBAtndv3U05Jf37Zyo4dO7Rr1y6NGzfOZmXIDa1bt1bfvn21YMECde3a1dbFyRLDOQFJzz77rCIjI3X8+HF9+eWXFsv27dunNm3aqEiRInJ2dlbt2rW1ZMkSc/mIESPMIRRvvPGG7OzszPmnjh8/rt69e6tChQpycXFR0aJF9fLLL2eYgySrucGszVmydu1as5HepUsXs6vynYaPXL16VQMGDJCfn5+cnJzk7e2txo0ba/v27Rb5Nm3apObNm6tw4cJydXVV9erVNWHCBIs8P/30k55++mm5urqqUKFCevHFF7V3795M67Znzx61b99ehQsXVv369bOst52dncLDw7V48WJVrVpVTk5OqlKlilasWJFp/WvXri1nZ2f5+/tr6tSpdzXP2oIFCxQQECAXFxd5enrqP//5j8Wwy4YNGyosLEyS9OSTT8rOzu6e5gJ5+umnJd3qOZAmKSlJw4cPV0BAgDw8POTq6qqnn35aa9assVg3bT6Pjz76SJ9//rn8/f3l5OSkJ598Ulu2bMmwrbT95uzsrKpVq+q7777LtEwJCQkaNGiQSpUqJScnJ1WoUEEfffSRDMOwyJd2PBYsWKDKlSvLxcVFQUFB2r17tyRp6tSpKlu2rJydndWwYcNsza+TXfd7fknSl19+aR7jIkWK6NVXX9XJkyct3uPgwYNq3bq1fH195ezsrJIlS+rVV1/V5cuXzX2QkJCg2bNnm5+xf9OcMAAeDCcnp2wPFfr222/1/PPPmwE0SQoODlb58uX1zTff3HHdzp07q0GDBpKkl19+WXZ2dhbzbGXn2pkZwzA0evRolSxZUgUKFFCjRo30559/Zqs+0q2A2IQJE1StWjU5OzvLy8tLTZs21datW808ad8pX331lapUqSInJyfz+33Hjh1q1qyZChYsKDc3Nz333HP67bffLLbxzz//aOTIkSpXrpycnZ1VtGhR1a9fX6tWrTLzxMXFqUuXLipZsqScnJxUrFgxvfjii3f8flq4cKHs7Oz0888/Z1g2depU2dnZ6Y8//pAk/f777+rcubMef/xxOTs7y9fXV127dtX58+et7qPM5kT766+/1LJlS7m6usrb21sDBw5UYmJihnV//fVXvfzyy3rsscfk5OSkUqVKaeDAgbpx44aZp3Pnzpo0aZIky+FsaVJTUxUTE6MqVarI2dlZPj4+eu2113Tx4kWLbd3vuTBv3jwFBATI3d1dBQsWVLVq1TK0IS9duqSBAwea7dGSJUuqU6dOOnfunJnnzJkz6tatm3x8fOTs7KwaNWpo9uzZFu+Tvp0UExNjtpP27NmT6ZxonTt3lpubm06dOqWWLVvKzc1NXl5eGjx4sFJSUize+/z58+rYsaMKFiyoQoUKKSwsTLt27cr2PGtHjhzRyy+/rCJFiqhAgQJ66qmnLIZdprXzDcPQpEmTMhyv7KpUqZI8PT0t2pmS9P3336tFixYqXry4nJyc5O/vr1GjRmWoZ8OGDVW1alXt2bNHjRo1UoECBVSiRAl98MEHGbaV3fNVst7Olv7veJw4cULPP/+83NzcVKJECfM83r17t5599lm5urqqdOnSmjt37l3vn6zc7/klWf+dKFm/bln73GZm8eLFcnR01DPPPGORntYuPnDggP7zn//Iw8NDXl5eioyMlGEYOnnypF588UUVLFhQvr6+mQbhEhMTFRUVpbJly5rXmjfffDPDcZ45c6aeffZZeXt7y8nJSZUrV9bkyZMzvJ+fn5+ef/55rVu3TnXq1JGzs7Mef/xx/fe//82Q19vbW9WrV9f3339/x/rbWt7uAgI8QB07dtTbb7+tH3/8UT169JAk/fnnn6pXr55KlCihIUOGyNXVVd98841atmypb7/9Vq1atdJLL72kQoUKaeDAgWrXrp2aN28uNzc3SbeGAm7YsEGvvvqqSpYsqWPHjmny5Mlq2LCh9uzZowIFCtxXmStVqqR3331Xw4cPV8+ePc1gTd26dbNc5/XXX9fChQsVHh6uypUr6/z581q3bp327t2rJ554QpK0atUqPf/88ypWrJj69+8vX19f7d27V0uXLlX//v0lSatXr1azZs30+OOPa8SIEbpx44YmTpyoevXqafv27Rkmsn/55ZdVrlw5jR07NkOg5nbr1q3TokW
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABfgklEQVR4nO3dd1QUV/8G8GcX2KVIM0gVxd4Fe8SoUVHsLYkmGEWTaOxGY2IXS+zRaKLRxG5+JqhJNL4W7BV7QUURFcFGUYKAdNi9vz982dcNJTu4sLI8n3P2HPfOnZlnR8qXO3dmZEIIASIiIiIjITd0ACIiIiJ9YnFDRERERoXFDRERERkVFjdERERkVFjcEBERkVFhcUNERERGhcUNERERGRVTQwcoaWq1GtHR0bC2toZMJjN0HCIiItKBEAIvXryAq6sr5PLCx2bKXHETHR0Nd3d3Q8cgIiKiInj06BEqVqxYaJ8yV9xYW1sDeHlwbGxsDJyGiIiIdJGcnAx3d3fN7/HClLniJvdUlI2NDYsbIiKiUkaXKSWcUExERERGhcUNERERGRUWN0RERGRUWNwQERGRUWFxQ0REREaFxQ0REREZFRY3REREZFRY3BAREZFRYXFDRERERoXFDRERERkVgxY3J0+eRI8ePeDq6gqZTIZdu3b96zrHjx9H48aNoVQqUb16dWzatKnYcxIREVHpYdDiJjU1FZ6enli1apVO/SMjI9GtWze0a9cOISEh+OKLL/DZZ5/hwIEDxZyUiIiISguDPjizS5cu6NKli87916xZgypVqmDp0qUAgDp16uD06dP47rvv4OvrW1wxiYiMXkJqFtKycgwdg4yEwlQOR2tzg+2/VD0V/OzZs/Dx8dFq8/X1xRdffFHgOpmZmcjMzNS8T05OLq54RERvPCEEniSm42Z0Mm4+SUJodDJuRichLjnz31cm0lHjSnb4c2Qrg+2/VBU3sbGxcHJy0mpzcnJCcnIy0tPTYWFhkWedBQsWYPbs2SUVkYjojaFWC0T9naopYG4+SUZodBIS07Lz7a805TUmpB9mJob9WipVxU1RTJkyBRMmTNC8T05Ohru7uwETERHpX7ZKjXtPU3AzOhmhT5JwMzoJt6KTkZqlytPXVC5DDSdr1He1QX03W9RztUEdFxtYKY3+VwKVEaXqK9nZ2RlxcXFabXFxcbCxscl31AYAlEollEplScQjIioRGdkqhMe+QGh0kub0UljsC2TlqPP0VZrKUcfFBvX+W8jUd7VFDadyMDczMUByopJRqoqbli1bYt++fVpthw4dQsuWLQ2UiIioeKVk5iAs5uVoTOiTl6eX7j5NgUot8vQtpzRFXVcb1He11RQz1SpYwdTApwiISppBi5uUlBTcu3dP8z4yMhIhISEoX748KlWqhClTpuDJkyfYsmULAGD48OFYuXIlvv76a3zyySc4evQotm/fjr179xrqIxAR6c3z1KyXIzHR/53o+yQJkX+nQuStY2BvafbfU0q2qO9mg3qutqhc3hJyuazkgxO9YQxa3Fy6dAnt2rXTvM+dG+Pv749NmzYhJiYGDx8+1CyvUqUK9u7di/Hjx2PFihWoWLEi1q1bx8vAiajUeZqcgdDo/43GhD5JxpPE9Hz7OtuYawqY3BEZF1tzyGQsZIjyIxMiv78JjFdycjJsbW2RlJQEGxsbQ8chIiMnhMDj5+maAia3oIlPyf/S68pvWaKeq81/R2ReFjMO5ThvkEjK7+9SNeeGiKg4ZWSr8j0FpCsBgejEjJeXXUfnzpNJQnJG3pvjyWVAtQrlNAVMPVdb1HW1ga2F2Wt8AiICWNwQURl3/1kK9lyPwZ7r0bgTl1Is+zAzkaGmkzXq/3d+TF1XW9RxsYalgj+CiYoDv7OIqMx5lJCG/1yPxp5rMbgVo9+7lpubyVHXxUZrom9NJ2soeIM8ohLD4oaIyoToxHTs/e8IzbXHSZp2E7kMrao7oHtDF7Sv7QiL17z/i7mZCUx4xRKRQbG4ISKj9TQ5A3tvxGDP9RhcfvBc0y6XAW9XfQvdG7qic31nlLdSGDAlEekbixsiMirxKZnYHxqLPdeicSEqQTNBWCYDmlUuj+6eLuhc39mgTywmouLF4oaISr3EtCwEhcZiz/UYnImIx6s3721UyQ7dG7qiWwMXONuyoCEqC1jcEFGplJyRjYM347DnejRO341HzisVTQM3W3Rv6IJuDV1Q0d7SgCmJyBBY3BBRqZGamYPDYXH4z7UYnLzzDFmq/z0osrazNXp4vhyh8XCwMmBKIjI0FjdE9EZLz1Lh6O2n2HM9GkdvP0XmK0++ru5YDt0buqB7Q1dUdyxnwJRE9CZhcUNEry0jW4Vjt59iz/UYnL4Xj6xXCpDXla1Sa51y8njLEt0buqK7pwtqOVnz+UpElAeLGyIqkswcFU7eicee69E4fCsOqVmqYttXRXuLlwVNQxfUc7VhQUNEhWJxQ0Q6y1apcfpePPZci8HBW7F48cozk9zsLNC94cvLrPX5oEcTuYxPwCYiSVjcEFGhclRqnLufgD3XoxF0MxaJadmaZc425ujawAXdPV3QyN2OBQgRvRFY3BBRHhnZKlyMSsCBm7EICo1FfEqWZplDOSW6NnBG94auaFrZHnI+aoCI3jAsbogIQgiExbzA6XvPcOpuPC5EJmhdlWRvaYbO9V3Qo6ELWlR9i89OIqI3GosbojLqaXIGTt2Nx+l78Th1Nx7xKZlay51tzNG2ZgV0begC72pvwcyET7UmotKBxQ1RGZGepcKFqAScuvNydCY87oXWcgszE7xdtTxa16iANjUdUK1COc6hIaJSicUNkZFSqwVuxST/d3TmGS5GPte6o69M9vIxBe9Ud0DrGhXQuLIdlKYmBkxMRKQfLG6IXvHH5cf4/uhd5KjEv3d+w6Vk5iApPVurzdXWHK1rVEDrmg7wruaA8lYKA6UjIio+LG6I/utmdBKm/HlDa3SjtLNSmKBltbdejs7UrICqDlY81URERo/FDRFePpBxzK9XkaVSo31tR4zrUMPQkV6bqYkMNRytoTDlRGAiKltY3BABmPnXTdyPT4WzjTmWfuAJe56uISIqtVjc0BspJTMHp+/GI0dd/KeI7j1NwR9XHkMuA77/qBELGyKiUo7FDb2R5vznJrZfelyi+/zCpyaaVylfovskIiL9Y3FDb6S45Jc3lKtWwQoVrPX3EMaC1Ha2wah21Yt9P0REVPxY3NAbbeS71fFek4qGjkFERKUIL6MgIiIio8LihoiIiIwKixsiIiIyKixuiIiIyKiwuCEiIiKjwqulyKDCY1/g79TMPO2JaVkGSENERMaAxQ0ZzKm7zzBw/YVC+8g5tkhERBKxuCGDeZSQDuDlk6vd7C3yLK9grcQ71SuUdCwiIirlWNyQwXlXd8DaQU0NHYOIiIwEB/2JiIjIqHDkhkrUmYh4rD8ViRy1QHRiuqHjEBGREWJxQyVqzYn7OHnnmVZbSTwYk4iIyo4iFTfZ2dmIjY1FWloaKlSogPLly+s7FxmpHJUaADCgRSU0qmQPMxMZ2td2NHAqIiIyJjoXNy9evMD//d//ITAwEBcuXEBWVhaEEJDJZKhYsSI6deqEYcOGoVmzZsWZl4xE8yrl0cvLzdAxiIjICOk0oXjZsmXw8PDAxo0b4ePjg127diEkJAR37tzB2bNnERAQgJycHHTq1AmdO3fG3bt3izs3ERERUb50Grm5ePEiTp48iXr16uW7vHnz5vjkk0+wZs0abNy4EadOnUKNGjX0GpSIiIhIFzoVN7/99ptOG1MqlRg+fPhrBSIiIiJ6HbzPDRERERkVScXNtWvX8M033+DHH39EfHy81rLk5GR88skneg1HREREJJXOxc3BgwfRvHlzBAYGYtGiRahduzaOHTumWZ6eno7NmzcXS0giIiIiXelc3MyaNQsTJ05EaGgooqKi8PXXX6Nnz54ICgoqznxEREREkuh8n5ubN2/il19+AQDIZDJ8/fXXqFixIt5//30EBgby/jZERET0RtC5uFEqlUhMTNRq8/Pzg1wuR//+/bF06VJ9ZyMiIiKSTOf
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from sklearn.ensemble import RandomForestClassifier\n",
"\n",
"# Score the model with default parameters\n",
2023-01-06 10:09:28 +01:00
"score_rf, model_rf, most_important_features_rf = score_the_model(\n",
2022-12-29 10:21:35 +01:00
" model=RandomForestClassifier(),\n",
" model_name='Random Forest',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
2023-01-06 10:09:28 +01:00
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 172,
2023-01-06 10:09:28 +01:00
"id": "f0c126a4",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABYgElEQVR4nO3dd3xO5//H8XcS2ZEEGWLGaq002tijlNijqi3VYbRWUaultEW1VKfqt1ZRdPC1imqpUUWV1IhSe28SVAUxIsn5/dFf7m9vCZcMuYnX8/G4H9/e17mucz7nzuHrneuc63ayLMsSAAAAAOCmnB1dAAAAAADc7QhOAAAAAGBAcAIAAAAAA4ITAAAAABgQnAAAAADAgOAEAAAAAAYEJwAAAAAwIDgBAAAAgAHBCQAAAAAMCE4AkE2mTZsmJycnHT582NZWp04d1alTxzh21apVcnJy0qpVq+5YfRkRGhqqDh06OLqMu8q+ffvUoEED+fn5ycnJSQsWLHB0SdkurWsdAO51BCcAd8SBAwfUtWtXFS9eXB4eHvL19VWNGjX02Wef6cqVK44u776yePFivf32244u477Rvn17bdu2TSNGjNA333yjihUr3rFjHT58WE5OTraXs7Oz8ubNq8aNGysqKuqOHfdec+Pn9O9X1apVHV1emmbMmKHRo0c7ugwA/5LL0QUAyHkWLVqkp59+Wu7u7mrXrp3Kly+vhIQE/fbbb+rfv7927NihiRMnOrrMu8KyZcvu+DEWL16ssWPHEp6ywZUrVxQVFaU333xTPXv2zLbjtm3bVk2aNFFSUpL27t2rcePG6bHHHtPGjRsVFhaWbXXc7VI+p38LDAx0UDW3NmPGDG3fvl19+vRxdCkA/h/BCUCWOnTokJ555hkVLVpUv/zyi0JCQmzbevToof3792vRokU3HZ+cnKyEhAR5eHhkR7kO5+bm5ugS7guJiYlKTk6+45/3mTNnJEn+/v5Zts/4+Hh5e3vfss8jjzyi559/3va+Vq1aaty4scaPH69x48ZlWS33uhs/p6xy9epVubm5ydmZG3mAnIw/4QCy1IcffqhLly7pyy+/tAtNKUqWLKnevXvb3js5Oalnz56aPn26ypUrJ3d3dy1ZskSS9Mcff6hx48by9fWVj4+P6tWrp99//91uf9evX9ewYcNUqlQpeXh4KF++fKpZs6aWL19u6xMTE6OOHTuqUKFCcnd3V0hIiB5//PFbPn8xd+5cOTk5afXq1am2ffHFF3JyctL27dslSX/++ac6dOhguy0xf/78evHFF/XXX38ZP6+0nnE6fvy4WrZsKW9vbwUFBalv3766du1aqrFr1qzR008/rSJFisjd3V2FCxdW37597W6F7NChg8aOHStJdrcnpUhOTtbo0aNVrlw5eXh4KDg4WF27dtXff/9tdyzLsjR8+HAVKlRIXl5eeuyxx7Rjxw7j+aWYOXOmIiIilDt3bvn6+iosLEyfffaZXZ/z58+rb9++Cg0Nlbu7uwoVKqR27drp7Nmztj6nT5/WSy+9pODgYHl4eCg8PFxfffWV3X5Sbsv6+OOPNXr0aJUoUULu7u7auXOnJGn37t166qmnlDdvXnl4eKhixYpauHCh3T5u57q60dtvv62iRYtKkvr37y8nJyeFhobatt/O9ZzybNDq1avVvXt3BQUFqVChQrf9OaeoVauWpH9umf23qVOnqm7dugoKCpK7u7vKli2r8ePHpxofGhqqZs2a6bffflPlypXl4eGh4sWL6+uvv07Vd8eOHapbt648PT1VqFAhDR8+XMnJyWnWNW7cONuf8wIFCqhHjx46f/68XZ86deqofPny+vPPP1W7dm15eXmpZMmSmjt3riRp9erVqlKlijw9PfXggw/q559/TvfnczMHDx7U008/rbx588rLy0tVq1ZN9YuelOcNZ86cqbfeeksFCxaUl5eXLly4IElav369GjVqJD8/P3l5eal27dpau3at3T4uXryoPn362K71oKAg1a9fX5s3b7Z9BosWLdKRI0dsf2b/fS0BcAxmnABkqR9++EHFixdX9erVb3vML7/8otmzZ6tnz54KCAhQaGioduzYoVq1asnX11cDBgyQq6urvvjiC9WpU8f2Dyfpn3+sjhw5Up06dVLlypV14cIFbdq0SZs3b1b9+vUlSU8++aR27NihV155RaGhoTp9+rSWL1+uo0eP3vQfI02bNpWPj49mz56t2rVr222bNWuWypUrp/Lly0uSli9froMHD6pjx47Knz+/7VbEHTt26Pfff7cLKiZXrlxRvXr1dPToUfXq1UsFChTQN998o19++SVV3zlz5ujy5ct6+eWXlS9fPm3YsEGff/65jh8/rjlz5kiSunbtqpMnT2r58uX65ptvUu2ja9eumjZtmjp27KhevXrp0KFDGjNmjP744w+tXbtWrq6ukqQhQ4Zo+PDhatKkiZo0aaLNmzerQYMGSkhIMJ7T8uXL1bZtW9WrV08ffPCBJGnXrl1au3atLURfunRJtWrV0q5du/Tiiy/qkUce0dmzZ7Vw4UIdP35cAQEBunLliurUqaP9+/erZ8+eKlasmObMmaMOHTro/PnzdoFc+ickXL16VV26dJG7u7vy5s2rHTt2qEaNGipYsKAGDhwob29vzZ49Wy1bttR3332nJ554QtLtXVc3atWqlfz9/dW3b1/bLWE+Pj6SdNvXc4ru3bsrMDBQQ4YMUXx8vPEzvlHKLwXy5Mlj1z5+/HiVK1dOLVq0UK5cufTDDz+oe/fuSk5OVo8ePez67t+/X0899ZReeukltW/fXlOmTFGHDh0UERGhcuXKSfrnlxKPPfaYEhMTbZ/nxIkT5enpmaqmt99+W8OGDVNkZKRefvll7dmzR+PHj9fGjRvtrjVJ+vvvv9WsWTM988wzevrppzV+/Hg988wzmj59uvr06aNu3brp2Wef1UcffaSnnnpKx44dU+7cuY2fy+XLl+2CuCT5+fnJ1dVVsbGxql69ui5fvqxevXopX758+uqrr9SiRQvNnTvXdm2kePfdd+Xm5qbXXntN165dk5ubm3755Rc1btxYERERGjp0qJydnW1hdc2aNapcubIkqVu3bpo7d6569uypsmXL6q+//tJvv/2mXbt26ZFHHtGbb76puLg4HT9+XJ9++qkk2a4lAA5kAUAWiYuLsyRZjz/++G2PkWQ5OztbO3bssGtv2bKl5ebmZh04cMDWdvLkSSt37tzWo48+amsLDw+3mjZtetP9//3335Yk66OPPrr9E/l/bdu2tYKCgqzExERb26lTpyxnZ2frnXfesbVdvnw51dj//ve/liTr119/tbVNnTrVkmQdOnTI1la7dm2rdu3atvejR4+2JFmzZ8+2tcXHx1slS5a0JFkrV6685XFHjhxpOTk5WUeOHLG19ejRw0rrr/s1a9ZYkqzp06fbtS9ZssSu/fTp05abm5vVtGlTKzk52dbvjTfesCRZ7du3T7Xvf+vdu7fl6+tr9zneaMiQIZYka968eam2pRwz5bP59ttvbdsSEhKsatWqWT4+PtaFCxcsy7KsQ4cOWZIsX19f6/Tp03b7qlevnhUWFmZdvXrVbv/Vq1e3SpUqZWszXVc3k3LsG6+3272eU66RmjVr3vLzuvF4w4YNs86cOWPFxMRYa9assSpVqmRJsubMmWPXP61rpmHDhlbx4sXt2ooWLZrq+j19+rTl7u5uvfrqq7a2Pn36WJKs9evX2/Xz8/Ozu9ZTrqEGDRpYSUlJtr5jxoyxJFlTpkyxtdWuXduSZM2YMcPWtnv3btvfFb///rutfenSpZYka+rUqbf1OaX1SvkzlXIua9assY27ePGiVaxYMSs0NNRW98qVKy1JVvHixe0+z+TkZKtUqVJWw4YN7f6cXL582SpWrJhVv359W5ufn5/Vo0ePW9bctGlTq2jRorfsAyB7casegCyTcqvK7fzm999q166tsmXL2t4nJSVp2bJlatmypYoXL25rDwkJ0bPPPqvffvvNdix/f3/t2LFD+/btS3Pfnp6ecnNz06pVq1LdfmbSpk0bnT592m4J8Llz5yo
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most important features: ['V36', 'V1', 'V39', 'V27', 'V22', 'V12', 'V18', 'V13', 'V30', 'V15', 'V14', 'V37', 'V8', 'V17', 'V34', 'V2', 'V16', 'V38', 'V10', 'V31', 'V3', 'V28', 'V7', 'V11', 'V5', 'V9', 'V41']\n"
]
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdeVgW1f//8RegLIrgwuaOorlvYRAuqYXiRmlapqa4m0qatGmGa0lWGmYmZpp+LHPXLHdJK5fcNfd9y8R9xYSE+f3hj/lyy3KjgqA+H9d1XxecOXPPOTNzz5z7fZ9zxsYwDEMAAAAAAAAA0mSb3QUAAAAAAAAAcjqCaAAAAAAAAIAVBNEAAAAAAAAAKwiiAQAAAAAAAFYQRAMAAAAAAACsIIgGAAAAAAAAWEEQDQAAAAAAALCCIBoAAAAAAABgBUE0AAAAAAAAwAqCaEAmOXTokBo1aiRXV1fZ2Nho4cKFWbKd+vXrq379+lny3g/b0KFDZWNjk93FsGrZsmWqXr26HB0dZWNjoytXrmR3ke5Jp06d5O3tnd3FAABkos2bN6tWrVrKmzevbGxstGPHjgyvO3XqVNnY2Oj48eNW83p7e6tTp073Xc4nVWr7OKNtuDVr1sjGxkZr1qzJsvLdj0ftXDh+/LhsbGw0derU7C5Kus6ePavWrVurUKFCsrGxUWRkZHYX6Z7k1PP1cZeYmKjKlSvr448/zu6iZIply5bJ2dlZ58+fz+6iWEUQDU+MpMZM0svR0VFFihRRUFCQvvzyS12/fv2B3j8kJES7du3Sxx9/rOnTp6tmzZqZVPL0/fPPPxo6dOg9NZ6RcRcvXtSrr74qJycnjR8/XtOnT1fevHlTzXv3OZYrVy4VLVpUnTp10unTpx9yyXOuu/dT8teAAQOyu3ipGjlyZJYFxgE8Hm7cuKEhQ4aocePGKliwoNUv7/v27VPjxo3l7OysggULqkOHDhn+8vDff//plVde0aVLl/TFF19o+vTpKlmyZCbVBI+yJUuWaOjQodldDNyD/v37a/ny5Ro4cKCmT5+uxo0bp5n37naTi4uL6tWrp8WLFz/EEud8abUzvby8srtoqbqfz+2PP/6oU6dOKTQ0NGsK9ZA1btxYZcqUUURERHYXxapc2V0A4GEbPny4SpUqpf/++08xMTFas2aN3nrrLY0ZM0aLFi1S1apV7/k9//33X23YsEGDBg166Beyf/75R8OGDZO3t7eqV6/+ULf9oD788MMcGzRJsnnzZl2/fl0jRoxQYGBghtZJOsdu3bqlP//8U1OnTtXatWu1e/duOTo6ZnGJHx1J+ym5ypUrZ1Np0jdy5Ei1bt1aLVq0yO6iAMihLly4oOHDh6tEiRKqVq1aur0y/v77bz333HNydXXVyJEjdePGDX3++efatWuXNm3aJHt7+3S3deTIEZ04cUKTJk1St27dMrkmyCorVqzI8m0sWbJE48ePJ5AmqWTJkvr333+VO3fu7C5Kun799Ve99NJLeueddzKUv2HDhurYsaMMw9CJEyc0YcIEBQcHa+nSpQoKCsri0j46kvZTck5OTtlUmvTdz+f2s88+02uvvSZXV9esK9hD1rNnT73zzjsaNmyY8uXLl93FSRNBNDxxmjRpYtFLbODAgfr111/VvHlzvfjii9q3b989X2CTfjnOnz9/Zhb1sRUbG6u8efMqV65cypUrZ1+Gzp07J+nejm3yc6xbt25yc3PTqFGjtGjRIr366qtZUcxH0t2fxcySdH4BwMNUuHBhnTlzRl5eXtqyZYueeeaZNPOOHDlSsbGx2rp1q0qUKCFJ8vPzU8OGDTV16lT16NEj3W3dz70pJ0pMTFR8fPwT8wOTteAoMsft27eVmJgoe3v7R+LcOnfu3D19lp966im9/vrr5v+tWrVSxYoVNXbsWIJoydy9nzJL8vMru2zfvl07d+7U6NGjs60MWaFVq1Z68803NWfOHHXp0iW7i5MmhnMCkp5//nmFh4frxIkT+v777y2W7d+/X61bt1bBggXl6OiomjVratGiRebyoUOHmkMo3n33XdnY2JjzT504cUK9e/dWuXLl5OTkpEKFCumVV15JMQdJWnODWZuzZM2aNWYjvXPnzmZX5fSGj1y/fl1vvfWWvL295eDgIA8PDzVs2FDbtm2zyLdx40Y1bdpUBQoUUN68eVW1alWNHTvWIs+vv/6qunXrKm/evMqfP79eeukl7du3L9W67d27V+3atVOBAgVUp06dNOttY2Oj0NBQLVy4UJUrV5aDg4MqVaqkZcuWpVr/mjVrytHRUT4+Ppo4ceI9zbM2Z84c+fr6ysnJSW5ubnr99dcthl3Wr19fISEhkqRnnnlGNjY29zUXSN26dSXd6TmQJD4+XoMHD5avr69cXV2VN29e1a1bV6tXr7ZYN2k+j88//1zffPONfHx85ODgoGeeeUabN29Osa2k/ebo6KjKlStrwYIFqZYpNjZWb7/9tooXLy4HBweVK1dOn3/+uQzDsMiXdDzmzJmjihUrysnJSQEBAdq1a5ckaeLEiSpTpowcHR1Vv379DM2vk1EPen5J0vfff28e44IFC+q1117TqVOnLN7j0KFDatWqlby8vOTo6KhixYrptdde09WrV819EBsbq2nTppmfsUdpThgAD4eDg0OGhwrNmzdPzZs3NwNokhQYGKinnnpKs2fPTnfdTp06qV69epKkV155RTY2NhbzbGXk2pkawzD00UcfqVixYsqTJ48aNGigPXv2ZKg+0p2A2NixY1WlShU5OjrK3d1djRs31pYtW8w8SfeUH374QZUqVZKDg4N5f9++fbuaNGkiFxcXOTs764UXXtCff/5psY3//vtPw4YNU9myZeXo6KhChQqpTp06WrlypZknJiZGnTt3VrFixeTg4KDChQvrpZdeSvf+NHfuXNnY2Oi3335LsWzixImysbHR7t27JUl//fWXOnXqpNKlS8vR0VFeXl7q0qWLLl68aHUfpTYn2t9//60WLVoob9688vDwUP/+/RUXF5di3T/++EOvvPKKSpQoIQcHBxUvXlz9+/fXv//+a+bp1KmTxo8fL8lyOFuSxMRERUZGqlKlSnJ0dJSnp6d69uypy5cvW2zrQc+FmTNnytfXV/ny5ZOLi4uqVKmSog155coV9e/f32yPFitWTB07dtSFCxfMPOfOnVPXrl3l6ekpR0dHVatWTdOmTbN4n+TtpMjISLOdtHfv3lTnROvUqZOcnZ11+vRptWjRQs7OznJ3d9c777yjhIQEi/e+ePGiOnToIBcXF+XPn18hISHauXNnhudZO3r0qF555RUVLFhQefLk0bPPPmsx7DKpnW8YhsaPH5/ieGVUhQoV5ObmZtHOlKSffvpJzZo1U5EiReTg4CAfHx+NGDEiRT3r16+vypUra+/evWrQoIHy5MmjokWL6tNPP02xrYyer5L1drb0f8fj5MmTat68uZydnVW0aFHzPN61a5eef/555c2bVyVLltSMGTPuef+k5UHPL8n690TJ+nXL2uc2NQsXLpS9vb2ee+45i/SkdvHBgwf1+uuvy9XVVe7u7goPD5dhGDp16pReeuklubi4yMvLK9UgXFxcnIYMGaIyZcqY15r33nsvxXH+7rvv9Pzzz8vDw0MODg6qWLGiJkyYkOL9vL291bx5c61du1Z+fn5ydHRU6dKl9b///S9FXg8PD1WtWlU//fRTuvXPbjm7CwjwEHXo0EEffPCBVqxYoe7du0uS9uzZo9q1a6to0aIaMGCA8ubNq9mzZ6tFixaaN2+eWrZsqZdffln58+dX//791bZtWzVt2lTOzs6S7gwFXL9+vV577TUVK1ZMx48f14QJE1S/fn3t3btXefLkeaAyV6hQQcOHD9fgwYPVo0cPM1hTq1atNNd54403NHfuXIWGhqpixYq6ePGi1q5dq3379unpp5+WJK1cuVLNmzdX4cKF1a9fP3l5eWnfvn365Zdf1K9fP0nSqlWr1KRJE5UuXVpDhw7Vv//+q3Hjxql27dratm1bionsX3nlFZUtW1YjR45MEai529q1azV//nz17t1b+fL
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABgHklEQVR4nO3dd1hT1/8H8HcCJAxZFpmiuLfirlq1Kop71VUXWqt1W6mtWxx11FW1Wq1b+7PF0WptVdx71YUTQQXEASgiIENGcn5/+CU1MszFQCS+X8+T5zEn5977zjWQD+eee69MCCFAREREZCTkhg5AREREpE8sboiIiMiosLghIiIio8LihoiIiIwKixsiIiIyKixuiIiIyKiwuCEiIiKjYmroAAVNrVbj8ePHsLa2hkwmM3QcIiIi0oEQAi9evICrqyvk8tzHZj644ubx48dwd3c3dAwiIiLKgwcPHqB48eK59vngihtra2sAr3aOjY2NgdMQERGRLhISEuDu7q75Hs/NB1fcZB6KsrGxYXFDRERUyOgypYQTiomIiMiosLghIiIio8LihoiIiIwKixsiIiIyKixuiIiIyKiwuCEiIiKjwuKGiIiIjAqLGyIiIjIqLG6IiIjIqLC4ISIiIqNi0OLmxIkT6NChA1xdXSGTybBr1663LnPs2DHUqlULSqUSZcuWxcaNG/M9JxERERUeBi1ukpKSUKNGDaxYsUKn/mFhYWjXrh2aNWuGwMBAfP311/jyyy+xf//+fE5KREREhYVBb5zZpk0btGnTRuf+q1atQqlSpbBo0SIAQKVKlXDq1Cn8+OOP8Pb2zq+YRET0P3HJaUhMzTB0DHrPKUzlcLQ2N9j2C9Vdwc+ePQsvLy+tNm9vb3z99dc5LpOamorU1FTN84SEhPyKR0RkVFRqgTtPXuDS/ee4dP85Lt9/jvBnyYaORYVArRJ2+HN4I4Ntv1AVN1FRUXByctJqc3JyQkJCAlJSUmBhYZFlmblz52LGjBkFFZGIqNB68TIdgQ/iNMVMYEQcXmQzSqM05bkolDszE8N+RgpVcZMXEydOhK+vr+Z5QkIC3N3dDZiIiMjwhBCIiE3WFDKX7j9HcPQLCKHdz1Jhgpol7FC7hD1qlbRHTXd72FqaGSY0kY4KVXHj7OyM6Ohorbbo6GjY2NhkO2oDAEqlEkqlsiDiERG9t16mq3DjUfx/h5giniMmMS1LP/eiFqhdwh61S74qZio4WcPUwH+FE0lVqIqbBg0aYO/evVptBw8eRIMGDQyUiIiM3f1nSdh7PQqR8SmGjpIn6SqB21EJuPEoHukq7WEZhYkcVd1sULvk/4qZEvZwtDHcJFAifTFocZOYmIi7d+9qnoeFhSEwMBBFixZFiRIlMHHiRDx69AibN28GAAwdOhTLly/Hd999hy+++AJHjhzBtm3bsGfPHkO9BSIyQg+fJ2PPtUj8cy0S1x/FGzqO3jgUUWgKmdol7VHF1RbmZiaGjkWkdwYtbi5evIhmzZppnmfOjfHx8cHGjRsRGRmJiIgIzeulSpXCnj17MHbsWCxduhTFixfH2rVreRo4Eb2zqPiX2HM9Ev9ce4wrEXGadhO5DA3LfARPdzvIZDLDBXwHpRwsUbtEUbgXtSi074FICpkQb04fM24JCQmwtbVFfHw8bGxsDB2HiAzo6YtU7LsRiX+uRuLC/VjNZFqZDKhfqijaV3dFm6rO+KgI5+0RGZqU7+9CNeeGiEgXarVAaoY629depKbj0K0n+OfaY5wLfQb1a3/e1Slpj/bVXdC2mgvnnhAVYixuiMhoPE9Kw/+du49NZ+8jJjH17QsAqOFuhw7/K2hc7bI/65KIChcWN0RU6N1/loR1p8Kw7eIDvEzPfsTmdVVcbdC+uivaV3eBe1HLAkhIRAWJxQ0RFVqXI55jzYlQBNyM0syXqeJqgyFNSqN5RUfIs5k8K5fJYKHgGUJExozFDREVKiq1wKGgaKw5EYqL959r2ptVKIbBTUqjQemPeEYQ0QeOxQ3RB0StFjgW8gTXHhbOa7ekq9TYez0KYTFJAF5dhK5zTVd82bg0yjtZGzgdEb0vWNwQfQBepquw88ojrDkZitCnSYaO885sLczQ9+MS8GngwbOaiCgLFjdERiz2f2cPbT4brrmPkLXSFN5VnWFuVjjvF1TByRpdaxWHlZK/vogoe/ztQGSEwmNenT20/dJ/Zw+52VlgYCMP9KzrDmtz3tWZiIwXixsiI3Lp/quzh/bf+u/soapuNhjSpAzaVnXm3Z2J6IPA4oaokHsQm4w91yPx99XHuPk4QdPevKIjBjcujY9LF+XZQ0T0QWFxQ1QIRcanYM+1SPx9LRJXH8Rp2hUmcnSp6YYvG5dCOZ49REQfKBY3RIXEkxcvse96FP659hgXwv+7votMBnxc6iO0r+GC1lV4k0ciIhY3RO+xZ4mpCLgZhX+uRuJ8mPZNHut62L+6a3U1Zzha83RoIqJMLG6I3jNCCPx9LRLbLz7AmXvPoHqtovF0t0P76i5oV90FLra8ySMRUXZY3BC9Z349dx/T/rqpeV7V7dVNHttV400eiYh0weKG6D1y83E8vv8nCAAwoKEHBjT0gIeDlYFTEREVLixuiN4TSakZGPXbFaSp1PCq5Ai/DpV5CjcRUR7wil5E74lpf91EaEwSnG3MsaBbDRY2RER5xJEbotf8cekhlh25gwyVeHtnPRJC4HH8S8hlwNJenrC3UhTo9omIjAmLG6L/ufk4HhP/vI40ldpgGcZ6lUf90h8ZbPtERMaAxQ0RtOe7NK/oiDEtyhV4BnMzE5R3KlLg2yUiMjYsboigPd9lUfcaPCxERFSIsbihQi0+JR1n7sZAJfI+R+buk0T8cfkh5DJg2ec1WdgQERVyLG6o0EpXqTFgw7+4EhGnl/V97VUe9UoV1cu6iIjIcFjcUKG18EAwrkTEoYjSFFXdbN5pXRWdbTCiWVk9JSMiIkNicUOF0rHgJ/jleCgAYGH36mhd1cXAiYiI6H3Bi/hRofMk4SW+2XYVANC/QUkWNkREpIUjN/Tei3iWjIk7ryEhJQMA8PRFKp4lpaGiszUmta1k4HRERPS+YXFD77XUDBVG/HYZ1x/Fa7VbKkywvHctmJuZGCgZERG9r1jc0HttfkAwrj+Kh52lGeZ1rQ6l6asjqWUdi8C9qKWB0xER0fuIxQ29tw4HRWPdqTAAwIJuNdCyspOBExERUWHACcX0XoqMT8G47a8mDQ9s5MHChoiIdMbiht47KrXAGP9APE9ORxVXG0xoU9HQkYiIqBBhcUPvnZ+O3MG/YbGw+t+kYaUpJw0TEZHuWNzQe+Vc6DMsO3wHADC7SzWUcrAycCIiIipsWNzQeyM2KQ1j/K9ALYButYujc003Q0ciIqJCiMUNvTfG/3EN0QmpKF3MCjM6VjF0HCIiKqRY3NB7ISk1AwdvRQMAlvWqCSslr1JARER5k6dvkPT0dERFRSE5ORnFihVD0aJF9Z2LPjAqITT/LudUxIBJiIiosNN55ObFixdYuXIlmjZtChsbG3h4eKBSpUooVqwYSpYsicGDB+PChQv5mZWIiIjorXQqbhYvXgwPDw9s2LABXl5e2LVrFwIDAxESEoKzZ8/Cz88PGRkZaNWqFVq3bo07d+7kd24iIiKibOl0WOrChQs4ceIEqlTJfpJnvXr18MUXX2DVqlXYsGEDTp48iXLlyuk1KBm3Z4lpho5ARERGQqfi5vfff9dpZUqlEkOHDn2nQPThyVCp8e3/brVQp6Q9L9pHRETvhGdLkcEtPXwHF+8/h7XSFIt7eBo6DhERFXKSipurV6/i+++/x88//4yYmBit1xISEvDFF1/oNRwZvzN3Y7D86F0AwJyu1VDiI0sDJyIiosJOJsRr5+Dm4sCBA+jQoQPKlSuHFy9eICkpCdu3b0ezZs0AANHR0XB1dYVKpcrXwO8qISEBtra2iI+Ph42NjaHjfHCuPojD/527D5X61cfuxJ2niElMQ6+67pj3WXUDpyMioveVlO9vnUdupk+fjnHjxuHGjRsIDw/Hd999h44dOyIgIOCdA9OHY8m
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Select the best features\n",
"score_rf, model_rf, most_important_features_rf = score_the_model(\n",
" model=RandomForestClassifier(),\n",
" model_name='Random Forest',\n",
" random_seed=42,\n",
" X_train=X_train[most_important_features_rf],\n",
" X_test=X_test[most_important_features_rf],\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
2022-12-29 10:21:35 +01:00
{
"cell_type": "markdown",
"id": "bf17a3b5",
"metadata": {},
"source": [
"### KNN"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 173,
2022-12-29 10:21:35 +01:00
"id": "75fea43a",
"metadata": {},
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABSaUlEQVR4nO3de3yP9f/H8edns/Nsw2yGMadCFjWHnCLmEJF0kA4O5RQTVkjlVLSO0jdKCB34OuVUJFqoWMlEDjmT44bkNIfZdv3+6Ofz7dPGeyf7zPa4327X7dvnfb3f1/W65qrvnt7X9f7YLMuyBAAAAAC4JhdnFwAAAAAA+R3BCQAAAAAMCE4AAAAAYEBwAgAAAAADghMAAAAAGBCcAAAAAMCA4AQAAAAABgQnAAAAADAgOAEAAACAAcEJAHBNM2bMkM1m04EDB+xtTZs2VdOmTY1jV69eLZvNptWrV9+w+rIjLCxM3bp1c3YZ+cru3bvVsmVL+fv7y2azadGiRc4uCQDyHYITAPy/vXv3qnfv3qpYsaI8PT3l5+enhg0b6r333tPFixedXV6hsmzZMo0aNcrZZRQaXbt21ZYtWzR27Fh99tlnql279g0714EDB2Sz2fT22287tFuWpd69e8tms9n/7K+Gb5vNpvj4+HTH6tatm3x9fR3amjZtKpvNpnbt2mX63ACQGUWcXQAA5AdLly7Vww8/LA8PD3Xp0kU1atRQcnKyfvzxRw0ePFjbtm3T5MmTnV1mvrBixYobfo5ly5Zp4sSJhKc8cPHiRcXFxemll15SVFSUU2qwLEt9+/bV5MmTNXz48Az/3EeNGqUvv/wy08f86quvFB8fr4iIiFysFEBhRnACUOjt379fjz76qMqXL6/vvvtOISEh9n39+vXTnj17tHTp0muOT0tLU3Jysjw9PfOiXKdzd3d3dgmFQkpKitLS0m74z/vEiROSpICAgFw7ZlJSknx8fDLdv3///po0aZJeeuklvfLKK+n216pVS1999ZU2btyoO++803i8cuXK6dy5cxo9erSWLFmSpdoB4Fp4VA9Aoffmm2/q/Pnz+vjjjx1C01WVK1fWgAED7J9tNpuioqI0c+ZM3XbbbfLw8NDy5cslSb/++qvuvfde+fn5ydfXV82bN9dPP/3kcLwrV65o9OjRqlKlijw9PVWiRAk1atRIK1eutPdJSEhQ9+7dVbZsWXl4eCgkJET333+/w7tG/zZ//nzZbDatWbMm3b6PPvpINptNW7dulST99ttv6tatm/2xxFKlSumpp57Sn3/+afx5ZfSO0+HDh9WhQwf5+PgoKChIgwYN0uXLl9ON/eGHH/Twww+rXLly8vDwUGhoqAYNGuTwKGS3bt00ceJESbI/pmWz2ez709LSNH78eN12223y9PRUcHCwevfurb/++svhXJZlacyYMSpbtqy8vb11zz33aNu2bcbru2r27NmKiIhQ0aJF5efnp/DwcL333nsOfU6fPq1BgwYpLCxMHh4eKlu2rLp06aKTJ0/a+xw/flxPP/20goOD5enpqZo1a+qTTz5xOM4/HyEbP368KlWqJA8PD23fvl2StGPHDj300EMqXry4PD09Vbt27XSBIDP31b+NGjVK5cuXlyQNHjxYNptNYWFh9v2ZuZ+vvge3Zs0a9e3bV0FBQSpbtmymf84DBgzQxIkTNWzYMI0ZMybDPv3791exYsUyPQNZtGhRDRo0SF9++aU2btyY6VoA4HqYcQJQ6H355ZeqWLGiGjRokOkx3333nebOnauoqCgFBgYqLCxM27ZtU+PGjeXn56chQ4bIzc1NH330kZo2bao1a9aoXr16kv7+ZTUmJkY9evRQ3bp1dfbsWW3YsEEbN25UixYtJEkPPvigtm3bpv79+yssLEzHjx/XypUrdfDgQYdfbP+pbdu28vX11dy5c9WkSROHfXPmzNFtt92mGjVqSJJWrlypffv2qXv37ipVqpT9UcRt27bpp59+cggqJhcvXlTz5s118OBBPfvssypdurQ+++wzfffdd+n6zps3TxcuXNAzzzyjEiVKaP369Xr//fd1+PBhzZs3T5LUu3dvHT16VCtXrtRnn32W7hi9e/fWjBkz1L17dz377LPav3+/JkyYoF9//VVr166Vm5ubJGnEiBEaM2aM2rRpozZt2mjjxo1q2bKlkpOTjde0cuVKde7cWc2bN9cbb7whSfr999+1du1ae4g+f/68GjdurN9//11PPfWU7rzzTp08eVJLlizR4cOHFRgYqIsXL6pp06bas2ePoqKiVKFCBc2bN0/dunXT6dOnHQK5JE2fPl2XLl1Sr1695OHhoeLFi2vbtm1q2LChypQpoxdeeEE+Pj6aO3euOnTooC+++EIPPPCApMzdV//WsWNHBQQEaNCgQercubPatGljf2cos/fzVX379lXJkiU1YsQIJSUlGX/GkjRo0CD95z//0dChQ/Xaa69ds5+fn58GDRqkESNGZHrWacCAAXr33Xc1atQoZp0A5A4LAAqxM2fOWJKs+++/P9NjJFkuLi7Wtm3bHNo7dOhgubu7W3v37rW3HT161CpatKh1991329tq1qxptW3b9prH/+uvvyxJ1ltvvZX5C/l/nTt3toKCgqyUlBR727FjxywXFxfrlVdesbdduHAh3dj//ve/liTr+++/t7dNnz7dkmTt37/f3takSROrSZMm9s/jx4+3JFlz5861tyUlJVmVK1e2JFmrVq267nljYmIsm81m/fHHH/a2fv36WRn9X9QPP/xgSbJmzpzp0L58+XKH9uPHj1vu7u5W27ZtrbS0NHu/F1980ZJkde3aNd2x/2nAgAGWn5+fw8/x30aMGGFJshYsWJBu39VzXv3ZfP755/Z9ycnJVv369S1fX1/r7NmzlmVZ1v79+y1Jlp+fn3X8+HGHYzVv3twKDw+3Ll265HD8Bg0aWFWqVLG3me6ra7l67n/fb5m9n6/eI40aNbruz+vf5ytfvrwlyRo8ePA1+65atcqSZM2bN886ffq0VaxYMat9+/b2/V27drV8fHwcxjRp0sS67bbbLMuyrNGjR1uSrPj4+OteKwBkBo/qASjUzp49K+nvR3uyokmTJqpevbr9c2pqqlasWKEOHTqoYsWK9vaQkBA99thj+vHHH+3nCggI0LZt27R79+4Mj+3l5SV3d3etXr063eNnJp06ddLx48cdlgCfP3++0tLS1KlTJ4dzXHXp0iWdPHlSd911lyRl+dGmZcuWKSQkRA899JC9zdvbW7169UrX95/nTUpK0smTJ9WgQQNZlqVff/3VeK558+bJ399fLVq00MmTJ+1bRESEfH19tWrVKknSt99+q+TkZPXv399h9mzgwIGZuqaAgAAlJSVd9zG3L774QjVr1rTP+PzT1XMuW7ZMpUqVUufOne373Nzc9Oyzz+r8+fPpHqt88MEHVbJkSfvnU6dO6bvvvtMjjzyic+fO2a/3zz//VKtWrbR7924dOXLEXvP17qusyMr9fFXPnj3l6uqa6XMkJiZKkm655ZZM9ff399fAgQO1ZMmSTN0r0t+zTsWKFdPo0aMzXRcAXAvBCUCh5ufnJ0k6d+5clsZVqFDB4fOJEyd04cIF3Xrrren6VqtWTWlpaTp06JAk6ZVXXtHp06d1yy23KDw8XIMHD9Zvv/1m7+/h4aE33nhDX3/9tYKDg3X33XfrzTffVEJCgr3PmTNnlJCQYN9OnTolSWrdurX8/f01Z84ce985c+aoVq1aDr+gnjp1SgMGDFBwcLC8vLxUsmRJ+zWdOXMmSz+LP/74Q5UrV073eF9GP4uDBw+qW7duKl68uHx9fVWyZEn7Y4WZOe/u3bt15swZBQUFqWTJkg7b+fPndfz4cXtNklSlShWH8SVLllSxYsWM5+nbt69uueUW3XvvvSpbtqyeeuop+3tsV+3du9f+6OO1/PHHH6pSpYpcXBz/77ZatWoOdV717/tqz549sixLw4cPT3e9I0eOlCT7NZvuq6zIyv18rdpNhg4dqjp16qh3796aP39+psYMGDBAAQEBmX7XKTthCwCuheAEoFDz8/NT6dK
2022-12-29 10:21:35 +01:00
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
2023-01-06 10:09:28 +01:00
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdeVgV5f//8RegbKK4IIuKYmruomEQmqmF4pJli5kbSC6lkiZtUikuJZVLmJmauWWZe+Y3zSXSyqXczX3fE9xyQwOF+f3hj/PhyHJAgYPyfFzXXBfnnntm7nvmnDM373PPfdsYhmEIAAAAAAAAQKZsrV0AAAAAAAAAoKAjiAYAAAAAAABYQBANAAAAAAAAsIAgGgAAAAAAAGABQTQAAAAAAADAAoJoAAAAAAAAgAUE0QAAAAAAAAALCKIBAAAAAAAAFhBEAwAAAAAAACwgiAYUYgcPHlTLli3l6uoqGxsbLV68OE+O06xZMzVr1ixP9p3fhg4dKhsbG2sXw6Lly5erfv36cnR0lI2NjS5dumTtIgEACrFNmzapUaNGKlasmGxsbLR9+/ZsbztjxgzZ2Njo2LFjFvP6+Pioe/fud13Owiqjc5zd9tuaNWtkY2OjNWvW5Fn57sb99l44duyYbGxsNGPGDGsXJUvx8fF68cUXVaZMGdnY2CgmJsbaRbpn165dk7u7u7777jtrFyVXTJo0SRUrVlRiYqK1i/JAIogGFGCpDZrUxdHRUeXKlVNwcLA+//xzXb169Z72Hxoaqp07d+qjjz7SrFmz1LBhw1wqedb++ecfDR06NEcNaGTfhQsX9NJLL8nJyUkTJkzQrFmzVKxYsQzzpr7HNm/ebJZ++fJl+fv7y9HRUcuXL5f0vwCih4eHrl+/nm5fPj4+evrpp83SUt+7Y8aMyfaxAQB579q1a4qKilKrVq1UunRpi/+87927V61atZKLi4tKly6tbt266dy5c9k61s2bN9WhQwddvHhRn332mWbNmqVKlSrlUk1wP1u2bJmGDh1q7WIgBwYOHKgVK1YoMjJSs2bNUqtWrTLNa2Njo/Dw8HTpI0eOlI2NjV555RWlpKSYAog2NjZauHBhuvypbdDz58+b0rp37y4bGxvVq1dPhmFk+9gZGTdunIoXL66XX345W/kLuu7duyspKUmTJ0+2dlEeSATRgPvA8OHDNWvWLE2cOFGvv/66JOmNN95Q3bp19ffff9/VPm/cuKENGzaoR48eCg8PV9euXVWhQoXcLHam/vnnHw0bNuy+DKJ98MEHunHjhrWLkaVNmzbp6tWrGjFihHr06KGuXbuqaNGi2d7+ypUratmypf7++2/98MMP6RpHZ8+e1cSJE3NUplGjRmUYeAMAWMf58+c1fPhw7d27V76+vlnmPXXqlJ544gkdOnRII0eO1FtvvaWlS5eqRYsWSkpKsnisw4cP6/jx43rrrbfUu3dvde3aVaVKlcqtqiCPrFy5UitXrszTYyxbtkzDhg3L02PcLypVqqQbN26oW7du1i5Kln799Vc9++yzeuutt9S1a1fVqFEjR9t//PHHev/99xUaGqqvv/5atrbmIYnhw4dnGBTLzM6dO7Vo0aIclSGtmzdvaty4cerZs6fs7Ozuej8FiaOjo0JDQzV27NgcnUtkD0E04D7QunVrde3aVWFhYYqMjNSKFSv0yy+/6OzZs3rmmWfuKqiT+utxyZIlc7m0D6aEhARJUpEiReTo6Gjl0mTt7Nmzku7u2l69elXBwcHavn27Fi5cqNatW6fLU79+fY0aNSrb77v69esrPj5ekyZNynF5AAB5w8vLS2fOnNHx48c1atSoLPOOHDlSCQkJ+vXXX9W/f3+99957mjdvnnbs2JGtR8/u5b5UkKSkpOi///6zdjHyjb29vezt7a1djAferVu3lJSUZHrqpKAHcs6ePXvXn+VRo0YpMjJSISEhmjZtWroAWv369U0/4maHk5OTHn744RwH3tL66aefdO7cOb300kt3tX1B9dJLL+n48eNavXq1tYvywCGIBtynnnzySQ0ePFjHjx/Xt99+a7Zu3759evHFF1W6dGk5OjqqYcOGWrJkiWn90KFDTY9RvP3227KxsZGPj48k6fjx4+rbt6+qV68uJycnlSlTRh06dEg3DklmY4NZGrdkzZo1evTRRyVJYWFhpq7bWTXCr169qjfeeEM+Pj5ycHCQu7u7WrRooa1bt5rl++uvv9SmTRuVKlVKxYoVU7169TRu3DizPL/++quaNGmiYsWKqWTJknr22We1d+/eDOu2Z88ede7cWaVKldLjjz+eab1Tu4svXrxYderUkYODg2rXrm16DPLO+jds2FCOjo6qUqWKJk+enKNx1ubPny8/Pz85OTnJzc1NXbt21enTp03rmzVrptDQUEnSo48+Khsbm2yPB3Lt2jW1atVKW7du1cKFC9W2bdsM8w0ZMkTx8fHZ7o3WuHFjPfnkk/r0008LfC8+ACgsHBwc5Onpma28Cxcu1NNPP62KFSua0oKCgvTwww9r3rx5WW7bvXt3NW3aVJLUoUMH2djYmI2zlZ37ckYMw9CHH36oChUqyNnZWc2bN9fu3buzVR/pdkBs3Lhxqlu3rhwdHVW2bFm1atXKbIiB1Pv7d999p9q1a8vBwcF0b9+2bZtat26tEiVKyMXFRU899ZT+/PNPs2PcvHlTw4YNU7Vq1eTo6KgyZcro8ccf16pVq0x54uLiFBYWpgoVKsjBwUFeXl569tlnsxz/bcGCBbKxsdFvv/2Wbt3kyZNlY2OjXbt2SZL+/vtvde/eXQ899JAcHR3l6empV155RRcuXLB4jjIaE+3UqVNq3769ihUrJnd3dw0cODDDcZf++OMPdejQQRUrVpSDg4O8vb01cOBAs3ZA9+7dNWHCBEkyG74kVUpKimJiYlS7dm05OjrKw8NDr776qv7991+zY93re2HOnDny8/NT8eLFVaJECdWtWzdd+/HSpUsaOHCgqS1aoUIFhYSEmD1eePbsWfXo0UMeHh5ydHSUr6+vZs6cabaf1McWR48erZiYGFWpUkUODg7as2dPhmOide/eXS4uLjp9+rTat28vFxcXlS1bVm+99ZaSk5PN9n3hwgV169ZNJUqUUMmSJRUaGqodO3Zke5y1I0eOqEOHDipdurScnZ312GOPaenSpab1qW18wzA0YcKEdNfLkrFjx+qdd95R165dNX369HQBNEl6+eWXcxQUs7W11QcffJCjwNudFi9eLB8fH1WpUsUsPfXcnzhxQk8//bRcXFxUvnx503t2586devLJJ1WsWDFVqlRJs2fPTrfvS5cu6Y033pC3t7ccHBxUtWpVffLJJ0pJSTHLN3r0aDVq1EhlypSRk5OT/Pz8tGDBgnT7y8n/HH5+fipdurR+/PHHuzovyFwRaxcAwN3r1q2b3nvvPa1cuVK9evWSJO3evVuNGzdW+fLlNWjQIBUrVkzz5s1T+/bttXDhQj333HN6/vnnVbJkSQ0cOFCdOnVSmzZt5OLiIun2o4Dr16/Xyy+/rAoVKujYsWOaOHGimjVrpj179sjZ2fmeylyzZk0NHz5cQ4YMUe/evdWkSRNJUqNGjTLd5rXXXtOCBQsUHh6uWrVq6cKFC1q7dq327t2rRx55RJK0atUqPf300/Ly8tKAAQPk6empvXv36qefftKAAQMkSb/88otat26thx56SEOHDtWNGzc0fvx4NW7cWFu3bjUFElN16NBB1apV08iRIy3eyNeuXatFixapb9++Kl68uD7//HO98MILOnHihMqUKSPpdoO7VatW8vLy0rBhw5ScnKzhw4erbNmy2Tp3M2bMUFhYmB599FFFR0crPj5e48aN07p167Rt2zaVLFlS77//vqpXr66vvvpKw4cPV+XKldM1CjKSkJCg1q1ba9OmTVqwYEG6sc3SatKkiSko1qdPHzk5OVnc/9ChQ/XEE09o4sSJioiIyFZ9AQDWd/r0aZ09ezbDcVP9/f21bNmyLLd/9dVXVb58eY0cOVL9+/fXo48+Kg8PD0k5vy+nNWTIEH344Ydq06aN2rRpo61bt6p
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABt0UlEQVR4nO3deXhM1x8G8HeyzGTfRFYhQog91hBFVQhatbSlKKGWXxFtqdYullraorSU2ulmK62ilNQSaql9SQRZxJKEiOzLJDPn94dmGEmYiZlMlvfzPHmezJl773znSjKvc885VyKEECAiIiKqIIwMXQARERGRLjHcEBERUYXCcENEREQVCsMNERERVSgMN0RERFShMNwQERFRhcJwQ0RERBWKiaELKG1KpRL37t2DtbU1JBKJocshIiIiDQghkJ6eDjc3NxgZPb9vptKFm3v37sHDw8PQZRAREVEJ3L59G9WqVXvuNpUu3FhbWwN4fHJsbGwMXA0RERFpIi0tDR4eHqrP8eepdOGm4FKUjY0Nww0REVE5o8mQEg4oJiIiogqF4YaIiIgqFIYbIiIiqlAYboiIiKhCYbghIiKiCoXhhoiIiCoUhhsiIiKqUBhuiIiIqEJhuCEiIqIKheGGiIiIKhSDhpujR4+iR48ecHNzg0QiwW+//fbCfQ4fPoxmzZpBJpOhdu3a2LBhg97rJCIiovLDoOEmMzMTTZo0wfLlyzXaPiYmBq+//jo6duyICxcu4OOPP8bw4cOxf/9+PVdKRERE5YVBb5zZrVs3dOvWTePtV65ciZo1a2LRokUAgHr16uHYsWP4+uuvERgYqK8yiYiISAPyfCWSM+XIUyjh4WBhsDrK1V3BT5w4gYCAALW2wMBAfPzxx8Xuk5ubi9zcXNXjtLQ0fZVHRERUocjzlXiUJUdSRi6SM+V4mCHHw0w5HhY8fvr7DDnSc/MBAG28quCXka0NVne5CjcJCQlwdnZWa3N2dkZaWhqys7Nhbm5eaJ/58+dj1qxZpVUiERFRmZWnUOJRphxJGfL/wknuf4FFPbwkZz4ONOk5+Vq/hrGRBEoh9FC95spVuCmJyZMnY/z48arHaWlp8PDwMGBFREREupGvUCI563EoKQgk6iElV+25tBKEFSMJ4GApQxVLKRwspahiJUUVSymqWMngYCmFo5UUDpZPvrcxM4WRkUQP71Zz5SrcuLi4IDExUa0tMTERNjY2RfbaAIBMJoNMJiuN8oiIiF5KvkKJR1l5j3tSMuRIypQjOSP38eWfTDmS/+tlefhfgEnNztP6NR6HFanqq4rV4+BSxVIGByspHNVCjAy25oYPK9oqV+GmTZs22Lt3r1rbgQMH0KZNGwNVREREVDyFUuBRllx16aegF+Xhf4HlSS/L48cpWdqHFYkEcLB4ElYc/+tRebaHpeB7W3NTGJezsKItg4abjIwM3Lx5U/U4JiYGFy5cgIODA6pXr47Jkyfj7t272LRpEwDggw8+wLJly/DZZ5/h/fffx99//42tW7diz549hnoLRERUiSiUAilZclXPScFYlcdjWJ7+/nGAScnOg7bDTyQSwF4trPzXk2IpU/WmPB1e7CykFT6saMug4ebMmTPo2LGj6nHB2JigoCBs2LAB8fHxiIuLUz1fs2ZN7NmzB+PGjcPSpUtRrVo1rFmzhtPAiYioRJRKgZTsPFVPysP/QkpBQFEbx5Ipx6MsudZhBQDsLUzVAora5aBnwouduSlMjHkDgZchEcLAQ5pLWVpaGmxtbZGamgobGxtDl0NERDqkVAqkZuc9NQvo+VOXH2XJoSzBp6CdKqwUNVblv8G3//Wy2FswrOiCNp/f5WrMDRERVS5KpUBaTp7apZ6ne1iefP/4EtGjrDwoSpBWbM1N1WYDOVjKnlwOslKfKWRvIYUpw0qZxnBDRESlRgiBtOx8JKnWVXk2oDzpWUn6r2elJGHFxszkmYG0z45VeTJ12d6SYaWiYbghIqISE0IgLSdfLZAkZ6qPW3n6EtGjTDnySxBWrM1Mnuo9eXqsypMeloKZQvYWUkhNGFYqM4YbIiJSEUIgPTf/+QNrn1qC/1GWHHmKEoQVmQkcnh5I+9TA2oKpzKqwYmkKmYmxHt4tVVQMN0REFZgQAhn/hZXCA2vVx60UTGsuSVixkpkUnrr8TA9LlafGrJiZMqyQ/jDcEBGVI0IIZMoVzx1Y+3QPS3KmHHKFUuvXsZQa/9ezIis0C+jZqcsOlgwrVLYw3BARGZAQAllyReEVbJ8eWPvUPYIeZsohz9c+rFhIjYuc+fP0VOYqTz3PsELlGcMNEZGOZcnVLwM9WWb/mZlB/z3OLUFYMTM1emrF2udPXa5iKYO5lGGFKg+GGyKiF8iWK55apfbZ3pUnY1UKvs/J0z6syEyM1O4JpHaPoCLGrVhI+eebqDj87SCiSicnT1Fo5s/Tl32eHXSbnafQ+jWkJkaPx6pYPTsb6OnelicBxkJqDImE9wci0gWGGyKqFO6lZGP98RhsP3sHj0pw52WpiVHhdVb+Cy+O/y0Ip/reSgpLhhUig2G4IaIK7crdVKwOi8buS/FqK91KjY1UU5erPDWY9slUZpnatGYrmQnDClE5wXBDRBWOEAKHrz/A6qPR+Cfqoardv1YVjGjnheae9rBmWCGqsBhuiKjCyM1X4PcL97AmLBrXEzMAAMZGErzR2BUj2nmhobutgSskotLAcENE5V5qVh5+PHULG/6JxYP0XACPF6Hr36o6hr5SE+525gaukIhKE8MNEZVbt5OzsPZYDLaeuY0s+eMZTS42Zhja1hPvtqoOW3NTA1dIRIbAcENE5c7F2ylYFRaNPy/Ho2CMsI+LNUa298Ibjd14R2iiSo7hhojKBaVS4O9r97EqLBqnY5JV7e28HTGyvRdeqe3IAcJEBIDhhojKuJw8BXaev4vVYdGIfpAJADAxkuBNXzeMaOeFeq42Bq6QiMoahhsiKpOSM+X48eQtbDoRi6QMOQDAWmaCAa2rY4i/J1xtOUiYiIrGcENEZUpsUibWHovBtrO3Vfdocrczx9C2nujX0gPWZhwkTETPx3BDRGXC2VuPsPpoNPaHJ0D8N0i4obsNRrTzQvdGrjA15iBhItIMww0RGYxCKXAgPBGrw6Jx9tYjVXvHulUxor0X2nhV4SBhItIaww0RlbpsuQLbz93B2rBoxD7MAvD4Xk+9mrpheDsv1HG2NnCFRFSeMdwQUalJysjFphO38MOJWNWduW3NTfFe6+oIauMJJxszA1dIRBUBww0R6V3UgwysCYvBr+fuQJ7/eJCwh4M5hrWtiXdaeMBSxj9FRKQ7/ItCRHohhMC/sY+w6mg0DkYkqtqbVLPFyPa1ENjAGSYcJExEesBwQ0Q6la9QYv/VRKwKi8bF2ymq9oB6zhjZ3gstPe05SJiI9Irhhoh0IjM3H9vO3Mba4zG4nZwNAJCaGOGtZtUwvF1N1KpqZeAKiaiyYLghopdyPy0HG0/E4seTcUjNfjxI2N7CFIPaeGJwmxpwtJIZuEIiqmwYboioRG4kpmN1WDR+O38PcsXjQcKeVSwwrJ0X3m5WDeZSYwNXSESVFcMNEWlMCIET0Q+x+mg0DkU+ULU3r2GPEe280Lm+M4yNOJ6GiAyL4YaIXihPocTey/FYHRaNK3fTAAASCRBY3wUj2tdE8xoOBq6QiOgJhhsiKlZGbj42n47D+uOxuJvyeJCwmakR3mnugWGv1ISno6WBKyQiKozhhogKSUjNwfp/YvDzqTik5+QDAKpYShHk74n3WteAg6XUwBUSERWP4YaIVCLi07A6LBq7LtxDvvLxrbm9qlpiRDsv9G7qDjNTDhImorKP4YaokhNC4NjNJKw6Go2wG0mq9lY1HTCynRde83GCEQcJE1E5wnBDVEnJ85XYfekeVh2NxrWEdACAkQTo1sgVI9p5wdfDzrAFEhGVEMMNUSWTlpOHX049HiSckJYDALCQGqNvi8eDhD0cLAxcIRHRyylRuMnLy0NCQgKysrJQtWpVODhwGihRWXc3JRvrj8Vg87+3kZH7eJBwVWsZhvh7YqBfddhZcJA
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
2022-12-29 10:21:35 +01:00
}
],
"source": [
"from sklearn.neighbors import KNeighborsClassifier\n",
"\n",
"# Score the model with default parameters\n",
2023-01-06 10:09:28 +01:00
"score_knn, model_knn, _ = score_the_model(\n",
2022-12-29 10:21:35 +01:00
" model=KNeighborsClassifier(),\n",
" model_name='KNN',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
{
"cell_type": "markdown",
"id": "0b925bbf",
"metadata": {},
"source": [
"#### Logistic Regression"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 174,
2022-12-29 10:21:35 +01:00
"id": "33d0774a",
"metadata": {},
2023-01-06 10:09:28 +01:00
"outputs": [
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABWnElEQVR4nO3dd3hUVf7H8c+kk4QQIYVQg4AGJIKGXgQhEERRBBVxlaI0IYqgKKg0RbAirqAICCgLS5O2S5emAgqEIlUIRZqBINISIGXu7w9/zDomcNLIhPB+Pc88u3PuOfd+Z7gZ88m594zNsixLAAAAAIBrcnN1AQAAAABQ0BGcAAAAAMCA4AQAAAAABgQnAAAAADAgOAEAAACAAcEJAAAAAAwITgAAAABgQHACAAAAAAOCEwAAAAAYEJwAIAumTJkim82mw4cPO9qaNGmiJk2aGMeuWbNGNptNa9asuWH15UR4eLg6d+7s6jIKlP3796tFixYqVqyYbDab5s+f7+qS8tzQoUNls9nybH+dO3dWeHh4nu0PWf9sAZC/CE4AMjhw4IB69Oih22+/XT4+PgoICFCDBg30ySef6NKlS64u75ayePFiDR061NVl3DI6deqkHTt26J133tHUqVNVs2bNG3asw4cPy2az6cMPP7xhx8grJ06c0NChQ7Vt27YbepzOnTvLZrM5Ht7e3rrjjjs0ePBgXb58+YYeGwBMPFxdAICCZdGiRXr88cfl7e2tjh07qlq1akpJSdEPP/yg/v37a9euXRo/fryryywQli9ffsOPsXjxYo0dO5bwlA8uXbqkDRs26I033lBsbKyry7lh3nzzTQ0YMCBbY06cOKFhw4YpPDxcNWrUcNo2YcIE2e32PKvP29tbEydOlCSdO3dOCxYs0Ntvv60DBw5o2rRpeXacgiw/PlsAZB/BCYDDoUOH9OSTT6p8+fJatWqVwsLCHNt69+6t+Ph4LVq06Jrj7Xa7UlJS5OPjkx/lupyXl5erS7glpKWlyW633/D3OzExUZIUGBiYZ/tMSkqSn59fnu0vL3h4eMjDI+/+8+/p6Zln+5L+rO/pp592PO/Vq5fq16+vf//73xo1apRCQ0Pz9HjXk1/n3t/x2QIUTFyqB8Dh/fff18WLF/Xll186haarKlWqpD59+jie22w2xcbGatq0abrrrrvk7e2tpUuXSpK2bt2qBx54QAEBAfL391ezZs30448/Ou0vNTVVw4YNU+XKleXj46MSJUqoYcOGWrFihaNPQkKCunTpojJlysjb21thYWF65JFHnO41+rs5c+bIZrNp7dq1GbZ98cUXstls2rlzpyTp559/VufOnR2XJZYsWVLPPvusfv/9d+P7ldl9CMeOHVObNm3k5+enkJAQ9e3bV1euXMkw9vvvv9fjjz+ucuXKydvbW2XLllXfvn2dLoXs3Lmzxo4dK0lOly9dZbfbNXr0aN11113y8fFRaGioevTooT/++MPpWJZlafjw4SpTpox8fX11//33a9euXcbXd9WMGTMUFRWlokWLKiAgQJGRkfrkk0+c+pw9e1Z9+/ZVeHi4vL29VaZMGXXs2FGnT5929Dl16pSee+45hYaGysfHR9WrV9dXX33ltJ+/Xr42evRoVaxYUd7e3tq9e7ckae/evXrsscdUvHhx+fj4qGbNmlq4cKHTPrJyXv3d0KFDVb58eUlS//79ZbPZnO7bycr5fPU+uLVr16pXr14KCQlRmTJlsvw+X0tW3jdJ+v333/XMM88oICBAgYGB6tSpk7Zv3y6bzaYpU6Y4vda/3+O0YsUKNWzYUIGBgfL399edd96p119/XdKf9+jVqlVLktSlSxfHeXh1n5nd42S32/XJJ58oMjJSPj4+Cg4OVsuWLbV58+Zsv36bzaaGDRvKsiwdPHjQaduSJUvUqFEj+fn5qWjRonrwwQczPbdnz56tqlWrysfHR9WqVdO8efMy1J1f515WPtMy+2zJ7s/P+PHjHa+hVq1a2rRpU3bedgCZYMYJgMN//vMf3X777apfv36Wx6xatUqzZs1SbGysgoKCFB4erl27dqlRo0YKCAjQq6++Kk9PT33xxRdq0qSJ1q5dqzp16kj68xe4kSNHqmvXrqpdu7bOnz+vzZs3a8uWLWrevLkkqV27dtq1a5deeOEFhYeH69SpU1qxYoWOHDlyzRvSH3zwQfn7+2vWrFlq3Lix07aZM2fqrrvuUrVq1ST9+QvjwYMH1aVLF5UsWdJxKeKuXbv0448/Zusm+kuXLqlZs2Y6cuSIXnzxRZUqVUpTp07VqlWrMvSdPXu2kpOT9fzzz6tEiRLauHGjPv30Ux07dkyzZ8+WJPXo0UMnTpzQihUrNHXq1Az76NGjh6ZMmaIuXbroxRdf1KFDhzRmzBht3bpV69atc8wEDB48WMOHD1erVq3UqlUrbdmyRS1atFBKSorxNa1YsUIdOnRQs2bN9N5770mS9uzZo3Xr1jlC9MWLF9WoUSPt2bNHzz77rO69916dPn1aCxcu1LFjxxQUFKRLly6pSZMmio+PV2xsrCpUqKDZs2erc+fOOnv2rFMgl6TJkyfr8uXL6t69u7y9vVW8eHHt2rVLDRo0UOnSpTVgwAD5+flp1qxZatOmjb755hs9+uijkrJ2Xv1d27ZtFRgYqL59+6pDhw5q1aqV/P39JSnL5/NVvXr1UnBwsAYPHqykpCTje3w9WX3f7Ha7WrdurY0bN+r5559XRESEFixYoE6dOhmPsWvXLj300EO6++679dZbb8nb21vx8fFat26dJKlKlSp66623NHjwYHXv3l2NGjWSpOt+Tjz33HOaMmWKHnjgAXXt2lVpaWn6/vvv9eOPP+bovrGroeK2225ztE2dOlWdOnVSTEyM3nvvPSUnJ+vzzz9Xw4YNtXXrVsfnw6JFi9S+fXtFRkZq5MiR+uOPP/Tcc8+pdOnSmR7rRp97OflMy+7Pz/Tp03XhwgX16NFDNptN77//vtq2bauDBw/m+QwhcEuxAMCyrHPnzlmSrEceeSTLYyRZbm5u1q5du5za27RpY3l5eVkHDhxwtJ04ccIqWrSodd999znaqlevbj344IPX3P8ff/xhSbI++OCDrL+Q/9ehQwcrJCTESktLc7T99ttvlpubm/XWW2852pKTkzOM/fe//21Jsr777jtH2+TJky1J1qFDhxxtjRs3tho3bux4Pnr0aEuSNWvWLEdbUlKSValSJUuStXr16used+TIkZbNZrN+/fVXR1vv3r2tzD6qv//+e0uSNW3aNKf2pUuXOrWfOnXK8vLysh588EHLbrc7+r3++uuWJKtTp04Z9v1Xffr0sQICApzex78bPHiwJcmaO3duhm1Xj3n1vfnXv/7l2JaSkmLVq1fP8vf3t86fP29ZlmUdOnTIkmQFBARYp06dctpXs2bNrMjISOvy5ctO+69fv75VuXJlR5vpvLqWq8f++/mW1fP56jnSsGHD675fpuP9VVbft2+++caSZI0ePdrRLz093WratKklyZo8ebKjfciQIU7n1Mcff2xJshITE69Zx6ZNmzLs56pOnTpZ5cuXdzxftWqVJcl68cUXM/T96zmYmU6dOll+fn5WYmKilZiYaMXHx1sffvihZbPZrGrVqjnGX7hwwQoMDLS6devmND4hIcEqVqyYU3tkZKRVpkwZ68KFC462NWvWWJKc6s6Pcy+rn2nX+mzJ6s9PiRIlrDNnzjj6LliwwJJk/ec//7nucQFcH5fqAZAknT9/XpJUtGjRbI1r3Lixqlat6nienp6u5cuXq02bNrr99tsd7WFhYXrqqaf0ww8/OI4VGBioXbt2af/+/Znuu0iRIvLy8tKaNWsyXH5m0r59e506dcppCfA5c+bIbrerffv2Tse46vLlyzp9+rTq1q0rSdqyZUu2jrl48WKFhYXpsccec7T5+vqqe/fuGfr+9bhJSUk6ffq06tevL8uytHXrVuOxZs+erWLFiql58+Y6ffq04xEVFSV/f3+tXr1akvTtt98qJSVFL7zwgtPs2UsvvZSl1xQYGKikpKTrXub2zTffqHr16o6/uv/V1WM
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdd3QU1d/H8U8SSC+UBEJNkA7SBEGaFAMBNIIC0oTQFQjVBkJIKBKkGQQEpP9QlKZY6ARQKdJBFKQXQTrSQk/m+YOTfViyyW4gyYbwfp2z5ySzd2bunZmdufOde+84GIZhCAAAAAAAAECSHO2dAQAAAAAAACCjI4gGAAAAAAAAWEEQDQAAAAAAALCCIBoAAAAAAABgBUE0AAAAAAAAwAqCaAAAAAAAAIAVBNEAAAAAAAAAKwiiAQAAAAAAAFYQRAMAAAAAAACsIIgGSDp06JDq168vHx8fOTg4aMmSJWmyntq1a6t27dppsuz0FhkZKQcHB3tnw6oVK1aofPnycnV1lYODg65cuWLvLJlJ7WMiMDBQ7du3T7XlQXJwcFBkZKS9swEgk9i2bZuqVasmDw8POTg4aPfu3TbPO3v2bDk4OOj48eNW03I9eDyWtrGt1+r169fLwcFB69evT7P8PY6n7Vg4fvy4HBwcNHv2bHtnJVnnzp1Ts2bNlDNnTjk4OCg6OtreWTKTkvOFLZ6Wuv/TpH379goMDLTb+rdu3SpnZ2edOHHCbnlITS1bttRbb72V5ushiIanQsJFIOHj6uqqvHnzKjg4WJ9//rmuX7/+RMsPDQ3V3r179cknn2ju3LmqVKlSKuU8ef/++68iIyNTVIGG7S5duqS33npLbm5umjRpkubOnSsPDw+LaROOse3bt6dzLlNu06ZNioyMTPOAYGBgoNnvzsPDQ5UrV9b//ve/NF0vAKSnGzduKCIiQg0aNFCOHDms3rzv379fDRo0kKenp3LkyKG2bdvqwoULNq3r3r17at68uS5fvqzPPvtMc+fOVUBAQCqVBE+zZcuW8cDmKdO3b1+tXLlSAwYM0Ny5c9WgQYMk0zo4OCgsLCwdc/d4bt68qcjIyDQPBCcE5BI+WbNmVWBgoHr16pXhHnhnZgMHDlSrVq0yzXXoo48+0uLFi7Vnz540XU+WNF06kMqGDh2qQoUK6d69ezp79qzWr1+vPn36aNy4cfrxxx9VtmzZFC/z1q1b2rx5swYOHJjuF7d///1XQ4YMUWBgoMqXL5+u635SgwYNUv/+/e2djWRt27ZN169f17BhwxQUFGTv7Fi0atWqFM+zadMmDRkyRO3bt1e2bNnMvjtw4IAcHVPv+Uj58uX13nvvSZLOnDmj6dOnKzQ0VHfu3FGXLl1SbT0Z2a1bt5QlC5dLILO6ePGihg4dqoIFC6pcuXLJ3jyeOnVKL7/8snx8fDRixAjduHFDY8aM0d69e01P9JNz5MgRnThxQtOmTVPnzp1TuSRIK49zrU6pZcuWadKkSQTSJAUEBOjWrVvKmjWrvbOSrLVr16px48Z6//337Z0Vi9q2bauWLVvKxcXF5nlu3rypIUOGSFKi1pdpUfefPHmyPD09FRsbq5iYGE2YMEE7d+7Uhg0bUnU9GdW0adMUHx9vl3Xv3r1ba9as0aZNm+yy/rRQoUIFVapUSWPHjk3Th/7cFeCp0rBhQ7NWYgMGDNDatWv12muv6fXXX9f+/fvl5uaWomUmPD1+NBgBy2JjY+Xh4aEsWbJk+MDC+fPnJWXsfWvthiulUlJRskW+fPn09ttvm/5v3769nnvuOX322WfpHkRLOPbSm6ura7qvE0D6yZMnj86cOSN/f39t375dL774YpJpR4wYodjYWO3YsUMFCxaUJFWuXFn16tXT7Nmz1bVr12TX9TRcl2wRHx+vu3fvPjPnx9S+VsOy+/fvKz4+Xs7Ozk/FsXX+/PkM/Vt2cnKSk5NTqi0vLer+zZo1k6+vryTpnXfeUcuWLTV//nxt3bpVlStXTtV1Jcde5zR7BopnzZqlggUL6qWXXrJbHtLCW2+9pYiICH3xxRfy9PRMk3XQnRNPvbp16yo8PFwnTpzQV199Zfbd33//rWbNmilHjhxydXVVpUqV9OOPP5q+j4yMNDVf/eCDD+Tg4GDql37ixAl1795dxYsXl5ubm3LmzKnmzZsnGlcgqfEBrI1DsH79elNFvUOHDqbmzMl1Ibl+/br69OmjwMBAubi4KFeuXKpXr5527txplm7Lli1q1KiRsmfPLg8PD5UtW1bjx483S7N27VrVrFlTHh4eypYtmxo3bqz9+/dbLNu+ffvUunVrZc+eXTVq1Eiy3AlN1ZcsWaLnn39eLi4uKl26tFasWGGx/JUqVZKrq6sKFy6sqVOnpmishYULF6pixYpyc3OTr6+v3n77bZ0+fdr0fe3atRUaGipJevHFF+Xg4JAq44Hs2rVLDRs2lLe3tzw9PfXKK6/o999/T5Tujz/+UK1ateTm5qb8+fNr+PDhmjVrlk3jrEyYMEGlS5eWu7u7smfPrkqVKmnevHmSHmz3Dz74QJJUqFAh03GTsExL455cuXJFffv2NR03+fPnV7t27XTx4sUUl9/Pz08lSpTQkSNHzKbHx8crOjpapUuXlqurq3Lnzq133nlH//33X6J0kZGRyps3r9zd3VWnTh3t27cvUb4Tfj+//PKLunfvrly5cil//vym75cvX246fr28vPTqq6/qr7/+MlvX2bNn1aFDB+XPn18uLi7KkyePGjdubLb9t2/fruDgYPn6+srNzU2FChVSx44dzZZjaUw0W46DhDJs3LhR/fr1k5+fnzw8PPTGG2/Y3PULQNpzcXGRv7+/TWkXL16s1157zRRAk6SgoCAVK1ZMCxYsSHbe9u3bq1atWpKk5s2by8HBwez8b8t12RLDMDR8+HDlz5/fdF599HyYnPj4eI0fP15lypSRq6ur/Pz81KBBA7PhDRKu719//bVKly4tFxcX07XdlvPhvXv3NGTIEBUtWlSurq7KmTOnatSoodWrV5vS2HLOftSiRYtM14pHTZ06VQ4ODvrzzz8lPbguJzwIcnV1lb+/vzp27KhLly5Z3UaWrtWnTp1SkyZN5OHhoVy5cqlv3766c+dOonl/++03NW/eXAULFpSLi4sKFCigvn376tatW6Y07du316RJkyTJrItbAluvsU96LHz77beqWLGivLy85O3trTJlyiSqP9pSpzh//rw6deqk3Llzy9XVVeXKldOcOXPMlpMw7tmYMWMUHR2twoULy8XFRfv27bM4Jlr79u3l6emp06dPq0mTJvL09JSfn5/ef/99xcXFmS370qVLatu2rby9vZUtWzaFhoZqz549No+zdvToUTVv3lw5cuSQu7u7XnrpJS1dutT0fcL13TAMTZo0KdH+elyxsbF67733VKBAAbm4uKh48eIaM2aMDMMwS3fr1i316tVLvr6+8vLy0uuvv67Tp08nqq9YuhdJrt5z/Phx+fn5SZKGDBliKlfCMpOqp3/11VeqXLmyqd768ssvP3brzZo1a0pSonrmli1b1KBBA/n4+Mjd3V21atXSxo0bE81v6/1Fcue006dPq2PHjsqdO7fpXmbmzJmJ1pVcfV2y7b7N0photh4HKbnvsmTJkiWqW7duom0TGBio1157zbQt3dzcVKZMGVMr7e+++850vahYsaJ27dqVaNnW7sEl6fLly3r//fdVpkwZeXp6ytvbWw0bNkzUFTNhrMkFCxbok08+Uf78+eXq6qpXXnlFhw8fTrTuevXqKTY21uz6ktoydjMSwEZt27bVxx9/rFWrVplax/z111+qXr268uXLp/79+8vDw0MLFixQkyZNtHjxYr3xxht68803lS1bNvXt21etWrVSo0aNTBHrbdu2adOmTWrZsqXy58+v48ePa/Lkyapdu7b27dsnd3f3J8pzyZIlNXToUA0ePFhdu3Y1XTSqVauW5DzvvvuuFi1apLCwMJUqVUqXLl3Shg0btH//fr3wwguSpNWrV+u1115Tnjx51Lt3b/n7+2v//v36+eef1bt3b0nSmjVr1LBhQz3
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
2023-01-06 10:41:21 +01:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABYN0lEQVR4nO3de1xM+f8H8NeUpgsVNqkYcid32mzZXWtFLuu6dkNLrMu6hNW6X8pliXXfxbbu7Be5LNZ3WbdWi2RdkkWpFckitCgpTTWf3x9+zXdHU+ZkptF4PR+PeTzM53zOOe9zGs27z3mfz5EJIQSIiIiITISZsQMgIiIi0icmN0RERGRSmNwQERGRSWFyQ0RERCaFyQ0RERGZFCY3REREZFKY3BAREZFJKWPsAEqaSqXCnTt3YGtrC5lMZuxwiIiISAdCCDx58gQuLi4wMyt6bOaNS27u3LkDhUJh7DCIiIioGG7duoWqVasW2eeNS25sbW0BPD85dnZ2Ro6GiIiIdJGeng6FQqH+Hi/KG5fc5F+KsrOzY3JDRERUyuhSUsKCYiIiIjIpTG6IiIjIpDC5ISIiIpPC5IaIiIhMCpMbIiIiMilMboiIiMikMLkhIiIik8LkhoiIiEwKkxsiIiIyKUxuiIiIyKQYNbk5fvw4unbtChcXF8hkMuzdu/el60RERKBFixawtLRE7dq1sXHjRoPHSURERKWHUZObp0+fomnTpli5cqVO/W/cuIEuXbqgbdu2iImJwZdffokhQ4bg0KFDBo6UiIiISgujPjizU6dO6NSpk879Q0NDUaNGDSxevBgA0KBBA5w8eRJLly6Fj4+PocIkIqJXIIRAVk6escOgEmZtYa7TQy4NoVQ9FTwqKgre3t4abT4+Pvjyyy8LXSc7OxvZ2dnq9+np6YYKj4iIXiCEQO/QKJy/+cjYoVAJi53tAxu5cdKMUlVQnJKSgsqVK2u0Va5cGenp6cjKytK6TkhICOzt7dUvhUJREqESERGArJw8JjZU4krVyE1xTJkyBYGBger36enpTHCIiIzg3HRv2MjNjR0GlRBrC+P9rEtVcuPk5IR79+5ptN27dw92dnawtrbWuo6lpSUsLS1LIjwiKkGs4ygdMpX/+xnZyM2NdpmC3iyl6lPm6emJAwcOaLQdOXIEnp6eRoqIiIyBdRxEVBSj1txkZGQgJiYGMTExAJ7f6h0TE4Pk5GQAzy8pDRgwQN1/+PDhuH79OiZOnIirV69i1apV2LFjB8aNG2eM8InISFjHUfq4V69g1MsU9GYx6sjNuXPn0LZtW/X7/NoYf39/bNy4EXfv3lUnOgBQo0YN7N+/H+PGjcPy5ctRtWpVrF27lreBE73BWMdROhjztmB68xg1ufnggw8ghCh0ubbZhz/44ANcuHDBgFERUWnCOg4iehF/IxCRTl6nAt5/F6kSEb2IyQ0RvRQLeImoNClVk/gRkXG8rgW8LFIlIm04ckNEkrxOBbwsUiUibZjcEJEkLOAlotcdf0MRmQhDFvyygJeIShMmN0QmgAW/RET/w4JiIhNQUgW/LOAlotKAIzdEJsaQBb8s4CWi0oDJDZEBldTEd3zyMhHR//A3IJGBsA6GiMg4WHNDZCDGmPiONTFERBy5ISoRJTXxHWtiiIiY3BCVCNbBEBGVHF6WIiIiIpPC5IaIiIhMCpMbIiIiMilMboiIiMiksMKRSM/yJ+7jwyaJiIyDyQ2RHnHiPiIi4+NlKSI90jZxHyfWIyIqWRy5ITKQ/In7OLEeEVHJYnJDZCCcuI+IyDj4m5dKtZJ66rauWERMRGR8TG6o1GLxLhERacOCYiq1jPHUbV2xiJiIyHg4ckMmoaSeuq0rFhETERkPkxsyCSzeJSKifLwsRURERCaFyQ0RERGZFCY3REREZFKY3BAREZFJYXJDREREJoXJDREREZkUJjdERERkUpjcEBERkUnhrGdkVK/y4Es+pJKIiLRhckNGwwdfEhGRIfCyFBmNvh58yYdUEhHRvxVr5CYnJwcpKSnIzMxEpUqVULFiRX3HRW+YV3nwJR9SSURE/6ZzcvPkyRP85z//QVhYGM6cOQOlUgkhBGQyGapWrYoOHTpg2LBhePvttw0ZL5koPviSiIj0RafLUkuWLIGrqys2bNgAb29v7N27FzExMUhISEBUVBSCg4ORm5uLDh06oGPHjvjrr78MHTe95oQQyFTmvuTFgmAiItI/nf5UPnv2LI4fP46GDRtqXe7h4YHPP/8coaGh2LBhA06cOIE6deroNVAqPVgoTERExqRTcrNt2zadNmZpaYnhw4e/UkBU+kktFGZBMBER6ROLHMigdCkUZkEwERHpk6RbwS9evIivv/4aq1atQmpqqsay9PR0fP7553oNjkq//ELhol5MbIiISJ90Tm4OHz4MDw8PhIWFYcGCBahfvz6OHTumXp6VlYVNmzYZJEgiIiIiXemc3MycORPjx4/H5cuXkZSUhIkTJ6Jbt244ePCgIeMjIiIikkTnmpsrV67gxx9/BADIZDJMnDgRVatWRe/evREWFsb5bYiIiOi1oHNyY2lpicePH2u09evXD2ZmZvD19cXixYv1HRsRERGRZDonN82aNcOxY8fQsmVLjfY+ffpACAF/f3+9B0elx7+f7s3J+YiIyJh0Tm5GjBiB48ePa13Wt29fCCGwZs0avQVGpQcn7SMioteJTAghjB1ESUpPT4e9vT3S0tJgZ2dn7HBMQqYyF25Bhwq0u1evgJ3DPXmrNxERvTIp39+cxI/06t+T9nFyPiIiMgYmN6RXfLo3EREZm6QZiomIiIhed0xuiIiIyKQYPblZuXIlXF1dYWVlhVatWuHMmTNF9l+2bBnq1asHa2trKBQKjBs3Ds+ePSuhaImIiOh1V6zk5vjx4zh37pxG27lz5wq9Vbww27dvR2BgIIKDgxEdHY2mTZvCx8cH9+/f19p/69atmDx5MoKDgxEXF4d169Zh+/btmDp1anEOg4iIiExQsZKbDz74AAMGDNBo69+/P9q2bStpO0uWLMHQoUMxaNAguLm5ITQ0FDY2Nli/fr3W/qdOnULr1q3Rr18/uLq6okOHDujbt2+Roz3Z2dlIT0/XeBEREZHpKlZyc+PGDRw9elSjLTw8HNevX9d5G0qlEufPn4e3t/f/gjEzg7e3N6KiorSu4+XlhfPnz6uTmevXr+PAgQPo3LlzofsJCQmBvb29+qVQKHSOkYiIiEqfYt2zW7169QJtLi4ukraRmpqKvLw8VK5cWaO9cuXKuHr1qtZ1+vXrh9TUVLz77rsQQiA3NxfDhw8v8rLUlClTEBgYqH6fnp7OBIeIiMiEGb2gWIqIiAjMmzcPq1atQnR0NHbv3o39+/djzpw5ha5jaWkJOzs7jRcRERGZLp1GbipUqKDzTLMPHz7UqZ+DgwPMzc1x7949jfZ79+7ByclJ6zozZsxA//79MWTIEABA48aN8fTpUwwbNgzTpk2DmVmpytWIiIjIAHRKbpYtW6b3HcvlcrRs2RLh4eHo0aMHAEClUiE8PBwBAQFa18nMzCyQwJibP5/q/w17RBYREREVQqfkxt/f3yA7DwwMhL+/P9zd3eHh4YFly5bh6dOnGDRoEABgwIABqFKlCkJCQgAAXbt2xZIlS9C8eXO0atUK165dw4wZM9C1a1d1kkNERERvtmIVFCcmJmLDhg1ITEzE8uXL4ejoiF9//RXVqlVDw4YNdd6Or68vHjx4gKCgIKSkpKBZs2Y4ePCgusg4OTlZY6Rm+vTpkMlkmD59Om7fvo1KlSqha9eumDt3bnEOg4iIiEyQTEi8nvP777+jU6dOaN26NY4fP464uDjUrFkT8+fPx7lz57Br1y5DxaoXUh6ZTrrJVObCLegQACB2tg8fnElERHon5ftbcgXu5MmT8fXXX+PIkSOQy+Xq9g8//BCnT5+WHi29doQQyFTmSnjlGTtkIiIiNcl/Yl+6dAlbt24t0O7o6IjU1FS9BEXGI4RA79AonL/5yNihEBERFYvkkZvy5cvj7t27BdovXLiAKlWq6CUoMp6snLxiJzbu1SvA2oKF3UR
2023-01-06 10:09:28 +01:00
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2022-12-29 10:21:35 +01:00
"source": [
"from sklearn.linear_model import LogisticRegression\n",
"\n",
"# Score the model with default parameters\n",
2023-01-06 10:09:28 +01:00
"score_log_reg, model_log_reg,_ = score_the_model(\n",
2022-12-29 10:21:35 +01:00
" model=LogisticRegression(max_iter=100),\n",
" model_name='Logistic Regression',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
{
"cell_type": "markdown",
"id": "641e5a5a",
"metadata": {},
"source": [
"#### SVM"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 175,
2022-12-29 10:21:35 +01:00
"id": "96adfe07",
"metadata": {},
2023-01-06 10:09:28 +01:00
"outputs": [
{
2023-01-06 10:41:21 +01:00
"name": "stdout",
"output_type": "stream",
"text": [
"[{'Accuracy': 0.8660287081339713, 'F1': 0.9, 'Precision': 0.8689655172413793, 'Recall': 0.9333333333333333, 'AUC': 0.8382882882882883, 'model_name': 'SVM'}]\n"
2023-01-06 10:09:28 +01:00
]
}
],
2022-12-29 10:21:35 +01:00
"source": [
"from sklearn.svm import SVC\n",
"from sklearn.preprocessing import StandardScaler\n",
"\n",
"# Scale the data\n",
"scaler = StandardScaler()\n",
"X_train_scaled = scaler.fit_transform(X_train)\n",
"X_test_scaled = scaler.transform(X_test)\n",
"# Score the model with default parameters\n",
"\n",
2023-01-06 10:41:21 +01:00
"scores_svm, model_svm, most_important_features_svm = score_the_model(\n",
2022-12-29 10:21:35 +01:00
" model=SVC(),\n",
" model_name='SVM',\n",
" random_seed=42,\n",
" X_train=X_train_scaled,\n",
" X_test=X_test_scaled,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=False\n",
")\n",
"\n",
"print(scores_svm)\n"
]
},
2023-01-06 10:09:28 +01:00
{
"cell_type": "markdown",
"id": "51685da6",
"metadata": {},
"source": [
"### Gradient Boosting Classifier"
]
},
2022-12-29 10:21:35 +01:00
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 176,
2022-12-29 10:21:35 +01:00
"id": "0842608e",
"metadata": {},
2023-01-06 10:41:21 +01:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABZvUlEQVR4nO3deVhV1f7H8Q/jAURAmURFcSoxSQ3nIScUMzWzTG1ALTVTcipNK6fSbDQrLeeh0uuUmjfnTK2UMjHNMedZUDNRcUBg//7ox7kdATeTHMX363nOc+9Ze629v/u4uZcPa+91HAzDMAQAAAAAyJSjvQsAAAAAgDsdwQkAAAAATBCcAAAAAMAEwQkAAAAATBCcAAAAAMAEwQkAAAAATBCcAAAAAMAEwQkAAAAATBCcAAAAAMAEwQkAcmjmzJlycHDQkSNHrG2NGjVSo0aNTMeuX79eDg4OWr9+/W2rLydCQkLUpUsXe5dxR9m/f7+aN28ub29vOTg4aMmSJfYuKd/k5hpH1vB5AncPghOALDl48KBefPFFlS1bVm5ubvLy8lK9evX0ySef6OrVq/Yu756yfPlyjRgxwt5l3DM6d+6sHTt2aPTo0frqq69UvXr1237MixcvavTo0apevbq8vb1lsVhUunRpdejQQcuWLbvtx7e3TZs2acSIEbpw4UKW+nfp0kUODg7Wl7Ozs4KDg9WxY0ft3r379habBbt379aIESNsAiiAu4+zvQsAcOdbtmyZ2rdvL4vFoqioKFWuXFlJSUn6+eefNXDgQO3atUuTJ0+2d5l3hNWrV9/2YyxfvlwTJkwgPOWDq1evKiYmRm+88Yaio6Pz5ZgHDhxQZGSkjh49qscff1xRUVHy9PTU8ePHtXz5crVq1UpffvmlnnvuuXyp52b5cY1v2rRJI0eOVJcuXeTj45OlMRaLRVOnTpUkJScn6+DBg5o4caJWrlyp3bt3q3jx4rex4lvbvXu3Ro4cqUaNGikkJMRmW358ngDyBsEJwC0dPnxYHTt2VOnSpfXDDz8oKCjIuq137946cODALf8CnpqaqqSkJLm5ueVHuXbn6upq7xLuCcnJyUpNTb3tn/fZs2clKcu/vGdFYmKiChUqlOG25ORkPf7444qPj9eGDRtUr149m+3Dhw/X6tWrlZKSkuNj5Nadeo07Ozvr2WeftWmrXbu2WrVqpWXLlql79+52quzW7tTPE0B63KoH4Jbef/99Xb58WdOmTbMJTWnKly+vvn37Wt87ODgoOjpas2fP1gMPPCCLxaKVK1dKkn7//Xc98sgj8vLykqenp5o2bapffvnFZn83btzQyJEjVaFCBbm5ucnX11f169fXmjVrrH3i4uLUtWtXlSxZUhaLRUFBQXrsscdueRvMwoUL5eDgoA0bNqTbNmnSJDk4OGjnzp2SpD/++ENdunSx3pZYrFgxPf/88/rrr79MP6+Mnlc4ceKE2rZtq0KFCikgIED9+/fX9evX04396aef1L59e5UqVUoWi0XBwcHq37+/za2QXbp00YQJEyTJ5takNKmpqRo3bpweeOABubm5KTAwUC+++KL+/vtvm2MZhqFRo0apZMmS8vDwUOPGjbVr1y7T80szd+5chYeHq3DhwvLy8lJYWJg++eQTmz4XLlxQ//79FRISIovFopIlSyoqKkrnzp2z9jlz5oxeeOEFBQYGys3NTVWqVNGsWbNs9nPkyBE5ODjoww8/1Lhx41SuXDlZLBbrLVh79+7Vk08+qaJFi8rNzU3Vq1fX0qVLbfaRlevqZiNGjFDp0qUlSQMHDpSDg4PNbEFWrue0Z4Q2bNigXr16KSAgQCVLlsz0mAsWLNDOnTs1dOjQdKEpTfPmzfXII49k6RhHjx5Vr169dP/998vd3V2+vr5q3759hj8ru3btUpMmTeTu7q6SJUtq1KhRSk1NTdcvo2v8+vXrGj58uMqXL2+9dgcNGpTuOk/734clS5aocuXKslgseuCBB6z/GyH987kPHDhQklSmTBnrNZ6T29yKFSsm6Z9Q9W+HDh1S+/btVbRoUXl4eKh27doZ/gEoK9endOufh5kzZ6p9+/aSpMaNG1vPJ+35xps/z7TnH+fPn6/Ro0erZMmScnNzU9OmTXXgwIF0x54wYYLKli0rd3d31axZUz/99BPPTQG3CTNOAG7pv//9r8qWLau6detmecwPP/yg+fPnKzo6Wn5+fgoJCdGuXbvUoEEDeXl5adCgQXJxcdGkSZPUqFEjbdiwQbVq1ZL0zy9NY8aMUbdu3VSzZk1dvHhRW7Zs0datW9WsWTNJ0hNPPKFdu3bp5ZdfVkhIiM6cOaM1a9bo2LFj6W6DSfPoo4/K09NT8+fPV8OGDW22zZs3Tw888IAqV64sSVqzZo0OHTqkrl27qlixYtZbEXft2qVffvnFJqiYuXr1qpo2bapjx46pT58+Kl68uL766iv98MMP6fouWLBAV65c0UsvvSRfX19t3rxZn332mU6cOKEFCxZIkl588UWdOnVKa9as0VdffZVuHy+++KJmzpyprl27qk+fPjp8+LDGjx+v33//XRs3bpSLi4skadiwYRo1apRatmypli1bauvWrWrevLmSkpJMz2nNmjXq1KmTmjZtqvfee0+StGfPHm3cuNEaoi9fvqwGDRpoz549ev755/XQQw/p3LlzWrp0qU6cOCE/Pz9dvXpVjRo10oEDBxQdHa0yZcpowYIF6tKliy5cuGATyCVpxowZunbtmnr06CGLxaKiRYtq165dqlevnkqUKKHBgwerUKFCmj9/vtq2batvvvlGjz/+uKSsXVc3a9eunXx8fNS/f3916tRJLVu2lKenpyRl+XpO06tXL/n7+2vYsGFKTEzM9LP973//K0npZk6yIqNj/Pbbb9q0aZM6duyokiVL6siRI/riiy/UqFEj7d69Wx4eHpL++WNE48aNlZycbP0cJ0+eLHd3d9Pjpqamqk2bNvr555/Vo0cPhYaGaseOHfr444+1b9++dItp/Pzzz1q0aJF69eqlwoUL69NPP9UTTzyhY8eOydfXV+3atdO+ffv0n//8Rx9//LH8/PwkSf7+/qa1pIXylJQUHTp0SK+99pp8fX3VqlUra5/4+HjVrVtXV65cUZ8+feTr66tZs2apTZs2WrhwofWayer1afbz8PDDD6tPnz769NNP9frrrys0NFSSrP+ZmXfffVeOjo569dVXlZCQoPfff1/PPPOMfv31V2ufL774QtHR0WrQoIH69++vI0eOqG3btipSpMgtAzqAHDIAIBMJCQmGJOOxxx7L8hhJhqOjo7Fr1y6b9rZt2xqurq7GwYMHrW2nTp0yChcubDz88MPWtipVqhiPPvpopvv/+++/DUnGBx98kPUT+X+dOnUyAgICjOTkZGvb6dOnDUdHR+Ott96ytl25ciXd2P/85z+GJOPHH3+0ts2YMcOQZBw+fNja1rBhQ6Nhw4bW9+PGjTMkGfPnz7e2JSYmGuXLlzckGevWrbvlcceMGWM4ODgYR48etbb17t3byOh/vn/66SdDkjF79myb9pUrV9q0nzlzxnB1dTUeffRRIzU11drv9ddfNyQZnTt3Trfvf+vbt6/h5eVl8znebNiwYYYkY9GiRem2pR0z7bP5+uuvrduSkpKMOnXqGJ6ensbFixcNwzCMw4cPG5IMLy8v48yZMzb7atq0qREWFmZcu3bNZv9169Y1KlSoYG0zu64yk3bsm6+3rF7PaddI/fr1b/l5palWrZrh4+OTrv3y5cvG2bNnra+EhIQsHSOjayomJsaQZHz55ZfWtn79+hmSjF9//dXadubMGcPb29v0Gv/qq68MR0dH46effrI5zsSJEw1JxsaNG61tkgxXV1fjwIED1rbt27cbkozPPvvM2vbBBx+kO+6tdO7c2ZCU7lWiRAkjNjbWpm/auf673kuXLhllypQxQkJCjJSUFMMwsn59ZuXnYcGCBel+3tPc/HmuW7fOkGSEhoYa169ft7Z/8sknhiRjx44dhmEYxvXr1w1fX1+jRo0axo0bN6z9Zs6caUiy2SeAvMGtegAydfHiRUlS4cK
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most important features: ['V36', 'V1', 'V12', 'V39', 'V38', 'V34', 'V30', 'V22', 'V27', 'V13', 'V2', 'V14', 'V40', 'V8', 'V37', 'V18', 'V9', 'V28', 'V10', 'V31']\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdd3xO9///8WcSMkjESEKQiqL2KpUaNSqkVkutokaqtIjZRY0YrVQHUVWrNT5atatau6lRo6hVe6+qxF5BkJzfH37X+eaSK7kSsvC4327Xjet93uec9xnXOe+8znm/3w6GYRgCAAAAAAAAkCjHjC4AAAAAAAAAkNkRRAMAAAAAAADsIIgGAAAAAAAA2EEQDQAAAAAAALCDIBoAAAAAAABgB0E0AAAAAAAAwA6CaAAAAAAAAIAdBNEAAAAAAAAAOwiiAQAAAAAAAHYQRAMScfjwYdWvX1+enp5ycHDQokWL0mQ9tWvXVu3atdNk2elt6NChcnBwyOhi2LV8+XJVqFBBrq6ucnBw0JUrVzK6SMnSqVMn+fv7W6U5ODho6NChGVKeJxH7E0BG2Lp1q6pVq6bs2bPLwcFBO3fuTPa806dPl4ODg06cOGE3r7+/vzp16vTQ5Xxa2drHya2/rVmzRg4ODlqzZk2ale9hPG7nwokTJ+Tg4KDp06dndFGSFBUVpRYtWihPnjxycHBQeHh4RhcpWWydp7bqnXh4Gb0/4+LiVKZMGX366acZVobUtHz5crm7u+v8+fPpvm6CaHhsWSo0lo+rq6vy58+voKAgff3117p+/fojLb9jx47avXu3Pv30U82cOVOVK1dOpZIn7b///tPQoUNTVIFG8l28eFGtWrWSm5ubxo8fr5kzZyp79uxJznP8+HGFhIToueeeU7Zs2ZQtWzaVKlVKPXr00D///JNOJc84s2bNSlEl0N/fP8Fvs1ixYvrggw906dKltCtoMi1dupRAGYAk3bhxQ6GhoXrllVeUO3duu3+879+/X6+88orc3d2VO3dutW/fPtkV+7t376ply5a6dOmSxowZo5kzZ6pQoUKptCV4nHG/evz07dtXK1as0IABAzRz5ky98sorSeaPiYnRuHHjVKNGDeXKlUvOzs7Knz+/Xn31Vf3000+KjY1Np5JnjH379mno0KHJeggg/d8De8vH0dFRvr6+aty4sf7666+0LWwyZOa/43766SedPn1aISEhGV2UVPHKK6+oaNGiCgsLS/d1Z0n3NQKpbPjw4SpcuLDu3r2ryMhIrVmzRn369NHo0aO1ePFilStXLsXLvHXrljZt2qSBAwem+4Xmv//+07Bhw+Tv768KFSqk67of1aBBg9S/f/+MLkaStm7dquvXr2vEiBEKDAy0m/+3335T69atlSVLFrVr107ly5eXo6OjDhw4oIULF2rChAk6fvx4hv3Bc+vWLWXJkraX8lmzZmnPnj3q06dPsuepUKGC3nvvPUnS7du3tW3bNoWHh2vt2rXasmVLGpU0eZYuXarx48fb/MMkPfYngMzvwoULGj58uJ555hmVL18+ybeI/v33X9WsWVOenp4aOXKkbty4oS+//FK7d+/Wli1b5OzsnOS6jh49qpMnT2rKlCl6++23U3lLkFZWrlyZ5utI6n71tClUqJBu3bqlrFmzZnRRkvTHH3/otdde0/vvv2837/nz59WgQQNt27ZNQUFBGjRokHLnzq3IyEj9/vvvatu2rY4cOaLBgwenQ8kTmjJliuLi4tJ0Hfv27dOwYcNUu3btFL2lNWHCBLm7uysuLk6nT5/WlClTVLNmTW3ZsiVD/35K6u+49NifSfniiy/0xhtvyNPTM8PKkNreeecdvf/++xo2bJg8PDzSbb38pYDHXoMGDazeEhswYID++OMPNW7cWK+++qr2798vNze3FC3T8vQ4Z86cqVnUJ1Z0dLSyZ8+uLFmyZPoAxLlz5yQl79gePXpUb7zxhgoVKqSIiAj5+vpaTR81apS+/fZbOTom/VKvZf+kBVdX1zRZ7qMqUKCA3nzzTfP722+/LXd3d3355Zc6fPiwihUrloGlS1xm3Z8A0pevr6/Onj2rfPny6e+//9YLL7yQaN6RI0cqOjpa27Zt0zPPPCNJqlKliurVq6fp06era9euSa4rJfelzCwuLk537tx5aq6j9oKjSB337t1TXFycnJ2dH4tz69y5c8n+Lbdv3147duzQggUL9Prrr1tNGzBggP7++28dPHgwyWXcvn1bzs7OduuiDyMzByxbtGghLy8v83vTpk1VpkwZzZs3L9O+hJCR+3PHjh3atWuXvvrqqwwrQ1po3ry5evbsqXnz5umtt95Kt/XSnBNPpJdfflmDBw/WyZMn9cMPP1hNO3DggFq0aKHcuXPL1dVVlStX1uLFi83pQ4cONd8q+uCDD+Tg4GA+GTl58qS6d++u4sWLy83NTXny5FHLli0TvIKcWN9g9votWbNmjVlRDw4ONl9VTqoJyfXr19WnTx/5+/vLxcVFPj4+qlevnrZv326Vb/PmzWrYsKFy5cql7Nmzq1y5cho7dqxVnj/++EMvvfSSsmfPrpw5c+q1117T/v37bW7bvn371LZtW+XKlUs1atRIdLsdHBwUEhKiRYsWqUyZMnJxcVHp0qW1fPlym9tfuXJlubq6qkiRIpo0aVKK+lmbN2+eKlWqJDc3N3l5eenNN9/UmTNnzOm1a9dWx44dJUkvvPCCHBwckuwP5PPPP1d0dLSmTZuWIIAmSVmyZFGvXr3k5+dnpnXq1Enu7u46evSoGjZsKA8PD7Vr106S9Oeff6ply5Z65pln5OLiIj8/P/Xt21e3bt1KsGzL/nJ1dVWZMmX0888/2yyjrT68zpw5o7feekt58+Y19/fUqVOt8lj6vpg7d64+/fRTFSxYUK6urqpbt66OHDlitc+WLFmikydPmufjw/bnkC9fPklKEGhNznkn3a8ANGjQQDly5JC7u7vq1q2b4NX9u3fvatiwYSpWrJhcXV2VJ08e1ahRQ6tWrZJ0//iMHz/e3HeWj8WD+9Ny/h05ckSdOnVSzpw55enpqeDgYN28edNq3bdu3VKvXr3k5eUlDw8Pvfrqqzpz5gz9rAGPIRcXF/OaZc+CBQvUuHFjM4AmSYGBgXruuec0d+7cJOft1KmTatWqJUlq2bKlHBwcrPrZSu718UGGYeiTTz5RwYIFlS1bNtWpU0d79+5N1vZI9wNiY8eOVdmyZeXq6ipvb2+98sor+vvvv808lvv7jz/+qNKlS8vFxcW8t6fG9VqSIiMjFRwcrIIFC8rFxUW+vr567bXXkmz6NX/+fDk4OGjt2rUJpk2aNEkODg7as2ePJOmff/5Rp06d9Oyzz8rV1VX58uXTW2+9pYsXL9rdR7b6RPv333/VtGlTZc+eXT4+Purbt69iYmISzJuc+oC9+1VcXJzCw8NVunRpubq6Km/evHrnnXd0+fJlq3U96rkwe/ZsVapUSR4eHsqRI4fKli2boP545coV9e3b16yLFixYUB06dNCFCxfMPOfOnVPnzp2VN29eubq6qnz58poxY4bVciz9nn355ZcKDw9XkSJF5OLion379tnsE81S5zpz5oyaNm0qd3d3eXt76/3330/QDPLixYtq3769cuTIoZw5c6pjx47atWtXsvtZO3bsmFq2bKncuXMrW7ZsevHFF7VkyRJzuqWObxiGxo8fn+B4PWjTpk1asWKFunbtmiCAZlG5cmWzDin9X91t9uzZGjRokAoUKKBs2bLp2rVrunTpkt5//32VLVtW7u7uypEjhxo0aKBdu3YlWG5yz1NbfXgl97zz9/dX48aNtX79elWpUkWurq569tln9b///c9qn7Vs2VKSVKdOHXOfPUz/gYnVMZNz3kn3H3i/99578vPzk4uLi4oXL64vv/xShmFY5Vu1apVq1KihnDlzyt3dXcWLF9fHH38syf7fcQ/uz/jn++TJk83z/YUXXtDWrVsTlHHevHkqVaqU1d8Gye1nbdGiRXJ2dlbNmjWt0i313EOHDunNN9+Up6envL29NXjwYBmGodOnT+u1115Tjhw5lC9
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABaoklEQVR4nO3deViN6f8H8PcpnRZakDYiRGQnfGXGGlnGOkOWIcYw1jGaGAyyZ8aWmUFjDT8my9hmmCyRsW/JVsoSWQpNo7Qv5/794eqMo+I8Oad0er+u61yXcz/b53k6dT7u5/Pct0wIIUBERESkI/SKOwAiIiIiTWJyQ0RERDqFyQ0RERHpFCY3REREpFOY3BAREZFOYXJDREREOoXJDREREemUMsUdQFFTKBR48uQJTE1NIZPJijscIiIiUoMQAi9fvoSdnR309N7eN1PqkpsnT57A3t6+uMMgIiKiQnj48CGqVKny1nVKXXJjamoK4NXFMTMzK+ZoiIiISB1JSUmwt7dXfo+/TalLbnJvRZmZmTG5ISIiKmHUKSlhQTERERHpFCY3REREpFOY3BAREZFOYXJDREREOoXJDREREekUJjdERESkU5jcEBERkU5hckNEREQ6hckNERER6RQmN0RERKRTijW5+fvvv9GjRw/Y2dlBJpNh796979wmJCQETZs2haGhIRwdHREQEKD1OImIiKjkKNbkJiUlBY0aNcLKlSvVWj86Ohrdu3dH+/btERYWhm+++QZffvklDh06pOVIiYiIqKQo1okzu3btiq5du6q9vr+/P6pXr46lS5cCAOrWrYtTp05h+fLlcHd311aYREREahNCIC0rp7jDKHbGBvpqTXKpDSVqVvCzZ8/Czc1Npc3d3R3ffPNNgdtkZGQgIyND+T4pKUlb4RERUSknhMBn/mdx+cG/xR1KsQuf6w4TefGkGSWqoDguLg7W1tYqbdbW1khKSkJaWlq+2/j6+sLc3Fz5sre3L4pQiYioFErLymFi8wEoUT03hTFt2jR4eXkp3yclJTHBISIirbs0ww0mcv3iDqPYGBsU37mXqOTGxsYGT58+VWl7+vQpzMzMYGxsnO82hoaGMDQ0LIrwiIg0irUbJU9q5n8/LxO5frHdlintStRVb9WqFQ4ePKjSduTIEbRq1aqYIiIi0g7WbhAVXrHW3CQnJyMsLAxhYWEAXj3qHRYWhpiYGACvbikNHTpUuf7o0aNx7949TJkyBbdu3cKqVauwY8cOTJo0qTjCJyLSGtZulGwu1coX622Z0q5Ye24uXbqE9u3bK9/n1sZ4enoiICAAsbGxykQHAKpXr44DBw5g0qRJWLFiBapUqYJ169bxMXAi0mmlvXajJCrOx6CpmJObdu3aQQhR4PL8Rh9u164drly5osWoiIg+LKzdIJKGvy1EREVAanHw64WpRCQNkxsiIi1jcTBR0SpRg/gREZVE71MczMJUIunYc0NEVISkFgezMJVIOiY3RERFiMXBRNrH3zAi0mkfwii/LA4mKlpMbohIZ7GQl6h0YkExEemsD22UXxYHExUN9twQUanwIYzyy+JgoqLB5IaISgUW8hKVHrwtRURERDqFyQ0RERHpFCY3REREpFOY3BAREZFOYXJDREREOoXJDREREekUJjdERESkU5jcEBERkU7hiFZEpcyHMJFkUeGElUSlE5MbolKEE0kSUWnA21JEpciHNpFkUeGElUSlC3tuiEqpD2EiyaLCCSuJShcmN0SlFCeSJCJdxb9sRDpCnUJhFtgSUWnA5IZIB7BQmIjoPywoJtIBUguFWWBLRLqMPTdEOkadQmEW2BKRLmNyQ1SEtDWA3uu1NCwUJqLSjn8BiYoI62KIiIoGa26IikhRDKDHWhoiIvbcEBULbQ2gx1oaIiImN0TFgnUxRETaw7+uRFr0egExB9AjIioaTG6ItIQFxERExYMFxURaUlABMYt+iYi0iz03REXg9QJiFv0SEWkXkxsiDcuts+HAekRExYN/bYk0iHU2RETFjzU3RBqUX50Na2yIiIpWoXpusrKyEBcXh9TUVFSqVAkVKlTQdFxEJV5unQ1rbIiIipbaPTcvX77E6tWr0bZtW5iZmcHBwQF169ZFpUqVUK1aNYwcORIXL17UZqxEJUpunQ0TGyKioqVWz82yZcuwYMEC1KxZEz169MD06dNhZ2cHY2NjJCQk4MaNGzh58iQ6d+6Mli1b4ueff0atWrW0HTvpMG3Nnq1tHKiPiKj4qZXcXLx4EX///Tfq1auX7/IWLVrgiy++gL+/PzZu3IiTJ08yuaFCY1EuERG9D7WSm99++02tnRkaGmL06NHvFRBRUcyerW0sIiYiKj58FJw+aNqaPVvbWERMRFR8JCU3V69exR9//IEKFSqgf//+sLS0VC5LSkrCN998gw0bNmg8SCrZpNbPcPA7IiJ6HzIhhFBnxcOHD6NHjx6oVasWXr58iZSUFOzcuRPt27cHADx9+hR2dnbIyfmwCyqTkpJgbm6OxMREmJmZFXc4Ou9962fC57ozuSEiIknf32o/Cj579mx4e3vjxo0buH//PqZMmYKePXsiKCjovQMm3fU+9TOsWyEiosJQ+7/EN2/exJYtWwAAMpkMU6ZMQZUqVfDZZ58hMDAQzZs311qQpBuk1s+wboWIiApD7eTG0NAQL168UGkbNGgQ9PT04OHhgaVLl2o6NtIxrJ8hIqKioPY3TePGjXH8+HE0a9ZMpX3AgAEQQsDT01PjwVHJld/M2EREREVB7eRmzJgx+Pvvv/NdNnDgQAghsHbtWo0FRiUXB+EjIqLipPbTUrqCT0tpX2pmNpxnHVJpc6lWHjtHt2INDRERFYqU728WQJBWcWZsIiIqakxuSKtYRExEREVN7XFuiIiIiEoCJjdERESkU4o9uVm5ciUcHBxgZGSEli1b4sKFC29d38/PD05OTjA2Noa9vT0mTZqE9PT0IoqWiIiIPnSFSm7+/vtvXLp0SaXt0qVLBT4qXpDt27fDy8sLPj4+CA0NRaNGjeDu7o5nz57lu/62bdswdepU+Pj4ICIiAuvXr8f27dsxffr0wpwGERER6aBCJTft2rXD0KFDVdqGDBminERTXcuWLcPIkSMxfPhwODs7w9/fHyYmJgXOLH7mzBm0bt0agwYNgoODAzp37oyBAwe+tbcnIyMDSUlJKi8iIiLSXYVKbqKjo3H06FGVtuDgYNy7d0/tfWRmZuLy5ctwc3P7Lxg9Pbi5ueHs2bP5buPq6orLly8rk5l79+7h4MGD6NatW4HH8fX1hbm5ufJlb2+vdoxERERU8hTqGd1q1arlabOzs5O0j/j4eOTk5MDa2lql3draGrdu3cp3m0GDBiE+Ph4fffQRhBDIzs7G6NGj33pbatq0afDy8lK+T0pKYoJDRESkw4q9oFiKkJAQLFy4EKtWrUJoaCh2796NAwcOYN68eQVuY2hoCDMzM5UXERER6S61em7Kly+v9uiyCQkJaq1naWkJfX19PH36VKX96dOnsLGxyXebmTNnYsiQIfjyyy8BAA0aNEBKSgpGjRqF77//Hnp6JSpXIyIiIi1QK7nx8/PT+IHlcjmaNWuG4OBg9O7dGwCgUCgQHByM8ePH57tNampqngRGX18fwKvJGqn45M4CDoAzgRMRUbFSK7nx9PTUysG9vLzg6ekJFxcXtGjRAn5+fkhJScHw4cMBAEOHDkXlypXh6+sLAOjRoweWLVuGJk2aoGXLlrhz5w5mzpyJHj16KJMcKnqcBZyIiD4khSoovnv3LjZu3Ii7d+9ixYoVsLKywl9//YWqVauiXr16au/Hw8MDz58/x6xZsxAXF4fGjRsjKChIWWQcExOj0lMzY8YMyGQyzJgxA48fP0alSpXQo0cPLFiwoDCnQRqSlpWTb2LjUq08jA2YdBIRUdGSCYn3c06cOIGuXbuidevW+PvvvxEREYEaNWpg0aJFuHTpEnbt2qWtWDVCypTppJ7UzGw4zzoE4L9ZwAFwJnAiItIYKd/fkitwp06divnz5+PIkSOQy+XK9g4dOuDcuXPSo6USSwiB1MxslRqb3Fn
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2022-12-29 10:21:35 +01:00
"source": [
"from sklearn.ensemble import GradientBoostingClassifier\n",
"\n",
"# Score the model with default parameters\n",
2023-01-06 10:09:28 +01:00
"score_gb, model_gb, most_important_features_gb = score_the_model(\n",
2022-12-29 10:21:35 +01:00
" model=GradientBoostingClassifier(),\n",
" model_name='Gradient Boosting',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
2023-01-06 10:09:28 +01:00
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 177,
2023-01-06 10:09:28 +01:00
"id": "f9fc8040",
"metadata": {},
2023-01-06 10:41:21 +01:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABZv0lEQVR4nO3deZyN5f/H8fesZ2aMmcEsBsPYCpnQ2JeICYmSEloGhYRsRVS2Iq1SUUKWiq8t5BtZEiqmZETW7LsZJNtYxszcvz/6zfl2zIxrNnMYr+fjcR51rvu67vtzH/fUvF33fR0Xy7IsAQAAAAAy5OrsAgAAAADgZkdwAgAAAAADghMAAAAAGBCcAAAAAMCA4AQAAAAABgQnAAAAADAgOAEAAACAAcEJAAAAAAwITgAAAABgQHACgGyaNm2aXFxcdODAAXtbo0aN1KhRI+PY1atXy8XFRatXr75h9WVHeHi4OnXq5Owybiq7d+9W06ZN5e/vLxcXFy1cuNDZJeWZnFzjyBw+T+DWQXACkCl79+7Vc889pzJlysjLy0t+fn6qV6+ePvzwQ126dMnZ5d1WlixZouHDhzu7jNtGx44dtWXLFo0aNUpffvmlqlevfsOPee7cOY0aNUrVq1eXv7+/bDabSpUqpXbt2mnx4sU3/PjOtm7dOg0fPlxnzpzJVP9OnTrJxcXF/nJ3d1dYWJjat2+v7du339hiM2H79u0aPny4QwAFcOtxd3YBAG5+ixcvVtu2bWWz2RQdHa3KlSsrMTFRP//8swYMGKBt27Zp4sSJzi7zprB8+fIbfowlS5Zo/PjxhKc8cOnSJcXExOjVV19Vr1698uSYe/bsUbNmzXTw4EE98sgjio6Olq+vrw4fPqwlS5aoZcuW+uKLL/T000/nST3XyotrfN26dRoxYoQ6deqkgICATI2x2WyaPHmyJCkpKUl79+7VhAkTtHTpUm3fvl3FihW7gRVf3/bt2zVixAg1atRI4eHhDtvy4vMEkDsITgCua//+/Wrfvr1KlSqlH374QaGhofZtPXv21J49e677N+ApKSlKTEyUl5dXXpTrdJ6ens4u4baQlJSklJSUG/55nzx5UpIy/ct7ZiQkJKhAgQLpbktKStIjjzyi+Ph4rVmzRvXq1XPYPmzYMC1fvlzJycnZPkZO3azXuLu7u5566imHttq1a6tly5ZavHixunbt6qTKru9m/TwBpMWtegCu65133tGFCxf0+eefO4SmVOXKlVOfPn3s711cXNSrVy/NmDFDd911l2w2m5YuXSpJ+v333/XAAw/Iz89Pvr6+atKkiX755ReH/V29elUjRoxQ+fLl5eXlpSJFiqh+/fpasWKFvU9cXJw6d+6sEiVKyGazKTQ0VA8//PB1b4OZN2+eXFxctGbNmjTbPvvsM7m4uGjr1q2SpD/++EOdOnWy35ZYtGhRPfPMM/rrr7+Mn1d6zyscOXJErVu3VoECBRQcHKx+/frpypUracb+9NNPatu2rUqWLCmbzaawsDD169fP4VbITp06afz48ZLkcGtSqpSUFI0dO1Z33XWXvLy8FBISoueee05///23w7Esy9LIkSNVokQJ+fj46L777tO2bduM55dq1qxZioyMVMGCBeXn56eIiAh9+OGHDn3OnDmjfv36KTw8XDabTSVKlFB0dLROnTpl73PixAk9++yzCgkJkZeXl6pUqaLp06c77OfAgQNycXHRe++9p7Fjx6ps2bKy2Wz2W7B27typxx57TIULF5aXl5eqV6+uRYsWOewjM9fVtYYPH65SpUpJkgYMGCAXFxeH2YLMXM+pzwitWbNGPXr0UHBwsEqUKJHhMefOnautW7dqyJAhaUJTqqZNm+qBBx7I1DEOHjyoHj166M4775S3t7eKFCmitm3bpvuzsm3bNjVu3Fje3t4qUaKERo4cqZSUlDT90rvGr1y5omHDhqlcuXL2a3fgwIFprvPU/z4sXLhQlStXls1m01133WX/b4T0z+c+YMAASVLp0qXt13h2bnMrWrSopH9C1b/t27dPbdu2VeHCheXj46PatWun+xdAmbk+pev/PEybNk1t27aVJN13333280l9vvHazzP1+cc5c+Zo1KhRKlGihLy8vNSkSRPt2bMnzbHHjx+vMmXKyNvbWzVr1tRPP/3Ec1PADcKME4Dr+u9//6syZcqobt26mR7zww8/aM6cOerVq5cCAwMVHh6ubdu2qUGDBvLz89PAgQPl4eGhzz77TI0aNdKaNWtUq1YtSf/80jR69Gh16dJFNWvW1Llz57RhwwZt3LhR999/vyTp0Ucf1bZt2/TCCy8oPDxcJ06c0IoVK3To0KE0t8GkevDBB+Xr66s5c+aoYcOGDttmz56tu+66S5UrV5YkrVixQvv27VPnzp1VtGhR+62I27Zt0y+//OIQVEwuXbqkJk2a6NChQ+rdu7eKFSumL7/8Uj/88EOavnPnztXFixf1/PPPq0iRIlq/fr0+/vhjHTlyRHPnzpUkPffcczp27JhWrFihL7/8Ms0+nnvuOU2bNk2dO3dW7969tX//fo0bN06///671q5dKw8PD0nS0KFDNXLkSLVo0UItWrTQxo0b1bRpUyUmJhrPacWKFerQoYOaNGmit99+W5K0Y8cOrV271h6iL1y4oAYNGmjHjh165plndM899+jUqVNatGiRjhw5osDAQF26dEmNGjXSnj171KtXL5UuXVpz585Vp06ddObMGYdALklTp07V5cuX1a1bN9lsNhUuXFjbtm1TvXr1VLx4cQ0aNEgFChTQnDlz1Lp1a3399dd65JFHJGXuurpWmzZtFBAQoH79+qlDhw5q0aKFfH19JSnT13OqHj16KCgoSEOHDlVCQkKGn+1///tfSUozc5IZ6R3jt99+07p169S+fXuVKFFCBw4c0KeffqpGjRpp+/bt8vHxkfTPX0bcd999SkpKsn+OEydOlLe3t/G4KSkpeuihh/Tzzz+rW7duqlixorZs2aIPPvhAu3btSrOYxs8//6z58+erR48eKliwoD766CM9+uijOnTokIoUKaI2bdpo165d+s9//qMPPvhAgYGBkqSgoCBjLamhPDk5Wfv27dPLL7+sIkWKqGXLlvY+8fHxqlu3ri5evKjevXurSJEimj59uh566CHNmzfPfs1k9vo0/Tzce++96t27tz766CO98sorqlixoiTZ/5mRt956S66urnrppZd09uxZvfPOO3ryySf166+/2vt8+umn6tWrlxo0aKB+/frpwIEDat26tQoVKnTdgA4gmywAyMDZs2ctSdbDDz+c6TGSLFdXV2vbtm0O7a1bt7Y8PT2tvXv32tuOHTtmFSxY0Lr33nvtbVWqVLEefPDBDPf/999/W5Ksd999N/Mn8v86dOhgBQcHW0lJSfa248ePW66urtbrr79ub7t48WKasf/5z38sSdaPP/5ob5s6daolydq/f7+9rWHDhlbDhg3t78eOHWtJsubMmWNvS0hIsMqVK2dJslatWnXd444ePdpycXGxDh48aG/r2bOnld5/vn/66SdLkjVjxgyH9qVLlzq0nzhxwvL09LQefPBBKyUlxd7vlVdesSRZHTt2TLPvf+vTp4/l5+fn8Dlea+jQoZYka/78+Wm2pR4z9bP56quv7NsSExOtOnXqWL6+vta5c+csy7Ks/fv3W5IsPz8/68SJEw77atKkiRUREWFdvnzZYf9169a1ypcvb28zXVcZST32tddbZq/n1Gukfv361/28UlWrVs0KCAhI037hwgXr5MmT9tfZs2czdYz0rqmYmBhLkvXFF1/Y2/r27WtJsn799Vd724kTJyx/f3/jNf7ll19arq6u1k8//eRwnAkTJliSrLVr19rbJFmenp7Wnj177G2bN2+2JFkff/yxve3dd99Nc9zr6dixoyUpzat48eJWbGysQ9/Uc/13vefPn7dKly5thYeHW8nJyZZlZf76zMzPw9y5c9P8vKe69vNctWqVJcmqWLGideXKFXv7hx9+aEmytmzZYlmWZV25csUqUqSIVaNGDevq1av2ftOmTbMkOewTQO7gVj0AGTp
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most important features: ['V36', 'V1', 'V39', 'V12', 'V38', 'V30', 'V34', 'V22', 'V27', 'V13', 'V2', 'V40', 'V37', 'V31', 'V8', 'V18', 'V14', 'V10', 'V9', 'V28']\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdZ3gU1f/38U8SSCMhlBSqBAHpBKREQCkSCFVBUESEEGkCoaOCEKoSbBAEpCnlhyBVEQEpRooUpSO9N5GE3iI1mfsB9+4/SzbZBJJsgPfruvaCnDkzc6bszNnvnDnHwTAMQwAAAAAAAACS5GjvAgAAAAAAAACZHUE0AAAAAAAAwAaCaAAAAAAAAIANBNEAAAAAAAAAGwiiAQAAAAAAADYQRAMAAAAAAABsIIgGAAAAAAAA2EAQDQAAAAAAALCBIBoAAAAAAABgA0E0IAlHjhxRvXr15OXlJQcHBy1evDhd1lOrVi3VqlUrXZad0YYOHSoHBwd7F8OmFStWqHz58nJ1dZWDg4OuXr1q7yKlSLt27eTv72+R5uDgoKFDh9qlPE8j9icAe9i6dauqVaumbNmyycHBQbt27UrxvDNmzJCDg4NOnjxpM6+/v7/atWv3yOV8Vlnbxymtv61du1YODg5au3ZtupXvUTxp58LJkyfl4OCgGTNm2LsoyYqJiVGLFi2UO3duOTg4KDIy0t5FShFr56m1eicenb33Z3x8vMqUKaNPP/3UbmVISytWrJCHh4cuXLiQ4esmiIYnlqlCY/q4uroqX758Cg4O1tdff60bN2481vJDQkK0Z88effrpp5o1a5YqVaqURiVP3r///quhQ4emqgKNlLt06ZLeeustubm5acKECZo1a5ayZcuW7DwnTpxQWFiYXnjhBbm7u8vd3V2lSpVSt27d9Pfff2dQye1nzpw5qaoE+vv7J/puFitWTB988IEuX76cfgVNoeXLlxMoA5CsmzdvasiQIapfv75y5cpl88f7gQMHVL9+fXl4eChXrlxq06ZNiiv29+7d05tvvqnLly9rzJgxmjVrlgoVKpRGW4InGferJ0/v3r21cuVKDRgwQLNmzVL9+vWTzX/nzh2NGzdOL7/8snLmzClnZ2fly5dPr732mn744QfFxcVlUMntY//+/Ro6dGiKHgJI//fA3vRxdHRU3rx51bhxY/3555/pW9gUyMy/43744QedOXNGYWFh9i5Kmqhfv76KFi2qiIiIDF93lgxfI5DGhg8frsKFC+vevXuKjo7W2rVr1atXL40ePVpLlixRuXLlUr3MW7duafPmzRo4cGCGX2j+/fdfDRs2TP7+/ipfvnyGrvtxDRo0SP3797d3MZK1detW3bhxQyNGjFBQUJDN/EuXLlXLli2VJUsWtW7dWgEBAXJ0dNTBgwf1448/auLEiTpx4oTdfvDcunVLWbKk76V8zpw52rt3r3r16pXiecqXL6++fftKkm7fvq3t27crMjJS69at05YtW9KppCmzfPlyTZgwweoPk4zYnwAyv4sXL2r48OF67rnnFBAQkGwron/++Uc1atSQl5eXRo4cqZs3b+rLL7/Unj17tGXLFjk7Oye7rmPHjunUqVOaOnWqOnTokMZbgvSyatWqdF9HcverZ02hQoV069YtZc2a1d5FSdbvv/+u119/Xf369bOZ98KFC2rQoIG2b9+u4OBgDRo0SLly5VJ0dLR+++03vfPOOzp69KjCw8MzoOSJTZ06VfHx8em6jv3792vYsGGqVatWqlppTZw4UR4eHoqPj9eZM2c0depU1ahRQ1u2bLHr76fkfsdlxP5MzhdffKG3335bXl5editDWuvcubP69eunYcOGydPTM8PWyy8FPPEaNGhg0UpswIAB+v3339W4cWO99tprOnDggNzc3FK1TNPT4xw5cqRlUZ9asbGxypYtm7JkyZLpAxDnz5+XlLJje+zYMb399tsqVKiQoqKilDdvXovpn332mb755hs5OibfqNe0f9KDq6truiz3ceXPn1/vvvuu+e8OHTrIw8NDX375pY4cOaJixYrZsXRJy6z7E0DGyps3r86dO6c8efJo27Ztqly5cpJ5R44cqdjYWG3fvl3PPfecJKlKlSqqW7euZsyYoU6dOiW7rtTclzKz+Ph43b1795m5jtoKjiJt3L9/X/Hx8XJ2dn4izq3z58+n+Lvcpk0b7dy5U4sWLdIbb7xhMW3AgAHatm2bDh06lOwybt++LWdnZ5t10UeRmQOWLVq0kLe3t/nvpk2bqkyZMlqwYEGmbYRgz/25c+dO7d69W1999ZXdypAemjdvru7du2vBggV67733Mmy9vM6Jp9Krr76q8PBwnTp1St9//73FtIMHD6pFixbKlSuXXF1dValSJS1ZssQ8fejQoeZWRR988IEcHBzMT0ZOnTqlrl27qnjx4nJzc1Pu3Ln15ptvJmqCnFTfYLb6LVm7dq25oh4aGmpuqpzcKyQ3btxQr1695O/vLxcXF/n6+qpu3brasWOHRb6//vpLDRs2VM6cOZUtWzaVK1dOY8eOtcjz+++/65VXXlG2bNmUI0cOvf766zpw4IDVbdu/f7/eeecd5cyZUy+//HKS2+3g4KCwsDAtXrxYZcqUkYuLi0qXLq0VK1ZY3f5KlSrJ1dVVRYoU0eTJk1PVz9qCBQtUsWJFubm5ydvbW++++67Onj1rnl6rVi2FhIRIkipXriwHB4dk+wP5/PPPFRsbq+nTpycKoElSlixZ1KNHDxUsWNCc1q5dO3l4eOjYsWNq2LChPD091bp1a0nSH3/8oTfffFPPPfecXFxcVLBgQfXu3Vu3bt1KtGzT/nJ1dVWZMmX0008/WS2jtT68zp49q/fee09+fn7m/T1t2jSLPKa+L+bPn69PP/1UBQoUkKurq+rUqaOjR49a7LNly5bp1KlT5vPxUftzyJMnjyQlCrSm5LyTHlQAGjRooOzZs8vDw0N16tRJ1HT/3r17GjZsmIoVKyZXV1flzp1bL7/8slavXi3pwfGZMGGCed+ZPiYP70/T+Xf06FG1a9dOOXLkkJeXl0JDQ/Xff/9ZrPvWrVvq0aOHvL295enpqddee01nz56lnzXgCeTi4mK+ZtmyaNEiNW7c2BxAk6SgoCC98MILmj9/frLztmvXTjVr1pQkvfnmm3JwcLDoZyul18eHGYahTz75RAUKFJC7u7tq166tffv2pWh7pAcBsbFjx6ps2bJydXWVj4+P6tevr23btpnzmO7vs2fPVunSpeXi4mK+t6fF9VqSoqOjFRoaqgIFCsjFxUV58+bV66+/nuyrXwsXLpSDg4PWrVuXaNrkyZPl4OCgvXv3SpL+/vtvtWvXTs8//7xcXV2VJ08evffee7p06ZLNfWStT7R//vlHTZs2VbZs2eTr66vevXvrzp07ieZNSX3A1v0qPj5ekZGRKl26tFxdXeXn56fOnTvrypUrFut63HNh7ty5qlixojw9PZU9e3aVLVs2Uf3x6tWr6t27t7kuWqBAAbVt21YXL1405zl//rzat28vPz8/ubq6KiAgQDNnzrRYjqnfsy+//FKRkZEqUqSIXFxctH//fqt9opnqXGfPnlXTpk3l4eEhHx8f9evXL9FrkJcuXVKbNm2UPXt25ciRQyEhIdq9e3eK+1k7fvy43nzzTeXKlUvu7u566aWXtGzZMvN0Ux3fMAxNmDAh0fF62ObNm7Vy5Up16tQpUQDNpFKlSuY6pPR/dbe5c+dq0KBByp8/v9zd3XX9+nVdvnxZ/fr1U9myZeXh4aHs2bOrQYMG2r17d6LlpvQ8tdaHV0rPO39/fzVu3FgbNmxQlSpV5Orqqueff17/+9//LPbZm2++KUmqXbu2eZ89Sv+BSdUxU3LeSQ8eePft21cFCxaUi4uLihcvri+//FKGYVjkW716tV5++WXlyJFDHh4eKl68uD7++GNJtn/HPbw/E57vU6ZMMZ/vlStX1tatWxOVccGCBSpVqpTFb4OU9rO2ePFiOTs7q0aNGhbppnru4cOH9e6778rLy0s+Pj4KDw+XYRg6c+aMXn/9dWXPnl158uSxGoS7c+e
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABcD0lEQVR4nO3deVxN+f8H8Nct3VvRgqQiska2EEZmrJF9G2QZYgxjHaOJsWdnxiAMGmv4mckythkmS2TfQgaRQWQpNI2S0nY/vz88ul9XxT25t1u31/PxuI9H93O29znd7nn3Oe/zOTIhhAARERGRgTDSdwBERERE2sTkhoiIiAwKkxsiIiIyKExuiIiIyKAwuSEiIiKDwuSGiIiIDAqTGyIiIjIoxfQdQH5TKpV48uQJLCwsIJPJ9B0OERERaUAIgZcvX8LBwQFGRu/vmylyyc2TJ0/g6Oio7zCIiIgoDx4+fIjy5cu/d54il9xYWFgAeHNwLC0t9RwNERERaSIxMRGOjo6q8/j7FLnkJutSlKWlJZMbIiKiQkaTkhIWFBMREZFBYXJDREREBoXJDRERERkUJjdERERkUJjcEBERkUFhckNEREQGhckNERERGRQmN0RERGRQmNwQERGRQWFyQ0RERAZFr8nNiRMn0KVLFzg4OEAmk2HPnj0fXCY0NBQNGjSAQqFA1apVERgYqPM4iYiIqPDQa3Lz6tUr1KtXDytXrtRo/qioKHTq1AmtWrVCeHg4vv32W3z11Vc4ePCgjiMlIiKiwkKvD87s0KEDOnTooPH8AQEBqFSpEhYvXgwAqFmzJk6dOoWlS5fC09NTV2ESEREVakIIpKRn5us2zUyMNXrIpS4UqqeCnz17Fh4eHmptnp6e+Pbbb3NdJjU1Fampqar3iYmJugqPiIiowBFCoFfAWVx68F++bjditifM5fpJMwpVQXFsbCzKli2r1la2bFkkJiYiJSUlx2UWLFgAKysr1cvR0TE/QiUiIioQUtIz8z2x0bdC1XOTF5MnT4aPj4/qfWJiIhMcIiIqksKmecBcbpwv2zIzyZ/t5KRQJTd2dnZ4+vSpWtvTp09haWkJMzOzHJdRKBRQKBT5ER4RkV7oo56CCo/ktP99Nszlxnq7VJSfCtUeNm3aFAcOHFBrO3z4MJo2baqniIiI9Etf9RREBZlea26SkpIQHh6O8PBwAG9u9Q4PD0d0dDSAN5eUBg0apJp/xIgRuHfvHiZOnIhbt25h1apV2L59O8aPH6+P8ImI9K4o1lNQ3rhVLKnXS0X5Sa89N2FhYWjVqpXqfVZtjLe3NwIDAxETE6NKdACgUqVK2L9/P8aPH49ly5ahfPnyWLduHW8DJyJC/tZTUOGjz1uz85tek5uWLVtCCJHr9JxGH27ZsiWuXLmiw6iIiAqnolJPQfQh/CsgIiqEsoqI3y4WJaI3mNwQERUyLCImer9CNYgfERHlXERclIpFiT6EPTdERIVYVhFxUSoWJfoQJjdERIUYi4iJsuNfBBGRluTXSMEsIiZ6PyY3RERawCJfooKDBcVERFqgj5GCWURMlDP23BARaVl+jRTMImKinDG5IaIiS5s1MkXxyctEBRX/+oioSGKNDJHhYs0NERVJuqqRYR0Mkf6x54aIijxt1siwDoZI/5jcEFGRxxoZIsPCv2Yi0pn8GtQuLzgQHpHhYnJDRDrBgl0i0hcWFBORTuhjULu8YAEwkeFhzw0R6Vx+DWqXFywAJjI8TG6IKEcfWy/DQe2ISF/4bUNE2bBehogKM9bcEFE22qyXYU0LEeU39twQ0Xt9bL0Ma1qIKL8xuSGi92K9DBEVNvzGIioCpBYHc4A7IirMmNwQGTgWBxNRUcOCYiID9zHFwSwGJqLCiD03REWI1OJgFgMTUWHE5IZIT/LroZIcTI+Iihp+yxHpAetgiIh0hzU3RHqgj4dKsn6GiIoK9twQ6Vl+PVSS9TNEVFQwuSHSM9bBEBFpFy9LERERkUFhckNEREQGhckNERERGRQmN0RERGRQWMVIpAV8MCURUcHB5IboI3FAPiKigoWXpYg+Eh9MSURUsOSp5yY9PR2xsbFITk5GmTJlUKpUKW3HRVQo8cGURET6p3Fy8/LlS/zf//0fgoKCcOHCBaSlpUEIAZlMhvLly6Ndu3YYPnw4GjVqpMt4iQo0DshHRKR/Gn0LL1myBPPmzUOVKlXQpUsXTJkyBQ4ODjAzM0N8fDyuX7+OkydPol27dmjSpAlWrFiBatWq6Tp2ovfSx1O3iYhI/zRKbi5evIgTJ06gVq1aOU5v3LgxvvzySwQEBGDjxo04efIkkxvSKxb5EhEVXRolN7/99ptGK1MoFBgxYsRHBUSkDXzqNhFR0cXiADJ4fOo2EVHRIim5uXr1Kv744w+UKlUKffr0gY2NjWpaYmIivv32W2zYsEHrQRJ9zCB5LPIlIipaZEIIocmMhw4dQpcuXVCtWjW8fPkSr169wo4dO9CqVSsAwNOnT+Hg4IDMzIJdXJmYmAgrKyskJCTA0tJS3+GQBj62fiZitieTGyKiQk7K+VvjQfxmzpwJX19fXL9+Hffv38fEiRPRtWtXBAcHf3TARO/DQfKIiEgKjf+dvXHjBrZs2QIAkMlkmDhxIsqXL49evXohKCiI49tQvuAgeURE9CEaJzcKhQIvXrxQa+vfvz+MjIzg5eWFxYsXazs2omxYP0NERB+i8VnC1dUVx44dQ8OGDdXa+/btCyEEvL29tR4cFS25FQ1zkDwiIpJC4+Rm5MiROHHiRI7T+vXrByEE1q5dq7XAqGjhoHtERKQtGt8tZSh4t1TBlJyWAZcZB987j1vFktgxoilraIiIiiAp528WL1CBk1vRMIuDiYhIE0xuKF9pUlfDomEiIvoYPINQvmFdDRER5QeNB/Ej+liaDMbHQfeIiOhj6b3nZuXKlVi0aBFiY2NRr149rFixAo0bN851fn9/f6xevRrR0dGwsbFBr169sGDBApiamuZj1PSxWFdDRES6kqeemxMnTiAsLEytLSwsLNdbxXOzbds2+Pj4wM/PD5cvX0a9evXg6emJZ8+e5Tj/r7/+ikmTJsHPzw83b97E+vXrsW3bNkyZMiUvu0F6lFVX8+6LiQ0REX2sPCU3LVu2xKBBg9TaBg4cqHqIpqaWLFmCYcOGYciQIXBxcUFAQADMzc1zfbL4mTNn0KxZM/Tv3x9OTk5o164d+vXrhwsXLuS6jdTUVCQmJqq9iIiIyHDlKbmJiorCkSNH1NpCQkJw7949jdeRlpaGS5cuwcPD43/BGBnBw8MDZ8+ezXEZd3d3XLp0SZXM3Lt3DwcOHEDHjh1z3c6CBQtgZWWlejk6OmocIxERERU+eaq5qVixYrY2BwcHSeuIi4tDZmYmypYtq9ZetmxZ3Lp1K8dl+vfvj7i4OHz66acQQiAjIwMjRox472WpyZMnw8fHR/U+MTGRCQ4REZEBK1R3S4WGhmL+/PlYtWoVLl++jF27dmH//v2YM2dOrssoFApYWlqqvYiIiMhwadRzU7JkSY0LPePj4zWaz8bGBsbGxnj69Kla+9OnT2FnZ5fjMtOnT8fAgQPx1VdfAQDq1KmDV69eYfjw4Zg6dSqMjApVrkZEREQ6oFFy4+/vr/UNy+VyNGzYECEhIejevTsAQKlUIiQkBGPGjMlxmeTk5GwJjLHxm9uJi9gjsgqc3EYefhuf7k1ERPlBo+TG29tbJxv38fGBt7c33Nzc0LhxY/j7++PVq1cYMmQIAGDQoEEoV64cFixYAADo0qULlixZgvr166NJkya4c+cOpk+fji5duqiSHMp/HHmYiIgKkjwVFN+9excbN27E3bt3sWzZMtja2uKvv/5ChQoVUKtWLY3X4+XlhefPn2PGjBmIjY2Fq6srgoODVUXG0dHRaj0106ZNg0wmw7Rp0/D48WOUKVMGXbp0wbx58/KyG6Qlmow8/DaOQkxERLokExKv5xw/fhwdOnRAs2bNcOLECdy8eROVK1fGwoULERYWhp07d+oqVq2Q8sh00kxyWgZcZhwEkPvIw2/jKMRERCSVlPO35ArcSZMmYe7
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2023-01-06 10:09:28 +01:00
"source": [
"# Score the gb model with subset of best features\n",
"score_gb, model_gb, most_important_features_gb = score_the_model(\n",
" model=GradientBoostingClassifier(),\n",
" model_name='Gradient Boosting',\n",
" random_seed=42,\n",
" X_train=X_train[most_important_features_gb],\n",
" X_test=X_test[most_important_features_gb],\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
{
"cell_type": "markdown",
"id": "46f70528",
"metadata": {},
"source": [
"### Ada Boost Classifier using RandomForrestClassifier"
]
},
2022-12-29 10:21:35 +01:00
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 178,
2022-12-29 10:21:35 +01:00
"id": "4c75c0cd",
"metadata": {},
2023-01-06 10:41:21 +01:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABWdElEQVR4nO3dd3hU1b7G8XcSkklCCBFSCDU0BSQGDb1IMRQRFLEgFpqgCFEgCoIKiIKxIh5BEJGiwqFJU7qhqIAioUiV3kkAkRZK2r5/eJnjmMAihQwk38/zzHPPrL3W3r9J9rknL2vtNTbLsiwBAAAAAK7KzdUFAAAAAMDNjuAEAAAAAAYEJwAAAAAwIDgBAAAAgAHBCQAAAAAMCE4AAAAAYEBwAgAAAAADghMAAAAAGBCcAAAAAMCA4AQAecDEiRNls9m0f/9+R1ujRo3UqFEj49gVK1bIZrNpxYoVN6y+rAgNDVWnTp1cXcZNZdeuXWrWrJkKFy4sm82mOXPmuLqka9q/f79sNpsmTpzo6lIAINsITgBuOXv27NHzzz+vcuXKycvLS35+fqpXr54++eQTXbx40dXl5SsLFizQm2++6eoy8o2OHTtq8+bNGjZsmL7++mtVr149V667fft22Ww2eXl56fTp07lyzSv/GPDPV1BQkBo3bqyFCxfmSg3XcuHCBb355ps33T84ALhxCri6AADIjPnz5+uxxx6T3W5Xhw4dVLVqVSUlJennn39W3759tXXrVo0dO9bVZd4UlixZcsOvsWDBAo0aNYrwlAsuXryoNWvW6PXXX1dUVFSuXvubb75RsWLF9Ndff2nmzJnq2rVrrl37rbfeUtmyZWVZlhISEjRx4kS1bNlS3333nVq1apVrdfzbhQsXNGTIEEm6rpldALc+ghOAW8a+ffv0xBNPqEyZMlq2bJlCQkIcx3r27Kndu3dr/vz5Vx2flpampKQkeXl55Ua5Lufp6enqEvKFlJQUpaWl3fCf94kTJyRJ/v7+OXbOxMREFSxY8Jp9LMvSlClT9OSTT2rfvn2aPHlyrgan+++/32lm7dlnn1VwcLD++9//ujQ4Ach/WKoH4Jbx/vvv6/z58/ryyy+dQtMVFSpUUK9evRzvbTaboqKiNHnyZN15552y2+1atGiRJGnDhg26//775efnJ19fX91333365ZdfnM6XnJysIUOGqGLFivLy8lLRokVVv359LV261NEnPj5enTt3VsmSJWW32xUSEqKHHnrI6Vmjf5s5c6ZsNptWrlyZ7tjnn38um82mLVu2SJJ+//13derUybEssVixYurSpYv+/PNP488ro2ecDh8+rDZt2qhgwYIKCgpSnz59dPny5XRjf/rpJz322GMqXbq07Ha7SpUqpT59+jgthezUqZNGjRolSU7Lqa5IS0vTiBEjdOedd8rLy0vBwcF6/vnn9ddffzldy7IsDR06VCVLlpSPj48aN26srVu3Gj/fFVOnTlVERIQKFSokPz8/hYWF6ZNPPnHqc/r0afXp00ehoaGy2+0qWbKkOnTooJMnTzr6HD9+3PFHuZeXl8LDwzVp0iSn81x5ZufDDz/UiBEjVL58edntdm3btk2StGPHDj366KMqUqSIvLy8VL16dc2bN8/pHNdzX/3bm2++qTJlykiS+vbtK5vNptDQUMfx67mfryx9W7lypXr06KGgoCCVLFnS+PNdtWqV9u/fryeeeEJPPPGEfvzxRx0+fDhdv9OnT6tTp04qXLiw/P391bFjxwyX9WXnnpb+Do7e3t4qUMD5334TExP18ssvq1SpUrLb7brjjjv04YcfyrIsp34pKSl6++23Hb+70NBQvfbaa+n+e7Bu3To1b95cAQEB8vb2VtmyZdWlSxdJf98HgYGBkqQhQ4Y47n1mXoG8jRknALeM7777TuXKlVPdunWve8yyZcs0ffp0RUVFKSAgQKGhodq6dasaNGggPz8/9evXTx4eHvr888/VqFEjrVy5UrVq1ZL09x+rMTEx6tq1q2rWrKmzZ89q3bp1Wr9+vZo2bSpJeuSRR7R161a9+OKLCg0N1fHjx7V06VIdPHjQ6Q/bf3rggQfk6+ur6dOnq2HDhk7Hpk2bpjvvvFNVq1aVJC1dulR79+5V586dVaxYMcdSxK1bt+qXX35xCiomFy9e1H333aeDBw/qpZdeUvHixfX1119r2bJl6frOmDFDFy5c0AsvvKCiRYtq7dq1+vTTT3X48GHNmDFDkvT888/r6NGjWrp0qb7++ut053j++ec1ceJEde7cWS+99JL27dunkSNHasOGDVq1apU8PDwkSYMGDdLQoUPVsmVLtWzZUuvXr1ezZs2UlJRk/ExLly5V+/btdd999+m9996T9PfzOKtWrXKE6PPnz6tBgwbavn27unTponvuuUcnT57UvHnzdPjwYQUEBOjixYtq1KiRdu/eraioKJUtW1YzZsxQp06ddPr0aadALkkTJkzQpUuX9Nxzz8lut6tIkSLaunWr6tWrpxIlSqh///4qWLCgpk+frjZt2ujbb7/Vww8/LOn67qt/a9u2rfz9/dWnTx+1b99eLVu2lK+vryRd9/18RY8ePRQYGKhBgwYpMTHR+DOePHmyypcvrxo1aqhq1ary8fHRf//7X/Xt29fRx7IsPfTQQ/r555/VvXt3Va5cWbNnz1bHjh0z/J1l5p4+c+aMTp48KcuydPz4cX366ac6f/68nn76aafrP/jgg1q+fLmeffZZVatWTYsXL1bfvn115MgRffzxx46+Xbt21aRJk/Too4/q5Zdf1q+//qqYmBht375ds2fPlvR3iG7WrJkCAwPVv39/+fv7a//+/Zo1a5YkKTAwUKNHj9YLL7yghx9+WG3btpUk3XXXXcafJ4BbmAUAt4AzZ85YkqyHHnrousdIstzc3KytW7c6tbdp08by9PS09uzZ42g7evSoVahQIevee+91tIWHh1sPPPDAVc//119/WZKsDz744Po/yP9r3769FRQUZKWkpDjajh07Zrm5uVlvvfWWo+3ChQvpxv73v/+1JFk//vijo23ChAmWJGvfvn2OtoYNG1oNGzZ0vB8xYoQlyZo+fbqjLTEx0apQoYIlyVq+fPk1rxsTE2PZbDbrwIEDjraePXtaGf1PyU8//WRJsiZPnuzUvmjRIqf248ePW56entYDDzxgpaWlOfq99tprliSrY8eO6c79T7169bL8/Pycfo7/NmjQIEuSNWvWrHTHrlzzys/mm2++cRxLSkqy6tSpY/n6+lpnz561LMuy9u3bZ0my/Pz8rOPHjzud67777rPCwsKsS5cuOZ2/bt26VsWKFR1tpvvqaq5c+9/32/Xez1fukfr161/z5/VPSUlJVtGiRa3XX3/d0fbkk09a4eHhTv3mzJljSbLef/99R1tKSorVoEEDS5I1YcIER3tm7+l/v+x2uzVx4sQMrz906FCn9kcffdSy2WzW7t27LcuyrI0bN1qSrK5duzr1e+WVVyxJ1rJlyyzLsqzZs2dbkqzffvvtqj+bEydOWJKswYMHX7UPgLyFpXoAbglnz56VJBUqVChT4xo2bKgqVao43qempmrJkiVq06aNypUr52gPCQnRk08+qZ9//tlxLX9/f23dulW7du3K8Nze3t7y9PTUihUr0i0/M2nXrp2OHz/utCPXzJkzlZaWpnbt2jld44pLly7p5MmTql27tiRp/fr1mbrmggULFBISokcffdTR5uPjo+eeey5d339eNzExUSdPnlTdunVlWZY2bNhgvNaMGTNUuHBhNW3aVCdPnnS8IiIi5Ovrq+XLl0uSfvjhByUlJenFF190mmno3bv3dX0mf39/JSYmXnOZ27fffqvw8HDHjM8/XbnmggULVKxYMbVv395xzMPDQy+99JLOnz+fblnlI4884liqJUmnTp3SsmXL9Pjjj+vcuXOOz/vnn3+qefPm2rVrl44cOeKo+Vr3VWZk5n6+olu3bnJ3d7+u8y9cuFB//vmn08+lffv22rRpk9NyygULFqhAgQJ64YUXHG3u7u568cUX050zs/f0qFGjtHTpUi1dulTffPONGjd
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most important features: ['V36', 'V1', 'V39', 'V27', 'V22', 'V12', 'V15', 'V2', 'V37', 'V13', 'V18', 'V30', 'V38', 'V14', 'V17', 'V34', 'V8', 'V10', 'V31', 'V16', 'V3', 'V28', 'V11', 'V7', 'V9', 'V5', 'V40']\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdeXhM5///8VcSshCJJZutotpaWkKjUlTRhqDSalFbidTSIqWiLVoRSytdVKP2KuVjqb0+/ZRSolpb7VG171SJrbYgJDm/P/xyvkYmJiHJBM/Hdc1FzrnPnPeZMzPnnve5FwfDMAwBAAAAAAAAyJCjvQMAAAAAAAAA8jqSaAAAAAAAAIANJNEAAAAAAAAAG0iiAQAAAAAAADaQRAMAAAAAAABsIIkGAAAAAAAA2EASDQAAAAAAALCBJBoAAAAAAABgA0k0AAAAAAAAwAaSaMB9Yt++fWrYsKE8PT3l4OCghQsX5sh+6tWrp3r16uXIc+e2QYMGycHBwd5h2LRkyRJVrVpVrq6ucnBw0Pnz53NsX1OmTJGDg4MOHz6cY/sAANzfNm7cqFq1aqlgwYJycHBQfHx8prfNynXG399fHTt2vOs4H1bWXuPM1t9WrlwpBwcHrVy5Msfiuxv323vh8OHDcnBw0JQpU+wdyh0lJCSoRYsWKlasmBwcHBQbG5uj+3NwcNCgQYNydB/3kw0bNsjZ2VlHjhyxdyjZonXr1nr99dftHYbdkUQDsklahSbt4erqqhIlSigkJERff/21Ll26dE/PHxYWpu3bt+uTTz7RtGnTVL169WyK/M7++ecfDRo0KEsVaGTe2bNn9frrr8vNzU1jxozRtGnTVLBgQZvbjR07Vg4ODgoKCsqFKG/q2LGjxXs8X758Kl26tFq3bq2dO3fmWhwZ2blzpwYNGkSCEMAD5/Lly4qOjlajRo1UtGhRmz/ed+3apUaNGsnd3V1FixZV+/btdfr06Uzt68aNG2rZsqXOnTunr776StOmTVOZMmWy6UhwP1u8eDEJkvtM7969tXTpUvXv31/Tpk1To0aNbG5z/vx588burl27ciHK/0vu3vooWrSonn32Wc2YMSNXYrBl2LBhWW7E8NFHH6lNmzYPzHdo3759NX/+fG3bts3eodhVPnsHADxohgwZorJly+rGjRs6efKkVq5cqXfffVcjRozQjz/+qCpVqmT5Oa9evap169bpo48+UkRERA5EnbF//vlHgwcPlr+/v6pWrZqr+75XAwYMUL9+/ewdxh1t3LhRly5d0tChQxUcHJzp7WbMmCF/f39t2LBB+/fv12OPPZaDUf4fFxcXffvtt5Kk5ORkHThwQOPHj9eSJUu0c+dOlShRIlfisGbnzp0aPHiw6tWrJ39/f7vFAQDZ7cyZMxoyZIgeeeQRBQQE3LEV0d9//63nn39enp6eGjZsmC5fvqzhw4dr+/btZquIOzlw4ICOHDmiiRMnqnPnztl8JMgpv/zyS47vY/HixRozZgyJNEllypTR1atXlT9/fnuHckcrVqzQK6+8ovfeey/T28ydO1cODg7y8/PTjBkz9PHHH+dghJZ69uypZ555RtLNG82zZ8/WG2+8ofPnz6tHjx65Foc1w4YNU4sWLdSsWbNMlY+Pj9fy5cu1du3anA0sF1WrVk3Vq1fXl19+qf/85z/2DsduSKIB2axx48YWrcT69++vFStWqGnTpnr55Ze1a9cuubm5Zek50+4eFy5cODtDfWAlJiaqYMGCypcvn/Lly9tfc6dOnZKUtXN76NAhrV27VgsWLNBbb72lGTNmKDo6OocitJQvXz698cYbFsueffZZNW3aVIsWLVKXLl1yJQ4AeJgUL15cJ06ckJ+fnzZt2mT+yLRm2LBhSkxM1ObNm/XII49IkmrUqKEGDRpoypQp6tq16x33dTfXpbwoNTVV169fl6urq71DyRW2kqPIHsnJyUpNTZWzs/N98d46depUlj/L06dPV5MmTVSmTBnNnDkzV5NoderUUYsWLcy/u3XrpkcffVQzZ860exItq7777js98sgjevbZZ+0dSrZ6/fXXFR0drbFjx8rd3d3e4dgF3TmBXPDCCy8oKipKR44c0fTp0y3W7d69Wy1atFDRokXl6uqq6tWr68cffzTXDxo0yGwC/P7778vBwcFsZXPkyBF1795d5cuXl5ubm4oVK6aWLVum686W0dhgtsYtWblypVlRDw8PN5tX36kLyaVLl/Tuu+/K399fLi4u8vHxUYMGDbRlyxaLcuvXr1eTJk1UpEgRFSxYUFWqVNHIkSMtyqxYsUJ16tRRwYIFVbhwYb3yyivpmpWnHdvOnTvVtm1bFSlSRM8991yGx+3g4KCIiAgtXLhQTz31lFxcXPTkk09qyZIlVo+/evXqcnV1Vbly5TRhwoQsjbM2d+5cBQYGys3NTV5eXnrjjTd0/Phxc329evUUFhYmSXrmmWfk4OCQqfFAZsyYoSJFiuill15SixYtMmzmvmPHDr3wwgtyc3NTqVKl9PHHHys1NTVduf/+97966aWXVKJECbm4uKhcuXIaOnSoUlJSMnWcfn5+kpQuYXnw4EG1bNlSRYsWVYECBfTss89q0aJF6bY/deqUOnXqJF9fX7m6uiogIEBTp05NV27WrFkKDAxUoUKF5OHhocqVK5vvmSlTpqhly5aSpPr165vv1bw25gsA3A0XFxfzu9aW+fPnq2nTpmYCTZKCg4P1xBNPaM6cOXfctmPHjqpbt64kqWXLlnJwcLAYZysz12VrDMPQxx9/rFKlSqlAgQKqX7++duzYkanjkW4mxEaOHKnKlSvL1dVV3t7eatSokTZt2mSWSbu+z5gxQ08++aRcXFzMa/vWrVvVuHFjeXh4yN3dXS+++KL++OMPi33cuHFDgwcP1uOPPy5XV1cVK1ZMzz33nJYtW2aWOXnypMLDw1WqVCm5uLioePHieuWVV+44jMC8efPk4OCg3377Ld26CRMmyMHBQX/99Zck6c8//1THjh316KOPytXVVX5+fnrzzTd19uxZm6+RtTHR/v77bzVr1kwFCxaUj4+PevfuraSkpHTbrlq1Si1bttQjjzwiFxcXlS5dWr1799bVq1fNMh07dtSYMWMkyaLLXZrU1FTFxsbqySeflKurq3x9ffXWW2/p33//tdjXvb4X7lQXSHP+/Hn17t3brIuWKlVKHTp00JkzZ8wymal7pI17Nnz4cMXGxqpcuXJycXHRzp07rY6J1rFjR7m7u+v48eNq1qyZ3N3d5e3trffeey9dners2bNq3769PDw8VLhwYYWFhWnbtm2ZHmfNVh0rrY5vGIbGjBmT7nxl5OjRo1q1apVat26t1q1bmzdub5eUlKTevXvL29tbhQoV0ssvv6y///47XbnM/lbJiLOzs4oUKZKujpmcnKyhQ4ea58Tf318ffvih1ff32LFjze+EEiVKqEePHunGH963b5+aN28uPz8/ubq6qlSpUmrdurUuXLgg6eZ7PjExUVOnTjVfS1t19oULF+qFF15I97r7+/uradOm5u8MNzc3Va5c2ayzLliwwPyuCwwM1NatW9M9t63fj5J07tw5vffee6pcubLc3d3l4eGhxo0bp+uKmdaVds6cOfrkk09UqlQpubq66sUXX9T+/fvT7btBgwZKTEy0+G582OTtJhrAA6R9+/b68MMP9csvv5itdXbs2KHatWurZMmS6tevnwoWLKg5c+aoWbNmmj9/vl599VW99tprKly4sHr37q02bdqoSZMmZtZ/48aNWrt2rVq3bq1SpUrp8OHDGjdunOrVq6edO3eqQIEC9xRzxYoVNWTIEA0cOFBdu3ZVnTp1JEm1atXKcJu3335b8+bNU0REhCpVqqSzZ89q9erV2rVrl55++mlJ0rJly9S0aVMVL15cvXr1kp+fn3bt2qWffvpJvXr1kiQtX75cjRs31qOPPqpBgwbp6tWrGjVqlGrXrq0tW7ak667XsmVLPf744xo2bJgMw7jjca1evVoLFixQ9+7dVahQIX399ddq3ry5jh49qmLFikm6WeFu1KiRihcvrsGDByslJUV
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABd1UlEQVR4nO3deVhUZf8G8HtYZgBZXNgRRRQX3FAJEzPTUExzrcQ0RSsr9/RVc8ddy9wql1zRfhYuqWka7iuuKJgK4gbiAiih7Ps8vz98mbcR0Dk4w8h4f65rrpznPOece44jfHvOc86RCSEEiIiIiAyEkb4DEBEREWkTixsiIiIyKCxuiIiIyKCwuCEiIiKDwuKGiIiIDAqLGyIiIjIoLG6IiIjIoJjoO0B5UyqVePDgAaysrCCTyfQdh4iIiDQghEB6ejqcnZ1hZPT8sZnXrrh58OABXF1d9R2DiIiIyuDu3buoXr36c/u8dsWNlZUVgKcHx9raWs9piIiISBNpaWlwdXVV/R5/nteuuCk6FWVtbc3ihoiIqILRZEoJJxQTERGRQWFxQ0RERAaFxQ0REREZFBY3REREZFBY3BAREZFBYXFDREREBoXFDRERERkUFjdERERkUFjcEBERkUFhcUNEREQGRa/FzfHjx9G1a1c4OztDJpNh586dL1zn6NGjaN68ORQKBerUqYPg4GCd5yQiIqKKQ6/FTWZmJpo2bYply5Zp1D82NhZdunRBu3btEBkZia+//hqff/459u3bp+OkREREVFHo9cGZ7733Ht577z2N+69cuRK1atXCwoULAQANGjTAyZMnsXjxYvj7++sqJhERvULyCpR4mJ6j7xj0HHITI9hbmelt/xXqqeCnT5+Gn5+fWpu/vz++/vrrUtfJzc1Fbm6u6n1aWpqu4hERkQ4IIXDrUSZO3niEEzeScfr2P8jKK9R3LHqO5jUqY/vQ1nrbf4UqbhITE+Hg4KDW5uDggLS0NGRnZ8Pc3LzYOvPmzcOMGTPKKyIREWlBSmYewm4m48SNRzh5IxkPUtVHakyNZTCSyfSUjl7E1Fi/1ytVqOKmLCZOnIgxY8ao3qelpcHV1VWPiYiI6Fm5BYW4cOcxTt5IxokbybjyIBVC/G+53MQIPm5V8ZaHLdp42KKBozWMjFjcUMkqVHHj6OiIpKQktbakpCRYW1uXOGoDAAqFAgqFojziERGRhoQQuPkwA8dvPB2dOXs7Bdn56qea6jtaoY2HLd7ysIOPW1WYy431lJYqmgpV3LRq1Qp79+5Vaztw4ABatWqlp0RERIYhI7cAsY8ycTs5A7cfZeJxVp7O9pWeU4BTt5KRlJar1m5rqUCb/47MvFXHFvbW+puQShWbXoubjIwM3Lx5U/U+NjYWkZGRqFq1KmrUqIGJEyfi/v372LhxIwDgq6++wk8//YTx48fj008/xeHDh7Flyxbs2bNHXx+BiKjCKChU4t7jbFUBczs5E7cfPf3zw/TcF29AyxQmRvCpVRVve9jhLQ9b1He0gozzaEgL9FrchIeHo127dqr3RXNjAgMDERwcjISEBMTHx6uW16pVC3v27MHo0aOxdOlSVK9eHWvWrOFl4ERE/yWEwD+Zebj9KBOx/y1ibv33z/EpWcgvFKWua2sph7utJdztKsHeSgHoqNAwNZKhWY0q8HarAjNTnmoi7ZMJIUr/phugtLQ02NjYIDU1FdbW1vqOQ0RUJjn5hYhNzlQvYpIzEfsoA2k5BaWuZ2ZqBLdqlVDb7mkRU8u2EtztLFHLthJszE3L8RMQSSPl93eFmnNDRFRR5RcqUfCcUZOSCAik/HcU5vajjKfFzH8LmvtPsktdTyYDnG3M4W73tIh5WsA8LWKcrM14lREZPBY3REQ6dPNhOtaciMWOiPvILVBqddvWZiZw/+8IzL+LGLdqlXi6h15rLG6IiLRMCIGzsSlYffw2Dl17+FLbMjWWoUZVC1UR4/7f00jutpVQtZKcE3CJSsDihohISwoKlfjrSiJWn7iNv++lAnh6iqhDAwcMftsdnk7S5/kpTIxgoue7vRJVNCxuiIheUkZuAbacv4u1J2NVc2EUJkb4sEV1fPZWLbjbWeo5IdHrhcUNEVEZZeQWYPmRm/i/M3dUVyhVrSTHgFY10f/NmqhmybujE+kDixsiojI4dSsZ47b+rRqpqWVbCZ+3qYUPmlfnZF4iPWNxQ0QkQXZeIb4NvYbgU3EAgOpVzDGliyc6ejrwEmuiVwSLGyIiDV248xhjt15CbHImAKBvyxqY1LkBLBX8UUr0KuG/SCKiF8gtKMSSgzfw87FbUArA0doM8z9ojHfq2es7GhGVgMUNkQG4/yQbU3dewelb/+g7ikEqVArkFT69AV+vZi4I6toQNhZ8VAHRq4rFDVEFJoTA1gv3MGt3FNJzS3+eEL08W0s5ZvdojE6NHPUdhYhegMUNUQX1MC0HE7dfVt0Bt3mNypjRrREqc0RBJ+ytFVCY8CooooqAxQ1RBbT70gNM/eMKnmTlQ25shDEd62JwG3cY82odIiIWN0TlRQiBw9cePvdpzpo4c/sf7L2cCABo6GyNRb29UM/RShsRiYgMAosbonJy5X4aPtsQrpVtGRvJMLxdHQxvXwemfO4QEZEaFjdE5SQlKw8AYGVmgjYetmXejrmpCQJ9a6JJ9cpaSkZEZFhY3BCVM9cqFljer4W+YxARGSyOZxMREZFB4cgNkQYKlQLjt/2NM7fLfpO83IJCLSYiIqLSsLgh0sDKY7fw+8V7WtmWu10lrWyHiIhKxuKG6AXC41Kw6MB1AMDkzg3gU6tqmbdlJJOhgRMv2yYi0iUWN0TP8SQrD6NCIlGoFOjh5YzP29SCTMYb5RERvcpY3NBrIT0nH2E3k1GgFJLW+/3CPdx/kg23ahaY3bMxCxsiogqAxQ0ZvIJCJT4NPo/zcY/LtL6psQw/ftwclgr+cyEiqgj405oM3tJDN3A+7jEqyY3RuLqNpHWNZDIEvOEqeT0iItIfFjdk0MJuJuOnIzcBAN9+2ATvN3HWcyIiItI13sSPDFZyRi6+3hwJIYCPfWqwsCEiek1w5IZeCUIIzNgdhQt3yjYvpiTJGbl4lJ6Lug6WmPa+p9a2S0RErzYWN/RKiLj7BMGn4rS+XTNTI/zUtznM5cZa3zYREb2aWNzQK2H7f+/+276+Pfq/WVNr23W3q4Sa1XhHYCKi1wmLG9K73IJC7L6UAAD4tHUtvOVhq+dERERUkXFCMend4eiHSM3Oh5ONGVrVrqbvOEREVMGxuCG9+/3ifQBAj2YuMDbiHYCJiOjlsLghvfonIxdHYx4CAHo1c9FzGiIiMgQsbkivdl96gAKlQJPqNvBw4NOyiYjo5bG4Ib3aHvH0lBRHbYiISFtY3JDe3EhKx9/3UmFiJEPXprx7MBERaQeLG9KborsR+9SqimqWCj2nISIiQ1Gm+9zk5+cjMTERWVlZsLOzQ9WqVbWdi14DSvH0v5UUvN0SERFpj8YjN+np6VixYgXatm0La2truLm5oUGDBrCzs0PNmjUxePBgnD9/XpdZiYiIiF5Io+Jm0aJFcHNzw/r16+Hn54edO3ciMjIS169fx+nTpxEUFISCggJ07NgRnTp1wo0bN3Sdm4iIiKhEGp0POH/+PI4fP46GDRuWuNzHxweffvopVq5cifXr1+PEiRPw8PDQalAiIiIiTWhU3Pz2228abUyhUOCrr756qUBEREREL4NXSxEREZFBkVTcXLp0CbNnz8by5cuRnJystiwtLQ2ffvqpVsMRERERSaVxcbN//374+PggJCQE3377LerXr48jR46olmdnZ2PDhg06CUlERESkKY2Lm+nTp2Ps2LG4cuUK4uLiMH78eHTr1g2hoaG6zEdEREQkicZ3T7t69Sp++eUXAIBMJsP48eNRvXp1fPjhhwgJCcEbb7yhs5BEREREmtK4uFEoFHjy5IlaW9++fWFkZISAgAAsXLhQ29mIiIiIJNO4uPHy8sKRI0fQokULtfY+ffpACIHAwECthyMiIiKSSuPiZsiQITh+/HiJyz7++GMIIbB69WqtBSMiIiI
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2022-12-29 10:21:35 +01:00
"source": [
"from sklearn.ensemble import AdaBoostClassifier\n",
"\n",
"# Score the model with default parameters\n",
2023-01-06 10:09:28 +01:00
"score_ada, model_ada, most_important_features_ada = score_the_model(\n",
2022-12-29 10:21:35 +01:00
" model=AdaBoostClassifier(\n",
" estimator = RandomForestClassifier(),\n",
" n_estimators=500000,\n",
" learning_rate=0.001,\n",
" ),\n",
" model_name='AdaBoost',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")\n"
]
},
2023-01-06 10:09:28 +01:00
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 179,
2023-01-06 10:09:28 +01:00
"id": "005cb457",
"metadata": {},
2023-01-06 10:41:21 +01:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABU60lEQVR4nO3dd3RU1d7G8WfSE0KIkEKAQKgCEsEbepEWigiKWBALTVCEKIKioAKiIFbEKwgCUlSQJk3phqICioQiVXo3CYi0REg77x++zHVIYJNCJiTfz1qz7p199j7nN+Goedj77LFZlmUJAAAAAHBNLs4uAAAAAADyOoITAAAAABgQnAAAAADAgOAEAAAAAAYEJwAAAAAwIDgBAAAAgAHBCQAAAAAMCE4AAAAAYEBwAgAAAAADghMA5ANTp06VzWbT4cOH7W1NmjRRkyZNjGPXrFkjm82mNWvW3LT6siIsLExdu3Z1dhl5yr59+9SyZUsVKVJENptNCxYscHZJ13X48GHZbDZNnTrV2aUAQLYRnADccg4cOKBnnnlG5cqVk5eXl/z8/NSgQQN9/PHH+vvvv51dXoGyZMkSvfHGG84uo8Do0qWLtm/frhEjRujLL79UzZo1c+W6u3fvls1mk5eXl86ePZsr17zylwH/fgUFBalp06ZaunRprtRwPYmJiXrjjTfy3F84ALh53JxdAABkxuLFi/Xwww/L09NTnTt3VrVq1ZSUlKSffvpJAwYM0M6dOzVhwgRnl5knrFix4qZfY8mSJRo7dizhKRf8/fff2rBhg1577TVFRUXl6rW/+uorFS9eXH/99Zfmzp2rHj165Nq133zzTZUtW1aWZSkuLk5Tp05VmzZt9O2336pt27a5VsfVEhMTNWzYMEm6oZldALc+ghOAW8ahQ4f06KOPqkyZMlq1apVCQkLsx/r06aP9+/dr8eLF1xyflpampKQkeXl55Ua5Tufh4eHsEgqElJQUpaWl3fSf96lTpyRJ/v7+OXbOhIQEFSpU6Lp9LMvSjBkz9Nhjj+nQoUOaPn16rgane+65x2Fm7amnnlJwcLC+/vprpwYnAAUPS/UA3DLee+89Xbx4UZ9//rlDaLqiQoUK6tu3r/29zWZTVFSUpk+frjvuuEOenp5atmyZJGnLli2655575OfnJ19fXzVv3lw///yzw/mSk5M1bNgwVaxYUV5eXipWrJgaNmyolStX2vvExsaqW7duKlWqlDw9PRUSEqL777/f4Vmjq82dO1c2m01r165Nd+yzzz6TzWbTjh07JEm//fabunbtal+WWLx4cXXv3l1//vmn8eeV0TNOx48fV/v27VWoUCEFBQWpX79+unz5crqxP/74ox5++GGVLl1anp6eCg0NVb9+/RyWQnbt2lVjx46VJIflVFekpaVp9OjRuuOOO+Tl5aXg4GA988wz+uuvvxyuZVmWhg8frlKlSsnHx0dNmzbVzp07jZ/vipkzZyoiIkKFCxeWn5+fwsPD9fHHHzv0OXv2rPr166ewsDB5enqqVKlS6ty5s06fPm3vEx8fb/+l3MvLS9WrV9e0adMcznPlmZ0PPvhAo0ePVvny5eXp6aldu3ZJkvbs2aOHHnpIRYsWlZeXl2rWrKlFixY5nONG7qurvfHGGypTpowkacCAAbLZbAoLC7Mfv5H7+crSt7Vr16p3794KCgpSqVKljD/fdevW6fDhw3r00Uf16KOP6ocfftDx48fT9Tt79qy6du2qIkWKyN/fX126dMlwWV927mnpn+Do7e0tNzfHv/tNSEjQiy++qNDQUHl6eur222/XBx98IMuyHPqlpKTorbfesv/ZhYWF6dVXX033z8GmTZvUqlUrBQQEyNvbW2XLllX37t0l/XMfBAYGSpKGDRtmv/eZeQXyN2acANwyvv32W5UrV07169e/4TGrVq3S7NmzFRUVpYCAAIWFhWnnzp1q1KiR/Pz89PLLL8vd3V2fffaZmjRporVr16pOnTqS/vlldeTIkerRo4dq166t8+fPa9OmTdq8ebNatGghSXrwwQe1c+dOPffccwoLC1N8fLxWrlypo0ePOvxi+2/33nuvfH19NXv2bDVu3Njh2KxZs3THHXeoWrVqkqSVK1fq4MGD6tatm4oXL25firhz5079/PPPDkHF5O+//1bz5s119OhRPf/88ypRooS+/PJLrVq1Kl3fOXPmKDExUc8++6yKFSumjRs36pNPPtHx48c1Z84cSdIzzzyjkydPauXKlfryyy/TneOZZ57R1KlT1a1bNz3//PM6dOiQxowZoy1btmjdunVyd3eXJA0ZMkTDhw9XmzZt1KZNG23evFktW7ZUUlKS8TOtXLlSnTp1UvPmzfXuu+9K+ud5nHXr1tlD9MWLF9WoUSPt3r1b3bt313/+8x+dPn1aixYt0vHjxxUQEKC///5bTZo00f79+xUVFaWyZctqzpw56tq1q86ePesQyCVpypQpunTpkp5++ml5enqqaNGi2rlzpxo0aKCSJUtq4MCBKlSokGbPnq327dvrm2++0QMPPCDpxu6rq3Xo0EH+/v7q16+fOnXqpDZt2sjX11eSbvh+vqJ3794KDAzUkCFDlJCQYPwZT58+XeXLl1etWrVUrVo1+fj46Ouvv9aAAQPsfSzL0v3336+ffvpJvXr1UpUqVTR//nx16dIlwz+zzNzT586d0+nTp2VZluLj4/XJJ5/o4sWLeuKJJxyuf99992n16tV66qmnVKNGDS1fvlwDBgzQiRMn9NFHH9n79ujRQ9OmTdNDDz2kF198Ub/88otGjhyp3bt3a/78+ZL+CdEtW7ZUYGCgBg4cKH9/fx0+fFjz5s2TJAUGBmrcuHF69tln9cADD6hDhw6SpDvvvNP48wRwC7MA4BZw7tw5S5J1//333/AYSZaLi4u1c+dOh/b27dtbHh4e1oEDB+xtJ0+etAoXLmzdfffd9rbq1atb99577zXP/9dff1mSrPfff//GP8j/69SpkxUUFGSlpKTY2/744w/LxcXFevPNN+1tiYmJ6cZ+/fXXliTrhx9+sLdNmTLFkmQdOnTI3ta4cWOrcePG9vejR4+2JFmzZ8+2tyUkJFgVKlSwJFmrV6++7nVHjhxp2Ww268iRI/a2Pn36WBn9p+THH3+0JFnTp093aF+2bJlDe3x8vOXh4WHde++9Vlpamr3fq6++akmyunTpku7c/9a3b1/Lz8/P4ed4tSFDhliSrHnz5qU7duWaV342X331lf1YUlKSVa9ePcvX19c6f/68ZVmWdejQIUuS5efnZ8XHxzucq3nz5lZ4eLh16dIlh/PXr1/fqlixor3NdF9dy5VrX32/3ej9fOUeadiw4XV/Xv+WlJRkFStWzHrttdfsbY899phVvXp1h34LFiywJFnvvfeevS0lJcVq1KiRJcmaMmWKvT2z9/TVL09PT2vq1KkZXn/48OEO7Q899JBls9ms/fv3W5ZlWVu3brUkWT169HDo99JLL1mSrFWrVlmWZVnz58+3JFm//vrrNX82p06dsiRZQ4cOvWYfAPkLS/UA3BLOnz8vSSpcuHCmxjVu3FhVq1a1v09NTdWKFSvUvn17lStXzt4eEhKixx57TD/99JP9Wv7+/tq5c6f27duX4bm9vb3l4eGhNWvWpFt+ZtKxY0fFx8c77Mg1d+5cpaWlqWPHjg7XuOLSpUs6ffq06tatK0navHlzpq65ZMkShYSE6KGHHrK3+fj46Omnn07X99/XTUhI0OnTp1W/fn1ZlqUtW7YYrzVnzhwVKVJELVq00OnTp+2viIgI+fr6avXq1ZKk77//XklJSXruueccZhpeeOGFG/pM/v7+SkhIuO4yt2+++UbVq1e3z/j825VrLlmyRMWLF1enTp3sx9zd3fX888/r4sWL6ZZVPvjgg/alWpJ05swZrVq1So888oguXLhg/7x//vmnWrVqpX379unEiRP2mq93X2VGZu7nK3r27ClXV9cbOv/SpUv1559/OvxcOnXqpG3btjksp1yyZInc3Nz07LPP2ttcXV313HPPpTtnZu/psWPHauXKlVq5cqW++uorNW3aVD1
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most important features: ['V36', 'V39', 'V1', 'V22', 'V27', 'V15', 'V12', 'V13', 'V37', 'V18', 'V14', 'V2', 'V30', 'V10', 'V31', 'V8', 'V17', 'V38', 'V34', 'V16', 'V28', 'V3', 'V9', 'V5', 'V7', 'V11']\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdeXwN1//H8XcSskiIJYlYUkHtYimVooo2BKXVlipKpJYWqSXdaEVslW40qvbavpba6+uLUqJaRe3Uvm9Via22ICSZ3x8emZ8rN7kJ2fB6Ph73Qc6cmTnnztw7537mnDN2hmEYAgAAAAAAAJAi++wuAAAAAAAAAJDTEUQDAAAAAAAAbCCIBgAAAAAAANhAEA0AAAAAAACwgSAaAAAAAAAAYANBNAAAAAAAAMAGgmgAAAAAAACADQTRAAAAAAAAABsIogEAAAAAAAA2EEQDHhGHDx9W48aN5e7uLjs7Oy1evDhT9tOgQQM1aNAgU7ad1QYNGiQ7O7vsLoZNK1asULVq1eTs7Cw7Oztdvnw50/Y1bdo02dnZ6cSJE5m2DwDAo23Lli2qU6eOXF1dZWdnp507d6Z53fRcZ3x9fdWpU6cHLueTytp7nNb229q1a2VnZ6e1a9dmWvkexKN2Lpw4cUJ2dnaaNm1adhclVTExMWrVqpUKFSokOzs7RUZGZur+7OzsNGjQoEzdx6Nk8+bNcnR01MmTJ7O7KBnirbfe0ptvvpndxch2BNGADJLUoEl6OTs7q2jRogoMDNR3332na9euPdT2g4KCtHv3bn3++eeaMWOGatasmUElT90///yjQYMGpasBjbS7ePGi3nzzTbm4uGjMmDGaMWOGXF1dba43duxY2dnZyd/fPwtKeVenTp0szvFcuXLJx8dHb731lvbt25dl5UjJvn37NGjQIAKEAB47169fV3h4uJo0aaKCBQva/PG+f/9+NWnSRG5ubipYsKA6dOig8+fPp2lfd+7cUevWrXXp0iV9++23mjFjhkqUKJFBNcGjbPny5QRIHjF9+/bVypUr1b9/f82YMUNNmjSxuc7ly5fNG7v79+/PglL+f3D33lfBggX13HPPadasWVlSBluGDx+e7k4Mn332mdq2bfvYfId+8sknWrhwoXbt2pXdRclWubK7AMDjZsiQISpZsqTu3Lmj6OhorV27Vn369NHIkSO1ZMkSValSJd3bvHnzpjZu3KjPPvtMISEhmVDqlP3zzz8aPHiwfH19Va1atSzd98MaMGCA+vXrl93FSNWWLVt07do1DR06VAEBAWleb9asWfL19dXmzZt15MgRPf3005lYyv/n5OSkH374QZIUHx+vo0ePavz48VqxYoX27dunokWLZkk5rNm3b58GDx6sBg0ayNfXN9vKAQAZ7cKFCxoyZIieeuopVa1aNdVeRH///bdeeOEFubu7a/jw4bp+/bq++eYb7d692+wVkZqjR4/q5MmTmjRpkrp06ZLBNUFm+eWXXzJ9H8uXL9eYMWMIpEkqUaKEbt68qdy5c2d3UVK1Zs0avfrqq/rwww/TvM78+fNlZ2cnb29vzZo1S8OGDcvEElrq1auXnn32WUl3bzTPnTtXb7/9ti5fvqyePXtmWTmsGT58uFq1aqWWLVumKf/OnTu1evVqbdiwIXMLloWqV6+umjVrasSIEfrPf/6T3cXJNgTRgAzWtGlTi15i/fv315o1a9S8eXO98sor2r9/v1xcXNK1zaS7x/nz58/Ioj62YmNj5erqqly5cilXrpz9NXfu3DlJ6Tu2x48f14YNG7Ro0SK9++67mjVrlsLDwzOphJZy5cqlt99+2yLtueeeU/PmzbVs2TJ17do1S8oBAE+SIkWK6OzZs/L29tbWrVvNH5nWDB8+XLGxsdq2bZueeuopSVKtWrXUqFEjTZs2Td26dUt1Xw9yXcqJEhMTdfv2bTk7O2d3UbKEreAoMkZ8fLwSExPl6Oj4SJxb586dS/dneebMmWrWrJlKlCih2bNnZ2kQrV69emrVqpX5d/fu3VWqVCnNnj0724No6TV16lQ99dRTeu6557K7KBnqzTffVHh4uMaOHSs3N7fsLk62YDgnkAVefPFFhYWF6eTJk5o5c6bFsgMHDqhVq1YqWLCgnJ2dVbNmTS1ZssRcPmjQILML8EcffSQ7Ozuzl83JkyfVo0cPlStXTi4uLipUqJBat26dbDhbSnOD2Zq3ZO3atWZDPTg42OxendoQkmvXrqlPnz7y9fWVk5OTvLy81KhRI23fvt0i36ZNm9SsWTMVKFBArq6uqlKlikaNGmWRZ82aNapXr55cXV2VP39+vfrqq8m6lSfVbd++fWrXrp0KFCig559/PsV629nZKSQkRIsXL1blypXl5OSkSpUqacWKFVbrX7NmTTk7O6t06dKaMGFCuuZZmz9/vmrUqCEXFxd5eHjo7bff1pkzZ8zlDRo0UFBQkCTp2WeflZ2dXZrmA5k1a5YKFCigl19+Wa1atUqxm/vevXv14osvysXFRcWLF9ewYcOUmJiYLN9///tfvfzyyypatKicnJxUunRpDR06VAkJCWmqp7e3tyQlC1geO3ZMrVu3VsGCBZUnTx4999xzWrZsWbL1z507p86dO6tw4cJydnZW1apVNX369GT55syZoxo1aihv3rzKly+f/Pz8zHNm2rRpat26tSSpYcOG5rma0+Z8AYAH4eTkZH7X2rJw4UI1b97cDKBJUkBAgMqWLat58+alum6nTp1Uv359SVLr1q1lZ2dnMc9WWq7L1hiGoWHDhql48eLKkyePGjZsqL1796apPtLdgNioUaPk5+cnZ2dneXp6qkmTJtq6dauZJ+n6PmvWLFWqVElOTk7mtX3Hjh1q2rSp8uXLJzc3N7300kv6888/LfZx584dDR48WGXKlJGzs7MKFSqk559/XqtWrTLzREdHKzg4WMWLF5eTk5OKFCmiV199NdVpBBYsWCA7Ozv99ttvyZZNmDBBdnZ22rNnjyTpr7/+UqdOnVSqVCk5OzvL29tb77zzji5evGjzPbI2J9rff/+tli1bytXVVV5eXurbt6/i4uKSrbtu3Tq1bt1aTz31lJycnOTj46O+ffvq5s2bZp5OnTppzJgxkmQx5C5JYmKiIiMjValSJTk7O6tw4cJ699139e+//1rs62HPhdTaAkkuX76svn37mm3R4sWLq2PHjrpw4YKZJy1tj6R5z7755htFRkaqdOnScnJy0r59+6zOidapUye5ubnpzJkzatmypdzc3OTp6akPP/wwWZvq4sWL6tChg/Lly6f8+fMrKChIu3btSvM8a7baWEltfMMwNGbMmGTHKyWnTp3SunXr9NZbb+mtt94yb9zeLy4uTn379pWnp6fy5s2rV155RX///XeyfGn9rZISR0dHFShQIFkbMz4+XkOHDjWPia+vrz799FOr5/fYsWPN74SiRYuqZ8+eyeYfPnz4sN544w15e3vL2dlZxYsX11tvvaUrV65IunvOx8bGavr06eZ7aavNvnjxYr344ovJ3ndfX181b97c/J3h4uIiPz8/s826aNEi87uuRo0a2rFjR7Jt2/r9KEmXLl3Shx9+KD8/P7m5uSlfvnxq2rRpsqGYSUNp582bp88//1zFixeXs7OzXnrpJR05ciTZvhs1aqTY2FiL78YnTc7uogE8Rjp06KBPP/1Uv/zyi9lbZ+/evapbt66KFSumfv36ydXVVfPmzVPLli21cOFCvfbaa3r99deVP39+9e3bV23btlWzZs3MqP+WLVu0YcMGvfXWWypevLhOnDihcePGqUGDBtq3b5/y5MnzUGWuUKGChgwZooEDB6pbt26qV6+eJKlOnToprvPee+9pwYIFCgkJUcWKFXXx4kX98ccf2r9/v5555hlJ0qpVq9S8eXMVKVJEvXv3lre3t/bv36+lS5eqd+/ekqTVq1eradOmKlWqlAYNGqSbN29q9OjRqlu3rrZv355suF7r1q1VpkwZDR8+XIZhpFqvP/74Q4sWLVKPHj2UN29efffdd3rjjTd06tQpFSpUSNLdBneTJk1UpEgRDR48WAkJCRoyZIg
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABd4UlEQVR4nO3deXhMZ/8G8HuyzCSRDZFViJ3YglQqiiLEvlVFbaGttvaXqlJL7LRKeVvltYa+KpaiWhr7LrZIbInYBVlII/ueeX5/+GXejiTMiZmMTO7Pdc1V85znnHPP6US+nvOcc2RCCAEiIiIiA2Gk7wBERERE2sTihoiIiAwKixsiIiIyKCxuiIiIyKCwuCEiIiKDwuKGiIiIDAqLGyIiIjIoJvoOUNqUSiViYmJgZWUFmUym7zhERESkASEEUlNT4ezsDCOjV4/NlLviJiYmBq6urvqOQURERCXw6NEjVK1a9ZV9yl1xY2VlBeDFwbG2ttZzGiIiItJESkoKXF1dVb/HX6XcFTcFp6Ksra1Z3BAREZUxmkwp4YRiIiIiMigsboiIiMigsLghIiIig8LihoiIiAwKixsiIiIyKCxuiIiIyKCwuCEiIiKDwuKGiIiIDAqLGyIiIjIoLG6IiIjIoOi1uDl58iR69uwJZ2dnyGQy7Nmz57XrHD9+HM2bN4dCoUDt2rURGBio85xERERUdui1uElPT0fTpk2xcuVKjfrfv38f3bt3R/v27REeHo5//etf+PTTT3HgwAEdJyUiIqKyQq8PzuzatSu6du2qcf/Vq1ejRo0aWLp0KQCgQYMGOH36NH744Qf4+vrqKiYREelZalYukjNz9R2DNCQ3MYK9lZne9l+mngoeEhICHx8ftTZfX1/861//Knad7OxsZGdnq96npKToKh4REWlRckYuDtyIwx9XY3D27t/IVwp9RyINNa9mi12jW+tt/2WquImLi4ODg4Nam4ODA1JSUpCZmQlzc/NC6yxatAhz5swprYhERPQGUrNycSgiHn9ejcWp28+Qm/+/gkZuYgSZHrOR5kyN9Xu9Upkqbkpi2rRpmDRpkup9SkoKXF1d9ZiIiIj+KSMnD4cjn+LPKzE4fusZcvKUqmX1Ha3Qo4kTujdxRg27CnpMSWVJmSpuHB0dER8fr9YWHx8Pa2vrIkdtAEChUEChUJRGPCIi0lBWbj6O3XyKP6/G4sjNeGTl/q+gqVWlAno0cUaPJk6o42Clx5RUVpWp4qZVq1bYv3+/WtuhQ4fQqlUrPSUiIno1pVLg0sPnOHrzKTJy8vQd563wd3oOjt98ivScfFVb9coW6NHECT2aOKO+oxVkMp6AopLTa3GTlpaGO3fuqN7fv38f4eHhqFSpEqpVq4Zp06bhyZMn2Lx5MwDgiy++wE8//YQpU6bg448/xtGjR7F9+3bs27dPXx+BiKgQIQTCHiXhzyux2H8tFnEpWfqO9FZysTVXFTSNXKxZ0JDW6LW4uXTpEtq3b696XzA3xt/fH4GBgYiNjUV0dLRqeY0aNbBv3z5MnDgRK1asQNWqVbFu3TpeBk5EeieEwPUnKfjzagz+vBqLJ0mZqmVWChN0auiAqhUt9Jjw7SE3lsG7th2audqyoCGdkAkhytW1dSkpKbCxsUFycjKsra31HYeIyjAhBG7GpeLPqzHYdzUWD/7OUC2rIDeGj7sDejRxRtu6dlCYGOsxKVHZJ+X3d5mac0NEhkmpFMj+xxUyb7vHzzPw59VY/Hk1BnefpavazUyN0LG+A3o0cUL7+vYwM2VBQ6QPLG6ISG8S0rKxOeQh/nvuIRLTc/Qdp0TkJkZ4v24V9GjqjI717VFBwb9WifSNP4VEVOruPkvDulP38dvlx2r3NCkrTI1laFOnCno0cUIndwdYmZnqOxIR/QOLGyIqFUIIXHzwHGtO3sPhyP/dr6ppVRt81rYW2tWrUmbuPmtqbAS5iX7vwEpExWNxQ0Q6lZevxIEb8Vhz6h6uPEoCAMhkgE8DB4xsUxPvuFXkFTNEpFUsbojotZ6mZmH35SfIzM1/fed/yMlT4o+rMXiU+OKyaLmJEfq3qIpP3quBWlUsdRGViIjFDRG92h9XYjDz9+tIysgt8TYqWphiWCs3DG1VHXaWfBwKEekWixsiKlJieg5m/n4d+67GAgAaOFmjRXVbydtxd7JB32YuMJfzsmgiKh0sboiokEMR8Zi26xoS0rJhbCTD2Pa1MbZDbZgacxItEb39WNwQkUpKVi7m/hGBnaGPAQB17C2xbIAHGle10XMyIiLNsbgh0oAQAmN+vYxjN5/pO4pO5SmVyM0XkMmAz9rUxMROdXmXXSIqc1jcEGngRkwK9l+L03eMUuFW2QLff9gUnm6V9B2FiKhEWNwQaaDgNI1vQwfM6O6u5zS65WxrDmMj3neGiMouFjdEr5Gbr8TeKzEAgIEtq8G1koWeExER0avw0gei1zgR9QyJ6Tmws1SgTW07fcchIqLXYHFD9Bq7wl6ckurj4QwTXgpNRPTW49/URK+QnJGLwxFPAQD9mlfVcxoiItIEixuiV/jjagxy8pVo4GQNd2drfcchIiINsLgheoVdl1+ckvqguYuekxARkaZ4tRSVC9svPsJPx+4gXykkrfckKRNGMqCXh7OOkhERkbaxuCGDd/VxEqbvuYbcfGmFTYEujRxhb2Wm5VRERKQrLG7IoKVm5WLc1jDk5gt0dnfAmPa1Ja1vbCRDXQcrHaUjIiJdYHFDBksIgem7r+Ph3xlwsTXHkv5NYWNhqu9YRESkY5xQTAZrx6XH2HslBsZGMvz7o2YsbIiIygkWN2SQlEqBBfsjAQBfdq6LFtUr6jkRERGVFhY3ZJCUQiA5MxcAMKhlNT2nISKi0sTihgyeDHzCNRFRecLihoiIiAwKr5ait158ShYm77iCpIxcjdcRKNk9bYiIqOxjcUNvvQ2n7+PU7YQSrWtjbgpzubGWExER0duMxQ291fLyldgd9gQA8JVvPbg7SXt4ZX0nK8hNePaViKg8YXFDb7Uzd//G09RsVLQwxcg2NVmoEBHRa/E3Bb3VCp7K3aupMwsbIiLSCH9b0FsrNSsXB27EAQD6Na+q5zRERFRWsLiht9Zf1+OQlatErSoV0KSqjb7jEBFRGcHiht5aBaek+jWvCpmMN+IjIiLNsLiht9KjxAycu5cImQzo28xF33GIiKgM4dVSpHMrj93BhfuJktaJT8kCAHjXqgxnW3NdxCIiIgPF4oZ0KjkzF0sORJV4/QGerlpMQ0RE5UGJipvc3FzExcUhIyMDVapUQaVKlbSdiwxEvvJ/j0H4rn8TGEmYO1PRwhQd6tvrIhYRERkwjYub1NRU/Pe//0VQUBAuXLiAnJwcCCEgk8lQtWpVdO7cGZ999hneeecdXealMqx/86owMuLEYCIi0i2NJhQvW7YMbm5u2LhxI3x8fLBnzx6Eh4fj1q1bCAkJQUBAAPLy8tC5c2d06dIFt2/f1nVuIiIioiJpNHJz8eJFnDx5Eg0bNixyecuWLfHxxx9j9erV2LhxI06dOoU6depoNSgRERGRJjQqbrZu3arRxhQKBb744os3CkRERET0JnifGyIiIjIokoqbK1euYP78+fj555+RkJCgtiwlJQUff/yxVsMRERERSaVxcXPw4EG0bNkSQUFB+Pbbb1G/fn0cO3ZMtTwzMxObNm3SSUgiIiIiTWlc3MyePRuTJ0/G9evX8eDBA0yZMgW9evVCcHCwLvMRERERSaLxfW5u3LiBX375BQAgk8kwZcoUVK1aFf3790dQUBDvb0NERERvBY2LG4VCgaSkJLW2QYMGwcjICH5+fli6dKm2sxERERFJpnFx4+HhgWPHjqFFixZq7QMHDoQQAv7+/loPR0RERCSVxsXNqFGjcPLkySKXffTRRxBCYO3atVoLRoYhNStX3xGIiKickQkhxOu7GY6UlBTY2NggOTkZ1tbW+o5j0JRKgaEbzuPMnb/RtKoNfh/7nr4jERFRGSXl9zdv4kc6s+rEXZy58zfMTY2xdEBTfcchIqJygsUN6cSlB4lYdugWAGBOr4aobW+l50R
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2023-01-06 10:09:28 +01:00
"source": [
"# Plot the scores of ada model with subset of best features\n",
"score_ada, model_ada, most_important_features_ada = score_the_model(\n",
" model=AdaBoostClassifier(\n",
" estimator = RandomForestClassifier(),\n",
" n_estimators=500000,\n",
" learning_rate=0.001,\n",
" ),\n",
" model_name='AdaBoost',\n",
" random_seed=42,\n",
" X_train=X_train[most_important_features_ada],\n",
" X_test=X_test[most_important_features_ada],\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
2022-12-29 10:21:35 +01:00
{
"cell_type": "markdown",
"id": "121d0534",
"metadata": {},
"source": [
2023-01-06 10:09:28 +01:00
"### Found using `hyper_param.py` script"
2022-12-29 10:21:35 +01:00
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 180,
2022-12-29 10:21:35 +01:00
"id": "28e3bd1e",
"metadata": {},
2023-01-06 10:41:21 +01:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABX7klEQVR4nO3deZyNdf/H8ffsizGG2QyGsRUy0T32JWIsCUlJWgaFxEQUUdmKtErFnX2puG0hd1kTKqYwImv2PYNkG8uYOdfvj35z7o6Z8Z3NHMbr+XicR/f5Xt/vdX3OmWvc5z3f6/oeF8uyLAEAAAAAMuTq7AIAAAAA4FZHcAIAAAAAA4ITAAAAABgQnAAAAADAgOAEAAAAAAYEJwAAAAAwIDgBAAAAgAHBCQAAAAAMCE4AAAAAYEBwAoBsmjZtmlxcXHTw4EF7W8OGDdWwYUPj2NWrV8vFxUWrV6++afVlR0REhDp16uTsMm4pe/bsUdOmTVWoUCG5uLho4cKFzi4pz+TkHEfm8H4Ctw+CE4BM2bdvn55//nmVKVNG3t7e8vf3V926dfXxxx/r8uXLzi7vjrJ48WINHTrU2WXcMTp27KitW7dqxIgR+uKLL1StWrWbfszz589rxIgRqlatmgoVKiQvLy+VKlVK7du317fffnvTj+9s69at09ChQ3X27NlM9e/UqZNcXFzsD3d3d4WHh+uJJ57Qjh07bm6xmbBjxw4NHTrUIYACuP24O7sAALe+b7/9Vu3atZOXl5diYmJUuXJlJSUl6aefflK/fv20fft2TZgwwdll3hKWL19+04+xePFijR07lvCUBy5fvqy4uDi9/vrrio2NzZNj7t27V82aNdOhQ4f0yCOPKCYmRn5+fjpy5IgWL16sli1b6vPPP9czzzyTJ/VcLy/O8XXr1mnYsGHq1KmTAgICMjXGy8tLkyZNkiQlJydr3759GjdunJYuXaodO3aoWLFiN7HiG9uxY4eGDRumhg0bKiIiwmFbXryfAHIHwQnADR04cEBPPPGESpUqpe+//15hYWH2bT179tTevXtv+Bdwm82mpKQkeXt750W5Tufp6ensEu4IycnJstlsN/39PnXqlCRl+sN7ZiQmJqpAgQLpbktOTtYjjzyihIQErVmzRnXr1nXYPmTIEC1fvlwpKSnZPkZO3arnuLu7u55++mmHtlq1aqlly5b69ttv1bVrVydVdmO36vsJIC0u1QNwQ++9954uXryoyZMnO4SmVOXKlVPv3r3tz11cXBQbG6sZM2bonnvukZeXl5YuXSpJ+vXXX/Xggw/K399ffn5+aty4sX7++WeH/V27dk3Dhg1T+fLl5e3trcDAQNWrV08rVqyw9zlx4oQ6d+6sEiVKyMvLS2FhYXr44YdveBnMvHnz5OLiojVr1qTZNn78eLm4uGjbtm2SpN9++02dOnWyX5ZYtGhRPfvss/rzzz+N71d69yscPXpUbdq0UYECBRQSEqI+ffro6tWracb++OOPateunUqWLCkvLy+Fh4erT58+DpdCdurUSWPHjpUkh0uTUtlsNo0ePVr33HOPvL29FRoaqueff15//fWXw7Esy9Lw4cNVokQJ+fr66oEHHtD27duNry/VrFmzFBUVpYIFC8rf31+RkZH6+OOPHfqcPXtWffr0UUREhLy8vFSiRAnFxMTo9OnT9j4nT57Uc889p9DQUHl7e6tKlSqaPn26w34OHjwoFxcXffDBBxo9erTKli0rLy8v+yVYu3bt0mOPPaYiRYrI29tb1apV06JFixz2kZnz6npDhw5VqVKlJEn9+vWTi4uLw2xBZs7n1HuE1qxZox49eigkJEQlSpTI8Jhz587Vtm3bNGjQoDShKVXTpk314IMPZuoYhw4dUo8ePXT33XfLx8dHgYGBateuXbq/K9u3b1ejRo3k4+OjEiVKaPjw4bLZbGn6pXeOX716VUOGDFG5cuXs527//v3TnOep/z4sXLhQlStXlpeXl+655x77vxHS3+97v379JEmlS5e2n+PZucytaNGikv4OVf+0f/9+tWvXTkWKFJGvr69q1aqV7h+AMnN+Sjf+fZg2bZratWsnSXrggQfsryf1/sbr38/U+x/nzJmjESNGqESJEvL29lbjxo21d+/eNMceO3asypQpIx8fH9WoUUM//vgj900BNwkzTgBu6L///a/KlCmjOnXqZHrM999/rzlz5ig2NlZBQUGKiIjQ9u3bVb9+ffn7+6t///7y8PDQ+PHj1bBhQ61Zs0Y1a9aU9PeHppEjR6pLly6qUaOGzp8/r40bN2rTpk1q0qSJJOnRRx/V9u3b9eKLLyoiIkInT57UihUrdPjw4TSXwaR66KGH5Ofnpzlz5qhBgwYO22bPnq177rlHlStXliStWLFC+/fvV+fOnVW0aFH7pYjbt2/Xzz//7BBUTC5fvqzGjRvr8OHD6tWrl4oVK6YvvvhC33//fZq+c+fO1aVLl/TCCy8oMDBQ69ev16effqqjR49q7ty5kqTnn39ex48f14oVK/TFF1+k2cfzzz+vadOmqXPnzurVq5cOHDigMWPG6Ndff9XatWvl4eEhSRo8eLCGDx+uFi1aqEWLFtq0aZOaNm2qpKQk42tasWKFOnTooMaNG+vdd9+VJO3cuVNr1661h+iLFy+qfv362rlzp5599ln961//0unTp7Vo0SIdPXpUQUFBunz5sho2bKi9e/cqNjZWpUuX1ty5c9WpUyedPXvWIZBL0tSpU3XlyhV169ZNXl5eKlKkiLZv3666deuqePHiGjBggAoUKKA5c+aoTZs2+uqrr/TII49Iytx5db22bdsqICBAffr0UYcOHdSiRQv5+flJUqbP51Q9evRQcHCwBg8erMTExAzf2//+97+SlGbmJDPSO8aGDRu0bt06PfHEEypRooQOHjyozz77TA0bNtSOHTvk6+sr6e8/RjzwwANKTk62v48TJkyQj4+P8bg2m02tW7fWTz/9pG7duqlixYraunWrPvroI+3evTvNYho//fST5s+frx49eqhgwYL65JNP9Oijj+rw4cMKDAxU27ZttXv3bv3nP//RRx99pKCgIElScHCwsZbUUJ6SkqL9+/fr1VdfVWBgoFq2bGnvk5CQoDp16ujSpUvq1auXAgMDNX36dLVu3Vrz5s2znzOZPT9Nvw/333+/evXqpU8++USvvfaaKlasKEn2/2bknXfekaurq1555RWdO3dO7733np566in98ssv9j6fffaZYmNjVb9+ffXp00cHDx5UmzZtVLhw4RsGdADZZAFABs6dO2dJsh5++OFMj5Fkubq6Wtu3b3dob9OmjeXp6Wnt27fP3nb8+HGrYMGC1v33329vq1KlivXQQw9luP+//vrLkmS9//77mX8h/69Dhw5WSEiIlZycbG/7448/LFdXV+vNN9+0t126dCnN2P/85z+WJOuHH36wt02dOtWSZB04cMDe1qBBA6tBgwb256NHj7YkWXPmzLG3JSYmWuXKlbMkWatWrbrhcUeOHGm5uLhYhw4dsrf17NnTSu+f7x9//NGSZM2YMcOhfenSpQ7tJ0+etDw9Pa2HHnrIstls9n6vvfaaJcnq2LFjmn3/U+/evS1/f3+H9/F6gwcPtiRZ8+fPT7Mt9Zip782XX35p35aUlGTVrl3b8vPzs86fP29ZlmUdOHDAkmT5+/tbJ0+edNhX48aNrcjISOvKlSsO+69Tp45Vvnx5e5vpvMpI6rGvP98yez6nniP16tW74fuV6r777rMCAgLStF+8eNE6deqU/XHu3LlMHSO9cyouLs6SZH3++ef2tpdeesmSZP3yyy/2tpMnT1qFChUynuNffPGF5erqav34448Oxxk3bpwlyVq7dq29TZLl6elp7d271962ZcsWS5L16aef2tvef//9NMe9kY4dO1qS0jyKFy9uxcfHO/RNfa3/rPfChQtW6dKlrYiICCslJcWyrMyfn5n5fZg7d26a3/dU17+fq1atsiRZFStWtK5evWpv//jjjy1J1tatWy3LsqyrV69agYGBVvXq1a1r167Z+02bNs2S5LBPALmDS/UAZOj
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most important features: ['V39', 'V27', 'V36', 'V1', 'V12', 'V22', 'V15', 'V8', 'V37', 'V13', 'V18', 'V2', 'V34', 'V31', 'V38', 'V17', 'V30', 'V5', 'V10', 'V9', 'V16', 'V14', 'V28', 'V11', 'V3', 'V6', 'V7']\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdeXwN1//H8XcSspBIkEUsFUXtW6k0qKUNKZqWqipqX1oVu7aUCFXSVaOqqNZSrdqrvqW2FG1RO7Xve4l9C4Jkfn943Pnlyk1uQja8no/HfXDPnJk5Z2buzMlnZs5xMAzDEAAAAAAAAIBkOWZ1AQAAAAAAAIDsjiAaAAAAAAAAYAdBNAAAAAAAAMAOgmgAAAAAAACAHQTRAAAAAAAAADsIogEAAAAAAAB2EEQDAAAAAAAA7CCIBgAAAAAAANhBEA0AAAAAAACwgyAakIz9+/erQYMG8vT0lIODg+bPn58h66lbt67q1q2bIcvObEOHDpWDg0NWF8OuxYsXq3LlynJ1dZWDg4MuXbqU1UVKlfbt2ysgIMAqzcHBQUOHDs2S8jyK2J4AssKGDRtUo0YN5c6dWw4ODtq6dWuq550yZYocHBx05MgRu3kDAgLUvn37+y7n48rWNk5t+23lypVycHDQypUrM6x89+NhOxaOHDkiBwcHTZkyJauLkqKYmBi99tpryp8/vxwcHBQVFZXVRUoVW8eprXYn7l9Wb8+EhASVL19eI0aMyLIypKfFixfL3d1dZ8+ezfR1E0TDQ8vSoLF8XF1dVbBgQYWEhOirr77S1atXH2j57dq10/bt2zVixAhNmzZN1apVS6eSp+y///7T0KFD09SARuqdP39er7/+utzc3DR27FhNmzZNuXPnTnGew4cPKywsTE899ZRy5cqlXLlyqWzZsurevbv+/fffTCp51pk+fXqaGoEBAQFJfpslS5bUu+++qwsXLmRcQVNp0aJFBMoApOjatWuKiIjQiy++qHz58tn943337t168cUX5e7urnz58qlNmzapbtjfvn1bzZs314ULF/Tll19q2rRpKlq0aDrVBA8zrlcPnz59+mjJkiUaOHCgpk2bphdffDHF/HFxcRozZoxq1aqlvHnzytnZWQULFtTLL7+sn3/+WfHx8ZlU8qyxa9cuDR06NFU3AaT/v2Fv+Tg6Osrf318vvfSS/vnnn4wtbCpk57/jfv75Zx0/flxhYWFZXZR08eKLL6pEiRKKjIzM9HXnyPQ1Aunsww8/VLFixXT79m2dPn1aK1euVO/evTVq1CgtWLBAFStWTPMyb9y4obVr12rQoEGZfqL577//NGzYMAUEBKhy5cqZuu4HNXjwYA0YMCCri5GiDRs26OrVqxo+fLiCg4Pt5v/tt9/UokUL5ciRQ61bt1alSpXk6OioPXv2aN68eRo3bpwOHz6cZX/w3LhxQzlyZOypfPr06dqxY4d69+6d6nkqV66sfv36SZJu3rypTZs2KSoqSqtWrdL69eszqKSps2jRIo0dO9bmHyaZsT0BZH/nzp3Thx9+qCeeeEKVKlVK8SmiEydOqHbt2vL09NTIkSN17do1ff7559q+fbvWr18vZ2fnFNd18OBBHT16VBMnTlTnzp3TuSbIKEuXLs3wdaR0vXrcFC1aVDdu3FDOnDmzuigp+uOPP/TKK6+of//+dvOePXtWDRs21KZNmxQSEqLBgwcrX758On36tJYvX65WrVrpwIEDCg8Pz4SSJzVx4kQlJCRk6Dp27dqlYcOGqW7duml6SmvcuHFyd3dXQkKCjh8/rokTJ6p27dpav359lv79lNLfcZmxPVPy2Wef6Y033pCnp2eWlSG9vfXWW+rfv7+GDRsmDw+PTFsvfyngodewYUOrp8QGDhyoP/74Qy+99JJefvll7d69W25ubmlapuXusZeXV3oW9ZEVGxur3LlzK0eOHNk+AHHmzBlJqdu3Bw8e1BtvvKGiRYsqOjpa/v7+VtM/+eQTffPNN3J0TPmhXsv2yQiurq4ZstwHVahQIb355pvm986dO8vd3V2ff/659u/fr5IlS2Zh6ZKXXbcngMzl7++vU6dOqUCBAtq4caOeeeaZZPOOHDlSsbGx2rRpk5544glJUvXq1VW/fn1NmTJFXbt2TXFdabkuZWcJCQm6devWY3MetRccRfq4c+eOEhIS5Ozs/FAcW2fOnEn1b7lNmzbasmWL5s6dq1dffdVq2sCBA7Vx40bt3bs3xWXcvHlTzs7Odtui9yM7Byxfe+01eXt7m9+bNGmi8uXLa/bs2dn2IYSs3J5btmzRtm3b9MUXX2RZGTJCs2bN1KNHD82ePVsdO3bMtPXyOiceSc8//7zCw8N19OhR/fjjj1bT9uzZo9dee0358uWTq6urqlWrpgULFpjThw4daj5V9O6778rBwcG8M3L06FG98847KlWqlNzc3JQ/f341b948ySPIyfUNZq/fkpUrV5oN9Q4dOpiPKqf0CsnVq1fVu3dvBQQEyMXFRb6+vqpfv742b95slW/dunVq1KiR8ubNq9y5c6tixYoaPXq0VZ4//vhDzz33nHLnzi0vLy+98sor2r17t8267dq1S61atVLevHlVq1atZOvt4OCgsLAwzZ8/X+XLl5eLi4vKlSunxYsX26x/tWrV5OrqquLFi2vChAlp6mdt9uzZqlq1qtzc3OTt7a0333xTJ0+eNKfXrVtX7dq1kyQ988wzcnBwSLE/kE8//VSxsbGaPHlykgCaJOXIkUM9e/ZUkSJFzLT27dvL3d1dBw8eVKNGjeTh4aHWrVtLkv766y81b95cTzzxhFxcXFSkSBH16dNHN27cSLJsy/ZydXVV+fLl9csvv9gso60+vE6ePKmOHTvKz8/P3N6TJk2yymPp+2LWrFkaMWKEChcuLFdXV73wwgs6cOCA1TZbuHChjh49ah6P99ufQ4ECBSQpSaA1NceddLcB0LBhQ+XJk0fu7u564YUXkjy6f/v2bQ0bNkwlS5aUq6ur8ufPr1q1amnZsmWS7u6fsWPHmtvO8rG4d3tajr8DBw6offv28vLykqenpzp06KDr169brfvGjRvq2bOnvL295eHhoZdfflknT56knzXgIeTi4mKes+yZO3euXnrpJTOAJknBwcF66qmnNGvWrBTnbd++verUqSNJat68uRwcHKz62Urt+fFehmHoo48+UuHChZUrVy7Vq1dPO3fuTFV9pLsBsdGjR6tChQpydXWVj4+PXnzxRW3cuNHMY7m+//TTTypXrpxcXFzMa3t6nK8l6fTp0+rQoYMKFy4sFxcX+fv765VXXknx1a85c+bIwcFBq1atSjJtwoQJcnBw0I4dOyRJ//77r9q3b68nn3xSrq6uKlCggDp27Kjz58/b3Ua2+kQ7ceKEmjRpoty5c8vX11d9+vRRXFxcknlT0x6wd71KSEhQVFSUypUrJ1dXV/n5+emtt97SxYsXrdb1oMfCjBkzVLVqVXl4eChPnjyqUKFCkvbjpUuX1KdPH7MtWrhwYbVt21bnzp0z85w5c0adOnWSn5+fXF1dValSJU2dOtVqOZZ+zz7//HNFRUWpePHicnFx0a5du2z2iWZpc508eVJNmjSRu7u7fHx81L9//ySvQZ4/f15t2rRRnjx55OXlpXbt2mnbtm2p7mft0KFDat68ufLly6dcuXLp2Wef1cKFC83plja+YRgaO3Zskv11r7Vr12rJkiXq2rVrkgCaRbVq1cw2pPT/bbcZM2Zo8ODBKlSokHLlyqUrV67owoUL6t+/vypUqCB3d3flyZNHDRs21LZt25IsN7XHqa0+vFJ73AUEBOill17S33//rerVq8vV1VVPPvmkfvjhB6tt1rx5c0lSvXr1zG12P/0HJtfGTM1xJ9294d2vXz8VKVJELi4uKlWqlD7//HMZhmGVb9myZapVq5a8vLzk7u6uUqVK6YMPPpBk/++4e7dn4uP922+/NY/3Z555Rhs2bEhSxtmzZ6ts2bJWfxuktp+1+fPny9nZWbVr17ZKt7Rz9+3bpzfffFOenp7y8fFReHi4DMPQ8ePH9corryhPnjwqUKCAzSB
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABbFElEQVR4nO3dd1QU19sH8O+CLEUpKiKgKPbeUSMmVhRLFDWxR9EYjTVGgolGBTsmNkwsxIr6MwE1tjcaG4qxRUXEBoIFxQIqIYL0svf9w8PGlUV3cJeyfD/n7DnsnTszzw7L7sOdZ+7IhBACRERERHrCoKgDICIiItImJjdERESkV5jcEBERkV5hckNERER6hckNERER6RUmN0RERKRXmNwQERGRXilT1AEUNoVCgSdPnsDc3BwymayowyEiIiINCCHw8uVL2Nvbw8Dg7WMzpS65efLkCRwcHIo6DCIiIiqAhw8fomrVqm/tU+qSG3NzcwCvDo6FhUURR0NERESaSEpKgoODg/J7/G1KXXKTeyrKwsKCyQ0REVEJo0lJCQuKiYiISK8wuSEiIiK9wuSGiIiI9AqTGyIiItIrTG6IiIhIrzC5ISIiIr3C5IaIiIj0CpMbIiIi0itMboiIiEivMLkhIiIivVKkyc1ff/2FPn36wN7eHjKZDPv27XvnOsHBwWjZsiWMjY1Ru3Zt+Pv76zxOIiIiKjmKNLlJSUlBs2bNsGbNGo36R0dHo3fv3ujcuTPCwsLw9ddf44svvsCRI0d0HCkRERGVFEV648yePXuiZ8+eGvf38/NDjRo1sHz5cgBAgwYNcObMGaxcuRKurq66CpOIiOithBBIy8op6jCKFVMjQ41ucqkLJequ4OfPn4eLi4tKm6urK77++ut818nIyEBGRobyeVJSkq7CIyKiUkgIgU/9zuPyg3+LOpRiJXy+K8zkRZNmlKiC4ri4OFSuXFmlrXLlykhKSkJaWpradXx8fGBpaal8ODg4FEaoRERUSqRl5TCxKWZK1MhNQcycORMeHh7K50lJSUxwiIhIJ0Jmu8BMbljUYRQLpkZFdxxKVHJja2uLp0+fqrQ9ffoUFhYWMDU1VbuOsbExjI2NCyM8IiKdYU1H8ZWa+d/vxUxuWGSnYug/Jeo30K5dOxw6dEil7dixY2jXrl0RRUREpHus6SCSpkhrbpKTkxEWFoawsDAAry71DgsLQ0xMDIBXp5RGjhyp7D9+/Hjcu3cP3377LW7duoW1a9di586dmDZtWlGET0RUKFjTUTI4VS9fpKdi6D9FOnITEhKCzp07K5/n1sa4u7vD398fsbGxykQHAGrUqIGDBw9i2rRpWLVqFapWrYqNGzfyMnAiKjVY01F8FeWlz6SqSJObTp06QQiR73J1sw936tQJV65c0WFURETFF2s6iN6NfyFEVGKU1qLa1wtWiejdmNwQUYnAoloi0lSJmsSPiEovFtWyYJVIUxy5IaISp7QW1bJglUgzTG6IqMRhUS0RvQ0/HYhIqTgX7LKolog0xeSGiACwYJeI9AcLiokIQMkp2GVRLRG9C0duiCiP4lywy6JaInoXJjdUKhTnWpLignc2JiJ9wU8v0nusJSEiKl1Yc0N6r6TUkhQXrGkhopKOIzdUqhTnWpLigjUtRFTSMbmhUoW1JERE+o+f8qS3couIOfkbEVHpwuSG9BKLiImISi8WFJNeUldEzEJZIqLSgSM3pPdyi4hZKEtEVDowuaESLb/J+TghHRFR6cVPfCqxWFdDRETqsOaGSixNJudjnQ0RUenDkRvSC/lNzsc6GyKi0ofJDekF1tUQEVEunpYiIiIivcLkhoiIiPQKkxsiIiLSK0xuiIiISK+wApOKpfwm53sdb4hJRETqMLmhYoeT8xER0fvgaSkqdjSZnO91nKiPiIhex5EbKtbym5zvdZyoj4iIXsfkhoo1Ts5HRERS8bQUERER6RUmN0RERKRXmNwQERGRXmFyQ0RERHqFlZpUpNRN1sfJ+YiI6H0wuaEiw8n6iIhIF3haiorMuybr4+R8RERUEAUaucnKykJcXBxSU1NRqVIlVKhQQdtxUSmjbrI+Ts5HREQFofHIzcuXL7Fu3Tp07NgRFhYWcHR0RIMGDVCpUiVUr14dY8eOxaVLl3QZK+mx3Mn6Xn8wsSEiooLQKLlZsWIFHB0dsWXLFri4uGDfvn0ICwtDVFQUzp8/D29vb2RnZ6N79+7o0aMHbt++reu4qQQRQiA1M1vNg4XDRESkfRqdlrp06RL++usvNGrUSO3yNm3a4PPPP4efnx+2bNmC06dPo06dOloNlEomFg0TEVFh0yi5+e233zTamLGxMcaPH/9eAZF+0eQO3ywcJiIibeKl4FRo8rvDNwuHiYhImyRdCn716lUsXLgQa9euRXx8vMqypKQkfP7551oNjvSLuqJhFg4TEZG2aZzcHD16FG3atEFAQAB++OEH1K9fHydPnlQuT0tLw9atW3USJBEREZGmNE5u5s6dC09PT9y4cQP379/Ht99+i759++Lw4cO6jI+IiIhIEo1rbm7evInt27cDAGQyGb799ltUrVoVn376KQICAtC6dWudBUlERESkKY2TG2NjY7x48UKlbdiwYTAwMMDgwYOxfPlybcdGREREJJnGyU3z5s1x8uRJtGrVSqV9yJAhEELA3d1d68ERERERSaVxcjNhwgT89ddfapcNHToUQghs2LBBa4ERERERFYRMCCGKOojClJSUBEtLSyQmJsLCwqKow9F7qZnZaOh1BAAQPt8VZnJOrURERNJJ+f6WNM8NERERUXHHf6NJK4QQSMvKeyNM3hyTiIgKG5Mbem+8OSYRERUnPC1F7403xyQiouKkyEdu1qxZg6VLlyIuLg7NmjXDzz//jDZt2uTb39fXF+vWrUNMTAysra3x6aefwsfHByYmJoUYNeWHN8ckIqKiVqCRm7/++gshISEqbSEhIfleKp6fwMBAeHh4wNvbG6GhoWjWrBlcXV3x7Nkztf1//fVXzJgxA97e3oiIiMCmTZsQGBiI77//viAvg3SAN8ckIqKiVqDkplOnThg5cqRK24gRI9C5c2dJ21mxYgXGjh2L0aNHo2HDhvDz84OZmRk2b96stv+5c+fQvn17DBs2DI6OjujevTuGDh2Kixcv5ruPjIwMJCUlqTyIiIhIfxUouYmOjsbx48dV2oKCgnDv3j2Nt5GZmYnLly/DxcXlv2AMDODi4oLz58+rXcfZ2RmXL19WJjP37t3DoUOH0KtXr3z34+PjA0tLS+XDwcFB4xiJiIio5ClQzU316tXztNnb20vaRnx8PHJyclC5cmWV9sqVK+PWrVtq1xk2bBji4+Px4YcfQgiB7OxsjB8//q2npWbOnAkPDw/l86SkJCY4REREeqxEXS0VHByMxYsXY+3atQgNDcWePXtw8OBBLFiwIN91jI2NYWFhofIgIiIi/aXRyE358uU1LghNSEjQqJ+1tTUMDQ3x9OlTlfanT5/C1tZW7Tpz5szBiBEj8MUXXwAAmjRpgpSUFIwbNw6zZs2CgUGJytWIiIhIBzRKbnx9fbW+Y7lcjlatWiEoKAj9+vUDACgUCgQFBWHy5Mlq10lNTc2TwBgavrrsuJTdIqtYyJ2VmLMQExFRcaJRcuPu7q6TnXt4eMDd3R1OTk5o06YNfH19kZKSgtGjRwMARo4ciSpVqsDHxwcA0KdPH6xYsQItWrRA27ZtcefOHcyZMwd9+vRRJjlUODgrMRERFVcFKii+e/cutmzZgrt372LVqlWwsbHBn3/+iWrVqqFRo0Yab2fw4MF4/vw5vLy8EBcXh+bNm+Pw4cPKIuOYmBiVkZrZs2dDJpNh9uzZePz4MSpVqoQ+ffpg0aJFBXkZ9B7UzUrMWYiJiKg4kAmJ53NOnTqFnj17on379vjrr78QERGBmjVrYsmSJQgJCcHu3bt1FatWSLllOuUvNTMbDb2OAPhvVmLOQkxERLoi5ftbcgXujBkzsHDhQhw7dgxyuVzZ3qVLF/z999/So6USRQiB1MxslTqb3FmJmdgQEVFxIPm01PXr1/Hrr7/mabe
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2022-12-29 10:21:35 +01:00
"source": [
"best_params_gradient_boosting = {\n",
"'validation_fraction': 0.3, \n",
"'tol': 0.0001,\n",
"'subsample': 0.9,\n",
"'n_estimators': 50,\n",
"'min_samples_split': 5, \n",
"'min_samples_leaf': 5,\n",
"'max_features': 'log2', \n",
"'max_depth': 10, \n",
"'learning_rate': 0.1\n",
"}\n",
"\n",
2023-01-06 10:41:21 +01:00
"best_params_grad_boost_scores, best_params_grad_boost_model, best_params_grad_boost = score_the_model(\n",
2022-12-29 10:21:35 +01:00
" model=GradientBoostingClassifier(**best_params_gradient_boosting),\n",
" model_name='Gradient Boosting',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "57b06b7d",
"metadata": {},
"source": [
"### Comparisson between models performances"
]
},
{
"cell_type": "code",
2023-01-06 10:41:21 +01:00
"execution_count": 181,
2022-12-29 10:21:35 +01:00
"id": "1d3fdfaf",
"metadata": {},
2023-01-06 10:41:21 +01:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB9gAAAduCAYAAACpL/aIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdeZxWZf0//tew4wIim4LE4J5pmKBIam4obpg7biG45UKaZN/UVDQrsExxjVxQKwxM0Q/mLklqaa6QZOKKGAiCyiLK4sz8/ujH5MSAZ3Bg0Hk+H4/78Zj7Otc5533u+ya779e5rqukoqKiIgAAAAAAAADACjWo6wIAAAAAAAAA4ItAwA4AAAAAAAAABQjYAQAAAAAAAKAAATsAAAAAAAAAFCBgBwAAAAAAAIACBOwAAAAAAAAAUICAHQAAAAAAAAAKELADAAAAAAAAQAECdgAAAAAAAAAoQMAOAAAAK2nSpEk59thj06lTpzRp0iQlJSUpKSnJhAkTVmsdpaWlKSkpSf/+/ZfZNmXKlMq6brnlltVa15ruoosuqnxtVpUVvTcAAAB88TSq6wIAAADqgwULFuR3v/tdxo4dm4kTJ+a9995LRUVFWrRokdLS0myzzTbp2bNn9tlnn3Tq1Kmuy6WA5557Lrvssks+/vjjui4FAAAAWE0E7AAAAKvYk08+mSOPPDJTp05dZtvs2bMze/bsPPvss7n55pvTvn37zJgxow6qpKbOPffcfPzxx2nRokWGDh2a7t27p3nz5kmSTTfdtI6rAwAAAFYFATsAAMAq9Morr6R3796ZP39+kuTAAw/MYYcdls033zxNmjTJ7NmzM3HixDz88MN59NFH67hailqyZEn+8pe/JElOPvnknHrqqXVcEQAAALA6CNgBAABWoR//+MeV4frNN99c7TrMe+21V84+++zMmjUrt99++2qukJUxe/bsLF68OEmy+eab13E1AAAAwOrSoK4LAAAA+LIqKyvLvffemyTp3r17teH6p7Vt2zann376aqiMz2vRokWVfzdu3LgOKwEAAABWJwE7AADAKjJr1qx8/PHHSWpvTe5Fixbl+uuvz/7775+OHTumadOmWXvttfO1r30tJ554Yh588MFUVFRUu++HH36YoUOHpmfPnll//fXTtGnTbLTRRjnssMPypz/9aYXn3W233VJSUpLddtstSfLqq69m4MCB2WyzzbLWWmulpKQkU6ZMqbLPwoULc80112TPPffMBhtskCZNmqRdu3bp1atXbrrppnzyyScrPOef//znHHXUUenSpUuaN2+etdZaK507d86OO+6Ys88+O3/+858Lv27VWbx4ca677rrsvvvuadu2bZo0aZINNtgg++23X37/+9+nvLx8mX0uuuiilJSUpEuXLpVtAwYMSElJSeXjoosuqlEdCxYsyOjRo3PiiSdm2223TcuWLdO4ceO0bds2u+66ay677LJ8+OGHn+taP6/+/funpKQkpaWlSZIZM2bk7LPPzuabb5611lorHTt2zBFHHJF//vOfVfabMmVKzjjjjGy++eZp3rx52rdvn2OOOSavv/76Z55zZd6f//Xvf/87p59+ejbeeOM0a9YsHTp0yIEHHphHHnmkRtc/d+7cDBkyJDvttFNlLRtuuGH69OmTO+64Y7n/5opYuHBhrrrqquy2225p27ZtGjdunPXXXz9bbLFF9t1331x++eXL/NsCAACgDlUAAACwSrz33nsVSSqSVHTt2vVzH++FF16o6NKlS+Uxl/d48803l9n3+eefr+jQocMK9zvkkEMqPv7442rPveuuu1Ykqdh1110r7r777oq11157heedMGFCRefOnVd4vu23375ixowZ1Z7v+9///mdeZ+vWrVf6tXzzzTcrttxyyxUef+edd6547733quw3ePDgz6xr8ODBNapl6Wu7okeXLl0q/vWvfy33GEtf6+OOO67aa116nJtvvrlGtS113HHHVSSp6Ny5c8WECRMqNthgg2rrXHvttSsef/zxioqKiopx48ZVtGzZstp+rVq1qpg0adJyz7ey78+nPfbYYxUtWrRY7v4XXXRRlfdzeR555JGK1q1br7CW/fbbr2L+/PnV7r+i92b69OkVW2211We+/z/4wQ+WWx8AAACrlzXYAQAAVpH1118/nTt3zltvvZWJEyfm0ksvzQ9/+MM0aFDzycT+9a9/ZZdddqkcyXzwwQfnyCOPzMYbb5yysrK88soreeihh3LXXXcts++0adOy55575oMPPkhJSUn69++fI488Mq1bt85LL72UX/3qV5k4cWLGjBmT/v37Z9SoUcutY+rUqTn22GOz1lpr5YILLsguu+yShg0b5plnnsk666yTJHnttdey6667Zu7cuWnRokVOP/307LDDDunUqVPee++9jB07Nr/5zW/yzDPP5Nvf/nYef/zxKtOs/+lPf8qwYcOSJF//+tdz6qmn5qtf/WpatmyZOXPm5J///GceeeSRPP300zV+HZP/jOTfc88988YbbyRJDjrooBx//PHp0KFD3nzzzVxzzTX5y1/+kieeeCJ9+vTJY489loYNGyZJTjvttBx22GGZPn16evfunST56U9/mm9/+9uVx2/Xrl2N6vnkk0+yzTbb5MADD0z37t3ToUOHVFRU5K233spdd92V22+/PW+++WYOOuigTJgwIc2aNVup664NH330UQ4++OAsXrw4P//5z7PrrrumYcOGeeCBB/Lzn/88CxYsyHe+8508/PDDOeigg9KyZcv85Cc/SY8ePfLJJ5/kzjvvzLBhw/LBBx/khBNOyFNPPbXMOT7P+7PU1KlTc8ABB2TevHlp0KBBTj755Bx22GFp2bJl/vGPf2To0KG56KKL0r179xVe71//+tfsu+++WbJkSdq3b5/vfe976dq1azp06JDp06dn9OjR+f3vf5/77rsvxx13XO68884avZ7f+9738tJLLyVJjj322BxyyCHp0KFDGjZsmHfeeSfPPvts/u///q9GxwQAAGAVq+uEHwAA4MvssssuqzIStbS0tOKMM86oGDVqVMUbb7xR+DjbbbddRZKKBg0aVPzhD39Ybr/Zs2dXfPTRR1XaDjvssMrz33jjjcvss3Dhwordd9+9ss999923TJ9Pj7Lu0KFDxVtvvbXcGr75zW9WJKn4xje+UTFr1qxq+9x///0VDRo0qEhScf3111fZ9p3vfKdytPTyRgVXVFSscPTyipx99tmV13L++ecvs728vLzimGOOqexz3XXXLdOnNkaFL/XKK6+scPvDDz9c+VpV9/5VVKy+EexJKtq0aVPx2muvLdPnmmuuqezTtm3bis0226zi3XffXabfD3/4w8p+zz///DLba+P9+fRn/rbbbltm+7x58yq6du1a5d/m/1q8eHFFaWlpRZKKffbZp2LBggXVvjbXX3995TEeeuihZbYv7735+OOPKxo3blxohPrKftYBAACofdZgBwAAWIXOOuusHH/88ZXPp0yZkquuuqpy9PkGG2yQI488Mvfcc89y13F+6KGH8vzzzydJzjjjjBx55JHLPV/r1q3TvHnzyufTp0+vHNW+zz775IQTTlhmn6ZNm2bEiBFp1Og/k5xdc801K7ymoUOH5itf+Uq12x5//PH87W9/S5LceuutadOmTbX99tlnnxx22GFJkltuuaXKthkzZiRJtttuu8pR8dVZf/31V1hndRYtWpQbb7wxSfK1r32t2vXSS0pKct1116V169ZJPvv1+Lw222yzFW7v1atXDjzwwCTJ3XffvUprKeKSSy7JJptsskz78ccfXzm6ftasWbnqqqvStm3bZfqdeuqplX8//vjjVbbVxvszY8aMys/8AQcckKOOOmqZY6y77rq5/vrrV3SZGTVqVKZMmZJmzZrlt7/9bdZaa61q+5100knZYYcdkiz7WV6R999/P0uWLEmSfOtb31ph35X5rAMAALBqCNgBAABWoQYNGuSmm27KQw89lH322acyxF5q5syZGT16dA488MDssMMOef3115c5xp/+9KfKv7///e/X6Pzjx49PWVlZklQbri9VWlqavfb
"text/plain": [
"<Figure size 2500x2000 with 6 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2022-12-29 10:21:35 +01:00
"source": [
"# Plot Scores of all models\n",
"all_accuracy = [score['Accuracy'] for score in all_scores]\n",
"all_precision = [score['Precision'] for score in all_scores]\n",
"all_recall = [score['Recall'] for score in all_scores]\n",
"all_f1 = [score['F1'] for score in all_scores]\n",
"all_roc_auc = [score['AUC'] for score in all_scores]\n",
"model_names = [score['model_name'] for score in all_scores]\n",
"\n",
"fig, ax = plt.subplots(3, 2, figsize=(25, 20))\n",
"fig.suptitle('Scores of all models', fontsize=20)\n",
"\n",
"ax[0, 0].bar(model_names, all_accuracy)\n",
"ax[0, 0].set_title('Accuracy')\n",
"ax[0, 0].set_ylabel('Accuracy')\n",
"ax[0, 0].set_xticklabels(model_names, rotation=90)\n",
"\n",
"ax[0, 1].bar(model_names, all_precision)\n",
"ax[0, 1].set_title('Precision')\n",
"ax[0, 1].set_ylabel('Precision')\n",
"ax[0, 1].set_xticklabels(model_names, rotation=90)\n",
"\n",
"ax[1, 0].bar(model_names, all_recall)\n",
"ax[1, 0].set_title('Recall')\n",
"ax[1, 0].set_ylabel('Recall')\n",
"ax[1, 0].set_xticklabels(model_names, rotation=90)\n",
"\n",
"ax[1, 1].bar(model_names, all_f1)\n",
"ax[1, 1].set_title('F1')\n",
"ax[1, 1].set_ylabel('F1')\n",
"ax[1, 1].set_xticklabels(model_names, rotation=90)\n",
"\n",
"ax[2, 0].bar(model_names, all_roc_auc)\n",
"ax[2, 0].set_title('ROC AUC')\n",
"ax[2, 0].set_ylabel('ROC AUC')\n",
"ax[2, 0].set_xticklabels(model_names, rotation=90)\n",
"\n",
"ax[2, 1].set_visible(False)\n",
"\n",
"plt.show()"
]
},
2023-01-06 10:41:21 +01:00
{
"cell_type": "markdown",
"id": "a102a0ee",
"metadata": {},
"source": [
"### Ok what about converting numerical features into categorical?"
]
},
{
"cell_type": "code",
"execution_count": 182,
"id": "7c003720",
"metadata": {},
"outputs": [],
"source": [
"# import one hot encoder\n",
"from sklearn.preprocessing import OneHotEncoder\n",
"\n",
"# Select the numerical columns, but should not have more than 20 unique values\n",
"numerical_cols = [cname for cname in X_train.columns if \n",
" X_train[cname].dtype in ['int64', 'float64'] and X_train[cname].nunique() < 20]\n",
"\n",
"# Now onehotencode the numerical columns\n",
"OH_encoder = OneHotEncoder(handle_unknown='ignore', sparse=False)\n",
"\n",
"encoder_df = pd.DataFrame(OH_encoder.fit_transform(df_train[numerical_cols]))\n",
"df_train = pd.concat([df_train, encoder_df], axis=1)\n",
"\n",
"encoder_df = pd.DataFrame(OH_encoder.fit_transform(df_test[numerical_cols]))\n",
"df_test = pd.concat([df_test, encoder_df], axis=1)\n",
"\n",
"# Drop the numerical columns\n",
"df_train = df_train.drop(numerical_cols, axis=1)\n",
"df_test = df_test.drop(numerical_cols, axis=1)\n"
]
},
{
"cell_type": "code",
"execution_count": 183,
"id": "edb83193",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABZvUlEQVR4nO3deVhV1f7H8Q/jAURAmURFcSoxSQ3nIScUMzWzTG1ALTVTcipNK6fSbDQrLeeh0uuUmjfnTK2UMjHNMedZUDNRcUBg//7ox7kdATeTHMX363nOc+9Ze629v/u4uZcPa+91HAzDMAQAAAAAyJSjvQsAAAAAgDsdwQkAAAAATBCcAAAAAMAEwQkAAAAATBCcAAAAAMAEwQkAAAAATBCcAAAAAMAEwQkAAAAATBCcAAAAAMAEwQkAcmjmzJlycHDQkSNHrG2NGjVSo0aNTMeuX79eDg4OWr9+/W2rLydCQkLUpUsXe5dxR9m/f7+aN28ub29vOTg4aMmSJfYuKd/k5hpH1vB5AncPghOALDl48KBefPFFlS1bVm5ubvLy8lK9evX0ySef6OrVq/Yu756yfPlyjRgxwt5l3DM6d+6sHTt2aPTo0frqq69UvXr1237MixcvavTo0apevbq8vb1lsVhUunRpdejQQcuWLbvtx7e3TZs2acSIEbpw4UKW+nfp0kUODg7Wl7Ozs4KDg9WxY0ft3r379habBbt379aIESNsAiiAu4+zvQsAcOdbtmyZ2rdvL4vFoqioKFWuXFlJSUn6+eefNXDgQO3atUuTJ0+2d5l3hNWrV9/2YyxfvlwTJkwgPOWDq1evKiYmRm+88Yaio6Pz5ZgHDhxQZGSkjh49qscff1xRUVHy9PTU8ePHtXz5crVq1UpffvmlnnvuuXyp52b5cY1v2rRJI0eOVJcuXeTj45OlMRaLRVOnTpUkJScn6+DBg5o4caJWrlyp3bt3q3jx4rex4lvbvXu3Ro4cqUaNGikkJMRmW358ngDyBsEJwC0dPnxYHTt2VOnSpfXDDz8oKCjIuq137946cODALf8CnpqaqqSkJLm5ueVHuXbn6upq7xLuCcnJyUpNTb3tn/fZs2clKcu/vGdFYmKiChUqlOG25ORkPf7444qPj9eGDRtUr149m+3Dhw/X6tWrlZKSkuNj5Nadeo07Ozvr2WeftWmrXbu2WrVqpWXLlql79+52quzW7tTPE0B63KoH4Jbef/99Xb58WdOmTbMJTWnKly+vvn37Wt87ODgoOjpas2fP1gMPPCCLxaKVK1dKkn7//Xc98sgj8vLykqenp5o2bapffvnFZn83btzQyJEjVaFCBbm5ucnX11f169fXmjVrrH3i4uLUtWtXlSxZUhaLRUFBQXrsscdueRvMwoUL5eDgoA0bNqTbNmnSJDk4OGjnzp2SpD/++ENdunSx3pZYrFgxPf/88/rrr79MP6+Mnlc4ceKE2rZtq0KFCikgIED9+/fX9evX04396aef1L59e5UqVUoWi0XBwcHq37+/za2QXbp00YQJEyTJ5takNKmpqRo3bpweeOABubm5KTAwUC+++KL+/vtvm2MZhqFRo0apZMmS8vDwUOPGjbVr1y7T80szd+5chYeHq3DhwvLy8lJYWJg++eQTmz4XLlxQ//79FRISIovFopIlSyoqKkrnzp2z9jlz5oxeeOEFBQYGys3NTVWqVNGsWbNs9nPkyBE5ODjoww8/1Lhx41SuXDlZLBbrLVh79+7Vk08+qaJFi8rNzU3Vq1fX0qVLbfaRlevqZiNGjFDp0qUlSQMHDpSDg4PNbEFWrue0Z4Q2bNigXr16KSAgQCVLlsz0mAsWLNDOnTs1dOjQdKEpTfPmzfXII49k6RhHjx5Vr169dP/998vd3V2+vr5q3759hj8ru3btUpMmTeTu7q6SJUtq1KhRSk1NTdcvo2v8+vXrGj58uMqXL2+9dgcNGpTuOk/734clS5aocuXKslgseuCBB6z/GyH987kPHDhQklSmTBnrNZ6T29yKFSsm6Z9Q9W+HDh1S+/btVbRoUXl4eKh27doZ/gEoK9endOufh5kzZ6p9+/aSpMaNG1vPJ+35xps/z7TnH+fPn6/Ro0erZMmScnNzU9OmTXXgwIF0x54wYYLKli0rd3d31axZUz/99BPPTQG3CTNOAG7pv//9r8qWLau6detmecwPP/yg+fPnKzo6Wn5+fgoJCdGuXbvUoEEDeXl5adCgQXJxcdGkSZPUqFEjbdiwQbVq1ZL0zy9NY8aMUbdu3VSzZk1dvHhRW7Zs0datW9WsWTNJ0hNPPKFdu3bp5ZdfVkhIiM6cOaM1a9bo2LFj6W6DSfPoo4/K09NT8+fPV8OGDW22zZs3Tw888IAqV64sSVqzZo0OHTqkrl27qlixYtZbEXft2qVffvnFJqiYuXr1qpo2bapjx46pT58+Kl68uL766iv98MMP6fouWLBAV65c0UsvvSRfX19t3rxZn332mU6cOKEFCxZIkl588UWdOnVKa9as0VdffZVuHy+++KJmzpyprl27qk+fPjp8+LDGjx+v33//XRs3bpSLi4skadiwYRo1apRatmypli1bauvWrWrevLmSkpJMz2nNmjXq1KmTmjZtqvfee0+StGfPHm3cuNEaoi9fvqwGDRpoz549ev755/XQQw/p3LlzWrp0qU6cOCE/Pz9dvXpVjRo10oEDBxQdHa0yZcpowYIF6tKliy5cuGATyCVpxowZunbtmnr06CGLxaKiRYtq165dqlevnkqUKKHBgwerUKFCmj9/vtq2batvvvlGjz/+uKSsXVc3a9eunXx8fNS/f3916tRJLVu2lKenpyRl+XpO06tXL/n7+2vYsGFKTEzM9LP973//K0npZk6yIqNj/Pbbb9q0aZM6duyokiVL6siRI/riiy/UqFEj7d69Wx4eHpL++WNE48aNlZycbP0cJ0+eLHd3d9Pjpqamqk2bNvr555/Vo0cPhYaGaseOHfr444+1b9++dItp/Pzzz1q0aJF69eqlwoUL69NPP9UTTzyhY8eOydfXV+3atdO+ffv0n//8Rx9//LH8/PwkSf7+/qa1pIXylJQUHTp0SK+99pp8fX3VqlUra5/4+HjVrVtXV65cUZ8+feTr66tZs2apTZs2WrhwofWayer1afbz8PDDD6tPnz769NNP9frrrys0NFSSrP+ZmXfffVeOjo569dVXlZCQoPfff1/PPPOMfv31V2ufL774QtHR0WrQoIH69++vI0eOqG3btipSpMgtAzqAHDIAIBMJCQmGJOOxxx7L8hhJhqOjo7Fr1y6b9rZt2xqurq7GwYMHrW2nTp0yChcubDz88MPWtipVqhiPPvpopvv/+++/DUnGBx98kPUT+X+dOnUyAgICjOTkZGvb6dOnDUdHR+Ott96ytl25ciXd2P/85z+GJOPHH3+0ts2YMcOQZBw+fNja1rBhQ6Nhw4bW9+PGjTMkGfPnz7e2JSYmGuXLlzckGevWrbvlcceMGWM4ODgYR48etbb17t3byOh/vn/66SdDkjF79myb9pUrV9q0nzlzxnB1dTUeffRRIzU11drv9ddfNyQZnTt3Trfvf+vbt6/h5eVl8znebNiwYYYkY9GiRem2pR0z7bP5+uuvrduSkpKMOnXqGJ6ensbFixcNwzCMw4cPG5IMLy8v48yZMzb7atq0qREWFmZcu3bNZv9169Y1KlSoYG0zu64yk3bsm6+3rF7PaddI/fr1b/l5palWrZrh4+OTrv3y5cvG2bNnra+EhIQsHSOjayomJsaQZHz55ZfWtn79+hmSjF9//dXadubMGcPb29v0Gv/qq68MR0dH46effrI5zsSJEw1JxsaNG61tkgxXV1fjwIED1rbt27cbkozPPvvM2vbBBx+kO+6tdO7c2ZCU7lWiRAkjNjbWpm/auf673kuXLhllypQxQkJCjJSUFMMwsn59ZuXnYcGCBel+3tPc/HmuW7fOkGSEhoYa169ft7Z/8sknhiRjx44dhmEYxvXr1w1fX1+jRo0axo0bN6z9Zs6caUiy2SeAvMGtegAydfHiRUlS4cK
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most important features: ['V36', 'V1', 'V12', 'V39', 'V38', 'V34', 'V30', 'V22', 'V27', 'V13', 'V2', 'V14', 'V40', 'V8', 'V37', 'V18', 'V9', 'V28', 'V10', 'V31']\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdd3xO9///8WcSMkjESEKQiqL2KpUaNSqkVkutokaqtIjZRY0YrVQHUVWrNT5atatau6lRo6hVe6+qxF5BkJzfH37X+eaSK7kSsvC4327Xjet93uec9xnXOe+8znm/3w6GYRgCAAAAAAAAkCjHjC4AAAAAAAAAkNkRRAMAAAAAAADsIIgGAAAAAAAA2EEQDQAAAAAAALCDIBoAAAAAAABgB0E0AAAAAAAAwA6CaAAAAAAAAIAdBNEAAAAAAAAAOwiiAQAAAAAAAHYQRAMScfjwYdWvX1+enp5ycHDQokWL0mQ9tWvXVu3atdNk2elt6NChcnBwyOhi2LV8+XJVqFBBrq6ucnBw0JUrVzK6SMnSqVMn+fv7W6U5ODho6NChGVKeJxH7E0BG2Lp1q6pVq6bs2bPLwcFBO3fuTPa806dPl4ODg06cOGE3r7+/vzp16vTQ5Xxa2drHya2/rVmzRg4ODlqzZk2ale9hPG7nwokTJ+Tg4KDp06dndFGSFBUVpRYtWihPnjxycHBQeHh4RhcpWWydp7bqnXh4Gb0/4+LiVKZMGX366acZVobUtHz5crm7u+v8+fPpvm6CaHhsWSo0lo+rq6vy58+voKAgff3117p+/fojLb9jx47avXu3Pv30U82cOVOVK1dOpZIn7b///tPQoUNTVIFG8l28eFGtWrWSm5ubxo8fr5kzZyp79uxJznP8+HGFhIToueeeU7Zs2ZQtWzaVKlVKPXr00D///JNOJc84s2bNSlEl0N/fP8Fvs1ixYvrggw906dKltCtoMi1dupRAGYAk3bhxQ6GhoXrllVeUO3duu3+879+/X6+88orc3d2VO3dutW/fPtkV+7t376ply5a6dOmSxowZo5kzZ6pQoUKptCV4nHG/evz07dtXK1as0IABAzRz5ky98sorSeaPiYnRuHHjVKNGDeXKlUvOzs7Knz+/Xn31Vf3000+KjY1Np5JnjH379mno0KHJeggg/d8De8vH0dFRvr6+aty4sf7666+0LWwyZOa/43766SedPn1aISEhGV2UVPHKK6+oaNGiCgsLS/d1Z0n3NQKpbPjw4SpcuLDu3r2ryMhIrVmzRn369NHo0aO1ePFilStXLsXLvHXrljZt2qSBAwem+4Xmv//+07Bhw+Tv768KFSqk67of1aBBg9S/f/+MLkaStm7dquvXr2vEiBEKDAy0m/+3335T69atlSVLFrVr107ly5eXo6OjDhw4oIULF2rChAk6fvx4hv3Bc+vWLWXJkraX8lmzZmnPnj3q06dPsuepUKGC3nvvPUnS7du3tW3bNoWHh2vt2rXasmVLGpU0eZYuXarx48fb/MMkPfYngMzvwoULGj58uJ555hmVL18+ybeI/v33X9WsWVOenp4aOXKkbty4oS+//FK7d+/Wli1b5OzsnOS6jh49qpMnT2rKlCl6++23U3lLkFZWrlyZ5utI6n71tClUqJBu3bqlrFmzZnRRkvTHH3/otdde0/vvv2837/nz59WgQQNt27ZNQUFBGjRokHLnzq3IyEj9/vvvatu2rY4cOaLBgwenQ8kTmjJliuLi4tJ0Hfv27dOwYcNUu3btFL2lNWHCBLm7uysuLk6nT5/WlClTVLNmTW3ZsiVD/35K6u+49NifSfniiy/0xhtvyNPTM8PKkNreeecdvf/++xo2bJg8PDzSbb38pYDHXoMGDazeEhswYID++OMPNW7cWK+++qr2798vNze3FC3T8vQ4Z86cqVnUJ1Z0dLSyZ8+uLFmyZPoAxLlz5yQl79gePXpUb7zxhgoVKqSIiAj5+vpaTR81apS+/fZbOTom/VKvZf+kBVdX1zRZ7qMqUKCA3nzzTfP722+/LXd3d3355Zc6fPiwihUrloGlS1xm3Z8A0pevr6/Onj2rfPny6e+//9YLL7yQaN6RI0cqOjpa27Zt0zPPPCNJqlKliurVq6fp06era9euSa4rJfelzCwuLk537tx5aq6j9oKjSB337t1TXFycnJ2dH4tz69y5c8n+Lbdv3147duzQggUL9Prrr1tNGzBggP7++28dPHgwyWXcvn1bzs7OduuiDyMzByxbtGghLy8v83vTpk1VpkwZzZs3L9O+hJCR+3PHjh3atWuXvvrqqwwrQ1po3ry5evbsqXnz5umtt95Kt/XSnBNPpJdfflmDBw/WyZMn9cMPP1hNO3DggFq0aKHcuXPL1dVVlStX1uLFi83pQ4cONd8q+uCDD+Tg4GA+GTl58qS6d++u4sWLy83NTXny5FHLli0TvIKcWN9g9votWbNmjVlRDw4ONl9VTqoJyfXr19WnTx/5+/vLxcVFPj4+qlevnrZv326Vb/PmzWrYsKFy5cql7Nmzq1y5cho7dqxVnj/++EMvvfSSsmfPrpw5c+q1117T/v37bW7bvn371LZtW+XKlUs1atRIdLsdHBwUEhKiRYsWqUyZMnJxcVHp0qW1fPlym9tfuXJlubq6qkiRIpo0aVKK+lmbN2+eKlWqJDc3N3l5eenNN9/UmTNnzOm1a9dWx44dJUkvvPCCHBwckuwP5PPPP1d0dLSmTZuWIIAmSVmyZFGvXr3k5+dnpnXq1Enu7u46evSoGjZsKA8PD7Vr106S9Oeff6ply5Z65pln5OLiIj8/P/Xt21e3bt1KsGzL/nJ1dVWZMmX0888/2yyjrT68zpw5o7feekt58+Y19/fUqVOt8lj6vpg7d64+/fRTFSxYUK6urqpbt66OHDlitc+WLFmikydPmufjw/bnkC9fPklKEGhNznkn3a8ANGjQQDly5JC7u7vq1q2b4NX9u3fvatiwYSpWrJhcXV2VJ08e1ahRQ6tWrZJ0//iMHz/e3HeWj8WD+9Ny/h05ckSdOnVSzpw55enpqeDgYN28edNq3bdu3VKvXr3k5eUlDw8Pvfrqqzpz5gz9rAGPIRcXF/OaZc+CBQvUuHFjM4AmSYGBgXruuec0d+7cJOft1KmTatWqJUlq2bKlHBwcrPrZSu718UGGYeiTTz5RwYIFlS1bNtWpU0d79+5N1vZI9wNiY8eOVdmyZeXq6ipvb2+98sor+vvvv808lvv7jz/+qNKlS8vFxcW8t6fG9VqSIiMjFRwcrIIFC8rFxUW+vr567bXXkmz6NX/+fDk4OGjt2rUJpk2aNEkODg7as2ePJOmff/5Rp06d9Oyzz8rV1VX58uXTW2+9pYsXL9rdR7b6RPv333/VtGlTZc+eXT4+Purbt69iYmISzJuc+oC9+1VcXJzCw8NVunRpubq6Km/evHrnnXd0+fJlq3U96rkwe/ZsVapUSR4eHsqRI4fKli2boP545coV9e3b16yLFixYUB06dNCFCxfMPOfOnVPnzp2VN29eubq6qnz58poxY4bVciz9nn355ZcKDw9XkSJF5OLion379tnsE81S5zpz5oyaNm0qd3d3eXt76/3330/QDPLixYtq3769cuTIoZw5c6pjx47atWtXsvtZO3bsmFq2bKncuXMrW7ZsevHFF7VkyRJzuqWObxiGxo8fn+B4PWjTpk1asWKFunbtmiCAZlG5cmWzDin9X91t9uzZGjRokAoUKKBs2bLp2rVrunTpkt5//32VLVtW7u7uypEjhxo0aKBdu3YlWG5yz1NbfXgl97zz9/dX48aNtX79elWpUkWurq569tln9b///c9qn7Vs2VKSVKdOHXOfPUz/gYnVMZNz3kn3H3i/99578vPzk4uLi4oXL64vv/xShmFY5Vu1apVq1KihnDlzyt3dXcWLF9fHH38syf7fcQ/uz/jn++TJk83z/YUXXtDWrVsTlHHevHkqVaqU1d8Gye1nbdGiRXJ2dlbNmjWt0i313EOHDunNN9+Up6envL29NXjwYBmGodOnT+u1115Tjhw5lC9
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABaoklEQVR4nO3deViN6f8H8PcpnRZakDYiRGQnfGXGGlnGOkOWIcYw1jGaGAyyZ8aWmUFjDT8my9hmmCyRsW/JVsoSWQpNo7Qv5/794eqMo+I8Oad0er+u61yXcz/b53k6dT7u5/Pct0wIIUBERESkI/SKOwAiIiIiTWJyQ0RERDqFyQ0RERHpFCY3REREpFOY3BAREZFOYXJDREREOoXJDREREemUMsUdQFFTKBR48uQJTE1NIZPJijscIiIiUoMQAi9fvoSdnR309N7eN1PqkpsnT57A3t6+uMMgIiKiQnj48CGqVKny1nVKXXJjamoK4NXFMTMzK+ZoiIiISB1JSUmwt7dXfo+/TalLbnJvRZmZmTG5ISIiKmHUKSlhQTERERHpFCY3REREpFOY3BAREZFOYXJDREREOoXJDREREekUJjdERESkU5jcEBERkU5hckNEREQ6hckNERER6RQmN0RERKRTijW5+fvvv9GjRw/Y2dlBJpNh796979wmJCQETZs2haGhIRwdHREQEKD1OImIiKjkKNbkJiUlBY0aNcLKlSvVWj86Ohrdu3dH+/btERYWhm+++QZffvklDh06pOVIiYiIqKQo1okzu3btiq5du6q9vr+/P6pXr46lS5cCAOrWrYtTp05h+fLlcHd311aYREREahNCIC0rp7jDKHbGBvpqTXKpDSVqVvCzZ8/Czc1Npc3d3R3ffPNNgdtkZGQgIyND+T4pKUlb4RERUSknhMBn/mdx+cG/xR1KsQuf6w4TefGkGSWqoDguLg7W1tYqbdbW1khKSkJaWlq+2/j6+sLc3Fz5sre3L4pQiYioFErLymFi8wEoUT03hTFt2jR4eXkp3yclJTHBISIirbs0ww0mcv3iDqPYGBsU37mXqOTGxsYGT58+VWl7+vQpzMzMYGxsnO82hoaGMDQ0LIrwiIg0irUbJU9q5n8/LxO5frHdlintStRVb9WqFQ4ePKjSduTIEbRq1aqYIiIi0g7WbhAVXrHW3CQnJyMsLAxhYWEAXj3qHRYWhpiYGACvbikNHTpUuf7o0aNx7949TJkyBbdu3cKqVauwY8cOTJo0qTjCJyLSGtZulGwu1coX622Z0q5Ye24uXbqE9u3bK9/n1sZ4enoiICAAsbGxykQHAKpXr44DBw5g0qRJWLFiBapUqYJ169bxMXAi0mmlvXajJCrOx6CpmJObdu3aQQhR4PL8Rh9u164drly5osWoiIg+LKzdIJKGvy1EREVAanHw64WpRCQNkxsiIi1jcTBR0SpRg/gREZVE71MczMJUIunYc0NEVISkFgezMJVIOiY3RERFiMXBRNrH3zAi0mkfwii/LA4mKlpMbohIZ7GQl6h0YkExEemsD22UXxYHExUN9twQUanwIYzyy+JgoqLB5IaISgUW8hKVHrwtRURERDqFyQ0RERHpFCY3REREpFOY3BAREZFOYXJDREREOoXJDREREekUJjdERESkU5jcEBERkU7hiFZEpcyHMJFkUeGElUSlE5MbolKEE0kSUWnA21JEpciHNpFkUeGElUSlC3tuiEqpD2EiyaLCCSuJShcmN0SlFCeSJCJdxb9sRDpCnUJhFtgSUWnA5IZIB7BQmIjoPywoJtIBUguFWWBLRLqMPTdEOkadQmEW2BKRLmNyQ1SEtDWA3uu1NCwUJqLSjn8BiYoI62KIiIoGa26IikhRDKDHWhoiIvbcEBULbQ2gx1oaIiImN0TFgnUxRETaw7+uRFr0egExB9AjIioaTG6ItIQFxERExYMFxURaUlABMYt+iYi0iz03REXg9QJiFv0SEWkXkxsiDcuts+HAekRExYN/bYk0iHU2RETFjzU3RBqUX50Na2yIiIpWoXpusrKyEBcXh9TUVFSqVAkVKlTQdFxEJV5unQ1rbIiIipbaPTcvX77E6tWr0bZtW5iZmcHBwQF169ZFpUqVUK1aNYwcORIXL17UZqxEJUpunQ0TGyKioqVWz82yZcuwYMEC1KxZEz169MD06dNhZ2cHY2NjJCQk4MaNGzh58iQ6d+6Mli1b4ueff0atWrW0HTvpMG3Nnq1tHKiPiKj4qZXcXLx4EX///Tfq1auX7/IWLVrgiy++gL+/PzZu3IiTJ08yuaFCY1EuERG9D7WSm99++02tnRkaGmL06NHvFRBRUcyerW0sIiYiKj58FJw+aNqaPVvbWERMRFR8JCU3V69exR9//IEKFSqgf//+sLS0VC5LSkrCN998gw0bNmg8SCrZpNbPcPA7IiJ6HzIhhFBnxcOHD6NHjx6oVasWXr58iZSUFOzcuRPt27cHADx9+hR2dnbIyfmwCyqTkpJgbm6OxMREmJmZFXc4Ou9962fC57ozuSEiIknf32o/Cj579mx4e3vjxo0buH//PqZMmYKePXsiKCjovQMm3fU+9TOsWyEiosJQ+7/EN2/exJYtWwAAMpkMU6ZMQZUqVfDZZ58hMDAQzZs311qQpBuk1s+wboWIiApD7eTG0NAQL168UGkbNGgQ9PT04OHhgaVLl2o6NtIxrJ8hIqKioPY3TePGjXH8+HE0a9ZMpX3AgAEQQsDT01PjwVHJld/M2EREREVB7eRmzJgx+Pvvv/NdNnDgQAghsHbtWo0FRiUXB+EjIqLipPbTUrqCT0tpX2pmNpxnHVJpc6lWHjtHt2INDRERFYqU728WQJBWcWZsIiIqakxuSKtYRExEREVN7XFuiIiIiEoCJjdERESkU4o9uVm5ciUcHBxgZGSEli1b4sKFC29d38/PD05OTjA2Noa9vT0mTZqE9PT0IoqWiIiIPnSFSm7+/vtvXLp0SaXt0qVLBT4qXpDt27fDy8sLPj4+CA0NRaNGjeDu7o5nz57lu/62bdswdepU+Pj4ICIiAuvXr8f27dsxffr0wpwGERER6aBCJTft2rXD0KFDVdqGDBminERTXcuWLcPIkSMxfPhwODs7w9/fHyYmJgXOLH7mzBm0bt0agwYNgoODAzp37oyBAwe+tbcnIyMDSUlJKi8iIiLSXYVKbqKjo3H06FGVtuDgYNy7d0/tfWRmZuLy5ctwc3P7Lxg9Pbi5ueHs2bP5buPq6orLly8rk5l79+7h4MGD6NatW4HH8fX1hbm5ufJlb2+vdoxERERU8hTqGd1q1arlabOzs5O0j/j4eOTk5MDa2lql3draGrdu3cp3m0GDBiE+Ph4fffQRhBDIzs7G6NGj33pbatq0afDy8lK+T0pKYoJDRESkw4q9oFiKkJAQLFy4EKtWrUJoaCh2796NAwcOYN68eQVuY2hoCDMzM5UXERER6S61em7Kly+v9uiyCQkJaq1naWkJfX19PH36VKX96dOnsLGxyXebmTNnYsiQIfjyyy8BAA0aNEBKSgpGjRqF77//Hnp6JSpXIyIiIi1QK7nx8/PT+IHlcjmaNWuG4OBg9O7dGwCgUCgQHByM8ePH57tNampqngRGX18fwKvJGqn45M4CDoAzgRMRUbFSK7nx9PTUysG9vLzg6ekJFxcXtGjRAn5+fkhJScHw4cMBAEOHDkXlypXh6+sLAOjRoweWLVuGJk2aoGXLlrhz5w5mzpyJHj16KJMcKnqcBZyIiD4khSoovnv3LjZu3Ii7d+9ixYoVsLKywl9//YWqVauiXr16au/Hw8MDz58/x6xZsxAXF4fGjRsjKChIWWQcExOj0lMzY8YMyGQyzJgxA48fP0alSpXQo0cPLFiwoDCnQRqSlpWTb2LjUq08jA2YdBIRUdGSCYn3c06cOIGuXbuidevW+PvvvxEREYEaNWpg0aJFuHTpEnbt2qWtWDVCypTppJ7UzGw4zzoE4L9ZwAFwJnAiItIYKd/fkitwp06divnz5+PIkSOQy+XK9g4dOuDcuXPSo6USSwiB1MxslRqb3Fn
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Refit the gradient boosting model with the new data\n",
"best_params_grad_boost_scores, best_params_grad_boost_model, best_params_grad_boost = score_the_model(\n",
" model=GradientBoostingClassifier(),\n",
" model_name='Gradient Boosting',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")\n"
]
},
{
"cell_type": "code",
"execution_count": 184,
"id": "34db1917",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABWnElEQVR4nO3dd3hUVf7H8c+kk4QQIYVQg4AGJIKGXgQhEERRBBVxlaI0IYqgKKg0RbAirqAICCgLS5O2S5emAgqEIlUIRZqBINISIGXu7w9/zDomcNLIhPB+Pc88u3PuOfd+Z7gZ88m594zNsixLAAAAAIBrcnN1AQAAAABQ0BGcAAAAAMCA4AQAAAAABgQnAAAAADAgOAEAAACAAcEJAAAAAAwITgAAAABgQHACAAAAAAOCEwAAAAAYEJwAIAumTJkim82mw4cPO9qaNGmiJk2aGMeuWbNGNptNa9asuWH15UR4eLg6d+7s6jIKlP3796tFixYqVqyYbDab5s+f7+qS8tzQoUNls9nybH+dO3dWeHh4nu0PWf9sAZC/CE4AMjhw4IB69Oih22+/XT4+PgoICFCDBg30ySef6NKlS64u75ayePFiDR061NVl3DI6deqkHTt26J133tHUqVNVs2bNG3asw4cPy2az6cMPP7xhx8grJ06c0NChQ7Vt27YbepzOnTvLZrM5Ht7e3rrjjjs0ePBgXb58+YYeGwBMPFxdAICCZdGiRXr88cfl7e2tjh07qlq1akpJSdEPP/yg/v37a9euXRo/fryryywQli9ffsOPsXjxYo0dO5bwlA8uXbqkDRs26I033lBsbKyry7lh3nzzTQ0YMCBbY06cOKFhw4YpPDxcNWrUcNo2YcIE2e32PKvP29tbEydOlCSdO3dOCxYs0Ntvv60DBw5o2rRpeXacgiw/PlsAZB/BCYDDoUOH9OSTT6p8+fJatWqVwsLCHNt69+6t+Ph4LVq06Jrj7Xa7UlJS5OPjkx/lupyXl5erS7glpKWlyW633/D3OzExUZIUGBiYZ/tMSkqSn59fnu0vL3h4eMjDI+/+8+/p6Zln+5L+rO/pp592PO/Vq5fq16+vf//73xo1apRCQ0Pz9HjXk1/n3t/x2QIUTFyqB8Dh/fff18WLF/Xll186haarKlWqpD59+jie22w2xcbGatq0abrrrrvk7e2tpUuXSpK2bt2qBx54QAEBAfL391ezZs30448/Ou0vNTVVw4YNU+XKleXj46MSJUqoYcOGWrFihaNPQkKCunTpojJlysjb21thYWF65JFHnO41+rs5c+bIZrNp7dq1GbZ98cUXstls2rlzpyTp559/VufOnR2XJZYsWVLPPvusfv/9d+P7ldl9CMeOHVObNm3k5+enkJAQ9e3bV1euXMkw9vvvv9fjjz+ucuXKydvbW2XLllXfvn2dLoXs3Lmzxo4dK0lOly9dZbfbNXr0aN11113y8fFRaGioevTooT/++MPpWJZlafjw4SpTpox8fX11//33a9euXcbXd9WMGTMUFRWlokWLKiAgQJGRkfrkk0+c+pw9e1Z9+/ZVeHi4vL29VaZMGXXs2FGnT5929Dl16pSee+45hYaGysfHR9WrV9dXX33ltJ+/Xr42evRoVaxYUd7e3tq9e7ckae/evXrsscdUvHhx+fj4qGbNmlq4cKHTPrJyXv3d0KFDVb58eUlS//79ZbPZnO7bycr5fPU+uLVr16pXr14KCQlRmTJlsvw+X0tW3jdJ+v333/XMM88oICBAgYGB6tSpk7Zv3y6bzaYpU6Y4vda/3+O0YsUKNWzYUIGBgfL399edd96p119/XdKf9+jVqlVLktSlSxfHeXh1n5nd42S32/XJJ58oMjJSPj4+Cg4OVsuWLbV58+Zsv36bzaaGDRvKsiwdPHjQaduSJUvUqFEj+fn5qWjRonrwwQczPbdnz56tqlWrysfHR9WqVdO8efMy1J1f515WPtMy+2zJ7s/P+PHjHa+hVq1a2rRpU3bedgCZYMYJgMN//vMf3X777apfv36Wx6xatUqzZs1SbGysgoKCFB4erl27dqlRo0YKCAjQq6++Kk9PT33xxRdq0qSJ1q5dqzp16kj68xe4kSNHqmvXrqpdu7bOnz+vzZs3a8uWLWrevLkkqV27dtq1a5deeOEFhYeH69SpU1qxYoWOHDlyzRvSH3zwQfn7+2vWrFlq3Lix07aZM2fqrrvuUrVq1ST9+QvjwYMH1aVLF5UsWdJxKeKuXbv0448/Zusm+kuXLqlZs2Y6cuSIXnzxRZUqVUpTp07VqlWrMvSdPXu2kpOT9fzzz6tEiRLauHGjPv30Ux07dkyzZ8+WJPXo0UMnTpzQihUrNHXq1Az76NGjh6ZMmaIuXbroxRdf1KFDhzRmzBht3bpV69atc8wEDB48WMOHD1erVq3UqlUrbdmyRS1atFBKSorxNa1YsUIdOnRQs2bN9N5770mS9uzZo3Xr1jlC9MWLF9WoUSPt2bNHzz77rO69916dPn1aCxcu1LFjxxQUFKRLly6pSZMmio+PV2xsrCpUqKDZs2erc+fOOnv2rFMgl6TJkyfr8uXL6t69u7y9vVW8eHHt2rVLDRo0UOnSpTVgwAD5+flp1qxZatOmjb755hs9+uijkrJ2Xv1d27ZtFRgYqL59+6pDhw5q1aqV/P39JSnL5/NVvXr1UnBwsAYPHqykpCTje3w9WX3f7Ha7WrdurY0bN+r5559XRESEFixYoE6dOhmPsWvXLj300EO6++679dZbb8nb21vx8fFat26dJKlKlSp66623NHjwYHXv3l2NGjWSpOt+Tjz33HOaMmWKHnjgAXXt2lVpaWn6/vvv9eOPP+bovrGroeK2225ztE2dOlWdOnVSTEyM3nvvPSUnJ+vzzz9Xw4YNtXXrVsfnw6JFi9S+fXtFRkZq5MiR+uOPP/Tcc8+pdOnSmR7rRp97OflMy+7Pz/Tp03XhwgX16NFDNptN77//vtq2bauDBw/m+QwhcEuxAMCyrHPnzlmSrEceeSTLYyRZbm5u1q5du5za27RpY3l5eVkHDhxwtJ04ccIqWrSodd999znaqlevbj344IPX3P8ff/xhSbI++OCDrL+Q/9ehQwcrJCTESktLc7T99ttvlpubm/XWW2852pKTkzOM/fe//21Jsr777jtH2+TJky1J1qFDhxxtjRs3tho3bux4Pnr0aEuSNWvWLEdbUlKSValSJUuStXr16used+TIkZbNZrN+/fVXR1vv3r2tzD6qv//+e0uSNW3aNKf2pUuXOrWfOnXK8vLysh588EHLbrc7+r3++uuWJKtTp04Z9v1Xffr0sQICApzex78bPHiwJcmaO3duhm1Xj3n1vfnXv/7l2JaSkmLVq1fP8vf3t86fP29ZlmUdOnTIkmQFBARYp06dctpXs2bNrMjISOvy5ctO+69fv75VuXJlR5vpvLqWq8f++/mW1fP56jnSsGHD675fpuP9VVbft2+++caSZI0ePdrRLz093WratKklyZo8ebKjfciQIU7n1Mcff2xJshITE69Zx6ZNmzLs56pOnTpZ5cuXdzxftWqVJcl68cUXM/T96zmYmU6dOll+fn5WYmKilZiYaMXHx1sffvihZbPZrGrVqjnGX7hwwQoMDLS6devmND4hIcEqVqyYU3tkZKRVpkwZ68KFC462NWvWWJKc6s6Pcy+rn2nX+mzJ6s9PiRIlrDNnzjj6LliwwJJk/ec//7nucQFcH5fqAZAknT9/XpJUtGjRbI1r3Lixqlat6nienp6u5cuXq02bNrr99tsd7WFhYXrqqaf0ww8/OI4VGBioXbt2af/+/Znuu0iRIvLy8tKaNWsyXH5m0r59e506dcppCfA5c+bIbrerffv2Tse46vLlyzp9+rTq1q0rSdqyZUu2jrl48WKFhYXpsccec7T5+vqqe/fuGfr+9bhJSUk6ffq06tevL8uytHXrVuOxZs+erWLFiql58+Y6ffq04xEVFSV/f3+tXr1akvTtt98qJSVFL7zwgtPs2UsvvZSl1xQYGKikpKTrXub2zTffqHr16o6/uv/V1WM
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdd3QU1d/H8U8SSC+UBEJNkA7SBEGaFAMBNIIC0oTQFQjVBkJIKBKkGQQEpP9QlKZY6ARQKdJBFKQXQTrSQk/m+YOTfViyyW4gyYbwfp2z5ySzd2bunZmdufOde+84GIZhCAAAAAAAAECSHO2dAQAAAAAAACCjI4gGAAAAAAAAWEEQDQAAAAAAALCCIBoAAAAAAABgBUE0AAAAAAAAwAqCaAAAAAAAAIAVBNEAAAAAAAAAKwiiAQAAAAAAAFYQRAMAAAAAAACsIIgGSDp06JDq168vHx8fOTg4aMmSJWmyntq1a6t27dppsuz0FhkZKQcHB3tnw6oVK1aofPnycnV1lYODg65cuWLvLJlJ7WMiMDBQ7du3T7XlQXJwcFBkZKS9swEgk9i2bZuqVasmDw8POTg4aPfu3TbPO3v2bDk4OOj48eNW03I9eDyWtrGt1+r169fLwcFB69evT7P8PY6n7Vg4fvy4HBwcNHv2bHtnJVnnzp1Ts2bNlDNnTjk4OCg6OtreWTKTkvOFLZ6Wuv/TpH379goMDLTb+rdu3SpnZ2edOHHCbnlITS1bttRbb72V5ushiIanQsJFIOHj6uqqvHnzKjg4WJ9//rmuX7/+RMsPDQ3V3r179cknn2ju3LmqVKlSKuU8ef/++68iIyNTVIGG7S5duqS33npLbm5umjRpkubOnSsPDw+LaROOse3bt6dzLlNu06ZNioyMTPOAYGBgoNnvzsPDQ5UrV9b//ve/NF0vAKSnGzduKCIiQg0aNFCOHDms3rzv379fDRo0kKenp3LkyKG2bdvqwoULNq3r3r17at68uS5fvqzPPvtMc+fOVUBAQCqVBE+zZcuW8cDmKdO3b1+tXLlSAwYM0Ny5c9WgQYMk0zo4OCgsLCwdc/d4bt68qcjIyDQPBCcE5BI+WbNmVWBgoHr16pXhHnhnZgMHDlSrVq0yzXXoo48+0uLFi7Vnz540XU+WNF06kMqGDh2qQoUK6d69ezp79qzWr1+vPn36aNy4cfrxxx9VtmzZFC/z1q1b2rx5swYOHJjuF7d///1XQ4YMUWBgoMqXL5+u635SgwYNUv/+/e2djWRt27ZN169f17BhwxQUFGTv7Fi0atWqFM+zadMmDRkyRO3bt1e2bNnMvjtw4IAcHVPv+Uj58uX13nvvSZLOnDmj6dOnKzQ0VHfu3FGXLl1SbT0Z2a1bt5QlC5dLILO6ePGihg4dqoIFC6pcuXLJ3jyeOnVKL7/8snx8fDRixAjduHFDY8aM0d69e01P9JNz5MgRnThxQtOmTVPnzp1TuSRIK49zrU6pZcuWadKkSQTSJAUEBOjWrVvKmjWrvbOSrLVr16px48Z6//337Z0Vi9q2bauWLVvKxcXF5nlu3rypIUOGSFKi1pdpUfefPHmyPD09FRsbq5iYGE2YMEE7d+7Uhg0bUnU9GdW0adMUHx9vl3Xv3r1ba9as0aZNm+yy/rRQoUIFVapUSWPHjk3Th/7cFeCp0rBhQ7NWYgMGDNDatWv12muv6fXXX9f+/fvl5uaWomUmPD1+NBgBy2JjY+Xh4aEsWbJk+MDC+fPnJWXsfWvthiulUlJRskW+fPn09ttvm/5v3769nnvuOX322WfpHkRLOPbSm6ura7qvE0D6yZMnj86cOSN/f39t375dL774YpJpR4wYodjYWO3YsUMFCxaUJFWuXFn16tXT7Nmz1bVr12TX9TRcl2wRHx+vu3fvPjPnx9S+VsOy+/fvKz4+Xs7Ozk/FsXX+/PkM/Vt2cnKSk5NTqi0vLer+zZo1k6+vryTpnXfeUcuWLTV//nxt3bpVlStXTtV1Jcde5zR7BopnzZqlggUL6qWXXrJbHtLCW2+9pYiICH3xxRfy9PRMk3XQnRNPvbp16yo8PFwnTpzQV199Zfbd33//rWbNmilHjhxydXVVpUqV9OOPP5q+j4yMNDVf/eCDD+Tg4GDql37ixAl1795dxYsXl5ubm3LmzKnmzZsnGlcgqfEBrI1DsH79elNFvUOHDqbmzMl1Ibl+/br69OmjwMBAubi4KFeuXKpXr5527txplm7Lli1q1KiRsmfPLg8PD5UtW1bjx483S7N27VrVrFlTHh4eypYtmxo3bqz9+/dbLNu+ffvUunVrZc+eXTVq1Eiy3AlN1ZcsWaLnn39eLi4uKl26tFasWGGx/JUqVZKrq6sKFy6sqVOnpmishYULF6pixYpyc3OTr6+v3n77bZ0+fdr0fe3atRUaGipJevHFF+Xg4JAq44Hs2rVLDRs2lLe3tzw9PfXKK6/o999/T5Tujz/+UK1ateTm5qb8+fNr+PDhmjVrlk3jrEyYMEGlS5eWu7u7smfPrkqVKmnevHmSHmz3Dz74QJJUqFAh03GTsExL455cuXJFffv2NR03+fPnV7t27XTx4sUUl9/Pz08lSpTQkSNHzKbHx8crOjpapUuXlqurq3Lnzq133nlH//33X6J0kZGRyps3r9zd3VWnTh3t27cvUb4Tfj+//PKLunfvrly5cil//vym75cvX246fr28vPTqq6/qr7/+MlvX2bNn1aFDB+XPn18uLi7KkyePGjdubLb9t2/fruDgYPn6+srNzU2FChVSx44dzZZjaUw0W46DhDJs3LhR/fr1k5+fnzw8PPTGG2/Y3PULQNpzcXGRv7+/TWkXL16s1157zRRAk6SgoCAVK1ZMCxYsSHbe9u3bq1atWpKk5s2by8HBwez8b8t12RLDMDR8+HDlz5/fdF599HyYnPj4eI0fP15lypSRq6ur/Pz81KBBA7PhDRKu719//bVKly4tFxcX07XdlvPhvXv3NGTIEBUtWlSurq7KmTOnatSoodWrV5vS2HLOftSiRYtM14pHTZ06VQ4ODvrzzz8lPbguJzwIcnV1lb+/vzp27KhLly5Z3UaWrtWnTp1SkyZN5OHhoVy5cqlv3766c+dOonl/++03NW/eXAULFpSLi4sKFCigvn376tatW6Y07du316RJkyTJrItbAluvsU96LHz77beqWLGivLy85O3trTJlyiSqP9pSpzh//rw6deqk3Llzy9XVVeXKldOcOXPMlpMw7tmYMWMUHR2twoULy8XFRfv27bM4Jlr79u3l6emp06dPq0mTJvL09JSfn5/ef/99xcXFmS370qVLatu2rby9vZUtWzaFhoZqz549No+zdvToUTVv3lw5cuSQu7u7XnrpJS1dutT0fcL13TAMTZo0KdH+elyxsbF67733VKBAAbm4uKh48eIaM2aMDMMwS3fr1i316tVLvr6+8vLy0uuvv67Tp08nqq9YuhdJrt5z/Phx+fn5SZKGDBliKlfCMpOqp3/11VeqXLmyqd768ssvP3brzZo1a0pSonrmli1b1KBBA/n4+Mjd3V21atXSxo0bE81v6/1Fcue006dPq2PHjsqdO7fpXmbmzJmJ1pVcfV2y7b7N0photh4HKbnvsmTJkiWqW7duom0TGBio1157zbQt3dzcVKZMGVMr7e+++850vahYsaJ27dqVaNnW7sEl6fLly3r//fdVpkwZeXp6ytvbWw0bNkzUFTNhrMkFCxbok08+Uf78+eXq6qpXXnlFhw8fTrTuevXqKTY21uz6ktoydjMSwEZt27bVxx9/rFWrVplax/z111+qXr268uXLp/79+8vDw0MLFixQkyZNtHjxYr3xxht68803lS1bNvXt21etWrVSo0aNTBHrbdu2adOmTWrZsqXy58+v48ePa/Lkyapdu7b27dsnd3f3J8pzyZIlNXToUA0ePFhdu3Y1XTSqVauW5DzvvvuuFi1apLCwMJUqVUqXLl3Shg0btH//fr3wwguSpNWrV+u1115Tnjx51Lt3b/n7+2v//v36+eef1bt3b0nSmjVr1LBhQz3
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABYN0lEQVR4nO3de1xM+f8H8NeUpgsVNqkYcid32mzZXWtFLuu6dkNLrMu6hNW6X8pliXXfxbbu7Be5LNZ3WbdWi2RdkkWpFckitCgpTTWf3x9+zXdHU+ZkptF4PR+PeTzM53zOOe9zGs27z3mfz5EJIQSIiIiITISZsQMgIiIi0icmN0RERGRSmNwQERGRSWFyQ0RERCaFyQ0RERGZFCY3REREZFKY3BAREZFJKWPsAEqaSqXCnTt3YGtrC5lMZuxwiIiISAdCCDx58gQuLi4wMyt6bOaNS27u3LkDhUJh7DCIiIioGG7duoWqVasW2eeNS25sbW0BPD85dnZ2Ro6GiIiIdJGeng6FQqH+Hi/KG5fc5F+KsrOzY3JDRERUyuhSUsKCYiIiIjIpTG6IiIjIpDC5ISIiIpPC5IaIiIhMCpMbIiIiMilMboiIiMikMLkhIiIik8LkhoiIiEwKkxsiIiIyKUxuiIiIyKQYNbk5fvw4unbtChcXF8hkMuzdu/el60RERKBFixawtLRE7dq1sXHjRoPHSURERKWHUZObp0+fomnTpli5cqVO/W/cuIEuXbqgbdu2iImJwZdffokhQ4bg0KFDBo6UiIiISgujPjizU6dO6NSpk879Q0NDUaNGDSxevBgA0KBBA5w8eRJLly6Fj4+PocIkIqJXIIRAVk6escOgEmZtYa7TQy4NoVQ9FTwqKgre3t4abT4+Pvjyyy8LXSc7OxvZ2dnq9+np6YYKj4iIXiCEQO/QKJy/+cjYoVAJi53tAxu5cdKMUlVQnJKSgsqVK2u0Va5cGenp6cjKytK6TkhICOzt7dUvhUJREqESERGArJw8JjZU4krVyE1xTJkyBYGBger36enpTHCIiIzg3HRv2MjNjR0GlRBrC+P9rEtVcuPk5IR79+5ptN27dw92dnawtrbWuo6lpSUsLS1LIjwiKkGs4ygdMpX/+xnZyM2NdpmC3iyl6lPm6emJAwcOaLQdOXIEnp6eRoqIiIyBdRxEVBSj1txkZGQgJiYGMTExAJ7f6h0TE4Pk5GQAzy8pDRgwQN1/+PDhuH79OiZOnIirV69i1apV2LFjB8aNG2eM8InISFjHUfq4V69g1MsU9GYx6sjNuXPn0LZtW/X7/NoYf39/bNy4EXfv3lUnOgBQo0YN7N+/H+PGjcPy5ctRtWpVrF27lreBE73BWMdROhjztmB68xg1ufnggw8ghCh0ubbZhz/44ANcuHDBgFERUWnCOg4iehF/IxCRTl6nAt5/F6kSEb2IyQ0RvRQLeImoNClVk/gRkXG8rgW8LFIlIm04ckNEkrxOBbwsUiUibZjcEJEkLOAlotcdf0MRmQhDFvyygJeIShMmN0QmgAW/RET/w4JiIhNQUgW/LOAlotKAIzdEJsaQBb8s4CWi0oDJDZEBldTEd3zyMhHR//A3IJGBsA6GiMg4WHNDZCDGmPiONTFERBy5ISoRJTXxHWtiiIiY3BCVCNbBEBGVHF6WIiIiIpPC5IaIiIhMCpMbIiIiMilMboiIiMiksMKRSM/yJ+7jwyaJiIyDyQ2RHnHiPiIi4+NlKSI90jZxHyfWIyIqWRy5ITKQ/In7OLEeEVHJYnJDZCCcuI+IyDj4m5dKtZJ66rauWERMRGR8TG6o1GLxLhERacOCYiq1jPHUbV2xiJiIyHg4ckMmoaSeuq0rFhETERkPkxsyCSzeJSKifLwsRURERCaFyQ0RERGZFCY3REREZFKY3BAREZFJYXJDREREJoXJDREREZkUJjdERERkUpjcEBERkUnhrGdkVK/y4Es+pJKIiLRhckNGwwdfEhGRIfCyFBmNvh58yYdUEhHRvxVr5CYnJwcpKSnIzMxEpUqVULFiRX3HRW+YV3nwJR9SSURE/6ZzcvPkyRP85z//QVhYGM6cOQOlUgkhBGQyGapWrYoOHTpg2LBhePvttw0ZL5koPviSiIj0RafLUkuWLIGrqys2bNgAb29v7N27FzExMUhISEBUVBSCg4ORm5uLDh06oGPHjvjrr78MHTe95oQQyFTmvuTFgmAiItI/nf5UPnv2LI4fP46GDRtqXe7h4YHPP/8coaGh2LBhA06cOIE6deroNVAqPVgoTERExqRTcrNt2zadNmZpaYnhw4e/UkBU+kktFGZBMBER6ROLHMigdCkUZkEwERHpk6RbwS9evIivv/4aq1atQmpqqsay9PR0fP7553oNjkq//ELhol5MbIiISJ90Tm4OHz4MDw8PhIWFYcGCBahfvz6OHTumXp6VlYVNmzYZJEgiIiIiXemc3MycORPjx4/H5cuXkZSUhIkTJ6Jbt244ePCgIeMjIiIikkTnmpsrV67gxx9/BADIZDJMnDgRVatWRe/evREWFsb5bYiIiOi1oHNyY2lpicePH2u09evXD2ZmZvD19cXixYv1HRsRERGRZDonN82aNcOxY8fQsmVLjfY+ffpACAF/f3+9B0elx7+f7s3J+YiIyJh0Tm5GjBiB48ePa13Wt29fCCGwZs0avQVGpQcn7SMioteJTAghjB1ESUpPT4e9vT3S0tJgZ2dn7HBMQqYyF25Bhwq0u1evgJ3DPXmrNxERvTIp39+cxI/06t+T9nFyPiIiMgYmN6RXfLo3EREZm6QZiomIiIhed0xuiIiIyKQYPblZuXIlXF1dYWVlhVatWuHMmTNF9l+2bBnq1asHa2trKBQKjBs3Ds+ePSuhaImIiOh1V6zk5vjx4zh37pxG27lz5wq9Vbww27dvR2BgIIKDgxEdHY2mTZvCx8cH9+/f19p/69atmDx5MoKDgxEXF4d169Zh+/btmDp1anEOg4iIiExQsZKbDz74AAMGDNBo69+/P9q2bStpO0uWLMHQoUMxaNAguLm5ITQ0FDY2Nli/fr3W/qdOnULr1q3Rr18/uLq6okOHDujbt2+Roz3Z2dlIT0/XeBEREZHpKlZyc+PGDRw9elSjLTw8HNevX9d5G0qlEufPn4e3t/f/gjEzg7e3N6KiorSu4+XlhfPnz6uTmevXr+PAgQPo3LlzofsJCQmBvb29+qVQKHSOkYiIiEqfYt2zW7169QJtLi4ukraRmpqKvLw8VK5cWaO9cuXKuHr1qtZ1+vXrh9TUVLz77rsQQiA3NxfDhw8v8rLUlClTEBgYqH6fnp7OBIeIiMiEGb2gWIqIiAjMmzcPq1atQnR0NHbv3o39+/djzpw5ha5jaWkJOzs7jRcRERGZLp1GbipUqKDzTLMPHz7UqZ+DgwPMzc1x7949jfZ79+7ByclJ6zozZsxA//79MWTIEABA48aN8fTpUwwbNgzTpk2DmVmpytWIiIjIAHRKbpYtW6b3HcvlcrRs2RLh4eHo0aMHAEClUiE8PBwBAQFa18nMzCyQwJibP5/q/w17RBYREREVQqfkxt/f3yA7DwwMhL+/P9zd3eHh4YFly5bh6dOnGDRoEABgwIABqFKlCkJCQgAAXbt2xZIlS9C8eXO0atUK165dw4wZM9C1a1d1kkNERERvtmIVFCcmJmLDhg1ITEzE8uXL4ejoiF9//RXVqlVDw4YNdd6Or68vHjx4gKCgIKSkpKBZs2Y4ePCgusg4OTlZY6Rm+vTpkMlkmD59Om7fvo1KlSqha9eumDt3bnEOg4iIiEyQTEi8nvP777+jU6dOaN26NY4fP464uDjUrFkT8+fPx7lz57Br1y5DxaoXUh6ZTrrJVObCLegQACB2tg8fnElERHon5ftbcgXu5MmT8fXXX+PIkSOQy+Xq9g8//BCnT5+WHi29doQQyFTmSnjlGTtkIiIiNcl/Yl+6dAlbt24t0O7o6IjU1FS9BEXGI4RA79AonL/5yNihEBERFYvkkZvy5cvj7t27BdovXLiAKlWq6CUoMp6snLxiJzbu1SvA2oKF3UR
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# What about logistic regression?\n",
"\n",
"# Score the model with default parameters\n",
"score_log_reg, model_log_reg,_ = score_the_model(\n",
" model=LogisticRegression(max_iter=100),\n",
" model_name='Logistic Regression',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 185,
"id": "17471596",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABSaUlEQVR4nO3de3yP9f/H8edns/Nsw2yGMadCFjWHnCLmEJF0kA4O5RQTVkjlVLSO0jdKCB34OuVUJFqoWMlEDjmT44bkNIfZdv3+6Ofz7dPGeyf7zPa4327X7dvnfb3f1/W65qrvnt7X9f7YLMuyBAAAAAC4JhdnFwAAAAAA+R3BCQAAAAAMCE4AAAAAYEBwAgAAAAADghMAAAAAGBCcAAAAAMCA4AQAAAAABgQnAAAAADAgOAEAAACAAcEJAHBNM2bMkM1m04EDB+xtTZs2VdOmTY1jV69eLZvNptWrV9+w+rIjLCxM3bp1c3YZ+cru3bvVsmVL+fv7y2azadGiRc4uCQDyHYITAPy/vXv3qnfv3qpYsaI8PT3l5+enhg0b6r333tPFixedXV6hsmzZMo0aNcrZZRQaXbt21ZYtWzR27Fh99tlnql279g0714EDB2Sz2fT22287tFuWpd69e8tms9n/7K+Gb5vNpvj4+HTH6tatm3x9fR3amjZtKpvNpnbt2mX63ACQGUWcXQAA5AdLly7Vww8/LA8PD3Xp0kU1atRQcnKyfvzxRw0ePFjbtm3T5MmTnV1mvrBixYobfo5ly5Zp4sSJhKc8cPHiRcXFxemll15SVFSUU2qwLEt9+/bV5MmTNXz48Az/3EeNGqUvv/wy08f86quvFB8fr4iIiFysFEBhRnACUOjt379fjz76qMqXL6/vvvtOISEh9n39+vXTnj17tHTp0muOT0tLU3Jysjw9PfOiXKdzd3d3dgmFQkpKitLS0m74z/vEiROSpICAgFw7ZlJSknx8fDLdv3///po0aZJeeuklvfLKK+n216pVS1999ZU2btyoO++803i8cuXK6dy5cxo9erSWLFmSpdoB4Fp4VA9Aoffmm2/q/Pnz+vjjjx1C01WVK1fWgAED7J9tNpuioqI0c+ZM3XbbbfLw8NDy5cslSb/++qvuvfde+fn5ydfXV82bN9dPP/3kcLwrV65o9OjRqlKlijw9PVWiRAk1atRIK1eutPdJSEhQ9+7dVbZsWXl4eCgkJET333+/w7tG/zZ//nzZbDatWbMm3b6PPvpINptNW7dulST99ttv6tatm/2xxFKlSumpp57Sn3/+afx5ZfSO0+HDh9WhQwf5+PgoKChIgwYN0uXLl9ON/eGHH/Twww+rXLly8vDwUGhoqAYNGuTwKGS3bt00ceJESbI/pmWz2ez709LSNH78eN12223y9PRUcHCwevfurb/++svhXJZlacyYMSpbtqy8vb11zz33aNu2bcbru2r27NmKiIhQ0aJF5efnp/DwcL333nsOfU6fPq1BgwYpLCxMHh4eKlu2rLp06aKTJ0/a+xw/flxPP/20goOD5enpqZo1a+qTTz5xOM4/HyEbP368KlWqJA8PD23fvl2StGPHDj300EMqXry4PD09Vbt27XSBIDP31b+NGjVK5cuXlyQNHjxYNptNYWFh9v2ZuZ+vvge3Zs0a9e3bV0FBQSpbtmymf84DBgzQxIkTNWzYMI0ZMybDPv3791exYsUyPQNZtGhRDRo0SF9++aU2btyY6VoA4HqYcQJQ6H355ZeqWLGiGjRokOkx3333nebOnauoqCgFBgYqLCxM27ZtU+PGjeXn56chQ4bIzc1NH330kZo2bao1a9aoXr16kv7+ZTUmJkY9evRQ3bp1dfbsWW3YsEEbN25UixYtJEkPPvigtm3bpv79+yssLEzHjx/XypUrdfDgQYdfbP+pbdu28vX11dy5c9WkSROHfXPmzNFtt92mGjVqSJJWrlypffv2qXv37ipVqpT9UcRt27bpp59+cggqJhcvXlTz5s118OBBPfvssypdurQ+++wzfffdd+n6zps3TxcuXNAzzzyjEiVKaP369Xr//fd1+PBhzZs3T5LUu3dvHT16VCtXrtRnn32W7hi9e/fWjBkz1L17dz377LPav3+/JkyYoF9//VVr166Vm5ubJGnEiBEaM2aM2rRpozZt2mjjxo1q2bKlkpOTjde0cuVKde7cWc2bN9cbb7whSfr999+1du1ae4g+f/68GjdurN9//11PPfWU7rzzTp08eVJLlizR4cOHFRgYqIsXL6pp06bas2ePoqKiVKFCBc2bN0/dunXT6dOnHQK5JE2fPl2XLl1Sr1695OHhoeLFi2vbtm1q2LChypQpoxdeeEE+Pj6aO3euOnTooC+++EIPPPCApMzdV//WsWNHBQQEaNCgQercubPatGljf2cos/fzVX379lXJkiU1YsQIJSUlGX/GkjRo0CD95z//0dChQ/Xaa69ds5+fn58GDRqkESNGZHrWacCAAXr33Xc1atQoZp0A5A4LAAqxM2fOWJKs+++/P9NjJFkuLi7Wtm3bHNo7dOhgubu7W3v37rW3HT161CpatKh1991329tq1qxptW3b9prH/+uvvyxJ1ltvvZX5C/l/nTt3toKCgqyUlBR727FjxywXFxfrlVdesbdduHAh3dj//ve/liTr+++/t7dNnz7dkmTt37/f3takSROrSZMm9s/jx4+3JFlz5861tyUlJVmVK1e2JFmrVq267nljYmIsm81m/fHHH/a2fv36WRn9X9QPP/xgSbJmzpzp0L58+XKH9uPHj1vu7u5W27ZtrbS0NHu/F1980ZJkde3aNd2x/2nAgAGWn5+fw8/x30aMGGFJshYsWJBu39VzXv3ZfP755/Z9ycnJVv369S1fX1/r7NmzlmVZ1v79+y1Jlp+fn3X8+HGHYzVv3twKDw+3Ll265HD8Bg0aWFWqVLG3me6ra7l67n/fb5m9n6/eI40aNbruz+vf5ytfvrwlyRo8ePA1+65atcqSZM2bN886ffq0VaxYMat9+/b2/V27drV8fHwcxjRp0sS67bbbLMuyrNGjR1uSrPj4+OteKwBkBo/qASjUzp49K+nvR3uyokmTJqpevbr9c2pqqlasWKEOHTqoYsWK9vaQkBA99thj+vHHH+3nCggI0LZt27R79+4Mj+3l5SV3d3etXr063eNnJp06ddLx48cdlgCfP3++0tLS1KlTJ4dzXHXp0iWdPHlSd911lyRl+dGmZcuWKSQkRA899JC9zdvbW7169UrX95/nTUpK0smTJ9WgQQNZlqVff/3VeK558+bJ399fLVq00MmTJ+1bRESEfH19tWrVKknSt99+q+TkZPXv399h9mzgwIGZuqaAgAAlJSVd9zG3L774QjVr1rTP+PzT1XMuW7ZMpUqVUufOne373Nzc9Oyzz+r8+fPpHqt88MEHVbJkSfvnU6dO6bvvvtMjjzyic+fO2a/3zz//VKtWrbR7924dOXLEXvP17qusyMr9fFXPnj3l6uqa6XMkJiZKkm655ZZM9ff399fAgQO1ZMmSTN0r0t+zTsWKFdPo0aMzXRcAXAvBCUCh5ufnJ0k6d+5clsZVqFDB4fOJEyd04cIF3Xrrren6VqtWTWlpaTp06JAk6ZVXXtHp06d1yy23KDw8XIMHD9Zvv/1m7+/h4aE33nhDX3/9tYKDg3X33XfrzTffVEJCgr3PmTNnlJCQYN9OnTolSWrdurX8/f01Z84ce985c+aoVq1aDr+gnjp1SgMGDFBwcLC8vLxUsmRJ+zWdOXMmSz+LP/74Q5UrV073eF9GP4uDBw+qW7duKl68uHx9fVWyZEn7Y4WZOe/u3bt15swZBQUFqWTJkg7b+fPndfz4cXtNklSlShWH8SVLllSxYsWM5+nbt69uueUW3XvvvSpbtqyeeuop+3tsV+3du9f+6OO1/PHHH6pSpYpcXBz/77ZatWoOdV717/tqz549sixLw4cPT3e9I0eOlCT7NZvuq6zIyv18rdpNhg4dqjp16qh3796aP39+psYMGDBAAQEBmX7XKTthCwCuheAEoFDz8/NT6dK
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdeVgV5f//8RegbKK4IIuKYmruomEQmqmF4pJli5kbSC6lkiZtUikuJZVLmJmauWWZe+Y3zSXSyqXczX3fE9xyQwOF+f3hj/PhyHJAgYPyfFzXXBfnnntm7nvmnDM373PPfdsYhmEIAAAAAAAAQKZsrV0AAAAAAAAAoKAjiAYAAAAAAABYQBANAAAAAAAAsIAgGgAAAAAAAGABQTQAAAAAAADAAoJoAAAAAAAAgAUE0QAAAAAAAAALCKIBAAAAAAAAFhBEAwAAAAAAACwgiAYUYgcPHlTLli3l6uoqGxsbLV68OE+O06xZMzVr1ixP9p3fhg4dKhsbG2sXw6Lly5erfv36cnR0lI2NjS5dumTtIgEACrFNmzapUaNGKlasmGxsbLR9+/ZsbztjxgzZ2Njo2LFjFvP6+Pioe/fud13Owiqjc5zd9tuaNWtkY2OjNWvW5Fn57sb99l44duyYbGxsNGPGDGsXJUvx8fF68cUXVaZMGdnY2CgmJsbaRbpn165dk7u7u7777jtrFyVXTJo0SRUrVlRiYqK1i/JAIogGFGCpDZrUxdHRUeXKlVNwcLA+//xzXb169Z72Hxoaqp07d+qjjz7SrFmz1LBhw1wqedb++ecfDR06NEcNaGTfhQsX9NJLL8nJyUkTJkzQrFmzVKxYsQzzpr7HNm/ebJZ++fJl+fv7y9HRUcuXL5f0vwCih4eHrl+/nm5fPj4+evrpp83SUt+7Y8aMyfaxAQB579q1a4qKilKrVq1UunRpi/+87927V61atZKLi4tKly6tbt266dy5c9k61s2bN9WhQwddvHhRn332mWbNmqVKlSrlUk1wP1u2bJmGDh1q7WIgBwYOHKgVK1YoMjJSs2bNUqtWrTLNa2Njo/Dw8HTpI0eOlI2NjV555RWlpKSYAog2NjZauHBhuvypbdDz58+b0rp37y4bGxvVq1dPhmFk+9gZGTdunIoXL66XX345W/kLuu7duyspKUmTJ0+2dlEeSATRgPvA8OHDNWvWLE2cOFGvv/66JOmNN95Q3bp19ffff9/VPm/cuKENGzaoR48eCg8PV9euXVWhQoXcLHam/vnnHw0bNuy+DKJ98MEHunHjhrWLkaVNmzbp6tWrGjFihHr06KGuXbuqaNGi2d7+ypUratmypf7++2/98MMP6RpHZ8+e1cSJE3NUplGjRmUYeAMAWMf58+c1fPhw7d27V76+vlnmPXXqlJ544gkdOnRII0eO1FtvvaWlS5eqRYsWSkpKsnisw4cP6/jx43rrrbfUu3dvde3aVaVKlcqtqiCPrFy5UitXrszTYyxbtkzDhg3L02PcLypVqqQbN26oW7du1i5Kln799Vc9++yzeuutt9S1a1fVqFEjR9t//PHHev/99xUaGqqvv/5atrbmIYnhw4dnGBTLzM6dO7Vo0aIclSGtmzdvaty4cerZs6fs7Ozuej8FiaOjo0JDQzV27NgcnUtkD0E04D7QunVrde3aVWFhYYqMjNSKFSv0yy+/6OzZs3rmmWfuKqiT+utxyZIlc7m0D6aEhARJUpEiReTo6Gjl0mTt7Nmzku7u2l69elXBwcHavn27Fi5cqNatW6fLU79+fY0aNSrb77v69esrPj5ekyZNynF5AAB5w8vLS2fOnNHx48c1atSoLPOOHDlSCQkJ+vXXX9W/f3+99957mjdvnnbs2JGtR8/u5b5UkKSkpOi///6zdjHyjb29vezt7a1djAferVu3lJSUZHrqpKAHcs6ePXvXn+VRo0YpMjJSISEhmjZtWroAWv369U0/4maHk5OTHn744RwH3tL66aefdO7cOb300kt3tX1B9dJLL+n48eNavXq1tYvywCGIBtynnnzySQ0ePFjHjx/Xt99+a7Zu3759evHFF1W6dGk5OjqqYcOGWrJkiWn90KFDTY9RvP3227KxsZGPj48k6fjx4+rbt6+qV68uJycnlSlTRh06dEg3DklmY4NZGrdkzZo1evTRRyVJYWFhpq7bWTXCr169qjfeeEM+Pj5ycHCQu7u7WrRooa1bt5rl++uvv9SmTRuVKlVKxYoVU7169TRu3DizPL/++quaNGmiYsWKqWTJknr22We1d+/eDOu2Z88ede7cWaVKldLjjz+eab1Tu4svXrxYderUkYODg2rXrm16DPLO+jds2FCOjo6qUqWKJk+enKNx1ubPny8/Pz85OTnJzc1NXbt21enTp03rmzVrptDQUEnSo48+Khsbm2yPB3Lt2jW1atVKW7du1cKFC9W2bdsM8w0ZMkTx8fHZ7o3WuHFjPfnkk/r0008LfC8+ACgsHBwc5Onpma28Cxcu1NNPP62KFSua0oKCgvTwww9r3rx5WW7bvXt3NW3aVJLUoUMH2djYmI2zlZ37ckYMw9CHH36oChUqyNnZWc2bN9fu3buzVR/pdkBs3Lhxqlu3rhwdHVW2bFm1atXKbIiB1Pv7d999p9q1a8vBwcF0b9+2bZtat26tEiVKyMXFRU899ZT+/PNPs2PcvHlTw4YNU7Vq1eTo6KgyZcro8ccf16pVq0x54uLiFBYWpgoVKsjBwUFeXl569tlnsxz/bcGCBbKxsdFvv/2Wbt3kyZNlY2OjXbt2SZL+/vtvde/eXQ899JAcHR3l6empV155RRcuXLB4jjIaE+3UqVNq3769ihUrJnd3dw0cODDDcZf++OMPdejQQRUrVpSDg4O8vb01cOBAs3ZA9+7dNWHCBEkyG74kVUpKimJiYlS7dm05OjrKw8NDr776qv7991+zY93re2HOnDny8/NT8eLFVaJECdWtWzdd+/HSpUsaOHCgqS1aoUIFhYSEmD1eePbsWfXo0UMeHh5ydHSUr6+vZs6cabaf1McWR48erZiYGFWpUkUODg7as2dPhmOide/eXS4uLjp9+rTat28vFxcXlS1bVm+99ZaSk5PN9n3hwgV169ZNJUqUUMmSJRUaGqodO3Zke5y1I0eOqEOHDipdurScnZ312GOPaenSpab1qW18wzA0YcKEdNfLkrFjx+qdd95R165dNX369HQBNEl6+eWXcxQUs7W11QcffJCjwNudFi9eLB8fH1WpUsUsPfXcnzhxQk8//bRcXFxUvnx503t2586devLJJ1WsWDFVqlRJs2fPTrfvS5cu6Y033pC3t7ccHBxUtWpVffLJJ0pJSTHLN3r0aDVq1EhlypSRk5OT/Pz8tGDBgnT7y8n/HH5+fipdurR+/PHHuzovyFwRaxcAwN3r1q2b3nvvPa1cuVK9evWSJO3evVuNGzdW+fLlNWjQIBUrVkzz5s1T+/bttXDhQj333HN6/vnnVbJkSQ0cOFCdOnVSmzZt5OLiIun2o4Dr16/Xyy+/rAoVKujYsWOaOHGimjVrpj179sjZ2fmeylyzZk0NHz5cQ4YMUe/evdWkSRNJUqNGjTLd5rXXXtOCBQsUHh6uWrVq6cKFC1q7dq327t2rRx55RJK0atUqPf300/Ly8tKAAQPk6empvXv36qefftKAAQMkSb/88otat26thx56SEOHDtWNGzc0fvx4NW7cWFu3bjUFElN16NBB1apV08iRIy3eyNeuXatFixapb9++Kl68uD7//HO98MILOnHihMqUKSPpdoO7VatW8vLy0rBhw5ScnKzhw4erbNmy2Tp3M2bMUFhYmB599FFFR0crPj5e48aN07p167Rt2zaVLFlS77//vqpXr66vvvpKw4cPV+XKldM1CjKSkJCg1q1ba9OmTVqwYEG6sc3SatKkiSko1qdPHzk5OVnc/9ChQ/XEE09o4sSJioiIyFZ9AQDWd/r0aZ09ezbDcVP9/f21bNmyLLd/9dVXVb58eY0cOVL9+/fXo48+Kg8PD0k5vy+nNWTIEH344Ydq06aN2rRpo61bt6p
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABt0UlEQVR4nO3deXhM1x8G8HeyzGTfRFYhQog91hBFVQhatbSlKKGWXxFtqdYullraorSU2ulmK62ilNQSaql9SQRZxJKEiOzLJDPn94dmGEmYiZlMlvfzPHmezJl773znSjKvc885VyKEECAiIiKqIIwMXQARERGRLjHcEBERUYXCcENEREQVCsMNERERVSgMN0RERFShMNwQERFRhcJwQ0RERBWKiaELKG1KpRL37t2DtbU1JBKJocshIiIiDQghkJ6eDjc3NxgZPb9vptKFm3v37sHDw8PQZRAREVEJ3L59G9WqVXvuNpUu3FhbWwN4fHJsbGwMXA0RERFpIi0tDR4eHqrP8eepdOGm4FKUjY0Nww0REVE5o8mQEg4oJiIiogqF4YaIiIgqFIYbIiIiqlAYboiIiKhCYbghIiKiCoXhhoiIiCoUhhsiIiKqUBhuiIiIqEJhuCEiIqIKheGGiIiIKhSDhpujR4+iR48ecHNzg0QiwW+//fbCfQ4fPoxmzZpBJpOhdu3a2LBhg97rJCIiovLDoOEmMzMTTZo0wfLlyzXaPiYmBq+//jo6duyICxcu4OOPP8bw4cOxf/9+PVdKRERE5YVBb5zZrVs3dOvWTePtV65ciZo1a2LRokUAgHr16uHYsWP4+uuvERgYqK8yiYiISAPyfCWSM+XIUyjh4WBhsDrK1V3BT5w4gYCAALW2wMBAfPzxx8Xuk5ubi9zcXNXjtLQ0fZVHRERUocjzlXiUJUdSRi6SM+V4mCHHw0w5HhY8fvr7DDnSc/MBAG28quCXka0NVne5CjcJCQlwdnZWa3N2dkZaWhqys7Nhbm5eaJ/58+dj1qxZpVUiERFRmZWnUOJRphxJGfL/wknuf4FFPbwkZz4ONOk5+Vq/hrGRBEoh9FC95spVuCmJyZMnY/z48arHaWlp8PDwMGBFREREupGvUCI563EoKQgk6iElV+25tBKEFSMJ4GApQxVLKRwspahiJUUVSymqWMngYCmFo5UUDpZPvrcxM4WRkUQP71Zz5SrcuLi4IDExUa0tMTERNjY2RfbaAIBMJoNMJiuN8oiIiF5KvkKJR1l5j3tSMuRIypQjOSP38eWfTDmS/+tlefhfgEnNztP6NR6HFanqq4rV4+BSxVIGByspHNVCjAy25oYPK9oqV+GmTZs22Lt3r1rbgQMH0KZNGwNVREREVDyFUuBRllx16aegF+Xhf4HlSS/L48cpWdqHFYkEcLB4ElYc/+tRebaHpeB7W3NTGJezsKItg4abjIwM3Lx5U/U4JiYGFy5cgIODA6pXr47Jkyfj7t272LRpEwDggw8+wLJly/DZZ5/h/fffx99//42tW7diz549hnoLRERUiSiUAilZclXPScFYlcdjWJ7+/nGAScnOg7bDTyQSwF4trPzXk2IpU/WmPB1e7CykFT6saMug4ebMmTPo2LGj6nHB2JigoCBs2LAB8fHxiIuLUz1fs2ZN7NmzB+PGjcPSpUtRrVo1rFmzhtPAiYioRJRKgZTsPFVPysP/QkpBQFEbx5Ipx6MsudZhBQDsLUzVAora5aBnwouduSlMjHkDgZchEcLAQ5pLWVpaGmxtbZGamgobGxtDl0NERDqkVAqkZuc9NQvo+VOXH2XJoSzBp6CdKqwUNVblv8G3//Wy2FswrOiCNp/f5WrMDRERVS5KpUBaTp7apZ6ne1iefP/4EtGjrDwoSpBWbM1N1WYDOVjKnlwOslKfKWRvIYUpw0qZxnBDRESlRgiBtOx8JKnWVXk2oDzpWUn6r2elJGHFxszkmYG0z45VeTJ12d6SYaWiYbghIqISE0IgLSdfLZAkZ6qPW3n6EtGjTDnySxBWrM1Mnuo9eXqsypMeloKZQvYWUkhNGFYqM4YbIiJSEUIgPTf/+QNrn1qC/1GWHHmKEoQVmQkcnh5I+9TA2oKpzKqwYmkKmYmxHt4tVVQMN0REFZgQAhn/hZXCA2vVx60UTGsuSVixkpkUnrr8TA9LlafGrJiZMqyQ/jDcEBGVI0IIZMoVzx1Y+3QPS3KmHHKFUuvXsZQa/9ezIis0C+jZqcsOlgwrVLYw3BARGZAQAllyReEVbJ8eWPvUPYIeZsohz9c+rFhIjYuc+fP0VOYqTz3PsELlGcMNEZGOZcnVLwM9WWb/mZlB/z3OLUFYMTM1emrF2udPXa5iKYO5lGGFKg+GGyKiF8iWK55apfbZ3pUnY1UKvs/J0z6syEyM1O4JpHaPoCLGrVhI+eebqDj87SCiSicnT1Fo5s/Tl32eHXSbnafQ+jWkJkaPx6pYPTsb6OnelicBxkJqDImE9wci0gWGGyKqFO6lZGP98RhsP3sHj0pw52WpiVHhdVb+Cy+O/y0Ip/reSgpLhhUig2G4IaIK7crdVKwOi8buS/FqK91KjY1UU5erPDWY9slUZpnatGYrmQnDClE5wXBDRBWOEAKHrz/A6qPR+Cfqoardv1YVjGjnheae9rBmWCGqsBhuiKjCyM1X4PcL97AmLBrXEzMAAMZGErzR2BUj2nmhobutgSskotLAcENE5V5qVh5+PHULG/6JxYP0XACPF6Hr36o6hr5SE+525gaukIhKE8MNEZVbt5OzsPZYDLaeuY0s+eMZTS42Zhja1hPvtqoOW3NTA1dIRIbAcENE5c7F2ylYFRaNPy/Ho2CMsI+LNUa298Ibjd14R2iiSo7hhojKBaVS4O9r97EqLBqnY5JV7e28HTGyvRdeqe3IAcJEBIDhhojKuJw8BXaev4vVYdGIfpAJADAxkuBNXzeMaOeFeq42Bq6QiMoahhsiKpOSM+X48eQtbDoRi6QMOQDAWmaCAa2rY4i/J1xtOUiYiIrGcENEZUpsUibWHovBtrO3Vfdocrczx9C2nujX0gPWZhwkTETPx3BDRGXC2VuPsPpoNPaHJ0D8N0i4obsNRrTzQvdGrjA15iBhItIMww0RGYxCKXAgPBGrw6Jx9tYjVXvHulUxor0X2nhV4SBhItIaww0RlbpsuQLbz93B2rBoxD7MAvD4Xk+9mrpheDsv1HG2NnCFRFSeMdwQUalJysjFphO38MOJWNWduW3NTfFe6+oIauMJJxszA1dIRBUBww0R6V3UgwysCYvBr+fuQJ7/eJCwh4M5hrWtiXdaeMBSxj9FRKQ7/ItCRHohhMC/sY+w6mg0DkYkqtqbVLPFyPa1ENjAGSYcJExEesBwQ0Q6la9QYv/VRKwKi8bF2ymq9oB6zhjZ3gstPe05SJiI9Irhhoh0IjM3H9vO3Mba4zG4nZwNAJCaGOGtZtUwvF1N1KpqZeAKiaiyYLghopdyPy0HG0/E4seTcUjNfjxI2N7CFIPaeGJwmxpwtJIZuEIiqmwYboioRG4kpmN1WDR+O38PcsXjQcKeVSwwrJ0X3m5WDeZSYwNXSESVFcMNEWlMCIET0Q+x+mg0DkU+ULU3r2GPEe280Lm+M4yNOJ6GiAyL4YaIXihPocTey/FYHRaNK3fTAAASCRBY3wUj2tdE8xoOBq6QiOgJhhsiKlZGbj42n47D+uOxuJvyeJCwmakR3mnugWGv1ISno6WBKyQiKozhhogKSUjNwfp/YvDzqTik5+QDAKpYShHk74n3WteAg6XUwBUSERWP4YaIVCLi07A6LBq7LtxDvvLxrbm9qlpiRDsv9G7qDjNTDhImorKP4YaokhNC4NjNJKw6Go2wG0mq9lY1HTCynRde83GCEQcJE1E5wnBDVEnJ85XYfekeVh2NxrWEdACAkQTo1sgVI9p5wdfDzrAFEhGVEMMNUSWTlpOHX049HiSckJYDALCQGqNvi8eDhD0cLAxcIRHRyylRuMnLy0NCQgKysrJQtWpVODhwGihRWXc3JRvrj8Vg87+3kZH7eJBwVWsZhvh7YqBfddhZcJA
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# KNN should score much better now\n",
"score_knn, model_knn, _ = score_the_model(\n",
" model=KNeighborsClassifier(),\n",
" model_name='KNN',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 186,
"id": "7b5a37ae",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA04AAAK4CAYAAABDHK0xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABYeklEQVR4nO3dd3QV1d7G8ScJ6SEJkEKooSktBg29SAu9iKggForSBKQpCCogCmJFvNIEpKhwaQKiIEUERIiUIEjvHRJAJEAoIcm8f/jmXA8JbEhCDoTvZ62zrmfP3jO/ORnuypM9s4+TZVmWAAAAAAA35ezoAgAAAADgXkdwAgAAAAADghMAAAAAGBCcAAAAAMCA4AQAAAAABgQnAAAAADAgOAEAAACAAcEJAAAAAAwITgAAAABgQHACgCwydepUOTk56fDhw7a2WrVqqVatWsaxq1atkpOTk1atWnXX6kuP0NBQtW/f3tFl3FP27dun+vXry8/PT05OTlqwYIGjS8pyaV3rAHC/IzgBuCsOHDigLl26qGjRovLw8JCvr6+qVaumzz//XFeuXHF0eQ+UxYsX65133nF0GQ+Mdu3aadu2bRo+fLi++eYblS9f/q4d6/Dhw3JycrK9nJ2dlTt3bjVq1EhRUVF37bj3mxs/p3+/Kleu7Ojy0jRjxgyNGjXK0WUA+Jccji4AQPazaNEiPfPMM3J3d1fbtm1VtmxZJSQk6LffflO/fv20Y8cOTZgwwdFl3hOWLVt214+xePFijRkzhvCUBa5cuaKoqCi99dZb6tGjR5Ydt02bNmrcuLGSkpK0d+9ejR07VrVr19bGjRsVFhaWZXXc61I+p38LDAx0UDW3NmPGDG3fvl29e/d2dCkA/h/BCUCmOnTokJ599lkVLlxYv/zyi0JCQmzbunfvrv3792vRokU3HZ+cnKyEhAR5eHhkRbkO5+bm5ugSHgiJiYlKTk6+65/3mTNnJEn+/v6Zts/4+Hh5e3vfss9jjz2mF154wfa+Ro0aatSokcaNG6exY8dmWi33uxs/p8xy9epVubm5ydmZG3mA7Ix/4QAy1UcffaRLly7pq6++sgtNKYoXL65evXrZ3js5OalHjx6aPn26ypQpI3d3dy1ZskSS9Mcff6hRo0by9fWVj4+P6tatq99//91uf9evX9fQoUNVokQJeXh4KE+ePKpevbqWL19u6xMTE6MOHTqoQIECcnd3V0hIiJ544olbPn8xd+5cOTk5afXq1am2ffnll3JyctL27dslSX/++afat29vuy0xb968eumll/TXX38ZP6+0nnE6fvy4WrRoIW9vbwUFBalPnz66du1aqrFr1qzRM888o0KFCsnd3V0FCxZUnz597G6FbN++vcaMGSNJdrcnpUhOTtaoUaNUpkwZeXh4KDg4WF26dNHff/9tdyzLsjRs2DAVKFBAXl5eql27tnbs2GE8vxQzZ85URESEcubMKV9fX4WFhenzzz+363P+/Hn16dNHoaGhcnd3V4ECBdS2bVudPXvW1uf06dN6+eWXFRwcLA8PD4WHh2vatGl2+0m5LeuTTz7RqFGjVKxYMbm7u2vnzp2SpN27d+vpp59W7ty55eHhofLly2vhwoV2+7id6+pG77zzjgoXLixJ6tevn5ycnBQaGmrbfjvXc8qzQatXr1a3bt0UFBSkAgUK3PbnnKJGjRqS/rll9t+mTJmiOnXqKCgoSO7u7ipdurTGjRuXanxoaKiaNm2q3377TRUrVpSHh4eKFi2qr7/+OlXfHTt2qE6dOvL09FSBAgU0bNgwJScnp1nX2LFjbf/O8+XLp+7du+v8+fN2fWrVqqWyZcvqzz//VM2aNeXl5aXixYtr7ty5kqTVq1erUqVK8vT01MMPP6yff/75jj+fmzl48KCeeeYZ5c6dW15eXqpcuXKqP/SkPG84c+ZMvf3228qfP7+8vLx04cIFSdL69evVsGFD+fn5ycvLSzVr1tTatWvt9nHx4kX17t3bdq0HBQWpXr162rx5s+0zWLRokY4cOWL7N/vvawmAYzDjBCBT/fDDDypatKiqVq1622N++eUXzZ49Wz169FBAQIBCQ0O1Y8cO1ahRQ76+vurfv79cXV315ZdfqlatWrZfnKR/flkdMWKEOnbsqIoVK+rChQvatGmTNm/erHr16kmSnnrqKe3YsUOvvvqqQkNDdfr0aS1fvlxHjx696S8jTZo0kY+Pj2bPnq2aNWvabZs1a5bKlCmjsmXLSpKWL1+ugwcPqkOHDsqbN6/tVsQdO3bo999/twsqJleuXFHdunV19OhR9ezZU/ny5dM333yjX375JVXfOXPm6PLly3rllVeUJ08ebdiwQV988YWOHz+uOXPmSJK6dOmikydPavny5frmm29S7aNLly6aOnWqOnTooJ49e+rQoUMaPXq0/vjjD61du1aurq6SpMGDB2vYsGFq3LixGjdurM2bN6t+/fpKSEgwntPy5cvVpk0b1a1bVx9++KEkadeuXVq7dq0tRF+6dEk1atTQrl279NJLL+mxxx7T2bNntXDhQh0/flwBAQG6cuWKatWqpf3796tHjx4qUqSI5syZo/bt2+v8+fN2gVz6JyRcvXpVnTt3lru7u3Lnzq0dO3aoWrVqyp8/vwYMGCBvb2/Nnj1bLVq00Hfffacnn3xS0u1dVzdq2bKl/P391adPH9stYT4+PpJ029dzim7duikwMFCDBw9WfHy88TO+UcofBXLlymXXPm7cOJUpU0bNmzdXjhw59MMPP6hbt25KTk5W9+7d7fru379fTz/9tF5++WW1a9dOkydPVvv27RUREaEyZcpI+uePErVr11ZiYqLt85wwYYI8PT1T1fTOO+9o6NChioyM1CuvvKI9e/Zo3Lhx2rhxo921Jkl///23mjZtqmeffVbPPPOMxo0bp2effVbTp09X79691bVrVz333HP6+OOP9fTTT+vYsWPKmTOn8XO5fPmyXRCXJD8/P7m6uio2NlZVq1bV5cuX1bNnT+XJk0fTpk1T8+bNNXfuXNu1keK9996Tm5ubXn/9dV27dk1ubm765Zdf1KhRI0VERGjIkCFydna2hdU1a9aoYsWKkqSuXbtq7ty56tGjh0qXLq2//vpLv/32m3bt2qXHHntMb731luLi4nT8+HF99tlnkmS7lgA4kAUAmSQuLs6SZD3xxBO3PUaS5ezsbO3YscOuvUWLFpabm5t14MABW9vJkyetnDlzWo8//ritLTw83GrSpMlN9//3339bkqyPP/749k/k/7Vp08YKCgqyEhMTbW2nTp2ynJ2drXfffdfWdvny5VRj//vf/1qSrF9//dXWNmXKFEuSdejQIVtbzZo1rZo1a9rejxo1ypJkzZ4929YWHx9vFS9e3JJkrVy58pbHHTFihOXk5GQdOXLE1ta9e3crrf+7X7NmjSXJmj59ul37kiVL7NpPnz5tubm5WU2aNLGSk5Nt/d58801LktWuXbtU+/63Xr16Wb6+vnaf440GDx5sSbLmzZuXalvKMVM+m2+//da2LSEhwapSpYrl4+NjXbhwwbIsyzp06JAlyfL19bVOnz5tt6+6detaYWFh1tWrV+32X7VqVatEiRK2NtN1dTMpx77xervd6znlGqlevfotP68bjzd06FDrzJkzVkxMjLVmzRqrQoUKliRrzpw5dv3TumYaNGhgFS1a1K6tcOHCqa7f06dPW+7u7tZrr71ma+vdu7clyVq/fr1dPz8/P7trPeUaql+/vpWUlGTrO3r0aEuSNXnyZFtbzZo1LUnWjBkzbG27d++2/X/F77//bmtfunSpJcmaMmXKbX1Oab1S/k2lnMuaNWts4y5evGgVKVLECg0NtdW9cuVKS5JVtGhRu88zOTnZKlGihNWgQQO7fyeXL1+2ihQpYtWrV8/W5ufnZ3Xv3v2WNTdp0sQqXLjwLfsAyFrcqgcg06TcqnI7f/n9t5o1a6p06dK290lJSVq2bJlatGihokWL2tpDQkL03HPP6bfffrMdy9/fXzt27NC+ffvS3Lenp6fc3Ny0atWqVLefmbRu3VqnT5+2WwJ87ty5Sk5
"text/plain": [
"<Figure size 1000x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Most important features: ['V36', 'V39', 'V22', 'V27', 'V1', 'V12', 'V15', 'V13', 'V18', 'V34', 'V14', 'V37', 'V30', 'V2', 'V17', 'V31', 'V8', 'V16', 'V3', 'V38', 'V10', 'V28', 'V9', 'V11', 'V5', 'V7', 'V6', 'V41']\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABNEAAATYCAYAAAAxo1G2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdeVgW1f//8RegLALiwuaWGO67YRJqqYXiEqVpmfpR3EvFDa20RDQX2jTMTM3cPpZpmpkfNU1JK5fcNct9NxPcN1QImN8f/pgvt4A3KniTPh/XdV8XnDlzzzkzc8997vecc8bOMAxDAAAAAAAAALJkb+sCAAAAAAAAAHkdQTQAAAAAAADACoJoAAAAAAAAgBUE0QAAAAAAAAArCKIBAAAAAAAAVhBEAwAAAAAAAKwgiAYAAAAAAABYQRANAAAAAAAAsIIgGgAAAAAAAGAFQTQghxw8eFBNmjSRh4eH7OzstHjx4lzZTsOGDdWwYcNcee8HbcSIEbKzs7N1MaxasWKFatasKWdnZ9nZ2enSpUu2LtJd6dy5s/z8/GxdDABADtqyZYvq1q0rV1dX2dnZaefOndled9asWbKzs9OxY8es5vXz81Pnzp3vuZyPqsz2cXbbcGvXrpWdnZ3Wrl2ba+W7F/+2c+HYsWOys7PTrFmzbF2UO4qPj1ebNm1UtGhR2dnZKSYmxtZFuit59Xx92KWmpqpq1aoaM2aMrYuSI1asWCE3NzedPXvW1kWxiiAaHhlpjZm0l7Ozs4oXL66QkBB98sknunr16n29f1hYmHbv3q0xY8Zozpw5ql27dg6V/M7+/vtvjRgx4q4az8i+8+fP65VXXpGLi4smTZqkOXPmyNXVNdO8t59j+fLlU4kSJdS5c2edOnXqAZc877p9P6V/DRkyxNbFy9TYsWNzLTAO4OFw7do1RUVFqWnTpipSpIjVH+979+5V06ZN5ebmpiJFiqhjx47Z/vHwzz//6OWXX9aFCxf08ccfa86cOSpdunQO1QT/ZsuXL9eIESNsXQzchYEDB2rlypUaOnSo5syZo6ZNm2aZ9/Z2U8GCBdWgQQMtW7bsAZY478uqnenr62vromXqXj63X3/9tU6ePKnw8PDcKdQD1rRpU5UtW1bR0dG2LopV+WxdAOBBe/fdd1WmTBn9888/iouL09q1azVgwACNHz9eS5YsUfXq1e/6PW/cuKGNGzfqnXfeeeAXsr///lsjR46Un5+fatas+UC3fb+GDRuWZ4MmabZs2aKrV69q1KhRCg4OztY6aefYzZs39dtvv2nWrFlat26d/vjjDzk7O+dyif890vZTelWrVrVRae5s7NixatOmjVq2bGnrogDIo86dO6d3331Xjz32mGrUqHHHXhl//fWXnnnmGXl4eGjs2LG6du2aPvroI+3evVubN2+Wo6PjHbd1+PBhHT9+XNOmTVP37t1zuCbILT/++GOub2P58uWaNGkSgTRJpUuX1o0bN5Q/f35bF+WOfvrpJ7344osaPHhwtvI3btxYnTp1kmEYOn78uCZPnqzQ0FD98MMPCgkJyeXS/nuk7af0XFxcbFSaO7uXz+2HH36oV199VR4eHrlXsAfstdde0+DBgzVy5Ei5u7vbujhZIoiGR06zZs0seokNHTpUP/30k55//nm98MIL2rt3711fYNPuHBcqVCgni/rQSkhIkKurq/Lly6d8+fL2ZejMmTOS7u7Ypj/HunfvLk9PT73//vtasmSJXnnlldwo5r/S7Z/FnJJ2fgHAg1SsWDGdPn1avr6+2rp1q5588sks844dO1YJCQnatm2bHnvsMUlSnTp11LhxY82aNUs9e/a847bu5bspL0pNTVVSUtIjc4PJWnAUOSM5OVmpqalydHT8V5xbZ86cuavPcvny5fWf//zH/L9169aqXLmyJkyYQBAtndv3U05Jf37Zyo4dO7Rr1y6NGzfOZmXIDa1bt1bfvn21YMECde3a1dbFyRLDOQFJzz77rCIjI3X8+HF9+eWXFsv27dunNm3aqEiRInJ2dlbt2rW1ZMkSc/mIESPMIRRvvPGG7OzszPmnjh8/rt69e6tChQpycXFR0aJF9fLLL2eYgySrucGszVmydu1as5HepUsXs6vynYaPXL16VQMGDJCfn5+cnJzk7e2txo0ba/v27Rb5Nm3apObNm6tw4cJydXVV9erVNWHCBIs8P/30k55++mm5urqqUKFCevHFF7V3795M67Znzx61b99ehQsXVv369bOst52dncLDw7V48WJVrVpVTk5OqlKlilasWJFp/WvXri1nZ2f5+/tr6tSpdzXP2oIFCxQQECAXFxd5enrqP//5j8Wwy4YNGyosLEyS9OSTT8rOzu6e5gJ5+umnJd3qOZAmKSlJw4cPV0BAgDw8POTq6qqnn35aa9assVg3bT6Pjz76SJ9//rn8/f3l5OSkJ598Ulu2bMmwrbT95uzsrKpVq+q7777LtEwJCQkaNGiQSpUqJScnJ1WoUEEfffSRDMOwyJd2PBYsWKDKlSvLxcVFQUFB2r17tyRp6tSpKlu2rJydndWwYcNsza+TXfd7fknSl19+aR7jIkWK6NVXX9XJkyct3uPgwYNq3bq1fH195ezsrJIlS+rVV1/V5cuXzX2QkJCg2bNnm5+xf9OcMAAeDCcnp2wPFfr222/1/PPPmwE0SQoODlb58uX1zTff3HHdzp07q0GDBpKkl19+WXZ2dhbzbGXn2pkZwzA0evRolSxZUgUKFFCjRo30559/Zqs+0q2A2IQJE1StWjU5OzvLy8tLTZs21datW808ad8pX331lapUqSInJyfz+33Hjh1q1qyZChYsKDc3Nz333HP67bffLLbxzz//aOTIkSpXrpycnZ1VtGhR1a9fX6tWrTLzxMXFqUuXLipZsqScnJxUrFgxvfjii3f8flq4cKHs7Oz0888/Z1g2depU2dnZ6Y8//pAk/f777+rcubMef/xxOTs7y9fXV127dtX58+et7qPM5kT766+/1LJlS7m6usrb21sDBw5UYmJihnV//fVXvfzyy3rsscfk5OSkUqVKaeDAgbpx44aZp3Pnzpo0aZIky+FsaVJTUxUTE6MqVarI2dlZPj4+eu2113Tx4kWLbd3vuTBv3jwFBATI3d1dBQsWVLVq1TK0IS9duqSBAwea7dGSJUuqU6dOOnfunJnnzJkz6tatm3x8fOTs7KwaNWpo9uzZFu+Tvp0UExNjtpP27NmT6ZxonTt3lpubm06dOqWWLVvKzc1NXl5eGjx4sFJSUize+/z58+rYsaMKFiyoQoUKKSwsTLt27cr2PGtHjhzRyy+/rCJFiqhAgQJ66qmnLIZdprXzDcPQpEmTMhyv7KpUqZI8PT0t2pmS9P3336tFixYqXry4nJyc5O/vr1GjRmWoZ8OGDVW1alXt2bNHjRo1UoECBVSiRAl98MEHGbaV3fNVst7Olv7veJw4cULPP/+83NzcVKJECfM83r17t5599lm5urqqdOnSmjt37l3vn6zc7/klWf+dKFm/bln73GZm8eLFcnR01DPPPGORntYuPnDggP7zn//Iw8NDXl5eioyMlGEYOnnypF588UUVLFhQvr6+mQbhEhMTFRUVpbJly5rXmjfffDPDcZ45c6aeffZZeXt7y8nJSZUrV9bkyZMzvJ+fn5+ef/55rVu3TnXq1JGzs7Mef/xx/fe//82Q19vbW9WrV9f3339/x/rbWt7uAgI8QB07dtTbb7+tH3/8UT169JAk/fnnn6pXr55KlCihIUOGyNXVVd98841atmypb7/9Vq1atdJLL72kQoUKaeDAgWrXrp2aN28uNzc3SbeGAm7YsEGvvvqqSpYsqWPHjmny5Mlq2LCh9uzZowIFCtxXmStVqqR3331Xw4cPV8+ePc1gTd26dbNc5/XXX9fChQsVHh6uypUr6/z581q3bp327t2rJ554QpK0atUqPf/88ypWrJj69+8vX19f7d27V0uXLlX//v0lSatXr1azZs30+OOPa8SIEbpx44YmTpyoevXqafv27Rkmsn/55ZdVrlw5jR07NkOg5nbr1q3TokW
"text/plain": [
"<Figure size 1500x1500 with 7 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAGwCAYAAABVdURTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABfgklEQVR4nO3dd1QUV/8G8GcX2KVIM0gVxd4Fe8SoUVHsLYkmGEWTaOxGY2IXS+zRaKLRxG5+JqhJNL4W7BV7QUURFcFGUYKAdNi9vz982dcNJTu4sLI8n3P2HPfOnZlnR8qXO3dmZEIIASIiIiIjITd0ACIiIiJ9YnFDRERERoXFDRERERkVFjdERERkVFjcEBERkVFhcUNERERGhcUNERERGRVTQwcoaWq1GtHR0bC2toZMJjN0HCIiItKBEAIvXryAq6sr5PLCx2bKXHETHR0Nd3d3Q8cgIiKiInj06BEqVqxYaJ8yV9xYW1sDeHlwbGxsDJyGiIiIdJGcnAx3d3fN7/HClLniJvdUlI2NDYsbIiKiUkaXKSWcUExERERGhcUNERERGRUWN0RERGRUWNwQERGRUWFxQ0REREaFxQ0REREZFRY3REREZFRY3BAREZFRYXFDRERERoXFDRERERkVgxY3J0+eRI8ePeDq6gqZTIZdu3b96zrHjx9H48aNoVQqUb16dWzatKnYcxIREVHpYdDiJjU1FZ6enli1apVO/SMjI9GtWze0a9cOISEh+OKLL/DZZ5/hwIEDxZyUiIiISguDPjizS5cu6NKli87916xZgypVqmDp0qUAgDp16uD06dP47rvv4OvrW1wxiYiMXkJqFtKycgwdg4yEwlQOR2tzg+2/VD0V/OzZs/Dx8dFq8/X1xRdffFHgOpmZmcjMzNS8T05OLq54RERvPCEEniSm42Z0Mm4+SUJodDJuRichLjnz31cm0lHjSnb4c2Qrg+2/VBU3sbGxcHJy0mpzcnJCcnIy0tPTYWFhkWedBQsWYPbs2SUVkYjojaFWC0T9naopYG4+SUZodBIS07Lz7a805TUmpB9mJob9WipVxU1RTJkyBRMmTNC8T05Ohru7uwETERHpX7ZKjXtPU3AzOhmhT5JwMzoJt6KTkZqlytPXVC5DDSdr1He1QX03W9RztUEdFxtYKY3+VwKVEaXqK9nZ2RlxcXFabXFxcbCxscl31AYAlEollEplScQjIioRGdkqhMe+QGh0kub0UljsC2TlqPP0VZrKUcfFBvX+W8jUd7VFDadyMDczMUByopJRqoqbli1bYt++fVpthw4dQsuWLQ2UiIioeKVk5iAs5uVoTOiTl6eX7j5NgUot8vQtpzRFXVcb1He11RQz1SpYwdTApwiISppBi5uUlBTcu3dP8z4yMhIhISEoX748KlWqhClTpuDJkyfYsmULAGD48OFYuXIlvv76a3zyySc4evQotm/fjr179xrqIxAR6c3z1KyXIzHR/53o+yQJkX+nQuStY2BvafbfU0q2qO9mg3qutqhc3hJyuazkgxO9YQxa3Fy6dAnt2rXTvM+dG+Pv749NmzYhJiYGDx8+1CyvUqUK9u7di/Hjx2PFihWoWLEi1q1bx8vAiajUeZqcgdDo/43GhD5JxpPE9Hz7OtuYawqY3BEZF1tzyGQsZIjyIxMiv78JjFdycjJsbW2RlJQEGxsbQ8chIiMnhMDj5+maAia3oIlPyf/S68pvWaKeq81/R2ReFjMO5ThvkEjK7+9SNeeGiKg4ZWSr8j0FpCsBgejEjJeXXUfnzpNJQnJG3pvjyWVAtQrlNAVMPVdb1HW1ga2F2Wt8AiICWNwQURl3/1kK9lyPwZ7r0bgTl1Is+zAzkaGmkzXq/3d+TF1XW9RxsYalgj+CiYoDv7OIqMx5lJCG/1yPxp5rMbgVo9+7lpubyVHXxUZrom9NJ2soeIM8ohLD4oaIyoToxHTs/e8IzbXHSZp2E7kMrao7oHtDF7Sv7QiL17z/i7mZCUx4xRKRQbG4ISKj9TQ5A3tvxGDP9RhcfvBc0y6XAW9XfQvdG7qic31nlLdSGDAlEekbixsiMirxKZnYHxqLPdeicSEqQTNBWCYDmlUuj+6eLuhc39mgTywmouLF4oaISr3EtCwEhcZiz/UYnImIx6s3721UyQ7dG7qiWwMXONuyoCEqC1jcEFGplJyRjYM347DnejRO341HzisVTQM3W3Rv6IJuDV1Q0d7SgCmJyBBY3BBRqZGamYPDYXH4z7UYnLzzDFmq/z0osrazNXp4vhyh8XCwMmBKIjI0FjdE9EZLz1Lh6O2n2HM9GkdvP0XmK0++ru5YDt0buqB7Q1dUdyxnwJRE9CZhcUNEry0jW4Vjt59iz/UYnL4Xj6xXCpDXla1Sa51y8njLEt0buqK7pwtqOVnz+UpElAeLGyIqkswcFU7eicee69E4fCsOqVmqYttXRXuLlwVNQxfUc7VhQUNEhWJxQ0Q6y1apcfpePPZci8HBW7F48cozk9zsLNC94cvLrPX5oEcTuYxPwCYiSVjcEFGhclRqnLufgD3XoxF0MxaJadmaZc425ujawAXdPV3QyN2OBQgRvRFY3BBRHhnZKlyMSsCBm7EICo1FfEqWZplDOSW6NnBG94auaFrZHnI+aoCI3jAsbogIQgiExbzA6XvPcOpuPC5EJmhdlWRvaYbO9V3Qo6ELWlR9i89OIqI3GosbojLqaXIGTt2Nx+l78Th1Nx7xKZlay51tzNG2ZgV0begC72pvwcyET7UmotKBxQ1RGZGepcKFqAScuvNydCY87oXWcgszE7xdtTxa16iANjUdUK1COc6hIaJSicUNkZFSqwVuxST/d3TmGS5GPte6o69M9vIxBe9Ud0DrGhXQuLIdlKYmBkxMRKQfLG6IXvHH5cf4/uhd5KjEv3d+w6Vk5iApPVurzdXWHK1rVEDrmg7wruaA8lYKA6UjIio+LG6I/utmdBKm/HlDa3SjtLNSmKBltbdejs7UrICqDlY81URERo/FDRFePpBxzK9XkaVSo31tR4zrUMPQkV6bqYkMNRytoTDlRGAiKltY3BABmPnXTdyPT4WzjTmWfuAJe56uISIqtVjc0BspJTMHp+/GI0dd/KeI7j1NwR9XHkMuA77/qBELGyKiUo7FDb2R5vznJrZfelyi+/zCpyaaVylfovskIiL9Y3FDb6S45Jc3lKtWwQoVrPX3EMaC1Ha2wah21Yt9P0REVPxY3NAbbeS71fFek4qGjkFERKUIL6MgIiIio8LihoiIiIwKixsiIiIyKixuiIiIyKiwuCEiIiKjwqulyKDCY1/g79TMPO2JaVkGSENERMaAxQ0ZzKm7zzBw/YVC+8g5tkhERBKxuCGDeZSQDuDlk6vd7C3yLK9grcQ71SuUdCwiIirlWNyQwXlXd8DaQU0NHYOIiIwEB/2JiIjIqHDkhkrUmYh4rD8ViRy1QHRiuqHjEBGREWJxQyVqzYn7OHnnmVZbSTwYk4iIyo4iFTfZ2dmIjY1FWloaKlSogPLly+s7FxmpHJUaADCgRSU0qmQPMxMZ2td2NHAqIiIyJjoXNy9evMD//d//ITAwEBcuXEBWVhaEEJDJZKhYsSI6deqEYcOGoVmzZsWZl4xE8yrl0cvLzdAxiIjICOk0oXjZsmXw8PDAxo0b4ePjg127diEkJAR37tzB2bNnERAQgJycHHTq1AmdO3fG3bt3izs3ERERUb50Grm5ePEiTp48iXr16uW7vHnz5vjkk0+wZs0abNy4EadOnUKNGjX0GpSIiIhIFzoVN7/99ptOG1MqlRg+fPhrBSIiIiJ6HbzPDRERERkVScXNtWvX8M033+DHH39EfHy81rLk5GR88skneg1HREREJJXOxc3BgwfRvHlzBAYGYtGiRahduzaOHTumWZ6eno7NmzcXS0giIiIiXelc3MyaNQsTJ05EaGgooqKi8PXXX6Nnz54ICgoqznxEREREkuh8n5ubN2/il19+AQDIZDJ8/fXXqFixIt5//30EBgby/jZERET0RtC5uFEqlUhMTNRq8/Pzg1wuR//+/bF06VJ9ZyMiIiKSTOf
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Select the best features\n",
"score_rf, model_rf, most_important_features_rf = score_the_model(\n",
" model=RandomForestClassifier(),\n",
" model_name='Random Forest',\n",
" random_seed=42,\n",
" X_train=X_train,\n",
" X_test=X_test,\n",
" y_train=y_train,\n",
" y_test=y_test,\n",
" plot=True\n",
")"
]
},
2022-12-29 10:21:35 +01:00
{
"cell_type": "markdown",
"id": "3dafbf40",
"metadata": {},
"source": [
"### 2.3 Evaluation\n",
"Given that the data set is not in the ”big data” category, implement a cross-validation procedure based\n",
"on five folds (approximately equal sized) of your data. Furthermore, repeat the experiment 10 times with\n",
"different folds and average the results (include standard deviation). You are expected to report the following\n",
"metrics:\n",
"- F1\n",
"- Precision\n",
"- Recall\n",
"- AUC\n",
"Comment on the performance of algorithms and visualize their final scores. How do they perform against\n",
"the random baseline? What about the constant one? How do different learning scenarios impact the final\n",
"score? Are the differences between the models statistically significant?"
]
},
{
"cell_type": "markdown",
"id": "74d18249",
"metadata": {},
"source": [
"### F1 score"
]
},
{
"cell_type": "markdown",
"id": "f6f42fd4",
"metadata": {},
"source": [
"The F1 score is a metric that combines precision and recall. It is often used in classification tasks as a way to balance the two metrics, as it can be difficult to optimize for both at the same time.\n",
"To compute the F1 score, you first need to calculate the precision and recall for a given model.\n",
"\n",
"**Precision** \n",
"P = TP/(TP + FP)\n",
"\n",
"**Recall** \n",
"R = TP/ (TP + FN)\n",
"\n",
"Once you have calculated precision and recall, the F1 score is simply the harmonic mean of the two, computed using the following formula:\n",
"\n",
"F1 = 2 * (precision * recall) / (precision + recall)\n",
"\n",
"The F1 score ranges from 0 to 1, with a higher score indicating better performance. A perfect score is achieved when the precision and recall are both 1."
]
},
{
"cell_type": "markdown",
"id": "2e76c17b",
"metadata": {},
"source": [
"### Precision score"
]
},
{
"cell_type": "markdown",
"id": "05e44481",
"metadata": {},
"source": [
"Precision is a metric that measures the accuracy of a classifier when it predicts the positive class. It is defined as the number of true positive predictions made by the classifier, divided by the total number of positive predictions made by the classifier.\n",
"\n",
"In other words, precision is a measure of the proportion of positive predictions that are actually correct. It is a useful metric to consider when the cost of false positives is high, such as in cases where the classifier is being used to make important decisions (e.g. medical diagnosis, fraud detection).\n",
"\n",
"Precision = True Positives / (True Positives + False Positives)"
]
},
{
"cell_type": "markdown",
"id": "c4323660",
"metadata": {},
"source": [
"### Recall score\n"
]
},
{
"cell_type": "markdown",
"id": "bf24df24",
"metadata": {},
"source": [
"Recall is a metric that measures the ability of a classifier to detect all instances of the positive class. It is defined as the number of true positive predictions made by the classifier, divided by the total number of actual positive cases in the data.\n",
"\n",
"In other words, recall is a measure of the proportion of actual positive cases that the classifier is able to identify. It is a useful metric to consider when the cost of false negatives is high, such as in cases where it is important to identify all instances of the positive class (e.g. cancer diagnosis, intrusion detection).\n",
"\n",
"Recall = True Positives / (True Positives + False Negatives)"
]
},
{
"cell_type": "markdown",
"id": "aa3f5b17",
"metadata": {},
"source": [
"### AUC score"
]
},
{
"cell_type": "markdown",
"id": "697ee032",
"metadata": {},
"source": [
"The AUC is calculated by plotting the true positive rate (TPR) against the false positive rate (FPR) at various classification thresholds. The TPR is defined as the number of true positive predictions made by the classifier, divided by the total number of actual positive cases in the data. The FPR is defined as the number of false positive predictions made by the classifier, divided by the total number of actual negative cases in the data.\n",
"\n",
"The AUC is then calculated by computing the area under this curve. An AUC of 1 indicates a perfect classifier, while an AUC of 0.5 indicates a classifier that is no better than random.\n",
"\n",
"The AUC can be calculated using the following formula:\n",
"\n",
"AUC = (TPR1 - TPR0) + (TPR2 - TPR1) + ... + (TPRn - TPRn-1)\n",
"\n",
"where TPRi is the TPR at the ith classification threshold and TPRi-1 is the TPR at the previous classification threshold.\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "addfc3ea",
"metadata": {},
"source": [
"## Report and presentation\n",
"The assignment has to be submitted in the form of two files: a markdown file and a PDF file created from\n",
"the R Studio markdown file (in RStudio → file - new file - R Markdown), where you write both the code,\n",
"as well as the text of answers (echo = T option must be enabled for each code block). Markdown files can\n",
"easily be exported to PDF using (“Knit”) button in R Studio. If you are using Python, you can produce a\n",
"similar report with Jupyter Notebook."
]
2022-12-19 10:09:00 +01:00
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
2023-01-06 10:41:21 +01:00
"version": "3.11.1"
2022-12-29 10:21:35 +01:00
},
"vscode": {
"interpreter": {
2023-01-06 10:41:21 +01:00
"hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
2022-12-29 10:21:35 +01:00
}
2022-12-19 10:09:00 +01:00
}
},
"nbformat": 4,
"nbformat_minor": 5
}