|
|
# Geological data example
|
|
|
|
|
|
## Data Description
|
|
|
# Example methodology description
|
|
|
|
|
|
The data set contains 9 variables with 442 samples. The target variable for prediction in the following example is 'Depth'. The variable is also used to visually evaluate sampling quality via distribution plot.
|
|
|
|
|
|
This and the next example consider the following combinations of hyperparameters used for Bayesian network learning:
|
|
|
Both examples consider the following combinations of hyperparameters used for Bayesian network learning:
|
|
|
|
|
|
* K2 metric;
|
|
|
* K2 metric with gaussian mixtures (GMM);
|
... | ... | @@ -13,25 +9,63 @@ This and the next example consider the following combinations of hyperparameters |
|
|
|
|
|
All the examples are executed using cross-validation.
|
|
|
|
|
|
## K2 metric sampling example
|
|
|
# Geological data example
|
|
|
|
|
|
## Data Description
|
|
|
|
|
|
The data set contains 9 variables with 442 samples. The target variable for prediction in the following example is 'Depth'. The variable is also used to visually evaluate sampling quality via distribution plot.
|
|
|
|
|
|
## Sampling
|
|
|
|
|
|
### K2 metric sampling example
|
|
|
|
|
|
![k2](https://user-images.githubusercontent.com/86363785/188129119-dfa62b6d-b1fd-4e63-aa75-fb7aafba95a1.png)
|
|
|
|
|
|
## Sampling with K2 + GMM example
|
|
|
### Sampling with K2 + GMM example
|
|
|
|
|
|
![geo_k2_gmm](https://user-images.githubusercontent.com/86363785/188129748-ce239eb4-bbab-43f0-9d80-c92483f27613.png)
|
|
|
|
|
|
## Sampling with K2 + GMM + logit nodes example
|
|
|
### Sampling with K2 + GMM + logit nodes example
|
|
|
|
|
|
![geo_k2_gmm_logit](https://user-images.githubusercontent.com/86363785/188129774-a3695199-776d-493f-8a9c-bf78125f03fb.png)
|
|
|
|
|
|
## K2 with initial structure sampling
|
|
|
### K2 with initial structure sampling
|
|
|
|
|
|
![geo_k2_expert](https://user-images.githubusercontent.com/86363785/188129863-b8777153-eb31-4e8f-b8bf-b87e7c959035.png)
|
|
|
|
|
|
|
|
|
# Social data example
|
|
|
|
|
|
The second example is similar to the previous one, but carried out on different data set. Social data set consists of 30000 anonymous bank records with 9 variables each, bayesian networks were learnt on a sample with 2000 records.
|
|
|
## Data Description
|
|
|
|
|
|
The second example is similar to the previous one, but carried out on different data set. Social data set consists of 30000 anonymous bank records with 9 variables each, bayesian networks were learnt on a sample with 2000 records. The target variable is 'mean_tr' which is mean transaction of client.
|
|
|
|
|
|
## Sampling
|
|
|
|
|
|
### K2 metric sampling example
|
|
|
|
|
|
![socio_k2](https://user-images.githubusercontent.com/86363785/188132481-2ae015e4-69a0-4025-84ef-c96aad6dd98e.png)
|
|
|
|
|
|
|
|
|
### Sampling with K2 + GMM example
|
|
|
|
|
|
![social_k2_gmm](https://user-images.githubusercontent.com/86363785/188132496-e49ebf7d-d603-406a-a199-8cbc3e3c256c.png)
|
|
|
|
|
|
|
|
|
### Sampling with K2 + GMM + logit nodes example
|
|
|
|
|
|
![social_k2_gmm_logit](https://user-images.githubusercontent.com/86363785/188132505-72b257f8-fb38-47b3-ad19-a670bd31c6d9.png)
|
|
|
|
|
|
|
|
|
### K2 with initial structure sampling
|
|
|
|
|
|
![socio_expert](https://user-images.githubusercontent.com/86363785/188132518-5d463979-3b9d-46b0-8cda-5cef791ec98c.png)
|
|
|
|
|
|
# Prediction MSE table for both examples
|
|
|
|
|
|
|
|
|
\ No newline at end of file |
|
|
| Hyperparameters combinations | Geological data MSE | Social data MSE |
|
|
|
|------------------------------|---------------------|-----------------|
|
|
|
| K2 | 1014.59 | 6066.5 |
|
|
|
| K2 + GMM | 974.35 | 5149.5 |
|
|
|
| K2 + GMM + logit | 1018.84 | 6657.93 |
|
|
|
| K2 + initial structure | 1056.06 | 12506.47 | |
|
|
\ No newline at end of file |