The challenge is now over. But it remains open for post-challenge submissions!
Submitted by Marc B oulle
Data Grid (CMA)
Prior submission (prior for ada, gina and sylva, agnostic for hiva and nova)
Data Grids extend the MODL discretization and value grouping methods to the multivariate case (paper submitted to IJCNN 2007).
An ensemble of data grid is averaged according to the compression-based averaging schema (CMA) (see SNB(CMA) submission)
Preprocessing for sylva: the 80 soil_type binary variables are merged into two categorical variables
BER guess:
ada: 0.192
gina: 0.140
hiva: 0.310 (agnostic)
nova: 0.135 (agnostic)
sylva: 0.008
Dataset | Balanced Error | Area Under Curve | |||||
---|---|---|---|---|---|---|---|
Train | Valid | Test | Train | Valid | Test | ||
ada | 0.1734 | 0.1994 | 0.1756 | 0.8507 | 0.8045 | 0.8464 | prior |
gina | 0.1052 | 0.089 | 0.1254 | 0.9588 | 0.972 | 0.9479 | prior |
hiva | 0.2651 | 0.3276 | 0.3242 | 0.7659 | 0.7647 | 0.717 | agnostic |
nova | 0.1094 | 0.096 | 0.1229 | 0.9226 | 0.9338 | 0.9159 | agnostic |
sylva | 0.0205 | 0.009 | 0.0228 | 0.981 | 0.9838 | 0.9798 | prior |
Overall | 0.1347 | 0.1442 | 0.1542 | 0.8958 | 0.8917 | 0.8814 | prior |
This entry is a complete prior knowledge entry.