"When everything fails, ask for additional domain knowledge"
is the current motto of machine learning. Therefore, assessing the real
added value of prior/domain knowledge is a both deep and practical question.
Most commercial data mining programs accept data pre-formatted as a table,
each example being encoded as a fixed set of features.
Is it worth spending time engineering elaborate features incorporating domain knowledge
and/or designing ad hoc algorithms?
Or else, can off-the-shelf programs working on simple features encoding the raw
data without much domain knowledge put out-of-business skilled data analysts?
In this challenge, the participants are allowed to compete in two tracks:
- The “prior knowledge” track, for which they will have access to the original raw data representation and as much knowledge as possible about the data.
- The “agnostic learning” track for which they will be forced to use a data representation encoding the raw data with dummy features.
Final results of August 1st, 2007
"Prior Knowledge" winner
Vladimir Nikulin
with
vn3
Individual dataset winners
Statistics
Entrants | Erreur : Requète : SELECT COUNT(DISTINCT entrant_id) FROM entrant LEFT JOIN entry ON entrant.id=entry.entrant_id AND method IS NOT NULL AND name <> 'reference'
|
Entries | Erreur : Requète : SELECT COUNT(*) FROM entry LEFT JOIN entrant ON entrant_id=entrant.id WHERE name<>'reference'
|
Valid challenge entries (*) | Erreur : Requète : SELECT COUNT(*) FROM result LEFT JOIN entry ON entry_id=entry.id LEFT JOIN entrant ON entrant_id=entrant.id WHERE name<>'reference' AND dataset='all' AND be_test IS NOT NULL
|
(*) Valid entries include all results for all datasets