"When everything fails, ask for additional domain knowledge"
is the current motto of machine learning. Therefore, assessing the real
added value of prior/domain knowledge is a both deep and practical question.
Most commercial data mining programs accept data pre-formatted as a table,
each example being encoded as a fixed set of features.
Is it worth spending time engineering elaborate features incorporating domain knowledge
and/or designing ad hoc algorithms?
Or else, can off-the-shelf programs working on simple features encoding the raw
data without much domain knowledge put out-of-business skilled data analysts?
In this challenge, the participants are allowed to compete in two tracks:
- The “prior knowledge” track, for which they will have access to the original raw data representation and as much knowledge as possible about the data.
- The “agnostic learning” track for which they will be forced to use a data representation encoding the raw data with dummy features.
Final results of August 1st, 2007
"Prior Knowledge" winner
Vladimir Nikulin
with
vn3
Individual dataset winners
Statistics
Entrants | 75 |
Entries | 2736 |
Valid challenge entries (*) | 316 |
(*) Valid entries include all results for all datasets