The challenge is now over. But it remains open for post-challenge submissions!
IMPORTANT: Entries made since February 1st 2007 might be using validation data, now available for training.
How to format and ship results
Results File Formats
The results on each dataset should be formatted in ASCII files according to
the following table. If you are a Matlab user, you may find some of the sample
code routines useful for formatting the data (CLOP users, the sample code is part of CLOP). You can view an example of
each format from the filename column. You can optionally include your model
in the submission if you are a CLOP user.
||Classifier outputs for training examples
||+/-1 indicating class prediction.|
||Classifier outputs for validation examples|
||Classifier outputs for test examples|
||Classifier confidence for training examples
||Non-negative real numbers indicating the
confidence in the classification (large values indicating higher
confidence). They do not need to be probabilities, and can be simply absolute values of
discriminant values. Optionally they can be normalized between 0 and 1 to be
interpreted as abs(P(y=1|x)-P(y=-1|x)).
||Classifier confidence for validation examples|
||Classifier confidence for test examples|
||The trained CLOP model used to compute the submitted results
||A Matlab learning object saved with the command save_model([dataname '_model'], your_model, 1, 1).*|
+ If no confidence file is supplied, equal
confidence will be assumed for each classification. If confidences are not between 0 and 1, they will be divided by their maximum value.
* Setting the 2
last arguments to 1 forces overwriting models with the same name and saving only the hyperparameters of the model,
not the parameters resulting from training. There is a limit on the size of the archive you can upload,
so you will need to set the last argument to one.
Results Archive Format
Submitted files must be in either a .zip or .tar.gz archive format. You can
download the example zip archive
to help familiarise yourself with the archive structures and contents
(the results were generated with the
code). Submitted files must use exactly the same
filenames as in the example archive. If you use tar.gz archives please do not
include any leading directory names for the files. Use
zip results.zip *.resu *.conf *.mat
tar cvf results.tar *.resu *.conf *.mat; gzip results.tar
Synopsis of the competition rules
- Anonymity: All entrants must identify themselves with a valid email, which will
be used for the sole purpose of communicating with them. Emails will not appear
on the result page. Name aliases are permitted for development entries to preserve anonymity.
For all final submissions, the entrants must identify themselves by their real names.
- Data: There are 5 datasets in 2 different data representations: "Agnos" for the agnostic learning
track and "Prior" for the prior knowledge track. The validation set labels will be revealed one month before the end of the challenge.
- Models: Using CLOP models or other Spider objects is optional, except for
entering the model selection game, for
the NIPS workshop
on Multi-Level Inference (deadline December 1st, 2006).
- Deadline: Originally, results had to be submitted before March 1, 2007, and complete entries in the "agnostic learning track" made before December 1, 2006
counted towards the model selection game. The submission deadline is now extended until August 1, 2007.
The milestone results (December 1 and March 1) are available from the workshop pages. See the IJCNN workshop page for the latest results.
- Submissions: If you wish that your method be ranked on
the overall table you should include classification results on ALL the datasets
for the five tasks, but this is mandatory only for final submissions. You may make mixed submissions,
with results on "Agnos" data for some datasets and on "Prior" data for others.
Overall, a submission may count towards the "agnostic learning track" competition only if ALL five
dataset results are on "Agnos" data. Only submissions on "Agnos" data may count towards the model selection game. Your last 5 valid submissions in either track
count towards the final ranking. The method of submission is via the form on the submissions page. Please limit yourself to 5 submissions per day
maximum. If you encounter problems with submission, please contact the Challenge Webmaster.
- Track determination: An entry containing results using "Agnos" data only for all 5 datasets may qualify
for the agnostic track ranking. We will ask you to fill out a fact sheet about your methods to determine whether
they are indeed agnostic. All other entries will be ranked in the prior knowledge track, including entries
mixing "Prior" and "Agnos" representations.
- Reproducibility: Participation is not conditionned on delivering your code nor publishing
your methods. However, we will ask the top ranking participants to voluntarily cooperate to reproduce their
results. This will include filling out a fact sheet about their methods and eventually sending us their
code, including the source code. If you use CLOP, saving your models will facilitate this process.
The outcome of our attempt to reproduce your results will be published and add credibility to your results.
- Ranking: The entry ranking will be based on the test BER rank (see the evaluation page).
- Prizes: There will be a prize for the best "Agnos" overall entry and five prizes
for the best "Prior" entries, one for each dataset. [Note that for our submission book-keeping, valid final submissions
must contain answers on all five datasets, even if you compete in the "Prior" track towards a particular dataset prize. This
is easy, you can use the sample submission
to fill in results on datasets you do not want to work on.]
- Cheating Everything is allowed that is not explicitly forbidden. We forbid: (1) reverse engineering
the "Agnos" datasets to gain prior knowledge and then submit results in the agnostic learning track without disclosing what you did in the fact sheet; (2) using the original datasets
from which the challenge datasets were extracted to gain advantage (this includes but is not limited to
training on test data and selecting models using cross-validated methods using all the data).
Top ranking participants will be under scrutiny and failure to reproduce their results will shed doubt
on their integrity and potentially harm their reputation.