Tensorial-Professor Anima on AI

Reproducibility Debate: Should Code Release be Compulsory for Conference Publications?

Update: Added discussions in the end based on Twitter conversations. 

Yesterday, I was on the debate team at DALI conference in gorgeous George in South Africa. The topic was:

“DALI believes it is justified for industry researchers not to release code for reproducibility because of proprietary code and dependencies.”

I was opposing the motion, and this matched by personal beliefs.  I am happy to talk about my own stance but I cannot disclose the arguments of others, since it was off the records (and their arguments were not necessarily their own personal opinions).

Edit: Uri Shalit and I formed the team opposing the motion. I checked with him to see if he is fine with me mentioning it. We collaboratively came back with the points below. 

This topic is timely since ICML 2019 has added reproducibility as one of the factors to be considered by the reviewers. When it first came up, it seemed natural to set standards for reproducibility: the same way we set standards for a publication at our top-tier conferences. However, I was disheartened to see vocal opposition, especially from many “big-name” industry researchers. So with that background, DALI decided to focus the reproducibility debate on industry researchers.

My main reasons for opposing the motion:

Countering the arguments that support the motion:

Update from Twitter conversations

There was enthusiastic participation on Twitter. A summary below:

 

Useful tools for reproducibility:

Lessons from other communities: 

It is not just about code, but data, replication etc: 

Disagreements: 

I assume that the Tweet above does not represent the official position of Deep mind, but I am not surprised.

I do not agree with the premise that it is a worthwhile exercise for others to reinvent the wheel, only to find out it is just vaporware. It is unfair to academia and unfair to graduate students whose careers depend on this.

I also find it ironic that the comment states that if an algorithm is so brittle to hyperparameters we should not trust these results. YES! That is the majority of deep RL results that are hyped up (and we know who the main culprit is).

What happens behind the doors: Even though there is overwhelming public support, I know that such efforts get thwarted in committee meetings of popular conferences like ICML and NeurIPS. We need to apply more pressure to have better accountability.

It is time to burst the bubble on hyped up AI vaporware with no supporting evidence. Let the true science begin!