Welcome to European Tribune. It's gone a bit quiet around here these days, but it's still going.
Well, people that operate in secrecy and say that they are "enlightening" others deserve this...

If the work is so good, so reliable why not making it FULLY PUBLICLY AVAILABLE? Wasn't it payed with public funds?

I mean, get the source code out for public inspection. Parameters, fitting, etc...

Lets shed some light on the inner workings of the models.

I am just advocating for public scrutiny of what they do.

Note that in many cases, these projects even fail peer-scrutiny as the code, procedures are not made available for peer-review. And even if it was, it would be impossible for a couple/half-a-dozen reviewers to understand the model implementation. So full openess would allow complete peer-review (other peers than only journal reviewers) and public scrutiny.

Who is afraid of that?

by t-------------- on Sun Nov 22nd, 2009 at 07:01:58 AM EST
[ Parent ]
What do you imagine to be "the Data". Do you have the slightest idea how much there would be, and the enormity of the task of even assembling it?

As for model software, have you ever waded through a significant chunk of code?  Are you going to verify the operating systems they run on as well: if not, why not? How abour the instrumentation in satellites launched 30yr ago? The firmware that controlled them?

And what are the scientists supposed to do for the next 50yr whilst a host of hostile ignoramusses trundles through their work making insistent non-stop demands for detailed explanations and historical trails, while refusing to learn a god-damned thing?

The request is absurd on its face.

by PIGL (stevec@boreal.gmail@com) on Sun Nov 22nd, 2009 at 10:34:03 AM EST
[ Parent ]
What about making code and data accessible for other academics?
by Nomad on Sun Nov 22nd, 2009 at 11:00:34 AM EST
[ Parent ]
And if it is made available, why should you have to have an institutional affiliation to download it?

En un viejo país ineficiente, algo así como España entre dos guerras civiles, poseer una casa y poca hacienda y memoria ninguna. -- Gil de Biedma
by Migeru (migeru at eurotrib dot com) on Sun Nov 22nd, 2009 at 11:07:52 AM EST
[ Parent ]
In principle, of course, by all means...make everything available. In practice, it is a huge undertaking, for which time and resources are mostly not available.
by PIGL (stevec@boreal.gmail@com) on Sun Nov 22nd, 2009 at 11:23:11 AM EST
[ Parent ]
Then simply make it a requirement for every institution to set up a public fileserver with some standard file structure and upload all new source code and data that a reviewer would need to be able to see in order to review a paper. Old data that is "known in the community" wouldn't need to be put there.

Then arrange a permissions structure so that it is initially released only to reviewers, and released to everybody upon publication.

That would only take care of things going forward, but once the structure is established it might be possible to secure outside funding for porting "back issues."

The record will, of course, be incomplete - some data will have been lost, some code will have been modified beyond retrieval. But it's a step in the right direction.

- Jake

Friends come and go. Enemies accumulate.

by JakeS (JangoSierra 'at' gmail 'dot' com) on Sun Nov 22nd, 2009 at 11:31:02 AM EST
[ Parent ]
Nice 1st step. Especially useful in high-visibility areas like this one.
Note that for less visible areas most probably the number of persons interested in doing a review would be small. But it would be good nonetheless. Very good.

2nd step: Do a evaluation of past performance of predicative quantitative science. While some predictions are for the future, some can already be verified:

  1. Predicions on the spread of drug resistant malaria. Did they pan out?
  2. Foot and Mouth?
  3. Flu?
  4. Economics?

3rd step: What about quantitative versus qualitative? Here I am thinking in quantitative finance and things like that versus more qualitative approaches (think Nouriel Roubini and such).
You see, most of the proposals of resident Eurotrib's economists are very "unscientific": in stark opposition to the top journals in the field and also not using the "rigorous" quantitative methods (game theory et al).
In fact eurotribers are qualitative neo-liberal denialists. How unscientific!!!! ;)

4th step: Re assessment of previous publications by scientists that are non-peer. An example: In malaria lots of maths is used for modeling. Other people using maths as a tool (but not in malaria) could read and give an opinion on the maths. It is very difficult for peers to point out errors post-publication... without creating enemies.

I can dream.

In the mean time, things like this "email issue" will probably happen in the future, putting the credibility of current scientists where it deserves to be.

by t-------------- on Sun Nov 22nd, 2009 at 12:19:41 PM EST
[ Parent ]
Current availability and relative cheapness of hardware can no longer excuse the absence of transparency for data usage or computer coding - for any field of science. The only other hump that is needed is a cultural change in doing science.

The current trend, however, is that science journals make full transparency a prerequisite for publication, and data will then only be available through science journals access. Hence increasing their dominance on science publications.

Smart institutions hopefully will move ahead with structures like the one you propose.

by Nomad on Sun Nov 22nd, 2009 at 12:27:52 PM EST
[ Parent ]
I'd agree that science needs an internal shake-up. But keep in mind the pressures are external, and at least partly the fault of the Hobbesian values that also support neo-classical economics.

Ultra-competition, style over substance, egotistic posturing over actual discovery are not necessarily inherently scientific. They're certainly features of academia, but I'm not convinced their effects can't be minimised to the point where they're no longer a key driver of the culture.

As for peer review and data sharing - from the climate denialist point of view, this is missing the point. Even if the scientific community agreed consistently, peer reviewed all models, shared data religiously, and created a clear consensus, the denialists would find one tenured kook and plaster them all over the front pages and the wacko blogs to 'disprove' the scientists.

This is not about evidence or honesty, it's about story-telling and persuasion.

There are certainly things scientists could do, but in terms of political rather than scientific effectiveness, improved transparency comes pretty low on the list.

by ThatBritGuy (thatbritguy (at) googlemail.com) on Sun Nov 22nd, 2009 at 12:46:14 PM EST
[ Parent ]
Another, and in my opinion fairly major, advantage of a "must release upon publication" doctrine would be that it would prevent "paper chop shops" where something that intellectually and research-wise could and should and would have been a single, coherent paper is chopped into half a dozen bits and pieces and sent to as many different journals in order to maximise impact factor.

If you have to release your data after the "preliminary investigation report" there'd be an incentive to delay publishing until you have a paper that you think will actually be cited by anybody outside your own department and close friends.

- Jake

Friends come and go. Enemies accumulate.

by JakeS (JangoSierra 'at' gmail 'dot' com) on Mon Nov 23rd, 2009 at 02:58:19 PM EST
[ Parent ]
... we're on the subject of side benefits, it would also provide substantial insurance against universities being co-opted by corporate interests.

If universities have to make all data completely public, corporate attempts to hide, fabricate or spin results would be in direct conflict with the prestige of the participating scientists. Which is a rather more compelling incentive to refrain from participating in a project than vague concerns about academic ethics.

- Jake

Friends come and go. Enemies accumulate.

by JakeS (JangoSierra 'at' gmail 'dot' com) on Mon Nov 23rd, 2009 at 03:38:00 PM EST
[ Parent ]
The problem with doing it bottom-up is that there's a huge free-rider problem.

So you really have to have a group of Big Dicks who have both enough prestige to demand transparency and enough suitably inventive and painful punishments for the people who fail to comply with that demand.

This is an area where the European Union could do a lot of good. If the EU were to demand that all publicly funded research must be published in journals that demand full disclosure (to the general public, not just to the journal), the ripples would be felt worldwide.

- Jake

Friends come and go. Enemies accumulate.

by JakeS (JangoSierra 'at' gmail 'dot' com) on Mon Nov 23rd, 2009 at 03:31:53 PM EST
[ Parent ]
I do have the slightest idea. Probably more than anyone else here. In fact I am one of the coders of what is probably the biggest epidemiology project in existence. Which happens to be open source, BTW. So, not all is bad.

I also know some people involved in climate modeling software development.


But even if I did not knew anything I could make the following assertion:
Any political decision in an open society which is supported in a technical and scientific process should make that process open to the general public.

In this case it is possible. At least for some of the models that are used, I am pretty sure it is.

And, I personally could not care less that you are a "senior scientist". Please present rational arguments and not arguments of authority. I know what I am talking about from proven first hand experience, and you, why should we trust you?

by t-------------- on Sun Nov 22nd, 2009 at 12:51:01 PM EST
[ Parent ]
I am not a senior scientist, nor did I intend to imply that I was. I think it was fairly clear that I was referring above to that famous climatolgist, not to myself.

Like yourself, I have considerable experience in working with, and writing, fairly sizable pieces of computer simulation software in a variety of languages. None of what I have worked on is remotely comparable in complexity to a major climate model, of which I have merely used the outputs...which was plenty enough work on it own.

The very thought of making my data and model available in any usable form makes me quiver. I have tried to do this once or twice, and it takes a huge amount of effort. I just can't see making such availability a requirement for all scientific working groups. Their actual research productivity would grind to a halt.

At least, that is my opinion, based on my experience, which may well be less than yours.

by PIGL (stevec@boreal.gmail@com) on Sun Nov 22nd, 2009 at 01:44:38 PM EST
[ Parent ]
Sorry for my heated response.
But I stand by the substance of it (though the form was a bit rough).
by t-------------- on Sun Nov 22nd, 2009 at 01:49:36 PM EST
[ Parent ]
I think most of the problem is that people don't really know  how to do data management (such things are not taught and maybe should be) and it seems difficult. A pain it surely is.

I am in the last steps of preparing a paper and I am, for the first time, undecided if I am going to make the software available (the data I won't, as it can be generated from the software with not much computing power). Just bundling the software is a major pain and I am pretty sure no one will care to repeat my stuff, so I probably will skip it this time. If I submit to PLOS Comp Biol, they will probably force me, but other journals, I very much doubt.

They should force me.

by t-------------- on Sun Nov 22nd, 2009 at 02:01:47 PM EST
[ Parent ]
So, to give a more calm response and debunk you point by point:

  1. Regarding the data. For example, climateprediction.net uses BOINC (the binaries are avaiable, how difficult it would be to make the source available?). The amount of data and format is easy to know and is most probably stored in a database. At least the data generated on user computers should be easy to make public (it was generated on user computers with a known formatting, it is even possible to inspect at checkpoiting). Large quantities I would imagine.

  2. Regarding my experience with code of predicative software? I would be candidate to have maximum experience on the planet about a subject like this, sounds too much? I single handedly converted a an epidemiological simulator (one of the biggest in existence) written in Intel Fortran to GNU C. This I can prove.

I would imagine lots of people in climate prediction are using old code (building on top of), which they cannot convert and really they dont know what the code does (like it was written in the 70s by people who are DEAD and left no documentation)

So yes, I know very much well what I am talking about.
And I could talk hours and hours and hours about this.

Not so much about climate modeling (still I know a few things). But about predicative science in general.

by t-------------- on Sun Nov 22nd, 2009 at 01:21:13 PM EST
[ Parent ]
I would imagine lots of people in climate prediction are using old code (building on top of), which they cannot convert and really they dont know what the code does (like it was written in the 70s by people who are DEAD and left no documentation)

Considering the kind and quality of physicist code (and - in particular - the documentation of said code), this is not only probable - it is a virtual certainty (you should excuse the pun).

Ideally, you'd want a computer code cleanup staff on permanent retainer at all major universities. But that would not be cheap.

- Jake

Friends come and go. Enemies accumulate.

by JakeS (JangoSierra 'at' gmail 'dot' com) on Mon Nov 23rd, 2009 at 03:06:59 PM EST
[ Parent ]


Occasional Series