Haters Gonna Hate - why you shouldn't be ashamed of releasing your code
By Neil Chue Hong.
Another social coding scandal broke loose on the twitter-sphere - and with it the fragile trust between the programmer who open sources their code and the public who comments on it was destroyed a little. So what's going wrong, and what should we do about it?
Here's the backstory. Someone writes a small script to help them with a workflow. They find it useful and think that maybe others will find it useful too. They publish it on GitHub, others use it. So far so good.
Then the script gets lots of attention on Twitter, and not all of the good kind. As the author, Heather Arthur, notes on her blog:
"At this point, I know is that by creating this project I’ve done something very wrong. It seemed liked I’d done something fundamentally wrong, so stupid that it flabbergasts someone. So wrong that it doesn’t even need to be explained. And my code is so bad it makes people’s eyes bleed. So of course I start sobbing.
Then I see these people’s follower count, and I sob harder. I can’t help but think of potential future employers that are no longer potential. My name and avatar are part of its identity, and it’s just one step for a slightly curious person to see the idiot behind this project."
Wait just a second.
Someone has gone to effort of making their code open and accessible, and now the very community they thought were supportive of them has turned them into a crying wreck?
This chimes with comments in discussions about reasons for open-sourcing scientific software. There is a fair amount of give and take in research, but often your reputation is the most valuable asset you have.
For instance, Joon Ro writes:
This is pretty much the reason why I haven't released code for a project which has been done. I am not that worried about the ridicule, but rather I am worried that somebody else (who works in the same field) might find a bug in my code before me and it will be an embarrassment.
You are not your code
Here's the thing we need to realise - our reputation is not just built on the quality of our published code.
In reply to Joon, Randy Olson notes:
Better to be temporarily embarrassed than have a bug in your code that could cause confounding problems for you or others who use the code in the future.
This is particularly true in science now the emphasis on reproducible research is leading people to - shock horror - challenge published results, and where we have had examples of published work requiring retraction after bugs were found in code.
In fact in many ways it's better for you to be embarrassed for a bug in your code (hey, a scientist can't programme!) than for a bug in your science (hey, a scientist can't do science!) And as a fellow scientific computing person, I judge you on how quickly you respond to the bug being found, not whether there is a bug (and did you put in a test to catch that bug in the future?)
The tricky one is where you're in that mixed mode career path of being a "research software engineer" i.e. people are hiring you for your coding skills. There reputational damage can hurt on both fronts. And yet I'd still say that it is overall positive to publish your code and have a bug found in it, because you enabled the bug to be found and therefore it can be fixed. If it's poorly written/documented code it's a bit harder because in theory you should consider anything you write (open or closed) to be a potential advert to a future employer. But then you'd never complete anything.
Jessica Kerr kindly pointed me in the direction of Scott Hanselmen's comment on the subject, where he notes that the code in question was actually rather useful, didn't require idiomatic command line knowledge, and, most importantly solved someone's problem. What I like best is his analogy of publishing personal pieces of code to be like having a garage sale:
One person's junk is another person's treasure.
"I can't believe I found code that does exactly what I needed."
"Wow, I learned a lot from that algorithm."
In any case, what's the chances of your code being called out? Impact Story have been looking at this as part of their work to look at Alt-Metrics, and believe that getting a single tweet about your GitHub repository puts you in the top 15% of repositories readvertised to other people. Just three "stars" puts you in the top 20% of repositories recommended.
It seems that the maxim that "all news is good news" holds true - it's better to have people see and ridicule your code than for noone to see it at all. And in the case of Heather Arthur, the publicity has had a positive effect - it appears that there are many more people forking her repository and creating pull requests to improve her code!
So, go ahead - release something out there today! If you feel a little ashamed of it, release it anyway, maybe under Matt Might's academic-strength CRAPL license. And if you see someone else's code and find an issue keep that criticism constructive.
PS If you're a researcher who's still a bit embarassed at releasing their code, perhaps participating in Software Carpentry might give you the skills and confidence to do so in the future.
Posted by NeilChueHong on Friday 25 January 2013.

I'm having a hard time
I'm having a hard time reconciling the critique-everything online crowd with the world of mentoring...
Don't get me wrong,
Don't get me wrong, constructive criticism is very useful. A good mentor understands how to criticise in a way which is beneficial for the person they are mentoring. What I think is going wrong is that being online speeds the isolation from the originator. People leap in and comment not on the original material, but on the comments and opinions of others. This has the effect of amplification. But are you amplifying the signal or the noise?
Online amplification in many cases is a good thing - event amplification through social media can enrich and diversify an event, and allow a broader set of viewpoints. In the case of code, as long as commentators take care to form their own opinions based on their own investigations and their particular expertise, this improves the signal to noise ratio.
I disagree with the idea that
I disagree with the idea that as a "research software engineer" your reputation suffers if bugs are found because you open-sourced your code. First, code will always have bugs. That's just a basic fact of the business. Industry counts for bugs in software are measured in bugs per thousand lines of code, and if you're in the single digits, you're on track, and if you're < 1, you're pretty awesome. Supposedly the space shuttle went down to < 1 bug in 500 000 lines of code, but they probably weren't forced to play the publish or perish game you see in science.
I'll rather have an open-sourced piece of buggy code than a closed-source piece of buggy code, all other things being equal. If I'm desperate to use it, I'll fix the bug.
I agree with your viewpoint -
I agree with your viewpoint - the chances of your code being completely bug-free are slim. Therefore being able to see the code is better than not being able to, and being allowed to fix the code yourself (or commission someone else to do it for you) is even better, hence why open sourcing your code can improve quality.
However, in the case of the "research software engineer" although you and your peers understand this, the people hiring you (who may be professors, lab managers, scientific project managers, or administrators) may not be able to see past "they published code openly that others considered buggy". This is the attitude that we need to change.
I've recently had a similar
I've recently had a similar experience where I shared a small home project I had completed using a low cost Linux board. Not having used a certain scripting language before I was part learning/experimenting and part solving a problem. Needless to say the method I had used to handle some data was not very elegant but worked, I had learnt a little about the language and I hoped that by sharing others would be able to use/learn/modify my script for their own use.
Everything was going well, I had a number of messages asking about various parts of my script then I stumbled across a some comments that suggested that I had got it all wrong and nobody should ever use code like that. This made me feel bad and I felt I needed to justify my code. A criticism of my code is a criticism of me! Something I had forgotten but it is good to remember
The value of comments can be greatly enhanced by the way it is presented but it is important to not take things too seriously. Next time I will write better a better script. I should not be judged on code I have written but on the code I am able to write.
Post new comment