Archive for January, 2010

Teaching Programming to Biologists

January 28, 2010

I’ve just finished delivering a pilot version of a one-day course, giving biologists a one-day “taster” of programming.

The course came about, as is the way of these things, when I overheard a conversation in a meeting about something entirely different. The University is buying a license for a bioinformatics toolkit called CLCBio. You can write your own plugins for this tool, in Java. Somebody else at the meeting joked that they would need to teach all their postgraduate students Java. I could tell that they weren’t really joking, so I mentioned my software engineering for scientists course. I said I was thinking of writing a more fundamental course and they said they were interested.

It turns out, there was a lot of interest in such a course. Both the biological science and vet faculties have been thinking about providing this training to both their PhD and masters students, which could be well over 100 students a year. As with the software carpentry course, we also had support from the training people who provided the facilities.

There was quite a bit of to-ing and fro-ing about the format for the course. There are already a number of programming courses out there, some of them even for bioinformatics. The problem is that all the courses are several days long and are somewhat pedantic about introducing variables, loops, data structures and so on. I wanted something that could show some of things programming can achieve in biology with a bit more immediacy – a show-off tour of some sort. I was also keen to keep it short because biologists don’t all need to program, and many of them might like to have a look without getting too serious.

I also wanted to provide inspiration to the small number of people who had a natural affinity. My experience is that some people see programming and just get it straight away – they even find it enjoyable. Maybe 20% of people might fall into this category.

So the format we came up with was a one-day taster course, including a rapid tour of some cool stuff. The 20% who really got it can go on and do something more thorough. Those who really hate it are only trapped for one day. This is one of the advantages of a postgraduate course like this. People choose their own research area so they can pick up or drop programming with no harm to their careers. You can aim to inspire the top and not worry that you’re terrifying the bottom, because they can ignore it if they don’t like it.

My analogies for this tactic got more and more brutal as things went on: basically you drop all the students out of a helicopter into a high sea. Then you come back 15 minutes later and pick up all those who haven’t yet drowned.

After we had things set up, we sent out an Email advertising the course (I didn’t mention the helicopter thing) and were subscribed 3 times over within 2 days. That seemed like a good start – at least people are keen.

The course itself was based on Perl, which is still the lingua franca of bioinformatics, for better or worse (mostly worse). Perl in bioinformatics is a rant for another day. I had 4 sessions:

  1. Basic principles
  2. Sequence processing
  3. CSV file processing
  4. BioPerl

It’s a lot to pack into one day, especially since I covered loops, arrays and hashes in the first two sessions. As expected, some people found this completely overwhelming and were still struggling with even declaring variables by the end of the day. However, some really excelled, and by the end had written code to parse over a CSV file, take the mean of a group of columns and only print the rows where that mean is above a certain threshold. You could see that some of them were starting to get that excitement from programming that comes from being able to tackle problems and see what can be achieved.

There was a professor present who loved the course and is very keen to roll it out to all the postgrads we had discussed. We’ll push it through a few more pilot stages first, but it’s gratifying to know that this is somewhere I can make a useful contribution. In fact, as the plaudits rolled in, I realised just how much demand there is and have now started asking whether they could pay me a bit more! It’s a great course and I enjoyed teaching it, but I’ve had the University system take advantage of my good nature in the past and I’m not keen to repeat the experience.

We’re also thinking about how to make sure this course can be deployed to these numbers. It’s already at a stage where the project is too big for just me, so we’ll need to recruit other tutors, and preferably a University department to look after things if and when I leave. The obvious choice is computing science, but they seem very reluctant. Complaining about this is probably also a rant for another day.

So what have I learned from all this:

  1. Biologists want to learn how to program. No, really, they do.
  2. By allowing the weaker students to sink, you can get a lot done with the stronger students.
  3. You can cover a decent amount of programming in a day. No really, you can.

Oh, and Perl still sucks. It also really sucks as a teaching language. Try explaining to people who an hour ago had never seen a variable declaration what “array variable in a scalar context” means. Sigh.