RSS-feed

Sun, 31 Aug 2008

Para_srf: Convert your SoLiD data to srf, fast.


As you know, I released the para_srf version 0.2 in github some weeks ago. I have made some changes since then and I have put a new version available. This new version doesn't bring huge new changes. Mainly I added some integration tests. Very valuable by the way.

I am going to release a new version soon that will fix a little issue Jingwei from ABi found. The current version splits the input data in smaller chunks. By doing that, we may end up loosing some pairing information since not all the reads have a pair. In order to ensure the srf converter has the information for all the pairs we have to perform the split base on the panels. For example: first split panels from 1 to 20, split 2, panels from 21 to 40 and so on.

posted at: 21:17 | path: /programming | permanent link to this entry

Using a macpro for bioinformatics research


Our group decided four months ago to get mac pro to help us in our research. The idea was having a unix box with some decent disk and ram to help us run our bioinformatics code. As a matter of fact we ended up using it in our SOLiD ABi pipeline. The machine has run most of our offline analysis for human dna samples.

The machine has been serving pretty well for its main task. Now that our LSF cluster starts behaving a little bit better we can use the spare cycles for other jobs. A colleague contacted me asking me for access to our machine. His group needed to run some perl code that was suppose to use more than 4G or ram. The first thing we checked was if our perl distribution (macports) was 64 bits. It wasn't. I must admit I didn't even care until then. But I always thought the binary would be 64 bits. It was not:

drio@arad /hgsc/solid/corona_lite/bin $ otool -tv ./mapreads 
./mapreads:
(__TEXT,__text) section
start:
00002468        pushl   $0x00
__SNIP___
00002495        movl    %ebx,0x0c(%esp)
00002499        calll   0x00004cf2
0000249e        movl    %eax,0x00(%esp)
000024a2        calll   0x00101081
__SNIP__

My first thought was, oh well, less check the macports framework, I am sure they had some flags to recompile in 64 bits mode. There isn't. I even asked in the IRC channel. The next logical move was to try to compile everything from scratch, but, the effort will be too much. I would had had to recompile not only perl but also all the dependencies.

All these made me reconsider if a macpro is a ideal machine for this kind of tasks. The machine has already paid off his cost with all the solid analysis we have done with it. But, now, if we want to run some ram intensive code, we are a little bit screwed up. A Dell/HP/etc... linux box will be perfect here (Check my other posts, we got one). Another issue here is the noise. Initially we got the macpro because it was very powerful yet very quiet. There is nothing like that in the PC market. But, well, who cares, I can put the machine in the lab, in front of the sequencers. Noise is not a problem there.

posted at: 20:58 | path: /apple | permanent link to this entry