RSS-feed

Tue, 16 Sep 2008

Solid to SRF package from ABi in macosx


ABi has recently released a binary package that converts solid data into SRF (sort read format). That package was only tested and developed on linux, as usual, and it doesn't compile in macosx out of the box. Fortunately we have the source code.

Here you have the patch to make it work on osx:

From d8d85a3d4413be0c3523dbe0e977208cac3dca0e Mon Sep 17 00:00:00 2001
From: drio 
Date: Tue, 16 Sep 2008 13:06:32 -0500
Subject: [PATCH] Makes it compile on osx

---
 solid2srf-0.7.3/SRF/base/SRF_util.hh |    2 +-
 solid2srf-0.7.3/ZTR/src/ZTR_util.hh  |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/solid2srf-0.7.3/SRF/base/SRF_util.hh b/solid2srf-0.7.3/SRF/base/SRF_util.hh
index 1530700..b832065 100644
--- a/solid2srf-0.7.3/SRF/base/SRF_util.hh
+++ b/solid2srf-0.7.3/SRF/base/SRF_util.hh
@@ -33,7 +33,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 

 ADOPT char* SRF_cStrToPascalStr( const char* str );
diff --git a/solid2srf-0.7.3/ZTR/src/ZTR_util.hh b/solid2srf-0.7.3/ZTR/src/ZTR_util.hh
index 08199c3..5f48f16 100644
--- a/solid2srf-0.7.3/ZTR/src/ZTR_util.hh
+++ b/solid2srf-0.7.3/ZTR/src/ZTR_util.hh
@@ -13,7 +13,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 

 // put in a global include
-- 
1.5.6.5

As you can see the patch is trivial, there are a couple of includes that have to change in macosx.

posted at: 13:10 | path: /bioinformatics | permanent link to this entry

Tue, 13 May 2008

MD3000


I was in the market for an external storage array. I needed around 10Tb of space to run illumina's offline analysis. Our current LSF cluster is not stable enough and the resources are very limited.

I ordered a Dell MD3000 and a 2 cpus x 4 cores AMD machine with a SASe5 raid controller to hook up the MD3k.

My initial idea was to setup a Raid6 11Tb virtual disk and to format it for xfs. By the way, I was using CentOS since dell supports redhat for that hardware.

After reading some basic documentation I hooked the MD3k to the server (I ended up calling it milhouse). I wasn't sure how the server sends the commands to the md3k to configure the system. I setup a vtrack storage array two months ago and the configuration was done over ethernet. This array also allows that but if your server is physically connected to the md3k, that's the only thing you need.

The software installation wasn't very complicate. The CD comes with some rpms that install device drivers and some tools to control the array. I decide to start using the GUI, a java tool.

When I started it I thought, well, this is pretty typical. But then I had this issue: I couldn't configure volumes bigger than 2Tb. I spent a lot of time searching the documentation. The vtrak allowed me to configure a RAID6 11TB volume without problems.

Tired of searching I decided to pull from Linux once again: particularly from LVM. I created 6 volumes in the array and then I exposed them to linux. Once that was done I setup a lvm volume to join them, and finally I created the actual disk.

drio@milhouse ~/tmp $ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      128G  5.3G  116G   5% /
/dev/sda3              99M   28M   66M  30% /boot
tmpfs                  16G     0   16G   0% /dev/shm
/dev/mapper/md3000-slx
                       11T  174G   11T   2% /mnt/slx

After some more research I found ="http://blogs.smugmug.com/don/2007/10/01/dell-md3000-great-das-db-storage/">here that the Dell MD3000 is a re-branded LSI/Engenio array. In that same blog entry, you have a link to the LSI/Engio documentation. Much better than the Dell one.

posted at: 09:40 | path: /bioinformatics | permanent link to this entry

Tue, 26 Feb 2008

How to split a fasta file the ruby way


This is the ruby way to split a fasta file. Pure poetry.

#!/usr/bin/env ruby
  
sample = "indian_macaque_"

File.open(ARGV[0], "r").read.scan(/^>Chr[0-9A-Za-z]+\n[ACTG\n]+/) { |chrm|
  File.open(sample + />(Chr[0-9A-Za-z]+)/.match(chrm).to_a[1], "w").puts(chrm)
}


posted at: 23:39 | path: /bioinformatics | permanent link to this entry