Para_srf: Convert your SoLiD data to srf, fast.
As you know, I released the para_srf version 0.2 in github some weeks ago. I have made some changes since then and I have put a new version available. This new version doesn't bring huge new changes. Mainly I added some integration tests. Very valuable by the way.
I am going to release a new version soon that will fix a little
issue Jingwei from ABi found. The current version splits the input
data in smaller chunks. By doing that, we may end up loosing some
pairing information since not all the reads have a pair. In order
to ensure the srf converter has the information for all the pairs
we have to perform the split base on the panels. For example:
first split panels from 1 to 20, split 2, panels from 21 to 40 and
so on.
posted at: 21:17 | path: /programming | permanent link to this entry
Para_srf: Convert your SoLiD data to srf, fast.
I have just created a git repo for my para_srf project. This software paralellizes the SRF conversion of solid data.
The amount of sequence we get out of one ABi sequencer is extremely high. Performing the conversion in a non concurrent way can take a long time. This software parallelize the tasks so the whole process gets done much faster. Currently works only with LSF clusters but adding other alternatives is very simple.
This is the git url:
git://github.com/drio/para_srf.git
Some of my friends were working in a site and they were using utf-8 to write their html/js. The main page page had a drop-down where you could switch between different languages:
English
Français
Español
Deutsch
日本語
中文
한국어
NOTE: I am assuming that your browser will render this last utf8 characters the proper way. At the time I was writing this, the http server that was sending these content to your browser was forcing this character set:
drio@simba:~/wwwroot $ curl -I http://blog.is04607.com HTTP/1.1 302 Found Date: Sun, 24 Jun 2007 18:45:10 GMT Server: Apache/1.3.37 (Unix) mod_perl/1.29 PHP/4.3.11 mod_gzip/1.3.26.1a Location: http://www.is04607.com/blog/blosxom.cgi Content-Type: text/html; charset=iso-8859-1
I have to shoot an email to the sysadmin where I am hosting this so he can force utf8 on my virtual host.
That was exactly the same problem my friends had. Just by telling apache to use utf8 ( or at least not to force iso-8859-1) things get fixed.
By the way, do you know how many bits does utf8 uses to encode the Japanese characters? 32 bits, 4 bytes:
drio@simba:~/wwwroot $ cat test5.html ----日----- drio@simba:~/wwwroot $ hexdump test5.html 0000000 2d2d 2d2d 97e6 2da5 2d2d 2d2d 000a 000000d
Yes 日 is 0x97e62da5.
I found this document
highly useful to understand what unicode is. I think it has become a classical already.
posted at: 13:11 | path: /programming | permanent link to this entry