Dr Mikhail Matz is a researcher in the field of coral genomics. His approach to doing de novo transcriptomics for an organism whose genome is unavailable.
his compute cluster is basically
"two Dell PowerEdge 1900 servers joined together with ROCKS clustering software v5.0. Each server had: two Intel Quad Core E5345 (2.33 Ghz, 1333 Mhz FSB, 2x4MB L2 Cache) CPU’s and 16 GB of 667 Mhz DDR2 RAM. The cluster had a combined total of 580 GB disk space."
Tools used are
- Blast executables from NCBI, including blast, blastcl3, and blastclust
- Washington University blast (Wu-blast)
- ESTate sequence clustering software
He admits that the assembled transcriptome might be incomplete (~40,000 contigs with five-fold average sequencing see Figure 2 for the size distribution of the assembled contigs
But it is "good enough" to use as a reference transcriptome to align SOLiD reads accurately and to generate the coverage that 454 can't give for the same amount of grant money.
the results are published in BMC Genomics
Not sure if you have heard of just in time inventory. But I think "good enough" science takes a bit of dare to spend that money to ask those what-ifs.