Been chasing missing reads in my 70 million short reads data from ABI SOLid. Other than the gremlins took them I have no idea why the code fails and works some times. NFS or Network issues perhaps? Not the sysadmin on the cluster so I can't do much except to audit my numbers each time.
Am thinking ahead of how to speed up or make the process more reliable and I found Brad's blog on his experience with document stores.
Going to follow up and do some testing with this when I have the time.
I'm with you on this one. I'd love to see more bioinformatic applications written with schema free bases. SQL is for banks.
ReplyDelete