Bioinformatics and RNA Analysis Using Datameer

  • “Biologists should not deceive themselves with the thought that some new class of biological molecules, of comparable importance to proteins, remains to be discovered. This seems highly unlikely.” (F. Crick, 1958)

    These words, by the co-discoverer of the double helical structure of DNA, would have been largely affirmed by most biologists up to the end of the last century – but even great scientists can be wrong. In fact, the last ten years we have seen the discovery of numerous classes of functional RNA molecules which perform functions as diverse as signal perception, enzymatic catalysis, and extra-chromosomal (that is, DNA-independent) information transfer.

    A particularly interesting class of functional RNAs is the riboswitches: RNAs that perceive small molecules, and respond by adjusting gene expression in an adaptive way. Dr. Douglas Grubb, working at Leibniz Institute of Plant Biochemistry, Halle, Germany, developed a method that identifies riboswitches in eukaryotik genomes. Since this method is computationally intense, it is of interest to distribute computations across a cluster of many machines. Dr. Grubb found a way to setup the actual analysis in Datameer.

    Below are a couple videos to get a more detailed insight on how a computational intense bioinformatics problem can be solved by using Datameer, leveraging the scalability in storage and compute of Apache Hadoop.

