- We would like to extend the ssHMM software developed by David Heller, a previous member of the lab, to other RNA Binding Protein (RBP) dataset and search for sequence-structure motif hits in uncharacterized RNA sequences. This project requires that you are proficient in Python, understand the basic concepts of HMM (e.g. Viterbi path) and have an idea about RNA secondary structure prediction and/or motif finding.
- Identification and characterization of bacterial small RNAs from NGS data during infection processes. Basic knowledge of how to handle NGS data (e.g. mapping, visualization in genome Browser, functional annotation) is required, as well as basic programming skills in R, python or C++
- Random Forests (RFs) are an ensembl learning methods for classification and regression which is widely applied in regulatory genomics. While RF operate by constructing a multitude of decision trees to output class lables, a tool for building a 'consensus' tree as well evaluating the goodness of the learned rules is missing. We would like to develope an R package to address this issue exploiting concepts from phylogenetics to build a consensus tree, as well as for visualization of RF results.
- A bunch of other projects in both data analysis and machine learning direction are possible, ..let's just talk about it
Contact Annalisa Marsico email@example.com if you are interested in doing your master thesis in the RNA Bioinformatics lab.