Improving structural genome annotation using proteomics hints
The annotation and analysis of plant genomes present significant challenges due to their large size, extensive genetic diversity, high repeat content, and the presence of numerous pseudogenes and large gene families. These complexities hinder accurate gene prediction, necessitating advanced computational approaches. My research focuses on developing an automated and scalable pipeline for enhancing gene model predictions leveraging proteomics data. The project involves collecting and organizing genome sequencing data, integrating and benchmarking existing gene prediction tools, and incorporating multi-OMICs datasets to improve annotation accuracy. Additionally, comparative genome analysis will be conducted to identify conserved and species-specific features, providing insights into plant genome evolution and function. This research will contribute to improving the structural annotation of plant genomes, supporting agricultural and biotechnological applications by facilitating the discovery of novel genes and functional elements.