Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs

Crappe, Jeroen; Van Criekinge, Wim; Trooskens, Geert; Hayakawa, Eisuke; Luyten, Walter; Baggerman, Geert; Menschaert, Gerben

doi:10.1186/1471-2164-14-648

Title

Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs

Author

Crappe, Jeroen

Van Criekinge, Wim

Trooskens, Geert

Hayakawa, Eisuke

Luyten, Walter

Baggerman, Geert

Menschaert, Gerben

Abstract

Background: It was long assumed that proteins are at least 100 amino acids (AAs) long. Moreover, the detection of short translation products (e. g. coded from small Open Reading Frames, sORFs) is very difficult as the short length makes it hard to distinguish true coding ORFs from ORFs occurring by chance. Nevertheless, over the past few years many such non-canonical genes (with ORFs < 100 AAs) have been discovered in different organisms like Arabidopsis thaliana, Saccharomyces cerevisiae, and Drosophila melanogaster. Thanks to advances in sequencing, bioinformatics and computing power, it is now possible to scan the genome in unprecedented scrutiny, for example in a search of this type of small ORFs. Results: Using bioinformatics methods, we performed a systematic search for putatively functional sORFs in the Mus musculus genome. A genome-wide scan detected all sORFs which were subsequently analyzed for their coding potential, based on evolutionary conservation at the AA level, and ranked using a Support Vector Machine (SVM) learning model. The ranked sORFs are finally overlapped with ribosome profiling data, hinting to sORF translation. All candidates are visually inspected using an in-house developed genome browser. In this way dozens of highly conserved sORFs, targeted by ribosomes were identified in the mouse genome, putatively encoding micropeptides. Conclusion: Our combined genome-wide approach leads to the prediction of a comprehensive but manageable set of putatively coding sORFs, a very important first step towards the identification of a new class of bioactive peptides, called micropeptides.

Language

English

Source (journal)

BMC genomics. - London

Publication

London : 2013

ISSN

1471-2164

DOI

10.1186/1471-2164-14-648

Volume/pages

14 (2013) , 12 p.

Article Reference

648

ISI

000326171800005

Medium

E-only publicatie

Full text (Publisher's DOI)

https://doi.org/10.1186/1471-2164-14-648

Full text (open access)

https://repository.uantwerpen.be/docman/irua/c85ac6/9642.pdf

Faculty/Department				Faculty of Sciences. Biology

Research group
Publication type				A1 Journal article

Subject				Biology Human medicine Engineering sciences. Technology

Affiliation				Publications with a UAntwerp address

Web of Science

View record in Web of Science®

View citing articles in Web of Science®

Identifier

Creation

25.03.2015

Last edited

09.10.2023

To cite this reference

https://hdl.handle.net/10067/1240130151162165141