Predicting bacterial resistance from whole-genome sequences using k-mers and stability selection

Título

Predicting bacterial resistance from whole-genome sequences using k-mers and stability selection

Autor

Pierre Mahé, Maud Tournoud

Descripción

Abstract Background Several studies demonstrated the feasibility of predicting bacterial antibiotic resistance phenotypes from whole-genome sequences, the prediction process usually amounting to detecting the presence of genes involved in antibiotic resistance mechanisms, or of specific mutations, previously identified from a training panel of strains, within these genes. We address the problem from the supervised statistical learning perspective, not relying on prior information about such resistance factors. We rely on a k-mer based genotyping scheme and a logistic regression model, thereby combining several k-mers into a probabilistic model. To identify a small yet predictive set of k-mers, we rely on the stability selection approach (Meinshausen et al., J R Stat Soc Ser B 72:417–73, 2010), that consists in penalizing logistic regression models with a Lasso penalty, coupled with extensive resampling procedures. Results Using public datasets, we applied the resulting classifiers to two bacterial species and achieved predictive performance equivalent to state of the art. The models are extremely sparse, involving 1 to 8 k-mers per antibiotic, hence are remarkably easy and fast to evaluate on new genomes (from raw reads to assemblies). Conclusion Our proof of concept therefore demonstrates that stability selection is a powerful approach to investigate bacterial genotype-phenotype relationships.

Fecha

2018

Materia

Genotype phenotype, feature selection, Kmers, Lasso

Identificador

DOI: 10.1186/s12859-018-2403-z

Fuente

BMC Bioinformatics

Editor

BMC

Cobertura

Biology (General), Computer applications to medicine. Medical informatics

Idioma

EN

Archivos

https://socictopen.socict.org/files/to_import/pdfs/article 1361.pdf

Colección

Citación

Pierre Mahé, Maud Tournoud, “Predicting bacterial resistance from whole-genome sequences using k-mers and stability selection,” SOCICT Open, consulta 19 de abril de 2026, https://www.socictopen.socict.org/items/show/1319.

Formatos de Salida

Position: 12075 (23 views)