Predicting bacterial resistance from whole-genome sequences using k-mers and stability selection
Título
Predicting bacterial resistance from whole-genome sequences using k-mers and stability selection
Autor
Pierre Mahé, Maud Tournoud
Descripción
Abstract Background Several studies demonstrated the feasibility of predicting bacterial antibiotic resistance phenotypes from whole-genome sequences, the prediction process usually amounting to detecting the presence of genes involved in antibiotic resistance mechanisms, or of specific mutations, previously identified from a training panel of strains, within these genes. We address the problem from the supervised statistical learning perspective, not relying on prior information about such resistance factors. We rely on a k-mer based genotyping scheme and a logistic regression model, thereby combining several k-mers into a probabilistic model. To identify a small yet predictive set of k-mers, we rely on the stability selection approach (Meinshausen et al., J R Stat Soc Ser B 72:417–73, 2010), that consists in penalizing logistic regression models with a Lasso penalty, coupled with extensive resampling procedures. Results Using public datasets, we applied the resulting classifiers to two bacterial species and achieved predictive performance equivalent to state of the art. The models are extremely sparse, involving 1 to 8 k-mers per antibiotic, hence are remarkably easy and fast to evaluate on new genomes (from raw reads to assemblies). Conclusion Our proof of concept therefore demonstrates that stability selection is a powerful approach to investigate bacterial genotype-phenotype relationships.
Fecha
2018
Materia
Genotype phenotype, feature selection, Kmers, Lasso
Identificador
DOI: 10.1186/s12859-018-2403-z
Fuente
BMC Bioinformatics
Editor
BMC
Cobertura
Biology (General), Computer applications to medicine. Medical informatics
Idioma
EN
Colección
Citación
Pierre Mahé, Maud Tournoud, “Predicting bacterial resistance from whole-genome sequences using k-mers and stability selection,” SOCICT Open, consulta 19 de abril de 2026, https://www.socictopen.socict.org/items/show/1319.
Position: 12075 (23 views)