The Monte Carlo validation framework for the discriminant partial least squares model extended with variable selection methods applied to authenticity studies of Viagra (R) based on chromatographic impurity profiles

Krakowska, B.; Custers, D.; Deconinck, E.; Daszykowski, M.

doi:10.1039/C5AN01656H

Title

The Monte Carlo validation framework for the discriminant partial least squares model extended with variable selection methods applied to authenticity studies of Viagra (R) based on chromatographic impurity profiles

Author

Krakowska, B.

Custers, D.

Deconinck, E.

Daszykowski, M.

Abstract

The aim of this work was to develop a general framework for the validation of discriminant models based on the Monte Carlo approach that is used in the context of authenticity studies based on chromatographic impurity profiles. The performance of the validation approach was applied to evaluate the usefulness of the diagnostic logic rule obtained from the partial least squares discriminant model (PLS-DA) that was built to discriminate authentic ViagraO samples from counterfeits (a two-class problem). The major advantage of the proposed validation framework stems from the possibility of obtaining distributions for different figures of merit that describe the PLS-DA model such as, e.g., sensitivity, specificity, correct classification rate and area under the curve in a function of model complexity. Therefore, one can quickly evaluate their uncertainty estimates. Moreover, the Monte Carlo model validation allows balanced sets of training samples to be designed, which is required at the stage of the construction of PLS-DA and is recommended in order to obtain fair estimates that are based on an independent set of samples. In this study, as an illustrative example, 46 authentic ViagraO samples and 97 counterfeit samples were analyzed and described by their impurity profiles that were determined using high performance liquid chromatography with photodiode array detection and further discriminated using the PLS-DA approach. In addition, we demonstrated how to extend the Monte Carlo validation framework with four different variable selection schemes: the elimination of uninformative variables, the importance of a variable in projections, selectivity ratio and significance multivariate correlation. The best PLS-DA model was based on a subset of variables that were selected using the variable importance in the projection approach. For an independent test set, average estimates with the corresponding standard deviation (based on 1000 Monte Carlo runs) of the correct classification rate, sensitivity, specificity and area under the curve were equal to 96.42% +/- 2.04, 98.69% +/- 1.38, 94.16% +/- 3.52 and 0.982 +/- 0.017, respectively.

Language

English

Source (journal)

The analyst. - Cambridge, 1876, currens

Publication

Cambridge : 2016

ISSN

0003-2654 [print]

1364-5528 [online]

DOI

10.1039/C5AN01656H

Volume/pages

141 :3 (2016) , p. 1060-1070

ISI

000368942600037

Pubmed ID

26730545

Full text (Publisher's DOI)

https://doi.org/10.1039/C5AN01656H

Full text (publisher's version - intranet only)

https://repository.uantwerpen.be/docman/iruaauth/0b6f57/131591.pdf

Faculty/Department				Faculty of Pharmaceutical, Biomedical and Veterinary Sciences. Pharmacy

Research group
Publication type				A1 Journal article

Subject				Chemistry

Affiliation				Publications with a UAntwerp address

Web of Science

View record in Web of Science®

View citing articles in Web of Science®

Identifier

Creation

10.03.2016

Last edited

04.03.2024

To cite this reference

https://hdl.handle.net/10067/1315910151162165141