Multimodal phenotypic labelling using drug‐induced sleep endoscopy, awake nasendoscopy and computational fluid dynamics for the prediction of mandibular advancement device treatment outcome: a prospective study

Summary Mandibular advancement device (MAD) treatment outcome for obstructive sleep apnea (OSA) is variable and patient dependent. A global, clinically applicable predictive model is lacking. Our aim was to combine characteristics obtained during drug‐induced sleep endoscopy (DISE), awake nasendoscopy, and computed tomography scan‐based computational fluid dynamic (CFD) measurements in one multifactorial model, to explain MAD treatment outcome. A total of 100 patients with OSA were prospectively recruited and treated with a MAD at fixed 75% protrusion. In all, 72 underwent CFD analysis, DISE, and awake nasendoscopy at baseline in a blinded fashion and completed a 3‐month follow‐up polysomnography with a MAD. Treatment response was defined as a reduction in the apnea–hypopnea index (AHI) of ≥50% and deterioration as an increase of ≥10% during MAD treatment. To cope with missing data, multiple imputation with predictive mean matching was used. Multivariate logistic regression, adjusting for body mass index and baseline AHI, was used to combine all potential predictor variables. The strongest impact concerning odds ratios (ORs) was present for complete concentric palatal collapse (CCCp) during DISE on deterioration (OR 28.88, 95% confidence interval [CI] 1.18–704.35; p = 0.0391), followed by a C‐shape versus an oval shape of the soft palate during wakefulness (OR 8.54, 95% CI 1.09–67.23; p = 0.0416) and tongue base collapse during DISE on response (OR 3.29, 95% CI 1.02–10.64; p = 0.0464). Both logistic regression models exhibited excellent and fair predictive accuracy. Our findings suggest DISE to be the most robust examination associated with MAD treatment outcome, with tongue base collapse as a predictor for successful MAD treatment and CCCp as an adverse DISE phenotype.


INTRODUCTION
Obstructive sleep apnea (OSA) is a common syndrome with a worldwide prevalence of almost 1 billion people in the 30-69 years age range (Benjafield et al., 2019). This sleep-related breathing disorder is characterised by recurrent events of partial or complete collapse of the upper airway, lasting ≥10 s during sleep, leading to a reduction in respiratory flow (Gottlieb & Punjabi, 2020). Currently, the most commonly used standard in assessing sleep apnea severity is the apnea-hypopnea index (AHI), which presents the amount of partial (hypopnea) and complete (apnea) collapses of the upper airway per hour of sleep.
Continuous positive airway pressure (CPAP) is currently considered the standard therapy but has a limited compliance rate (Guralnick et al., 2017). In this regard, an alternative non-invasive treatment option for patients with OSA is oral appliance therapy, the most of which are custom-made, titratable mandibular advancement devices (MADs). A MAD is worn intra-orally at night and acts by protruding the mandible, resulting in an opening of the upper airway and an increase in volume of the upper airway (Chan et al., 2010). In this respect, multiple studies have already proven the efficacy of MADs, but not in all patients: therapy success may range from 47.7% to 75.0% (Gjerde et al., 2016;Kim et al., 2014).
The variable and patient-dependent MAD treatment response emphasises the need for careful patient selection. Selection for MAD treatment today relies on baseline patient characteristics, anthropometrics, and drug-induced sleep endoscopy (DISE) findings with or without the use of a simulation bite (Chen et al., 2020;Op de Beeck et al., 2019;Vroegop et al., 2020). However, a global, clinically applicable predictive model is lacking.

Baseline Parameters
In previous studies, various treatment response factors for MAD therapy are described: younger age, female gender, supine-dependent OSA, lower body mass index (BMI), lower AHI, retracted maxilla and mandible, narrower airway, and shorter soft palate than nonresponders (Chen et al., 2020;Pahkala et al., 2020;Sutherland et al., 2015). Nevertheless, these parameters only show a weak association in predicting MAD treatment efficacy.

Computational fluid dynamics
Functional imaging can be used to investigate OSA and the mechanism of action of MADs on the upper airway morphology, with the use of computed tomography (CT) scans on three-dimensional (3D), computer-aided design, coupled with computational fluid dynamics (CFD). In several studies here, MADs have been proven to act by enlarging the upper airway volume and the minimal cross-sectional area in order to prevent upper airway collapse during sleep, stating that a decrease in upper airway resistance and an increase in upper airway volume are correlated with an objective clinical improvement of OSA severity Vos et al., 2007). Moreover, a smaller minimal cross-sectional area is a marker for higher OSA severity . In this regard, recent analysis has shown that MADs may act by increasing the total upper airway volume, predominantly due to an increase in velopharyngeal volume (Van Gaver et al., 2021). Furthermore, particularly in responders to MAD treatment, Van Gaver et al. have found a significant increase in total upper airway volume, emphasising that the efficacy of a MAD is associated with a larger increase in upper airway volume. On the other hand, the absence of an increase in velopharyngeal volume seems to be associated with deterioration. Additionally, previous studies have also found an association between treatment response of a MAD and total upper airway volume with a predominant increase in velopharyngeal volume (Chan et al., 2010;Song et al., 2019). These markers may be used in evaluating treatment outcome in patients with OSA. Therefore, the combination of imaging techniques and CFD may play a role in future MAD-personalised patient selection.

Drug-induced sleep endoscopy
The observed site(s) and pattern(s) of upper airway collapse during DISE are proven to play a major role in personalised treatment selection of non-CPAP therapy for patients with OSA (Op de Beeck et al., 2019;Vanderveken et al., 2013;Vroegop et al., 2020).
Accordingly, a complete concentric collapse at the level of the palate (CCCp) has been shown to be associated with a less favourable surgical outcome for upper airway stimulation therapy (Vanderveken et al., 2013). Then, according to another recent study, CCCp was shown to be associated with a negative MAD treatment outcome (Op de Beeck et al., 2019). That study also showed an association of complete oropharyngeal collapse with an adverse effect on MAD treatment and a higher success rate of a MAD with the presence of tongue base collapse during baseline DISE (Op de Beeck et al., 2019).

Awake nasendoscopy
Awake nasendoscopy with Müller's manoeuvre is an Ear, Nose, Throat (ENT) investigation, commonly used in the clinical examination of patients diagnosed with OSA. Müller's manoeuvre here is defined as a forced inspiratory effort against a closed airway, where the examiner endoscopically observes the narrowing of the pharyngeal walls at the retrolingual and retropalatal level. However, various studies show that Müller's manoeuvre has a number of inconsistencies (Soares et al., 2013;Zerpa Zerpa et al., 2015). Nevertheless, Van

Study aim
Multiple studies report several predictors for oral appliance therapy outcome, which have recently been reviewed (Okuno et al., 2016), although thorough validation is lacking. With this, a recent study of Sutherland et al. has aimed to derive a prediction model based on multiple awake assessments, including facial photography, spirometry, and nasendoscopy (Sutherland et al., 2018). However, no significant added value of these awake assessments has been found compared to the use of only clinical baseline characteristics in the prediction of MAD treatment outcome. Therefore, a persistent need to find a robust clinical applicable prediction model combining several predictors is highlighted.
Accordingly, the aim of this study was to combine findings derived from DISE, awake nasendoscopy and CT scan-based CFD in one model to explain MAD treatment outcome.

Ethical considerations
The present data were prospectively obtained from the Agentschap voor Innovatie door Wetenschap en Technologie (IWT) database of the Predicting Therapeutic Outcome of Mandibular Advancement Device Treatment in Obstructive Sleep Apnea (PROMAD) trial (identifier NCT01532050 on clinicaltrials.gov) (Verbruggen et al., 2016). The ethics committee at the Antwerp University Hospital and the University of Antwerp approved the study. All patients have given a written informed consent prior to participation in the PROMAD cohort study.

METHODS
This study protocol was published previously by Verbruggen et al ( Figure 1a) (Verbruggen et al., 2016). At first, patients were screened and underwent an extensive clinical examination by an ENT and dental sleep specialist. Temporomandibular joint issues were evaluated anamnestically, through palpation and with a functional assessment of opening or closing of the mouth and movements of deduction of the lower jaw. Subsequently, an objective baseline evaluation on a F I G U R E 1 Study flow chart (a) and patient flow (b). †Reasons for dropout: time constraints (seven patients), lost to follow-up despite several reminders (three), expenses (two), insufficient reduction of complaints with MAD (two), OSA resolution after weight loss (one), excessive gag reflex with MAD (one), and moving abroad (one). Abbreviations: AHI, apnea-hypopnea index; BL, baseline; CFD, computational fluid dynamics; CT, computed tomography; DISE, druginduced sleep endoscopy; MAD, mandibular advancement device; OSA, obstructive sleep apnea; PSG, polysomnography standard full-night polysomnography (PSG) by assessing the AHI was made to verify the eligibility criteria (Table 1). Afterwards, oral appliance therapy was initiated using a titratable, custom-made, duoblock MAD (Respident Butterfly MAD, Orthodontics Clinics NV, Antwerp, Belgium) in 75% of the individualised maximal mandibular protrusion.
Each patient's maximum protrusion capacity was measured three times and averaged, using a proprietary gauge bite fork. Measurements were made according to the trajectory of the centric relation position to maximal protrusion. At 3 months after initiation of the MAD, a second PSG was performed to determine the follow-up AHI.
The AHI and other PSG variables were scored by a sleep laboratory technician according to the American Academy of Sleep Medicine (AASM) criteria (Iber et al., 2007). Subsequently, treatment outcome was measured by the difference in baseline AHI and AHI after 3 months of therapy. Deterioration was primarily expressed by an increase in AHI percentage of ≥10% from baseline. Treatment response was defined as a reduction of ≥50% in the AHI with MAD compared to baseline PSG. After initiating MAD treatment, all patients underwent the following three investigations at baseline: a low-dose CT scan of the head and neck region with CFD analysis 1 month after the start of MAD treatment, a DISE between 1 and 3 months after start, and an awake evaluation using nasendoscopy the day of the 3-month follow-up PSG. Furthermore, the investigators and patients remained blinded during the data collection.

Computational fluid dynamics
During the awake baseline low-radiation-dose CT with CFD, patients were placed in a supine position and were asked to hold their breath at the end of a normal inspiration. Based on the scanned areas starting at the nasopharynx down to the larynx, 3D computer-aided design models were reconstructed using Mimics software (Materialise, Leuven, Belgium). These models were subsequently transferred into a computational grid by FluidDa NV (Kontich, Belgium). The upper airway volume was determined and expressed as the effective upper airway volume in which air flows through, excluding leakage into the mouth. The total volume and the volume of the three individual sections of the pharynx were measured: velopharynx, oropharynx, and hypopharynx. Additional anatomical parameters, such as the minimal cross-sectional area and the upper airway resistance were calculated.

Drug-induced sleep endoscopy
A DISE was performed in a semi-dark and silent operating theatre with the patient lying in a supine position. The investigation was performed by an experienced ENT surgeon and scored by a board of four experienced ENT surgeons. Natural sleep was mimicked by administering sedative drugs, induction of sleep was obtained by an intravenous bolus administration of midazolam (1.5 mg) and remained with a target-controlled infusion of propofol (2.0-3.0 μg/ml). A flexible fiberoptic nasopharyngoscope (Olympus END-GP, diameter 3.7 mm, Olympus Europe GmbH, Hamburg, Germany) was used and inserted intranasally to inspect the upper airway. Collapse degree (none, partial, or complete), direction (anteroposterior, concentric, or lateral) and level were scored according to a standardised scoring system ( Figure 2). The following upper airway levels were examined: soft palate, oropharynx (region at the level of the tonsils), tongue base, epiglottis, and hypopharyngeal lateral walls (region below tongue base).

Awake nasendoscopy
Patients underwent an awake endoscopic investigation, performed using a flexible fiberoptic nasopharyngoscope (Olympus END-G, Body mass index (BMI) ≤35 kg/m 2 OSA as defined by the American Academy of Sleep Medicine task force (Iber et al., 2007) Diagnostic criteria: A. Anamnesis (at least one of the following criteria)  Note: Values are presented as median (IQR [quartile 1-quartile 3]) for non-normally distributed data or mean (SD) for normally distributed data. All parameters were compared using the Wilcoxon signed-rank test. BMI was compared using a paired t test. AHI was scored according to the American Academy of Sleep Medicine 1999 criteria (3% oxygen desaturation or an arousal). ODI was calculated as dips of ≥3% over the total time in bed. Significant values (p < 0.05) are shown in bold.   Vroegop, et al., 2020b). The soft palate was divided in three categorical shapes: the oval shape (anterior position), the C-shape (prominent uvula), and the dumbbell shape (overall narrowing of the velopharynx due to a posterior location of the soft palate). Oropharyngeal crowding was defined as the presence of large palatine tonsils or the occurrence of prominent pharyngeal arches provoking partial obscuration or compression of the tongue base. The lingual tonsils were scored according to the Friedman grading system (Friedman et al., 2015). The position of the tongue base was categorised depending on the visibility of the valleculae: completely, partially, or not visible and the fourth category consisted of a compression of the epiglottis and/or a posteriorly located tongue base. The epiglottic shape was assessed as a normal, flat, or curved. Lastly, the modified Cormarck-Lehane scale was used to describe the hypopharynx: complete or partial visibility of the vocal cords, visibility of the arytenoids but not of the vocal cords, and no visibility of the glottis (Torre et al., 2018).

Statistical analyses
All data were analysed using SPSS ® Statistics, version 27.0 (IBM

Evolution of clinical characteristics
A significant improvement was seen in clinical characteristics at the 3-month follow-up compared to baseline ( a median (IQR) score of 9/24 (5-12) to 6/24 (3-10) (p < 0.0001). In terms of treatment outcome, deterioration was seen in 11 patients (15.3%) and response was seen in 33 patients (45.8%). No significant differences were found in baseline clinical characteristics between responders and non-responders, and between deteriorating and nondeteriorating patients (Table 3).

Comparison of airway parameters according to treatment outcome
Between responders and non-responders, a significant difference was present during DISE for tongue base collapse (65.6% in responders and 41.0% in non-responders; p = 0.0404) and for palatal collapse (87.5% in responders and 100% in non-responders; p = 0.0240) ( Table 4). There were no significant differences in CFD and awake nasendoscopy parameters regarding response or no response.
No significant differences at baseline were found between deteriorating and non-deteriorating patients regarding CFD parameters (upper airway volume, upper airway resistance, and minimal crosssectional area) ( deterioration in a C-shaped soft palate (50.0%; p = 0.0105). A significantly higher percentage in oropharyngeal crowding was seen in deteriorating patients (40.0%; p = 0.0098) than in non-deteriorating patients (9.1%).   AHI and BMI (OR: 8.27 and OR: 5.12,respectively). There was no relationship between MAD outcome and baseline CFD variables. Correlation tests between the different predictor variables were performed, with only a weak relationship between these variables.
However, tongue base collapse remains the most stable predictor with a relatively narrow CI, after stepwise inclusion of all the predictive parameters in two separate multivariate models according to treatment outcome (Tables 8 and 9).

Diagnostic statistics and ROC analysis
Diagnostic accuracy of both MAD outcome multimodal prediction models corrected for baseline AHI and BMI was determined, and ROC curves were generated ( Figure 5). The optimal predictive probability

DISCUSSION
In general, MAD treatment response is variable and patient dependent. Thus, careful patient selection is necessary to identify eligible patients and avoid unfavourable treatment, as this is associated with unnecessary costs and a longer delay toward successful treatment.
Multiple studies here report several predictors for oral appliance therapy outcome, although multifactorial models and thorough validation are still lacking (Okuno et al., 2016).
Generally, an innovative clinical prediction model for MAD treatment outcome is outlined in this study, using patient characteristics obtained during DISE, awake nasendoscopy, and CT scan-based CFD in patients with OSA.
The major findings of this prospective study suggest DISE to be the most robust examination associated with MAD treatment outcome, with tongue base collapse during baseline as a positive predictor for successful MAD treatment for OSA. Furthermore, the presence of CCCp is an adverse DISE phenotype towards MAD treatment outcome.
Firstly, the significant results of tongue base collapse regarding response and CCCp regarding deterioration during MAD treatment, are preserved using a multimodal assessment adjusted for awake nasendoscopy observations and CFD findings, and after correction for AHI and BMI. With this, a somewhat narrow CI is seen for tongue base collapse, enabling more precise interpretation of the results.
CCCp presents a larger margin of error, necessitating a larger sample to adequately confirm these results, which emphasises tongue base collapse to be the most robust characteristic.
Regarding awake nasendoscopy, it is solely the presence of a prominent uvula (C-shaped position of the soft palate) during tidal breathing that remains strongly correlated with MAD treatment deterioration after multimodal labelling. This contributes to the results of previous studies that a MAD primarily acts on the soft palate (Kent et al., 2015;Ryan et al., 1999).
Subsequently, no clear correlations have been found between MAD treatment outcome and baseline upper airway volume, as measured with CT scan-based CFD. This is not completely surprising, as it is mainly the presence or absence of an increase in upper airway volume with the use of a MAD that seems to be significantly associated with treatment outcome in previous studies (Chan et al., 2010;Song et  Moreover, both logistic regression models exhibit an excellent (AUC 0.8-0.9) and fair (AUC 0.7-0.8) predictive accuracy (Mandrekar, 2010). Implementation of these predictor variables may avoid treatment of patients with a lower probability of response or a high probability of deterioration with a MAD. However, at this point, these models are purely exploratory and are rather a representation of the presence or absence of their clinical applicability. Further validation of these results in a large cohort is thus needed.

Strengths and limitations
All examinations were performed in a blinded fashion for the patient and multidisciplinary research team. Consequently, MAD treatment was not affected by the characteristics obtained during CT scan-based CFD, DISE or awake nasendoscopy.
Although the various baseline prediction methods were performed at different timepoints, the timing only varied by up to 2 months between these examinations and between patients, minimising differences related to the time frame. Moreover, all examinations were performed at the earliest 1 month after MAD start, allowing a habituation period of 1 month.
Furthermore, both awake nasendoscopy and CT scans were performed during wakefulness, so the observed results may differ from a sleeping state, as changes in muscle tone at the upper airway occur predominantly during sleep (Fogel et al., 2004). However, the creation of a multimodal model with the addition of endoscopic examinations during drug-induced sleep may overcome this problem and optimise the predictive value for MAD treatment outcome.
In this regard, treatment outcome was assessed using the difference in baseline AHI value and AHI with the use of a MAD, in which response was defined as a reduction in the AHI of ≥50% from base- Both measurements were only determined by a one-night PSG, as such not considering internight variability. With this, postural changes, sleep structure, and a first night effect may also influence OSA severity (Sforza et al., 2019). However, in most studies, the proportion of patients who exhibit internight variability remains limited to between 18% and 35% (Alshaer et al., 2018;Bliwise et al., 1991).
Therefore, the authors postulate that the effect of this limitation remains limited.
As standardised MAD-titration guidelines are lacking, to create uniformity among the various patients, the MAD is fixed at 75% of maximal protrusion. In contrast, within clinical practice in our hospital, an optimal personalised titration is performed (generally ranging between 75% and 100% of maximal protrusion), which will probably improve treatment response. To objectify outcome prediction of MAD treatment here, a uniform fixed degree of protrusion was adopted within the present study.
Moreover, endoscopic examinations are rather subjective in nature and predisposed to high intra-and interobserver variability. In this regard, during previous DISE studies, a poor to good interobserver agreement has been observed (Kilavuz & Bayram, 2019;Vroegop et al., 2013b). Therefore, to avoid this constraint, a uniform classification system was implemented to score both awake nasendoscopic and DISE observations, in which all awake nasendoscopic findings were evaluated by the same ENT specialist and reviewed by a second experienced investigator. Furthermore, the DISE observations were scored during consensus scoring by four experienced ENT surgeons to reduce possible interobserver variability.
Lastly, a multiple imputation approach was implemented to deal with missing data, obtaining a complete dataset with approximately unbiased estimates of all potential predictive characteristics. This gives the advantage of performing further statistical analyses on a larger number of patients, which increases overall statistical power.
In conclusion, this is the first study to our knowledge, combining DISE findings and awake examinations in one model for predicting MAD treatment outcome. With this, a recent study has concluded that a prediction model with awake assessments has no added value in the prediction of MAD treatment outcome compared to the use of clinical baseline characteristics alone (Sutherland et al., 2018 All authors contributed to revision and final approval of the manuscript.

ACKNOWLEDGMENTS
This study was funded by a 3-year grant of the Flemish government agency for Innovation by Science and Technology (IWT-090864).

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.