A Non-invasive Procedure for Early Stage Discrimination of Malignant and Precancerous Vocal Fold Lesions Based on Laryngeal Dynamics Analysis the Authors Report No Declaration of Interest. Maria Schuster Received Remuneration for Lectures

Running title Early stage detection of malignant vocal fold lesions Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Abstract About two thirds of laryngeal cancers originate at the vocal cords. Early stage detection of malignant vocal fold alterations, including a discrimination of premalignant lesions, represents a major challenge in laryngology as precancerous vocal fold lesions and small carcinomas are difficult to distinguish by means of regular endoscopy only. We report a procedure to discriminate between malignant and precancerous lesions by measuring the characteristics of vocal fold dynamics by means of a computerized analysis of laryngeal high-speed videos. Ten patients with squamous cell T1a carcinoma, ten with precancerous lesions with hyperkeratosis, and ten subjects without laryngeal disease underwent high-speed laryngoscopy yielding 4,000 images per second. By means of wavelet-based phonovibrographic analysis, a set of three clinically meaningful vibratory measures were extracted from the videos comprising a total number of 15,000 video frames. Statistical analysis (ANOVA with post-hoc two-sided t-tests, P <0.05) revealed that vocal fold dynamics is significantly affected in presence of precancerous lesions and T1a carcinoma. On the basis of the three measures a discriminating pattern was extracted using a support vector machine-learning algorithm performing an individual classification in respect to the different clinical groups. By applying a leave-one-out cross-validation strategy, we could show that the proposed measures discriminate with a very high performance between precancerous lesions and T1a carcinoma (sensitivity: 100%, specificity: 100%). Although a large-scale study will be necessary to confirm clinical significance, the set of vibratory measures derived in this study may be applicable to improve the accuracy and reliability of non-invasive diagnostics of vocal fold lesions. Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.


Introduction
Upper aerodigestive tract cancer ranks seventh among worldwide cancer incidences, with laryngeal malignancies being one of the most common representative (1).About two thirds of laryngeal cancers originate at the vocal cords, and more than 90% are squamous cell carcinomas (2).An early diagnosis is directly correlated with a favorable prognosis; more than 90% of patients with early laryngeal cancer can be cured without loosing laryngeal function (3).In laryngology, one of the greatest challenges is thus the early detection of malignant alterations of the vocal folds and their distinction from premalignant lesions.Both are similar in appearance, showing irregular or thickened mucosa due to structural changes with invasion of the subepithelial space in the case of malignant lesions (4) or hyperkeratosis and hyperplasia to severe dysplasia (3,5) in the case of premalignant lesions.Dysphonia initially caused by defective mucosal vibration is the first presenting symptom for malignant processes.Thus, the assessment of anomalous properties of the laryngeal dynamics induced by infiltrative processes of the deep submucosal structures of the vocal folds serves as predictor of glottic cancer.Clinically, laryngostroboscopic imaging constitutes the most important tool for functional investigation of the larynx.Because the diagnostic evaluation of the stroboscopic videos or fiberendoscopy bases just on a subjective visual inspection slight changes of the vocal fold dynamics induced by early infiltrative processes may not be observable for the human eye.Furthermore, small carcinomas and precancerous alterations can hardly be distinguished.As a consequence, biopsy or total excision of the suspicious lesion is currently necessary, including histopathologic examination of the excised tissue for reliable diagnostics.However, a clear distinction before invasive diagnostic would be essential for therapy planning.On the one hand, as malignant transformation in low dysplasia degrees is rather rare (6,7), unnecessarily augmented excision and the risk of unfavorable scarring could be prevented when presurgical diagnostics would be sufficiently reliable (8).On the other hand, knowing about the malignant nature of a lesion could accelerate the adequate therapy and lead to better functional outcome either by excision or radiotherapy.Thus, pretreatment diagnostics should be optimized for adequate surgical or nonsurgical therapies.
Several presurgical diagnostic methods referring to the vibratory function or the structure of a lesion have been introduced for this purpose.The vibratory function of the vocal folds including the mucosal wave propagation is commonly judged visually using stroboscopy with rigid laryngoscopes (9).As malignant lesions infiltrate, subepithelial structures mucosal wave propagation cannot be observed anymore.However, precancerous lesions and carcinomas could considerably disturb the temporal structure of affected voices.Hence, the visually assessed virtual stroboscopic slow motion video is distorted because of a desynchronized triggering of the light source flashes.Moreover, this method is based on perceptual assessment with limited reliability (10).
Other optical methods, such as autofluorescence or narrowband imaging, strive after a more detailed two-dimensional (2D) imaging of the superficial structure of mucosal alterations (11)(12)(13).Although these two methods are both highly sensitive in detecting tissue abnormalities as well as quite reliable in showing the exact superficial extension of the lesions, they are somewhat unspecific with regard to a differentiation of noninvasive versus invasive lesions.In contrast, both optical coherence tomography and confocal laser endomicroscopy provide insights into subsurface tissue architecture at the microscopic level in vivo.It was reported that optical coherence tomography can differentiate well between precancerous lesions and early invasive cancer, whereas confocal laser endomicroscopy serves to monitor malignant alterations at the cellular level (14,15).As both methods are usually applied during direct laryngoscopy under general anesthesia, however, they have not yet found their way into clinical routine.
In this article, we present a new method that examines the vibratory function of the vocal folds with high temporal resolution.In normal larynges, vocal fold vibrations show a threedimensional (3D) pattern depending on the myoelastic characteristics of the vocal folds and aerodynamics during expiration (16).Using endoscopy from above, a 2D regular opening and closing pattern of the glottis can be observed.Symmetry of the vocal fold vibrations and glottal closure is a precondition of a normal voice (17).Nowadays, high-speed video (HSV) laryngoscopy allows for new insight into physiologic and pathologic mechanisms of the function of vocal folds (18).HSV captures vocal fold dynamics in real time and is suited for a quantitative analysis.Because of its objective nature, HSV is proven to be more reliable than stroboscopy (19).Supplemented with appropriate computerized analysis procedures HSV allows for the detection of even slight disturbances of vocal folds (20).
Several image processing approaches have been developed for the analysis of the vibratory characteristics of vocal folds.Most of current approaches analyze the time-varying glottal area (21) enclosed by the vocal folds or the lateral deflection given by the distance between both vocal folds along the glottal axis (22).For phase asymmetries concerning left and right vocal fold and the anterior-posterior (AP) dimension, a high diagnostical relevance was shown (23)(24)(25).Lateral and AP asymmetries are seen to characterize changes of mass or tension of the vocal folds during vibration (17,25,26), and therefore need to be considered for a full and detailed quantitative analysis.
Lohscheller and Eysholdt (27) introduced a comprehensive analysis approach that allows the quantitative description of the entire 2D vibration pattern of both vocal folds along the visible AP dimension (glottal axis).The procedure extracts and describes the time-varying contours of the two medial vocal fold edges as a function of the glottal axis.The extracted spatiotemporal information of vocal fold vibration is mapped into a clinically meaningful 2D image that encodes the time-varying distances from both vocal folds to the glottal midline axis as color information.The resulting graph is termed phonovibrogram (PVG; ref. 27).It was demonstrated that PVGs transform the relevant motion information into characteristic static geometric structures that comprehensively describe vocal fold vibration patterns (28).
Besides providing a powerful diagnostic tool for visual assessment, PVGs further form the basis for a computerized analysis of vocal fold vibrations.For this purpose, the geometric shape of the PVG structures are extracted and quantitatively described using image processing and machine-learning approaches.Voigt and colleagues demonstrated the applicability of PVGs for the discrimination of healthy and paralytic larynges (29), as well as healthy and functional voice disorders (30).In this first approach, a high number of more than 300 features were extracted from the PVG that are, however, rather difficult to interpret in the clinical context.To overcome this problem of high dimensionality and low interpretative power, a wavelet-based approach was recently proposed (31).This method is capable to condense the entire information about the opening and closing mechanism of the vocal folds encoded within a PVG within a very low number of parameters, which are interpretable under clinical aspects as they are related to the glottal closure types defined in the basic protocol for functional assessment of voice pathology elaborated by the European Laryngological Society (32).The wavelet-based PVG analysis approach showed promising results for the classification of functional and organic voice disorders (33).
As the distinction between precancerous diseases and malignant alterations of the vocal folds relies primarily on endoscopy, in this study, we examine whether high-speed laryngoscopy combined with PVG analysis might contribute to the differentiation between healthy vocal folds, precancerous lesions, and carcinomas of the vocal folds.By extending the PVG-wavelet approach, we investigated anomalies of vocal fold vibratory characteristics in the presence of precancerous lesions and T1a carcinoma, and compared the results with data obtained from healthy subjects.
All subjects were examined while sitting on a chair with straight but slightly back tilted head (to facilitate endoscopy).They were instructed to phonate a sustained vowel /ae/ at a comfortable pitch and loudness during the examination procedure.Recordings of subjects with vocal fold precancerous lesions or carcinomas were performed before surgical interventions.All diagnoses were made after surgery according to histologic report of excised tissue of the vocal folds.
For the high-speed endoscopy, the HRES Endocam 5562 High-Speed Video System (Richard Wolf GmbH) was used.All laryngeal recordings were captured at 4,000 frames per second in color with a spatial resolution of 256 Â 256 pixels.For each subject, a sequence of 500 frames (125 ms) was considered.Images were captured with a rigid 70 endoscope (Model HRES laryngoscope; Richard Wolf GmbH) and a 300-W Xenon light source (LP 5132; Richard Wolf GmbH).The rigid laryngoscope was coupled to the high-speed digital camera head.

Phonovibrogram computation
To compare objectively the specific characteristics of the vibration patterns between the different groups, all high-speed recordings were quantitatively analyzed using PVGs (28).The approach includes the following steps: Initially, the vibrating medial vocal fold edges were extracted from all high-speed recordings as shown in Fig. 1A.In this study, a total of 15,000 high-speed images were successfully segmented using a specially designed and clinically evaluated segmentation procedure (34).Within a subsequent transformation step (Fig. 1B) the distances between the vocal fold edges and the glottal midline were computed and the left vocal fold was virtually turned around the posterior glottal ending represented by the point P.For visualization purposes, the computed distance values were color-coded, resulting in a corresponding color strip; here, a gray-scale representation was chosen.Iterating the described procedure for an entire high-speed sequence and concatenating the resulting gray-scale strips resulted in a 2D image that was denoted as PVG.A PVG contains the entire information about the full spatiotemporal vibration pattern of the left (top) and right (bottom) vocal fold along the glottal axis.In Fig. 1C, three PVG oscillation cycles of the male subject H8 are depicted.For each oscillation cycle, the shape of the geometric pattern, which is accented exemplarily by the white dotted contour line within the second oscillation cycle, contains precise information about the opening and closing process of the respective vocal fold.To get a detailed description of the PVG construction process, refer to Lohscheller and colleagues (28).

Analysis of phonovibrograms
As shown before, the entire information about the 2D vocal folds dynamics can be captured within a PVG.Therefore, the analysis of vocal fold vibrations is consistent with the quantitative analysis of the geometric patterns within a PVG.In the following, the procedure is presented that shows how the relevant information about the vocal fold vibrations are extractable from PVGs and can be further condensed into a set of distinct clinically useful measures.
Initially from the PVGs, the glottal area waveform (GAW), as well as the hemi-GAW (H-GAW L/R ), was derived reflecting each vocal fold side individually.As shown in Fig. 1A, the H-GAW L/R are defined as the areas spanned between the left/right vocal fold and the glottal main axis, representing the proportional change of the glottal area induced by the particular vocal fold movement.Exemplarily, the time-varying H-GAW L/R , which were derived from healthy subject H8, are shown in Fig. 1D.PVGs and H-GAWs provided the basis for the further objective analysis of vocal fold vibrations.

Measure M1: lateral phase delay
In the case of unilateral alterations of the elasticity of vocal fold tissue, it can be assumed that temporal shifts between the vibration of the left and right vocal fold occur.Figure 2 shows for subject P3 the variations in time of the left and right hemi-GAWs in respect to the total GAW signal.For each side, the horizontal arrows represent the temporal displacement between the appropriate H-GAW and the GAW signal.The left vocal fold is in advance while the right vocal fold drags behind.
As H-GAW and GAW signals can be seen as narrow-band oscillations, phasing of each signal can be expressed as a function of time by the corresponding phase signals ð Þ, and ' GAW n ð Þ using a complex wavelet analysis (31).As shown in Fig. 2B, for each point in time the angle difference represents the phase shift between the particular H-GAW and the GAW signal.To avoid phase differences greater than p and less than Àp, the first measure Q L;R is derived by transforming the angles on the complex unit circle via z ¼ exp i' ð Þ (Fig. 2C) and computing for an entire high-speed sequence the mean phase difference in the complex plain as: where N denotes the total number of video frames.For an entire high-speed sequence, the measure Q L;R reaches zero for perfect synchronism.For negative values, the corresponding vocal fold is in advance to the GAW signal, whereas a positive sign indicates a delay.In this study, the measure Q L;R is used to describe the degree of lateral phase delay occurring in the different clinical groups.

Measure M2: lateral asymmetry of oscillation modes
To derive quantitative information about the degree of lateral asymmetry in respect of the vocal fold oscillation modes, we apply the recently introduced PVG wavelet analysis approach introduced by Unger and colleagues (31).The wavelet-based approach objectively describes vocal fold dynamics by quantifying the geometrical patterns occurring within a PVG for each vocal fold independently.Figure 3A shows two PVG examples; one for the healthy subject H9 (upper row) and the subject C4 (lower row) suffering from a unilateral T1a carcinoma.The goal of the analysis is to extract for each vocal fold automatically the geometric structure from the PVG, which is here accented by the white dotted lines within the third oscillation cycle of the PVGs.These contours represent the spatiotemporal information about the particular vocal fold oscillation pattern.In a former work, it was shown that these geometrical patterns within the PVGs are extractable by performing a PVG-wavelet decomposition (31).The results of applying the procedure to a PVG are shown for the two subjects in Fig. 3B.To quantify the extracted curve progressions quantitatively, a principle component analysis (PCA) was applied as described in ref. 31.The PCA condenses the main information about the curve characteristics within the first three dominant eigenvalues (see Fig. 3C) for the left (l L1 , l L2 , l L3 ) and the right vocal fold (l R1 , l R2 , l R3 ).Because of the direct relation between PVG and vocal fold dynamics, these eigenvalues represent the information about the spatiotemporal characteristics of the vibratory patterns for each vocal fold independently.
Unger and colleagues (31) showed that the eigenvalues l 1À3 constitute a quantitative representation of the formerly only subjectively rateable glottal closure types that are defined in the basic protocol for functional assessment of voice pathology elaborated by the European Laryngological Society (32).Using the quantifiable eigenvalues, the degree of the lateral differences of the oscillation modes can thus easily be determined as follows: In Fig. 3 on the right, the second measure Dl is exemplarily given for the physiologic and the pathologic case.In the case of an increased lateral vibration asymmetry, the proposed measure Dl is considerably raised compared with the physiologic case.In this study, we used this second measure to quantify the vibration asymmetry of the vocal folds induced by precancerous lesions or T1a carcinoma.

Measure M3: lateral asymmetry of AP phase delay progression
Depending on the particular oscillation mode, vocal folds start opening first either at their anterior, posterior, or medial parts.Figure 4A shows a PVG extracted from subject H5, which incorporates two oscillation cycles.Here, vocal folds started opening first at the anterior third at position k 2 ¼ 90% from the most posterior to the anterior ending of the vocal folds.The opening process continued over time from anterior to posterior (indicated by the white arrows), resulting in a characteristic AP phase delay progression.
For each position k alongside the glottal axis and for each vocal fold the AP phase delay can be analyzed in detail by evaluating the temporal shifts between the trajectories T L;R n; k ð Þ and the corresponding H À GAW L;R n ð Þ (see Fig. 4B).According to the measure M1, the phase delay between the trajectories T L;R n; k ð Þ and H À GAW L;R can be expressed via the complex wavelet phase denoted as ' T L;R n; k ð Þ.In contrast to M1, for each vocal fold phase, the delays were computed between each line of a PVG and the appropriate hemi-GAW, resulting in an AP phase delay along the glottal axis as shown in Fig. 4C.Here, L;R k; n ð Þ quantifies a phase difference along the AP direction.Consequently, the A, GAW and hemi-GAWs of subject P3 with a precancerous lesion on the right side.B, phase signals of GAW, H-GAWL, and H-GAWR.The phase was estimated using a wavelet-based analysis of PVGs.The phase difference provides information about the relationship of the temporal phasing; the left vocal fold waveform is in advance of the GAW, whereas the right vocal fold is slightly delayed.C, to avoid potential phase jumps, phases were transformed to the complex unit circle.
represents the degree of lateral AP phase asymmetry for a single trajectory.By averaging over all N frames of a high-speed sequence and over all K trajectories the measure constitutes the mean AP phase asymmetry that reaches zero when AP phase displacements were perfectly identical for the left and right vocal fold.

Analysis procedure
To identify potential differences within the above described three measures between the groups, statistical tests were performed using MATLAB R2013b.The Shapiro-Wilk test was applied to test of normality.Bonferroni-Holm corrected ANOVA were subsequently used with post hoc two-sided t tests.The significance level was taken as P < 0.05.
Besides the identification of group differences using statistics, we further investigated whether the proposed measures were sufficient for a personalized classification of the subjects into the different clinical groups.For this purpose, we used a machinelearning approach.A support vector machine (SVM) with a radial basis function (RBF) kernel was trained on those measures having A, PVGs shown from the healthy subject H9 (top) and the subject C4 with a T1 carcinoma on the left vocal fold (bottom).B, a wavelet-based analysis was applied (29) to extract the recurring geometric PVG pattern for left and right vocal fold separately.C, the geometric contour patterns can be quantified adequately with merely three coefficients for each vocal fold side obtained from principal component analysis as described previously (29).The Euclidian distance between the eigenvalues of the left and right side provides a measure of lateral asymmetry of oscillation modes, which is considerably increased for the subject C4. the most distinctive power according to the results of the statistical analysis.For training and classification, a leave-one-out strategy was applied.

Results
The three measures M1, M2, and M3 were evaluated for all 30 subjects to demonstrate disparities between vocal fold dynamics between the three groups.

Measure M1: lateral phase delay
Figure 5A shows the phase delays Q L;R between the H-GAWs and the GAW signals for each clinical group.As expected, for the control group, the values of Q L;R are close to zero representing just minor lateral phase delays.The value range of the right vocal fold can thus be regarded as normal range, which is accented by the gray-shaded band within the graph.
For the precancerous lesions group, the measure Q L;R is shown for the subgroups "unilateral left," "unilateral right," and "bilateral."For all subgroups, the values Q L;R are slightly enlarged and reveal further an increased variance.Despite the alteration, the median values of Q L;R are, however, in most cases within the normal range.
For the T1a carcinoma group, the measure Q L;R is presented in Fig. 5A and is subdivided for the subgroups "unilateral left" and "unilateral right."For both subgroups, the affected vocal fold clearly vibrates temporally ahead of the contralateral side indicated by the negative sign of the measure Q L;R , while the phase delays of the nonaffected vocal folds are within the normal range.The consistent temporal leading of the cancerous vocal fold side is a clear distinct vibration characteristic selectively present in T1a carcinoma and not visible in the precancerous lesions group.
For clinical interpretation, the lateral difference between the left and right phases D is of particular interest.For the three clinical groups, the values of D Q are displayed as boxplots in Fig. 5B.The phase differences were lowest for the control group.Both for the precancerous lesions groups and for the T1a carcinoma group, the distances D Q were increased in comparison with the control group.ANOVA revealed significant differences between the groups.The post hoc tests disclosed that the phase difference D Q of the T1a carcinoma group is significantly increased in respect to the control group.The results of the statistical tests including the P values are summarized in Table 1.

Measure M2: lateral asymmetry of oscillation modes
For each clinical group, the results of the measure Dl are presented in Fig. 6A, representing the degree of lateral asymmetry of vocal fold dynamics.As expected, the control group was characterized by low values, indicating a high level of symmetry.A, lateral phase delays between the H-GAWs and the GAW signals (measure M1).The control group exhibited relatively small phase displacements, with just a minor spreading representing the normal range (emphasized by the gray shaded band).For the carcinoma group, the affected vocal folds tended to vibrate in advance of the contralateral side, whereas no preferential phase direction was seen for the precancerous lesions.B, the absolute phase between the left and right side.Statistics revealed a significantly increased phase delay of the carcinoma group in respect to the control group.The increase for the precancerous lesions group was not significant.ANOVA revealed significant differences between the groups (Table 1).The pair-wise post hoc tests further showed that all groups exhibit significantly different asymmetry values Dl.The tests prove that significant differences exist even between the precancerous lesions and the T1a cancer group.

Measure M3: lateral asymmetry of AP phase delay progression
For each clinical group, the results concerning the degree of lateral AP phase delay asymmetry D À Á are displayed in Fig. 6B.The control group was characterized by very low asymmetry values and a small within-group variance.The ANOVA test revealed significant differences between all groups (Table 1).The post hoc tests further corroborated significantly increased asymmetry values of the precancerous lesions group and the T1a carcinoma group in respect to the control group.
The results of the statistical evaluation proved that the measures Dl and D exhibited the most distinctive power to discriminate efficiently between the groups.To investigate whether a personalized classification of the subjects to the different clinical groups was feasible, a SVM (RBF kernel) was trained on the basis of Dl and D spanning a 2D parameter space.On the basis of a leaveone-out strategy, the parameter space could be automatically subdivided into three different parameter regions that distinguish the healthy group, the precancerous lesions and the T1a carcinoma group.Overall, all subjects except one could be classified correctly.The misclassified subject originally belonging to the "precancerous lesions" group was classified as healthy (see Fig. 7).
Considering the automated classification of the groups "T1a carcinoma" versus "control group," a sensitivity (true positive rate, TPR) of 100% as well as the specificity (true negative rate, TNR) of 100% could be achieved.In the case of the two-class problem, "precancerous lesions" versus "control group" sensitivity and specificity are TPR ¼ 100% and TNR ¼ 90%.For the intermediate two-class problem, "precancerous lesions" versus "T1a carcinoma," all subjects were classified correctly.

Discussion
In this article, we describe the first objective approach to distinguish squamous cell T1a carcinoma from precancerous lesions of the vocal folds with reference to normal vocal folds based on a computerized analysis of laryngoscopic HSVs.The analysis of vocal fold dynamics presented here is performed by extracting in a first step the vibrating vocal fold edges from the HSV stream (33) and condensing the extracted vocal fold dynamics into PVGs (27,28).Former studies already showed that PVGs are principally suitable for detecting even slight inter-and intraindividual changes of vocal fold dynamics (29,35).
In this study, we applied a wavelet-based approach for the quantitative analysis of PVGs to describe objectively the vibratory characteristics alongside the entire glottal axis with a limited set of clinically meaningful parameters (31,33).Because carcinoma as well as precancerous lesions changes the internal structure of the vocal folds, we propose three parameters that are designed to measure the medio-lateral (coronar) and AP (sagittal) asymmetry and phasing of the vocal folds for the squamous cell T1a carcinoma, the precancerous lesions, and the healthy group.A, measures M2, lateral asymmetry of oscillation modes.B, M3, lateral asymmetry of AP phase delay progression.Both measures are significantly higher for the pathologies than the control group.Moreover, lateral asymmetry of oscillation modes is significantly different for precancerous lesions compared with the carcinoma group.Parameter space of the two most distinct measures M2 and M2.The space is divided into three regions using a SVM with RBF kernel.For separating the control group and precancerous lesion, the boundary causing the misclassification of one subject with precancerous lesions is shown.Carcinoma and precancerous lesions are separated correctly.
The first measure Q L;R quantifies the medio-lateral phase displacement in respect to the GAW.Because of the inertia of a unilateral mass augmentation in the T1a group, one would expect a slight delay of the affected vocal fold.However, the results show that the affected vocal folds exhibit a distinct negative phase-delay in respect to the GAW signal.This means that the affected vocal fold's movement tends to be predominantly in advance of the contralateral side.The reason can be seen in the characteristic stiffness of the vocal fold that was found to be significantly greater than in normal vocal fold tissue (36).For the precancerous lesions group, the sign and amount of the phasing is distributed in a broader range.However, statistical testing proved that for carcinoma, the absolute lateral phase delay between the vocal folds is significantly increased, whereas the increment for the precancerous lesions provides no significant difference.
Likewise, significantly increased values of the lateral vibration asymmetry of oscillation modes Dl and the lateral asymmetry of the AP phase delay D are confirmed by statistical analysis from healthy subjects over the precancerous lesions group to the T1a group.In contrast to conventional approaches, which analyze frequently the lateral asymmetry of amplitude values, the measures Dl applied here give information about the lateral asymmetry with regard to the entire spatiotemporal oscillation pattern.It therefore comprises much more information about the entire laryngeal dynamics than conventional asymmetry measures.The asymmetry of the lateral AP phase delay D is described for the first time.It shows minimal variance for the healthy group and is thus very selective for the differentiation between the physiologic and pathologic cases.Furthermore, in combination with Dl, it allows a correct classification of each subject to its appropriate group as shown by the machine learning approach.The different clinical groups showed clear delimitable regions within the parameter space.
In this first evaluation, high-speed laryngoscopy in combination with the PVG analysis is a promising approach to distinguish malignant from precancerous lesions and from healthy vocal folds.The parameters received by the wavelet-based analysis were highly sensitive to differentiate the diverse vocal fold vibration patterns.The results suggested that the presented method has the potential to improve and complement the assessment of specific properties of vocal fold dynamics in a quantitative way.It needs, however, to be mentioned that the results presented here refer to the identification of squamous cell carcinomas, which are the most frequent vocal fold cancer.Thus, a direct transfer to other cancers such as to the Ackerman tumor is not possible because it grows more superficially.A multicenter large-scale study is to be pursued to confirm and generalize the results of this research study.
The computing time for the entire procedure is in the range of just a few minutes so that it can be directly applied subsequent to the laryngeal examination.The only preconditions are the availability of a laryngeal HSV system and a sufficient image quality of the video data (illumination, fogging of the optic, and no vocal fold occlusion).Concerning the high-speed system, it can be expected that due to the technologic change in imaging systems, high-speed laryngoscopes will potentially replace the currently available stroboscopes in the foreseeable future.Because of its high sensitivity and the direct clinical applicability, the described procedure has the potential to improve the diagnostic process flow by lowering the number of unnecessary biopsies and thereby the risk of unfavorable scarring and persistent hoarseness after an unnecessary removal of nonmalignant tissue.
By combining this method with further new imaging technologies, such as auto-fluorescence, narrow-band imaging, optical coherence tomography coupled to endoscopy, or confocal laser endoscopy, a further step toward an improved objective diagnostic may be feasible (15).In particular, the combination with optical coherence tomography or narrow band imaging might on the one hand identify the nature of a lesion and on the other hand also its boarders (37).We aim for a more precise indication and extent of surgical intervention in vocal fold pathology, resulting in a better outcome for the patients' voice.

Figure 1 .
Figure 1.A, sequence of endoscopic HS recordings of the larynx.The glottal area, enclosed by both vocal folds, is divided by the glottal axis into left and right hemi-glottal area.B, transforming and intensity-coding the segmented vocal fold contours for a single high-speed image.White regions visualize large distances from the glottal axis and dark sections indicate little distances to the midline.C, PVG representation of vocal fold dynamics obtained from a segmented HSV containing three oscillation cycles.D, temporal progression of left and right hemi-glottal area extracted from a PVG.

Figure 2 .
Figure 2.A, GAW and hemi-GAWs of subject P3 with a precancerous lesion on the right side.B, phase signals of GAW, H-GAWL, and H-GAWR.The phase was estimated using a wavelet-based analysis of PVGs.The phase difference provides information about the relationship of the temporal phasing; the left vocal fold waveform is in advance of the GAW, whereas the right vocal fold is slightly delayed.C, to avoid potential phase jumps, phases were transformed to the complex unit circle.

Figure 3 .
Figure 3.A, PVGs shown from the healthy subject H9 (top) and the subject C4 with a T1 carcinoma on the left vocal fold (bottom).B, a wavelet-based analysis was applied(29) to extract the recurring geometric PVG pattern for left and right vocal fold separately.C, the geometric contour patterns can be quantified adequately with merely three coefficients for each vocal fold side obtained from principal component analysis as described previously(29).The Euclidian distance between the eigenvalues of the left and right side provides a measure of lateral asymmetry of oscillation modes, which is considerably increased for the subject C4.

Figure 4 .
Figure 4. A, PVG composed of two oscillation cycles showing an AP phase displacement.The vocal folds first open at the anterior part shown at the position k 2 ¼ 90%.The opening process gradually continues at posterior parts of the glottis, indicated by the white arrows.B, trajectories extracted at the k 1 ¼ 10% and k 2 ¼ 90% positions show different phase delays in respect to GAW.C, the phase delay progression at frame no.35 is shown.The AP phase displacement is reflected by the continuous progression from negative to positive trajectory angles L;R .The lateral asymmetry of AP phase delay progression D is finally given by the absolute phase difference between the left and right trajectory phases.

Figure 5 .
Figure 5.A, lateral phase delays between the H-GAWs and the GAW signals (measure M1).The control group exhibited relatively small phase displacements, with just a minor spreading representing the normal range (emphasized by the gray shaded band).For the carcinoma group, the affected vocal folds tended to vibrate in advance of the contralateral side, whereas no preferential phase direction was seen for the precancerous lesions.B, the absolute phase between the left and right side.Statistics revealed a significantly increased phase delay of the carcinoma group in respect to the control group.The increase for the precancerous lesions group was not significant.

Figure 6 .
Figure 6.A, measures M2, lateral asymmetry of oscillation modes.B, M3, lateral asymmetry of AP phase delay progression.Both measures are significantly higher for the pathologies than the control group.Moreover, lateral asymmetry of oscillation modes is significantly different for precancerous lesions compared with the carcinoma group.

Figure 7 .
Figure 7.Parameter space of the two most distinct measures M2 and M2.The space is divided into three regions using a SVM with RBF kernel.For separating the control group and precancerous lesion, the boundary causing the misclassification of one subject with precancerous lesions is shown.Carcinoma and precancerous lesions are separated correctly.

Table 1 .
Results of statistical analysis concerning the identification of potential differences between the healthy control group, precancerous lesions group, and T1a carcinoma group Six of the nine post hoc tests showed significant differences between the groups.Bold data indicate statistical significance (P < 0.05).Abbreviations: H, healthy control group; P, precancerous lesions group; C, T1a carcinoma group.
NOTE: Bonferroni-Holm-corrected ANOVA and two-sided post hoc t tests were performed on the measures M1-M3.