Development of a Novel Facial Age Assessment Model in a Multiethnic Population for Evaluation of Topical Anti-Aging Products

Categories:

J Clin Aesthet Dermatol. 2025;18(11):24–29.

by Denise DiCanio, EdD, MBA; Edward (Ted) Lain, MD, MBA; James Del Rosso, DO; Eric Yovine, MBA; Hillary Kerns, PharmD; Elizabeth Bruning, BSc Hons, LLB; Catherine Fennessy, BS; Hao Ouyang, PhD; and Claude Saliou, PharmD, PhD

Drs. DiCanio, Kerns, and Saliou, and Ms. Fennessy are with The Estée Lauder Companies, Research and Development, in Melville, New York. Ms. Bruning and Dr. Ouyang were with The Estée Lauder Companies, Research and Development, in Melville, New York at the time of writing. Dr. Lain is with Sanova Dermatology in Austin, Texas. Dr. Del Rosso is with Del Rosso Dermatology Research Center in Las Vegas, Nevada. Mr. Yovine is with EGY Statistical Services, Inc. in Bronx, New York.

FUNDING: Funding for this paper was received from The Estée Lauder Companies in Melville, New York.

DISCLOSURES: Drs. DiCanio, Kerns, Ouyang, and Saliou, and Ms. Bruning and Ms. Fennessey are current or former employees of the Estée Lauder Companies. Mr. Yovine, of EGY Statistical Services, Inc., has received consulting fees from the Estée Lauder Companies. Drs. Lain and Del Rosso have served as advisors for and received honoraria from the Estée Lauder Companies.

ABSTRACT: Objective: The aim of this study was to develop a novel validated computational algorithm for calculating facial age based on the key universal parameters of the face and eyes that contribute to facial aging in women, independent of ethnicity and Fitzpatrick skin type. Methods: Digital facial images of women (n=2825) of 4 different ethnicities, across all Fitzpatrick skin types (I-VI), were evaluated and scored on 15 facial aging markers using a 0 to 10 photonumeric scale (0=no sign of aging; 10=severe signs of aging). Least squares linear multiple regression analysis was performed to identify parameters that contribute independently to observable skin aging and to develop a mathematical algorithm to calculate facial age based on these parameters. Results: The identified key universal parameters contributing to facial age, independent of ethnicity or skin type, were nasolabial folds, under eye lines, elongated cheek pores, forehead lines, under eye puffiness, uneven skin tone, and marionettes, which explained 71% of the variation in age. All parameters increased with age but at different rates, with forehead and marionette lines showing the largest changes with each progressive decade. Conclusion: The algorithm developed for calculating facial age based on visual assessments of 7 identified key parameters may be an objective method to evaluate the efficacy of topical anti-aging products. Keywords: Facial aging, calculated facial age, multiethnic age model, skin aging, age prediction

Introduction

The increasing global demand for facial rejuvenation procedures and anti-aging therapies has led to the development of various scales, algorithms, and assessment tools to evaluate facial aging markers and measure changes in these markers after cosmetic interventions. Validated assessment scales are available for the evaluation of individual facial aging markers such as marionette lines, crow’s feet, and forehead lines^1-3; facial regions such as the upper face, midface, and lower face^4-6; and the overall face.^7-11 These types of scales were designed to enable evaluation of the treatment effects of cosmetic procedures such as botulinum toxin A injections, collagen implants, and facelifts.^11-14 However, these assessment tools may be limited in their ability to measure more subtle and gradual changes in signs of facial aging over time that may occur with the use of topical anti-aging products.

A need remains within the dermatology and skin aesthetics community for a validated assessment tool or algorithm that defines the clinically relevant parameters of facial aging, independent of Fitzpatrick skin type and ethnicity, and enables an objective estimation of facial age based on those parameters. Such a model could be used to quantify the effects of topical cosmetic products purported to reduce the signs of facial aging in terms of years, and to compare anti-aging products in head-to-head clinical studies. Several age prediction models have used scales to quantify global facial aging parameters and estimate perceived age based on those aging scores^7,8,11,15; however, these models are based on a population of a single or limited range of skin types or ethnicity and therefore may not be applicable to multiethnic populations. In addition, the rating scales used in these models are broad 4-point or 5-point scales, which may not be suitable to detect the subtler changes in facial aging parameters that may occur following the use of topical anti-aging products. Moreover, the models may not include aging parameters such as uneven skin tone, dyspigmentation, or eye-area aging signs, which may not necessarily be affected by aesthetic procedures such as dermal fillers, but are clinically relevant in the evaluation of topical anti-aging products.

We describe herein an age prediction model that both identifies the key parameters of the face and eyes that contribute to facial aging, independent of ethnicity and skin type, and derives an algorithm that enables an objective estimation of age based on these parameters. The goal in developing this model was to create a method to objectively evaluate the efficacy of topical cosmetic products in reducing signs of facial aging in a multiethnic population.

Methods

Subjects and image dataset acquisition. The image data set for this study was generated from digital facial photographs of women (n=2825) of different ethnicities across the range of Fitzpatrick skin types. Study photographs were taken from Estée Lauder’s database of over 5000 facial images of women obtained over an 18-year period (from 2005 to 2023) for the purposes of facial aging research.

Subjects were women aged 18 to 85 years, in generally good health, of any Fitzpatrick skin type (I-VI), with normal, dry, or oily skin. Key exclusion criteria included: a systemic illness that contraindicated participation; any dermatological disorders in the skin areas evaluated in the study (i.e., face and eye areas); treatment by a dermatologist for any conditions in the skin areas evaluated in the study; pregnancy or lactation; use of systemic or topical retinoids in the past year; use of systemic antihistamines; and cosmetic procedures (injectable antiwrinkle products, facial cosmetic surgery, laser procedure, etc.) that might affect the natural skin aging process.

All subjects signed an informed consent and photograph release form for the capture of digital images as related to this study. Subjects were instructed not to participate in any other cosmetic or clinical trials during their participation in this study. Institutional review board approval was not sought as the study used only de-identified photographs to analyze facial skin parameters in a general population sample.

The image data set was designed to cover a broad range of age groups, ethnicities, and skin types (Table 1). The subjects included East Asian (Japanese and Chinese; n=1337, Fitzpatrick skin type I-IV); Hispanic/Latina (n=548, Fitzpatrick skin type I-VI); European American (n=490, Fitzpatrick skin type I-IV); and African American (n=450, Fitzpatrick skin type I-VI) women.

Digital photography. Digital facial images were obtained using VISIA-CR™ photoimaging equipment (Canfield Scientific), consisting of a fixed head support and image preview tools to ensure proper positioning of each subject.

Photographs were taken by reproducibly positioning the subjects’ head, using stationary chin and forehead supports and maintaining consistent camera and lighting settings. Eyes-open and eyes-closed images were captured and saved directly to an electronic record in Canfield’s Mirror software. Front-facing and left and right ¾ side view images of each subject were captured (Figure 1).

Clinical photographic assessment and grading. Photographic assessments were conducted to identify and quantify the presence of specific facial aging parameters. Evaluations were conducted by 4 to 12 trained graders using a 0 to 10 analog scale (0=no sign of aging; 10=severe signs of aging). Grades could be applied in 0.5 increments. A lexicon was developed and assigned to each parameter (e.g., nasolabial folds), and photonumeric visual grading scales (i.e., a representation of photos for each level of severity) were provided for each of the facial aging parameters to be graded (Figure 2).

This system of grading was previously validated using a pilot data set of 100 images of European American women, which served as the training/validation set for the human graders to become experts. The raw grading data of the trained graders was validated by external statisticians.¹⁵

The inter-rater reliability was analyzed across different graders for the entire data set (n=2825) using correlation analysis.

Graders were blinded with respect to subject demographics. Cropped de-identified images (Figure 3) were used for evaluation of parameters related to specific anatomical areas of the face (e.g., forehead lines), whereas ¾ side view full facial images were used for parameters related to overall facial appearance (e.g., uneven skin tone). Graders viewed images on high-resolution, color-balanced monitors.

Selection of facial aging parameters for evaluation. Digital images were graded using the 0 to 10 photonumeric scale for 45 parameters deemed to be potential contributors to overall skin aging. Of these, 15 parameters (Table 2) were selected as clinically relevant to facial skin aging and the evaluation of topical anti-aging products aimed at improving the appearance of facial and eye area skin.

Computational model/algorithm development and statistical analysis. Multiple regression analysis was performed to identify the specific parameters from the selected set of 15 parameters that are truly critical to an objective estimation of age. Based on the mean grading scores for these 15 parameters across the sample of 2825 subjects, the parameters that contributed independently to the estimation of age were identified by measuring the degree of association between the dependent variable (in this case, actual age) and the associated independent measures vis-à-vis correlation, and then by determining successively how many independent measures were needed to account for the observed results utilizing a least squares linear multiple regression analysis.

Statistical analysis was performed using SPSS/PC+ V5.0.2 statistical software.

Results

Results from the clinical photographic assessments and grading are shown in Table 3. In the first part of the regression analysis, the independent variable with the highest correlation to the dependent variable (actual age) was determined to be nasolabial folds, with a correlation coefficient of 0.722, and was therefore the first variable to enter the equation.

The multiple regression analysis found that 7 of the 15 variables were statistically significant contributors (in the order shown) to the regression equation (Table 4). Of the remaining 8 variables, 6 were nonsignificant contributors and 2 were deemed noncontributors. Using these 7 variables, we were able to explain 71.21% of the variation in actual age in this multiethnic sample. There was no significant difference or added value in estimated age calculation when including all 13 contributing variables (significant and nonsignificant) vs. 7 parameters (significant contributors only). The estimated variation was 71.21% when 7 parameters were included vs. 72.00% when the 13 contributing variables were included.

Based on the regression analysis of the 7 identified facial aging parameters, the predicted age equation for this regression was determined to be:

Predicted age (Y) = 26.335797 +
(2.486784 × [nasolabial folds]) +
(1.551839 × [under eye lines]) +
(4.083671 × [elongated cheek pores]) +
(1.216420 × [forehead lines]) +
(1.014862 × [under eye puffiness]) +
(0.944117 × [uneven skin tone]) +
(0.965936 × [marionettes]),

Where [variable] is the graded value of the skin aging parameter, and the order of the parameters represents the weight of the respective variables in the equation. The regression equation indicates that nasolabial folds (the first variable in the equation) is the most predictive of actual (chronological) age, whereas elongated cheek pores (the highest regression coefficient) is the most sensitive in predicting actual age.

All 7 variables were found to increase with chronological age (based on expert grading of digital images) but at different rates (Figure 4). Elongated cheek pores showed smaller incremental changes over time compared to the other markers. Forehead and marionette lines showed the largest changes with each progressive decade.

Discussion

The goals of this study were to identify the key universal parameters of the face and eye area that contribute to aging and to develop a method to calculate the age of an individual based on these parameters to aid in the clinical assessment of topical products designed to reduce signs of facial aging. Through a multiple regression analysis, we determined that the following 7 facial parameters are the primary contributors to facial aging, independent of ethnicity and skin type: nasolabial folds, under eye lines, elongated cheek pores, forehead lines, under eye puffiness, uneven skin tone, and marionettes. These 7 variables, comprising both superficial skin features (e.g., uneven skin tone) and deeper structural ones (e.g., nasolabial folds), accounted for roughly 71% of the variation in actual (chronological) age, underscoring the predictive strength of the model. From the multiple regression analysis, we derived a computational algorithm based on these 7 parameters by which the facial age of an individual could be calculated, regardless of Fitzpatrick skin type or ethnicity.

It is well established that the aging process differs across Fitzpatrick skin types and ethnicities. Darker skin ages at a slower rate than lighter skin due primarily to the protective effects of melanin against photoaging.^16,17 Moreover, the prevalence of specific facial aging signs and their rate of change over time has been found to vary markedly in the skin of different ethnicities.^16-18 The skin of Asian and Black individuals, for example, is more vulnerable to dyspigmentation than skin of White individuals, but it is characterized by a thicker and more compact dermis, which results in a lower incidence of facial wrinkles.¹⁶ Significant differences in the facial aging process have been observed between Chinese and European women, with wrinkles, ptosis or sagging, and vascular signs developing earlier (as early as age 18-29 years) and with greater severity in European women, and pigmentation signs developing earlier and with greater severity in Chinese women.¹⁸ Despite these differences, however, there are “universal” facial aging markers that span across ethnicities and skin type. The present study aimed to identify these universal markers by studying images of a large and diverse population, encompassing multiple ethnicities and Fitzpatrick skin types, to ensure that the age estimation model thereby derived would have wide applicability. Previously developed facial aging and age estimation models are based on data obtained from smaller, less diverse populations in terms of ethnicity and/or skin type, limiting their applicability. For example, the FACE-Objective Assessment Scale (FACE-OAS) developed by La Padula et al¹¹ to calculate an overall facial aging score and estimated age was based on clinical photographic assessments of 1000 subjects, all of whom were White. In earlier work using methodology similar to that of the current study, we developed an algorithm to obtain a calculated estimated age based on 12 independent variables, including clinical, biochemical, and biophysical markers.¹⁵ However, this study also included only White women (n=500) and the derived algorithm may not be applicable to women of other ethnicities and skin types. Similarly, Sen et al⁸ developed a 13-item rating scale and age prediction algorithm, but these were based on a relatively small (n=105) Indian population with darker skin (Fitzpatrick skin type IV-VI only). The facial aging model developed by Rzany et al⁷ was based on a population with a broad range of skin types (Fitzpatrick skin type I-IV), but again, darker skin types were not represented.

The present model was based on clinical grading assessments of a large, diverse population of 2825 women from 4 ethnicities spanning the full range of Fitzpatrick skin types (I-VI), suggesting that the parameters included in the algorithm to calculate age are universal, and transcend ethnicity and skin type. Indeed, facial aging parameters that are particularly relevant to assessments of lighter skin types (e.g., redness) or darker skin types (e.g., hyperpigmentation) were not found to be independent contributors to the age prediction equation in our model but were likely captured in the “universal” uneven skin tone variable. Of note, crow’s feet, which was included in previously published age prediction models based on European populations,^11,15 was not found to be a significant independent contributor to predicted age in our multiethnic model, suggesting that this parameter may not be independent of skin type and/or ethnicity.

The population on which our model was based also spanned a wide age range, from 18 to 85 years. In the recruitment process, we aimed to include at least 10 subjects of each age within an age group (i.e., 10 subjects 18 years of age, 10 subjects 19 years of age, etc.), enabling us to measure how facial aging parameters change over time. All 7 key facial aging parameters were found to increase with chronological age but at different rates, with forehead and marionette lines showing the largest changes with each progressive decade and elongated cheek pores showing only small incremental changes.

In the clinical grading assessments for our model, graders used a 0 to 10 scale (0=no signs of aging; 10=severe signs of aging), with 0.5 increments, to rate the severity of facial aging parameters. Other age prediction models based on clinical grading assessments have commonly used 4-point (0-3) or 5-point (0-4) scales with only whole number grades possible.^7,8,11 Our scale and model were developed specifically to enable clinical efficacy assessments of topical anti-aging products targeted at the face and eye areas, for which such scales need to be larger and more nuanced than traditional dermatology scales to detect more subtle and granular changes.

We considered 3 specific eye area aging parameters—under eye lines, under eye puffiness, and dark circles—of which 2 (under eye puffiness and under eye lines) emerged in the linear regression analysis as variables that significantly contribute to predicted age. The eye area was of particular interest during development of this model because unpublished internal eye-tracking studies showed that consumers look at the eyes first and longest, even when assessing the effects of topical cosmetic products that are applied to the face overall. The eyes are also an area of interest for consumers when estimating age. In a study assessing the contribution of individual facial skin attributes on the perceived age of women, the eye area was a primary attribute related to age prediction by untrained graders.¹⁹ Despite the importance of the eye area in age estimation, some models of facial aging do not include eye parameters. Among validated scales developed by Rzany et al⁷ and Flynn et al⁶ for assessment of age-related changes in the upper face, none focused on the eye area specifically, likely because these models were designed primarily to assess the efficacy of aesthetic procedures, which may not target parameters such as under eye lines and puffiness.

The facial aging assessments used to develop this model were performed by trained human graders based on high-resolution digital images. Use of artificial intelligence (AI)-based grading systems and algorithms may help streamline clinical grading of images, but currently developed models using AI-based systems have not been validated for all skin types or populations. Flament and colleagues have developed an AI-based facial skin grading system using smartphone self-images (“selfies”) that has been validated against human expert grading in several populations.^20-23 However, in a study comparing the grading of 7 facial signs in a diverse, multiethnic United States population by this AI-based system versus by trained dermatologists, the AI system did not perform well in grading cheek pores and density of pigmentation spots, and it was relatively less accurate in the darker phenotypes (Fitzpatrick skin type V-VI) and at the ends of the age spectrum (18-29 years and 65-80 years).⁹ Park et al^24,25 developed an AI-based system that was found to accurately predict facial age based on facial images, as demonstrated by high correlations with expert-evaluated age and actual age. However, this system was developed based on images of Korean women exclusively, and its applicability in other populations has not been studied.

Limitations

The present study has several limitations. First, the clinical grading assessments of facial aging parameters were performed by trained nondermatologist graders using digital images of subjects. However, the intraclass correlation coefficient (ICC) value for up to 12 graders was determined to be >0.95, indicating excellent inter-rater reliability.²⁶ In addition, studies by our group have shown that photographic grading of visible aging markers by human graders most closely predicts perceived age compared with biomarker- or instrumentation-based grading.¹⁵ Validation of the clinical grading assessments by dermatologists and/or other medical professionals may help improve the model.

Second, while our model was based on a large and diverse population of European American, African American, East Asian, and Hispanic/Latina women, the image data set did not include women from other ethnicities, such as those from the Indian subcontinent or the Middle East. Inclusion of other subpopulations may help refine the model further and widen its applicability.

Third, while our model was built based on 15 discrete clinically relevant facial aging parameters, there may be other parameters that were not captured that also contribute to facial aging, such as eyelid laxity, glabellar lines, and vertical lip wrinkles or lines. These parameters are included in other published age prediction models.^7,8,11

Conclusion

We developed an algorithm to calculate estimated age based on 7 key facial parameters that account for a significant proportion of facial aging, independent of ethnicity and Fitzpatrick skin type. Our model is based on clinical photography grading assessments of facial images from a large (n=2825) and diverse population representing 4 ethnicities, an age range spanning 6 decades, and the full range of Fitzpatrick skin types, underscoring its potential for wide applicability. To our knowledge, this is the first multiethnic facial aging model to use such a large and comprehensive data set aimed at enabling objective clinical assessment of topical anti-aging products.

References

Carruthers A, Carruthers J, Hardas B, et al. A validated grading scale for marionette lines. Dermatol Surg. 2008;34 (Suppl 2):S167-S172.
Carruthers A, Carruthers J, Hardas B, et al. A validated grading scale for crow’s feet. Dermatol Surg. 2008;34 (Suppl 2):S173-S178.
Carruthers A, Carruthers J, Hardas B, et al. A validated grading scale for forehead lines. Dermatol Surg. 2008;34 (Suppl 2):S155-S160.
Narins RS, Carruthers J, Flynn TC, et al. Validated assessment scales for the lower face. Dermatol Surg. 2012;38(2 Spec No.):333-342.
Carruthers J, Flynn TC, Geister TL, et al. Validated assessment scales for the mid face. Dermatol Surg. 2012;38(2 Spec No.):320-332.
Flynn TC, Carruthers A, Carruthers J, et al. Validated assessment scales for the upper face. Dermatol Surg. 2012;38(2 Spec No.):309-319.
Rzany B, Carruthers A, Carruthers J, et al. Validated composite assessment scales for the global face. Dermatol Surg. 2012;38(2 Spec No.):294-308.
Sen S, Choudhury S, Gangopadhyay A, et al. A clinical rating scale for the assessment of facial aging in Indian population. Indian J Dermatol Venereol Leprol. 2016;82(2):151-161.
Flament F, Jiang R, Houghton J, et al. Accuracy and clinical relevance of an automated, algorithm-based analysis of facial signs from selfie images of women in the United States of various ages, ancestries and phototypes: A cross-sectional observational study. J Eur Acad Dermatol Venereol. 2023;37(1):176-183.
Buranasirin P, Pongpirul K, Meephansan J. Development of a global subjective skin aging assessment score from the perspective of dermatologists. BMC Res Notes. 2019;12(1):364.
La Padula S, Hersant B, SidAhmed M, Niddam J, et al. Objective estimation of patient age through a new composite scale for facial aging assessment: The face – Objective assessment scale. J Craniomaxillofac Surg. 2016;44(7):775-782.
Carruthers A, Carruthers J, Coleman WP 3rd, et al. Multicenter, randomized, phase III study of a single dose of incobotulinumtoxinA, free from complexing proteins, in the treatment of glabellar frown lines. Dermatol Surg. 2013;39(4):551-558.
Monheit GD, Gendler EC, Poff B, et al. Development and validation of a 6-point grading scale in patients undergoing correction of nasolabial folds with a collagen implant. Dermatol Surg. 2010;36 (Suppl 3):1809-1816.
Ruiz R, Hersant B, La Padula S, Meningaud JP. Facelifts: Improving the long-term outcomes of lower face and neck rejuvenation surgery: The lower face and neck rejuvenation combined method. J Craniomaxillofac Surg. 2018;46(4):697-704.
Dicanio D, Sparacio R, Declercq L, et al. Calculation of apparent age by linear combination of facial skin parameters: a predictive tool to evaluate the efficacy of cosmetic treatments and to assess the predisposition to accelerated aging. Biogerontology. 2009;10(6):757-772.
Vashi NA, de Castro Maymone MB, Kundu RV. Aging differences in ethnic skin. J Clin Aesthet Dermatol. 2016;9(1):31-38.
Hudson C, Brissett A, Carniol P. Analysis and assessment of facial aging. Curr Otorhinolaryngol Rep. 2021;9:415-421.
Flament F, Jacquet L, Ye C, et al. Artificial Intelligence analysis of over half a million European and Chinese women reveals striking differences in the facial skin ageing process. J Eur Acad Dermatol Venereol. 2022;36(7):1136-1142.
Nkengne A, Bertin C, Stamatas GN, et al. Influence of facial skin attributes on the perceived age of Caucasian women. J Eur Acad Dermatol Venereol. 2008;22(8):982-991.
Jiang R, Kezele I, Levinshtein A, et al. A new procedure, free from human assessment that automatically grades some facial skin structural signs. Comparison with assessments by experts, using referential atlases of skin ageing. Int J Cosmet Sci. 2019;41(1):67-78.
Flament F, Lee YW, Lee DH, et al. The continuous development of a complete and objective automatic grading system of facial signs from selfie pictures: Asian validation study and application to women of three ethnic origins, differently aged. Skin Res Technol. 2021;27(2):183-190.
Zhang Y, Jiang R, Kezele I, et al. A new procedure, free from human assessment, that automatically grades some facial skin signs in men from selfie pictures. Application to changes induced by a severe aerial chronic urban pollution. Int J Cosmet Sci. 2020;42(2):185-197.
Flament F, Hofmann M, Roo E, et al. An automatic procedure that grades some facial skin structural signs: agreements and validation with clinical assessments made by dermatologists. Int J Cosmet Sci. 2019;41(5):472-478.
Park SR, Park H, Lee S, et al. Facial age evaluated by artificial intelligence system, Dr.AMORE®: An objective, intuitive, and reliable new skin diagnosis technology. J Cosmet Dermatol. 2024;23(4):1510-1512.
Park H, Park SR, Lee S, et al. Development and application of artificial intelligence-based facial skin image diagnosis system: Changes in facial skin characteristics with ageing in Korean women. Int J Cosmet Sci. 2024;46(2):199-208.
Koo, TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155-163.