The term “Dark Triad” refers to a personality structure comprising Machiavellianism, narcissism, and psychopathy (Paulhus & Williams, 2002). Machiavellianism entails manipulativeness, cynicism, and a strategic interpersonal orientation (Christie & Geis, 1970). Narcissism involves grandiosity and dysfunctional self-esteem strategies (Jones & Paulhus, 2014), while psychopathy is characterized by lack of empathy, impulsivity, and antisocial behavior (Jones & Paulhus, 2014).
The three components of the Dark Triad are subclinical traits, and research suggests that Machiavellianism and psychopathy can be considered interchangeable (O’Boyle et al., 2015; Vize et al., 2018; Watts et al., 2017). However, narcissism measured with different scales does not consistently converge within the same trait (O’Boyle et al., 2015; Watts et al., 2017). This lack of consistency may be attributed to the different scales used to measure these personality traits (Watts et al., 2017).
The Short Dark Triad (SD3; Jones & Paulhus, 2014) and the Dirty Dozen (DD; Jonason & Webster, 2010) are commonly used self-report scales to measure Dark Triad traits (Maples et al., 2014). Studies comparing the psychometric properties of these scales suggest that SD3 has higher validity than DD (Gamache et al., 2018; Geng et al., 2015; Maples et al., 2014). Notably, the narcissism scales of SD3 and DD show a weak correlation, possibly due to SD3 measuring grandiose narcissism while DD assesses both grandiose and vulnerable narcissism (Maples et al., 2014). Narcissism can manifest as grandiose or vulnerable forms, characterized by different traits (Gore & Widiger, 2016).
One of the major problems in psychological research is the jingle-jangle fallacy. This fallacy consists of conceptual mistakes that happen when psychological constructs are confused in relation to their labels or names. In particular, jingle fallacy happens when two psychological scales use the same name for indicating psychological constructs that are different (e.g., two scales that measure motivation, but one measure intrinsic motivation while the other measures extrinsic motivation); jangle fallacy happens when two scale have different names, but they are measuring the same construct (e.g., a scale of “self-efficacy” and a scale of “self-confidence” that are measuring the same construct). The negative consequences of the presence of jingle-jangle fallacy are theoretical confusion, difficulty to discriminate between constructs and, therefore, to create predictive models and the risk to make wrong generalizations or conclusions from empirical data. In literature, for example, some authors (Marsh et al., 2019) found that the constructs “self-efficacy” and “self-concept” are confused and overlapped. Therefore, a nomological analysis, through Multi-Trait Multi-Method analyses can help researchers to improve the discrimination between constructs and to increase their theoretical and empirical utility (Marsh et al., 2019). Scales with the same name but with different degrees of convergence and divergence with other measures probably assess different constructs. In the case of Dark Triad traits, one could argue that both SD3 and DD might use the same labels for Machiavellianism, narcissism and psychopathy scores, yet they could be assessing different constructs.
Apart from convergent and discriminant analyses, additional criteria such as prediction consistency and nomological consistency are essential for assessing the consistency of Dark Triad traits (Thielmann & Hilbig, 2019). Prediction consistency evaluates whether SD3 and DD scales predict the same outcomes or behavioral criteria, highlighting their substantial interchangeability in research or applied contexts. On the other hand, nomological consistency examines the overall pattern of correlations between the scales and other relevant constructs, ensuring that these correlations are consistent with theoretical expectations (Thalmayer et al., 2011). Therefore, prediction and nomological consistency offer a comprehensive framework for assessing whether SD3 and DD accurately measure the same underlying traits or represent distinct conceptualizations. To assess prediction and nomological consistency of SD3 and DD, we collected data from a community sample of individuals without psychological syndromes, as Dark Triad traits should manifest consistently even among the general population (Fleeson & Noftle, 2009).
In this study, we aimed to test the prediction consistency of SD3 and DD by comparing their prediction consistency with four sets of psychological variables. Previous research has linked Dark Triad traits to maladaptive behavior, substance abuse, and mental health variables (Azizli et al., 2016; Egan et al., 2014; Muris et al., 2017; Stenason & Vernon, 2016). Additionally, Dark Triad traits have been connected to the Five-Factor Model (FFM) of personality, with conscientiousness and agreeableness negatively associated with psychopathy and Machiavellianism, and openness and extraversion positively associated with narcissism (O’Boyle et al., 2015). Considering the potential overlap between Machiavellianism and psychopathy, their connection with disinhibition was explored, as psychopathy shows a positive relation with disinhibition while Machiavellianism exhibits a moderate association with self-control (Jones & Paulhus, 2014; Miller & Lynam, 2015).
To evaluate the nomological consistency of SD3 and DD, we examined their relationships with four sets of variables: psychopathy and empathy, FFM personality traits, mental health indicators, and disinhibition versus constraint measures. By investigating these variables, we aimed to gain a comprehensive understanding of the consistency of Dark Triad traits and their associations across different constructs.
Method
Participants and Procedure
A total of 504 Italian participants (58% females) with ages ranging from 19 to 89 years (M = 40.79; SD = 20.62) took part in the study. Most participants were either full-time or part-time students (37.6%), who were recruited for the study by the first author of his paper while they were attending university courses. The remaining participants were non-students employed in various occupations such as dependent workers (13.5%), housewives (12.7%), retired individuals (12.1%), employees (4.2%), freelance professionals (4%), trade officers (3.8%), teachers and professors (2%), soldiers and military personnel (1.6%), farmers (0.6%), seasonal or unskilled workers (1.4%), managers or business owners (0.8%), and drivers (0.4%). These participants were contacted through announcements or by word of mouth. Using GPower, we estimated the minimum required sample size for linear regression models with 3 predictors, setting α = 0.05, 1 – β = .80 and medium effect size f = .15. The calculation indicated that the sample should consist of at least 43 participants. The participants completed pencil-and-paper questionnaires administered by the same expert examiner. Informed consent was obtained from all participants, and their participation was voluntary. The study followed the ethical principles outlined in the Declaration of Helsinki and was approved by the regional ethical committee for biomedical research (Ref: rich99n3w). Material used for this study and R script codes for the analyses are available at Tommasi (2025).
Measures
Dark Triad Traits
The Short Dark Triad (SD3) scale, consisting of 27 items measured on a five-point Likert scale, assessed the components of Dark Triad: Machiavellianism, Narcissism, and Psychopathy. The Dirty Dozen (DD) scale, consisting of 12 items measured on a seven-point Likert scale, also assessed the same Dark Triad facets. The Italian adaptations of the DD (Schimmenti et al., 2019) and the SD3 (Somma et al., 2019) were used in this study.
Psychopathy and Empathy
The Levenson Self-Report Psychopathy Scale (LSRP; Italian adaptation: Somma et al., 2014) comprised 26 items measured on a four-point Likert scale. It yielded subscales for Primary Psychopathy and Secondary Psychopathy. The Balanced Emotional Empathy Scale (BEES; Italian adaptation: Meneghini et al., 2006) included 30 items measured on a seven-point Likert scale to assess empathy. The Interpersonal Reactivity Index (IRI; Italian adaptation: Ingoglia et al., 2016) measured the cognitive (Fantasy-F, Perspective Taking-PT) and affective (Empathic Concern-EC, Personal Distress-PD) components of empathy using 28 items on a seven-point Likert scale.
FFM Personality Traits
The Big Five Questionnaire-Short Form (BFQ-SF) developed by Caprara et al. (1993) assessed the five factors of personality: Extraversion, Agreeableness, Emotional stability, Conscientiousness, and Openness. The BFQ-SF consisted of 60 items scored on a five-point Likert scale.
Psychological Well-Being and Mental Health
The Subjective Happiness Scale (SHS; Italian adaptation: Iani et al., 2014) comprised four items measured on a seven-point Likert scale, assessing the level of life satisfaction. The Basic Psychological Needs Scale (BPNS; Italian adaptation: Szadejko, 2003) included 21 items measured on a five-point Likert scale, measuring autonomy, competence, and relatedness. The State-Trait Inventory of Cognitive and Somatic Anxiety (STICSA; Italian adaptation: Carlucci et al., 2018) consisted of 21 items measured on a four-point Likert scale, assessing anxiety characteristics. The Teate Depression Inventory (TDI; Balsamo & Saggino, 2013) comprised 21 items measured on a five-point Likert scale, evaluating depressive symptoms.
Disinhibition vs. Constraint
The Disinhibition vs. Constraint Inventory (DvC; Dindo et al., 2009) consisted of 65 items measured on a five-point Likert scale. The items were grouped into five subscales: prosociality, manipulativeness, distractibility, risk-taking, and orderliness.
Social Desirability
Social desirability was assessed using the Marlowe-Crowne (MC) scale-short form (Italian adaptation: Manganelli Rattazzi et al., 2000), which included nine items measured on a five-point Likert scale. Positive correlations of psychological variables with MC scores indicated an inclination to overestimate psychological traits, while negative correlations indicated a tendency to underestimate them. The assessment of social desirability aimed to evaluate the validity of individual responses on the SD3, DD, and other psychological scales, as previous research has shown that subjective ratings, particularly those related to the Dark Triad traits, could be influenced by this response set (Kowalski et al., 2016).
Statistical Analyses
Descriptive Analysis
Descriptive statistics, including the number of valid cases, means, standard deviations, skewness, and kurtosis, were calculated for each psychological scale. Skewness and kurtosis values within the range of ±2 were considered acceptable (Gravetter & Wallnau, 2014). Correlations between each psychological measure and the MC scale were examined to assess the impact of social desirability on subjective responses. Frequencies were calculated for sociodemographic variables.
Multi-Trait Multi-Method Analyses
To evaluate convergent and discriminant validity of DT traits measured by the SD3 and DD methods, we fit a six-factor Confirmatory Factor Analysis (CFA) model in which latent factors represented Machiavellianism, narcissism, and psychopathy measured by both the SD3 and DD. Each latent factor was defined by its respective items. Because these items were ordinal variables, diagonally weighted least squares (DWLS) was used for estimation. DWLS has no distributional assumptions and mitigates biases associated with Likert response scales (Rhemtulla et al., 2012). Model fit was examined using the Comparative Fit Index (CFI), Tucker–Lewis Index (TLI), Root Mean Square Error of Approximation (RMSEA), and Standardized Root Mean Square Residual (SRMR). Fit is considered excellent if CFI and TLI exceed .95 (acceptable if > .90), and good if RMSEA < .06 and SRMR < .08 (Schermelleh-Engel et al., 2003). The six-factor model assumed free covariances among latent variables. This allowed us to assess convergent and discriminant validity properties using the Heterotrait–Monotrait (HTMT) ratio and the Fornell–Larcker criterion, respectively. The HTMT ratio is computed as the average of heterotrait–heteromethod correlations (e.g., Machiavellianism items from the SD3 with psychopathy and narcissism items from the DD), divided by the average of monotrait–heteromethod correlations (e.g., Machiavellianism items measured by both SD3 and DD). Ideally, if two sets of measures are entirely independent, heterotrait–heteromethod correlations should be zero, while monotrait–heteromethod correlations should be one. Conversely, if the item sets measure the same construct, both types of correlations should be 1.00, resulting in an HTMT ratio of 1.00. In practical applications, an HTMT ratio threshold of .85 is recommended to determine whether two constructs are empirically distinct (HTMT < .85) or overlapping (HTMT ≥ .85) (Henseler et al., 2015). According to the Fronell-Larcker criterion (Voorhees et al., 2016) for assessing discriminant validity, each factor’s Average Variance Extracted (AVE)—a measure of how much variance in a set of observed indicators is explained by the latent construct—must exceed its shared variance with other factors, calculated as the square of their correlation. Specifically, if the square root of a factor’s AVE is greater than its correlation with any other factor, that factor is considered to have discriminant validity. Based on these analyses, we formally tested the convergent and discriminant validity of corresponding factors across the two methods by comparing modified models—where we constrained the correlations between identically labeled latent factors to 1.00—with the unconstrained model in which those correlations were freely estimated. These analyses were conducted using R software (https://cran.r-project.org) with the lavaan package (Rossel, 2012).
Nomological Analysis
To test the nomological consistency of SD3 and DD, a multifaceted analytic approach was followed, consisting of three steps (Thielmann & Hilbig, 2019). The Thielmann and Hilbig’s multifaceted method of nomological analysis follows a holistic approach by considering all relevant indicators that showed to be connected to the principal components of Dark Triad. This method refrains from pre-selecting criteria, through correlation matrices, and assures generalizability of outcomes in different contexts, reducing potential biases tied to a narrow selection of outcomes. The use of Bayesian regression models allows a more reliable selection of outcomes than null hypothesis significance test approach. This method is not exempt from limitations. In particular, because it is based on prior research, it is subjected to potential selection bias and to differences in inventory properties. In the first step, the difference between the mean absolute Fisher’s z values () of zero-order correlations for each correlation matrix was calculated to identify significant convergences or divergences between Dark Triad traits measured by SD3 and DD and the target variables. A mean absolute difference in zero-order correlations greater than .10 may suggest the potential influence of additional factors, beyond those considered, in shaping the relationships between variables. Additionally, the absolute difference between the mean R2 value () of regression models for prediction consistency of SD3 and DD and the Intraclass Correlation Coefficient (rICC) was computed to analyze the correlation profiles for each equivalent trait indicator. In the second step, Fisher’s z-tests were conducted to compare correlation matrices pairwise and determine the number of comparisons resulting in significant differences between the correlation coefficients. In the third step, regression coefficients resulting from multiple regression analyses were estimated to predict each criterion using all Dark Triad trait indicators from the scales. Bayes factors (BF01) were calculated for each regression model across sets of criteria. BF01 values were classified into three groups: BF01 ≥ 3 indicated strong evidence in favor of the alternative hypothesis (H1), BF01 ≤ 1/3 indicated strong evidence in favor of the null hypothesis (H0), and 1/3 < BF01 < 3 indicated inconclusive evidence for H0 or H1. The interpretation of Bayesian regression analysis involved assessing the prediction validity of the regression model when a specific Dark Triad trait was omitted. Low BF01 values indicated the high importance of Dark Triad traits.
Nomological consistency of inventories did not have defined cutoffs. However, some criteria proposed by Thielmann and Hilbig (2019) were considered to indicate satisfactory nomological consistency: small differences in absolute correlation coefficients ( ≤ .10), small differences in the amount of explained variance ( ≤ 5%), strong correlations between correlation matrices (rICC ≥ .80), non-significant differences between correlation matrices (pairwise z-tests with p ≥ .05), and a substantial percentage of similar conclusions about null and alternative hypotheses in multiple regressions based on Bayes factor analysis (≥ 80%).
These analyses were conducted using R software (https://cran.r-project.org) with the BayesFactor (Morey et al., 2015) and cocor (Diedenhofen & Diedenhofen, 2016) packages.
Results
Descriptive Analysis
Descriptive statistics for each psychological scale, including means, standard deviations, skewness, kurtosis, Cronbach’s alpha values, and 95% confidence intervals for alphas, are provided in (Supplementary Table S1). The skewness and kurtosis values were acceptable for all variables. Cronbach’s alpha values were generally acceptable or good, except for the autonomy subscale of the BPNS scale and the secondary psychopathy subscale of the LSRP, which had alpha values slightly below .60 but still close to acceptable standards. Significant correlations with the MC scale indicated that subjective responses on the SD3 and DD subscales were influenced by social desirability, with correlations ranging from -.55 to -.19.
Multi-Trait Multi-Method Analyses
Despite significant, χ2(687, N = 504) = 1801.18, p < .001, the unconstrained model, in which latent-factor correlations were freely estimated, had a good fit (CFI = .967; TLI = .965; RMSEA = .060, SRMR = .069). Across both methods (SD3 and DD), each trait—Machiavellianism, narcissism, and psychopathy—was measured by items with statistically significant loadings (all p < .001), demonstrating that the items generally captured their intended constructs (details in Supplementary Table S2).
In the SD3, Machiavellianism showed predominantly moderate to high factor loadings. Apart from Item i1 (λ = .067), all items loaded on between .433 and .810, with Item i4 displaying the strongest coefficient. For narcissism, most SD3 items loaded in the low-to-moderate range (.217–.642), with Item i14 as the highest. Similarly, psychopathy items ranged from .238 to .731, with Item i15 presenting the strongest loading. In the DD instrument, Machiavellianism was characterized by consistently strong loadings (λ = .778–.850), while narcissism (λ = .502–.819) and psychopathy (λ = .373–.786) also showed moderate to strong loadings. These results suggested some variation in how each method effectively measured the respective Dark Triad trait. Internal consistency indexes highlighted this variation (details in Supplementary Table S2). The ordinal alpha (α) values for the SD3 subscales ranged from .684 (narcissism) to .760 (Machiavellianism), whereas average variance extracted (AVE) values ranged from .207 (narcissism) to .302 (Machiavellianism). By contrast, the DD subscales showed higher ordinal alpha values, from .702 (psychopathy) to .892 (Machiavellianism), and higher AVE, ranging from .420 (psychopathy) to .676 (Machiavellianism). These findings indicated that while the SD3 subscales demonstrated acceptable internal consistency, the DD subscales generally yield stronger reliability and extracted a greater proportion of shared variance in measuring each Dark Triad trait.
Next, we examined the latent variable correlations. Each Dark Triad trait—Machiavellianism, narcissism, and psychopathy—correlated strongly across the two methods (SD3 and DD), in some cases showing an ostensible overlap. For Machiavellianism (r = .927) and psychopathy (r = .876), both methods showed strong convergence. In contrast, narcissism demonstrated weaker convergence, with a moderate correlation between SD3 and DD methods (r = .718). The Fronell–Larcker analysis revealed some differences in convergence and discriminant validity (details in Supplementary Table S3). While both methods showed strong convergence for Machiavellianism and psychopathy, the SD3 method struggled with discriminant validity, as its AVE square roots were consistently lower than correlations with other latent variables. For narcissism, the weaker convergence between SD3 and DD, coupled with the SD3 scale’s poor discriminant validity, highlighted the need for refinement in its measurement. Overall, the DD scale appeared to better differentiate the Dark Triad traits. This conclusion was further supported by the analysis of the HTMT ratios (details in Supplementary Table S3). The HTMT ratios for the same trait were understandably high for Machiavellianism and psychopathy (i.e., .888 and .877, respectively) indicating good convergent validity between methods. However, the HTMT ratio for narcissism equal to .722 was comparatively lower, raising some concerns about the convergent validity of narcissism across instruments.
We formally tested the overlap of DT traits by comparing the fit of the unconstrained model to constrained models where correlations between identically labeled latent factors were fixed at 1.00, indicating perfect similarity. For Machiavellianism, Δχ2(1, N = 504) = 13.42, p < .001, and psychopathy, Δχ2(1, N = 504) = 14.44, p < .001, there was a significant loss of fit compared to the unconstrained model; however, differences in RMSEA, SRMR, and CFI were null, and ΔTLI was negligible (.001 for both). For narcissism, the constrained model showed not only a significant loss of fit, Δχ2(1, N = 504) = 73.23, p < .001, but also appreciable declines in other fit indices, with increases in RMSEA (+.002) and SRMR (+.001) and decreases in CFI (-.002) and TLI (-.003). These findings, and those previously reported, aligned with the conclusion that while Machiavellianism and psychopathy showed strong similarity across SD3 and DD methods, narcissism exhibited notable differences between the two methods. While these conclusions capitalize on findings within the SEM framework, further evidence is needed, such as examining the nomological network of each trait to assess their relationships with external variables and clarify the distinctiveness of the constructs across measurement methods.
Nomological Analysis
In the first step of the analysis, mean absolute differences in zero-order correlations () were calculated for each set of criteria. The mean absolute difference in zero-order correlations for the Psychopathy and Empathy set was .080, for the FFM personality traits set was .074, for the Mental Health set was .315, and for the Disinhibition vs. Constraint set was .078. Table 1 presents the standardized coefficients and R2 values from linear regression models used to assess the prediction consistency of SD3 and DD scales.
Table 1
Regression Analysis for Prediction Consistency Between the SD3 and DD Subscales and the Four Sets of Criteria
| Dark Triad Scales | ||||||||
|---|---|---|---|---|---|---|---|---|
| Set of criteria | SD3 Mach | SD3 Nar | SD3 Psych | SD3 R2 | DD Mach | DD Nar | DD Psych | DD R2 |
| Psychopathy and Empathy | ||||||||
| BEES | -0.189*** | -0.070 | -0.202*** | 0.143 | -0.064 | 0.052 | -0.452*** | 0.222 |
| IRI F | 0.026 | -0.071 | -0.023 | 0.006 | 0.219*** | 0.077 | -0.361*** | 0.091 |
| IRI EC | -0.155** | -0.045 | -0.230*** | 0.131 | -0.165** | 0.118* | -0.383*** | 0.203 |
| IRI PT | -0.087 | -0.006 | -0.154** | 0.047 | -0.137* | 0.068 | -0.189*** | 0.068 |
| IRI PD | 0.114* | -0.283*** | -0.045 | 0.076 | 0.155** | -0.076 | -0.247*** | 0.048 |
| LSRP PP | 0.331*** | 0.135*** | 0.371*** | 0.467 | 0.382*** | 0.066 | 0.331*** | 0.451 |
| LSRP SP | 0.121* | -0.260*** | 0.401*** | 0.194 | 0.277*** | -0.104* | 0.092 | 0.086 |
| FFM Personality Traits | ||||||||
| BFQ-SF E | -0.068 | 0.513*** | 0.086 | 0.277 | 0.001 | 0.322*** | 0.064 | 0.124 |
| BFQ-SF A | -0.330*** | 0.096* | -0.238*** | 0.219 | -0.148** | 0.031 | -0.324*** | 0.170 |
| BFQ-SF C | 0.003 | 0.226*** | -0.228*** | 0.064 | -0.184** | 0.261*** | -0.114* | 0.062 |
| BFQ-SF Em-St | -0.071 | 0.204*** | -0.212*** | 0.065 | -0.065 | -0.111* | 0.039 | 0.020 |
| BFQ-SF O | -0.027 | 0.144** | -0.046 | 0.017 | -0.046 | 0.166** | -0.099 | 0.023 |
| Mental Health | ||||||||
| SHS | -0.123* | 0.375*** | -0.020 | 0.120 | -0.031 | 0.057 | 0.082 | 0.010 |
| BPNS Aut | -0.167** | 0.286*** | -0.008 | 0.075 | -0.226*** | 0.090 | 0.205*** | 0.040 |
| BPNS Com | -0.128* | 0.431*** | -0.169*** | 0.159 | -0.188** | 0.247*** | 0.012 | 0.044 |
| BPNS Rel | -0.186*** | 0.389*** | -0.185*** | 0.151 | -0.237*** | 0.169** | -0.018 | 0.042 |
| STICSA | 0.096 | -0.241*** | 0.180*** | 0.069 | 0.115 | 0.021 | -0.060 | 0.011 |
| TDI | 0.085 | -0.333*** | 0.186*** | 0.102 | 0.105 | -0.069 | -0.032 | 0.006 |
| Disinhibition vs. Constraint | ||||||||
| DvC Pros-total | 0.064 | 0.198*** | -0.320*** | 0.085 | -0.280*** | 0.309*** | -0.128* | 0.104 |
| DvC Distr | 0.146** | -0.305*** | 0.277*** | 0.138 | 0.237*** | -0.142** | 0.055 | 0.050 |
| DvC Manip | 0.337*** | 0.128*** | 0.361*** | 0.458 | 0.497*** | 0.089* | 0.233*** | 0.506 |
| DvC Order | -0.029 | 0.083 | -0.245*** | 0.058 | -0.097 | 0.001 | -0.141** | 0.044 |
| DvC Risk | -0.100* | 0.148** | 0.347*** | 0.143 | 0.021 | 0.048 | 0.245*** | 0.079 |
Note. Standardized coefficients and R2 values for each model are reported.
SD3 = Short Dark Triad; DD = Dirty Dozen; Mach = Machiavellianism; Nar = Narcissism; Psych = Psychopathy; BEES = Balanced Emotional Empathy Scale; IRI = Interpersonal Reactivity Index (subscale: F = Fantasy; PT = Perspective Taking; EC = Empathic Concern; PD = Personal Distress); LSRP = Levenson Self-Report psychopathy Scale (subscales: PP = Primary psychopathy; SP = Secondary psychopathy); BFQ-SF = Big Five Questionnaire Short Form (E = Extraversion, A = Agreeableness, C = Conscientiousness, Em-St = Emotional Stability, O = Openness); SHS = Subjective Happiness Scale; BPNS = Basic Psychological Needs Scale (subscales: Aut = Autonomy; Com = Competence; Rel = Relatedness); STICSA = State-Trait Inventory of Cognitive and Somatic Anxiety; TDI = Teate Depression Inventory; DvC = Disinhibition vs. Constraint Inventory (subscales: Pros-total = Prosociality total score, Distr = Distractibility, Manip = Manipulativeness, Order = Orderliness, Risk = Risk Taking.
*p < .05. **p < .01. ***p < .001.
The absolute difference in R2 values () between SD3 and DD regression models for the Psychopathy and Empathy set of criteria was .015 (1.5%). For the FFM personality traits set, as .049 (4.9%). The Mental Health set showed a value of .087 (8.7%), and the Disinhibition vs. Constraint set had a value of .020 (2.0%).
Regarding the Intraclass Correlation Coefficient (rICC), for the Psychopathy and Empathy set, rICC values were .993 for Machiavellianism, .854 for narcissism, and .943 for psychopathy. For the FFM personality traits set, rICC values were .944 for Machiavellianism, .786 for narcissism, and .937 for psychopathy. The Mental Health set showed rICC values of .960 for Machiavellianism, .208 for narcissism, and .444 for psychopathy. Finally, for the Disinhibition vs. Constraint set, rICC values were .959 for Machiavellianism, .872 for narcissism, and 0.976 for psychopathy.
The first step analysis revealed moderate prediction consistency between SD3 and DD in the sets of criteria related to psychopathy and empathy, FFM personality traits, and disinhibition vs. constraint. However, prediction consistency was not achieved for the Mental Health set, with the narcissism and psychopathy scales showing the most divergent predictions. Specifically, SD3 narcissism was a valid predictor for all psychological well-being variables, and SD3 Psychopathy yielded valid predictions for almost all well-being variables, except for SHS and BPNS autonomy. On the other hand, DD Narcissism was only a valid predictor for BPNS competence and relatedness, while DD Psychopathy was only a valid predictor for BPNS autonomy.
In the second step of the analysis, Fisher's z-tests were conducted to compare pairwise correlations. The p values of these tests for each set of criteria are presented in Table 2. Results indicated that narcissism showed the largest number of significant pairwise differences, particularly in the Mental Health criteria. This suggests a notable disparity between the correlation matrices of SD3 and DD Narcissism.
Table 2
P Values of Fisher’s Z-Tests for Pairwise Comparison of Independent Correlation Coefficients for the Four Sets of Criteria
| SD3 and DD Traits | |||
|---|---|---|---|
| Set of criteria | Mach | Nar | Psych |
| Psychopathy and Empathy | |||
| BEES | 0.596 | 0.349 | 0.009b |
| IRI F | 0.277 | 0.035a | 0.006b |
| IRI EC | 0.701 | 0.292 | 0.062 |
| IRI PT | 0.577 | 0.825 | 0.534 |
| IRI PD | 0.782 | 0.004b | 0.116 |
| LSRP PP | 0.455 | 0.768 | 0.557 |
| LSRP SP | 0.642 | 0.016a | 0.007b |
| FFM Personality Traits | |||
| BFQ-SF E | 0.361 | 0.001b | 0.358 |
| BFQ-SF A | 0.049b | 0.293 | 0.742 |
| BFQ-SF C | 0.311 | 0.656 | 0.724 |
| BFQ-SF Em-St | 0.921 | < 0.001b | 0.035b |
| BFQ-SF O | 0.889 | 0.813 | 0.378 |
| Mental Health | |||
| SHS | 0.453 | < 0.001b | 0.624 |
| BPNS Aut | 0.879 | 0.003b | 0.114 |
| BPNS Com | 0.715 | 0.003b | 0.234 |
| BPNS Rel | 0.917 | < 0.001b | 0.428 |
| STICSA | 0.822 | 0.001b | 0.044b |
| TDI | 0.782 | 0.001b | 0.098 |
| Disinhibition vs. Constraint | |||
| DvC Pros-total | 0.035b | 0.928 | 0.365 |
| DvC Distr | 0.981 | 0.012b | 0.090 |
| DvC Manip | 0.008b | 0.168 | 0.365 |
| DvC Order | 0.495 | 0.172 | 0.522 |
| DvC Risk | 0.408 | 0.148 | 0.237 |
Note. SD3 = Short Dark Triad; DD = Dirty Dozen; Mach = Machiavellianism; Nar = Narcissism; Psych = Psychopathy; BEES = Balanced Emotional Empathy Scale; IRI = Interpersonal Reactivity Index (subscale: F = Fantasy; PT = Perspective Taking; EC = Empathic Concern; PD = Personal Distress); LSRP = Levenson Self-Report psychopathy Scale (subscales: PP = Primary psychopathy; SP = Secondary psychopathy); BFQ-SF = Big Five Questionnaire Short Form (E = Extraversion, A = Agreeableness, C = Conscientiousness, Em-St = Emotional Stability, O = Openness); SHS = Subjective Happiness Scale; BPNS = Basic Psychological Needs Scale (subscales: Aut = Autonomy; Com = Competence; Rel = Relatedness); STICSA = State-Trait Inventory of Cognitive and Somatic Anxiety; TDI = Teate Depression Inventory; DvC = Disinhibition vs. Constraint Inventory (subscales: Pros-total = Prosociality total score, Distr = Distractibility, Manip = Manipulativeness, Order = Orderliness, Risk = Risk Taking.
a coefficient significant at < .05. b coefficient significant at < .01. c coefficient significant at < .001.
The third step analysis involved estimating Bayesian factors (BF01) for regression models when a particular Dark Triad trait was omitted. The results are presented in Table 3.
Table 3
Bayesian Factor Values (BF01) of the Regression Models in Relation to Omitted DARK TRIAD Traits
| Omitted Trait | ||||||
|---|---|---|---|---|---|---|
| Set of criteria | SD3 Mach | SD3 Nar | SD3 Psych | DD Mach | DD Nar | DD Psych |
| Psychopathy and Empathy | ||||||
| BEES | 0.008 | 2.387 | 0.003 | 4.511 | 4.999 | < 0.001 |
| IRI F | 4.743 | 1.916 | 4.843 | 0.007 | 2.283 | < 0.001 |
| IRI EC | 0.074 | 4.673 | < 0.001 | 0.101 | 0.487 | < 0.001 |
| IRI PT | 1.594 | 5.941 | 0.107 | 0.471 | 2.832 | 0.014 |
| IRI PD | 0.630 | < 0.001 | 4.554 | 0.230 | 2.204 | < 0.001 |
| LSRP PP | < 0.001 | 0.015 | < 0.001 | < 0.001 | 3.538 | < 0.001 |
| LSRP SP | 0.402 | < 0.001 | < 0.001 | < 0.001 | 0.956 | 1.464 |
| FFM Personality Traits | ||||||
| BFQ-SF E | 3.401 | < 0.001 | 1.826 | 7.283 | < 0.001 | 3.359 |
| BFQ-SF A | < 0.001 | 0.810 | < 0.001 | 0.254 | 6.646 | < 0.001 |
| BFQ-SF C | 6.268 | < 0.001 | 0.001 | 0.059 | < 0.001 | 0.654 |
| BFQ-SF Em-St | 2.571 | 0.001 | 0.003 | 3.129 | 0.701 | 4.278 |
| BFQ-SF O | 4.845 | 0.087 | 3.873 | 4.241 | 0.058 | 1.097 |
| Mental Health | ||||||
| SHS | 0.407 | < 0.001 | 6.702 | 4.719 | 3.147 | 1.794 |
| BPNS Aut | 0.046 | < 0.001 | 6.398 | 0.006 | 1.494 | 0.005 |
| BPNS Com | 0.304 | < 0.001 | 0.030 | 0.052 | < 0.001 | 5.794 |
| BPNS Rel | 0.009 | < 0.001 | 0.011 | 0.003 | 0.047 | 5.594 |
| STICSA | 1.240 | < 0.001 | 0.023 | 0.970 | 5.018 | 3.012 |
| TDI | 1.842 | < 0.001 | 0.013 | 1.277 | 2.438 | 4.515 |
| Disinhibition vs. Constraint | ||||||
| DvC Pros-total | 3.153 | 0.002 | < 0.001 | < 0.001 | < 0.001 | 0.364 |
| DvC Distr | 0.132 | < 0.001 | < 0.001 | 0.003 | 0.197 | 3.607 |
| DvC Manip | < 0.001 | 0.032 | < 0.001 | < 0.001 | 1.056 | < 0.001 |
| DvC Order | 5.291 | 1.473 | < 0.001 | 1.679 | 5.915 | 0.212 |
| DvC Risk | 1.116 | 0.048 | < 0.001 | 6.111 | 4.289 | < 0.001 |
Note. SD3 = Short Dark Triad; DD = Dirty Dozen; Mach = Machiavellianism; Nar = Narcissism; Psych = Psychopathy; BEES = Balanced Emotional Empathy Scale; IRI = Interpersonal Reactivity Index (subscale: F = Fantasy; PT = Perspective Taking; EC = Empathic Concern; PD = Personal Distress); LSRP = Levenson Self-Report psychopathy Scale (subscales: PP = Primary psychopathy; SP = Secondary psychopathy); BFQ-SF = Big Five Questionnaire Short Form (E = Extraversion, A = Agreeableness, C = Conscientiousness, Em-St = Emotional Stability, O = Openness); SHS = Subjective Happiness Scale; BPNS = Basic Psychological Needs Scale (subscales: Aut = Autonomy; Com = Competence; Rel = Relatedness); STICSA = State-Trait Inventory of Cognitive and Somatic Anxiety; TDI = Teate Depression Inventory; DvC = Disinhibition vs. Constraint Inventory (subscales: Pros-total = Prosociality total score, Distr = Distractibility, Manip = Manipulativeness, Order = Orderliness, Risk = Risk Taking.
The findings clearly demonstrate that SD3 Narcissism is a crucial trait for the prediction validity of regression models for the Mental health criteria, while the same trait is less critical in the DD scale. The percentage of similar conclusions between SD3 and DD scales regarding the alternative hypothesis, based on BF01 values, was 38.1% for Psychopathy and Empathy criteria, 46.67% for FFM Personality Traits criteria, 38.89% for Mental Health criteria, and 46.67% for Disinhibition vs. Constraint criteria. These percentages are below the 80% threshold, indicating a substantial discrepancy between SD3 and DD scales in terms of consistency across all sets of criteria.
Discussion and Conclusion
The present study examined the convergence and discriminant validity of two widely used measures of the Dark Triad (DT) traits—the Short Dark Triad (SD3) and the Dirty Dozen (DD)—using a Structural Equation Modeling (SEM) framework. Both the SD3 and DD scales demonstrate acceptable internal consistency in assessing Dark Triad traits and strong factor intercorrelations, consistent with previous research (Gamache et al., 2018; Geng et al., 2015; Maples et al., 2014; Schimmenti et al., 2019). Contrary to previous studies, which identified the SD3 scale as demonstrating greater validity (Gamache et al., 2018; Geng et al., 2015; Maples et al., 2014), our findings highlight the DD scale as the more robust instrument due to its superior item loadings, higher reliability indices, and greater Average Variance Extracted (AVE).
The findings revealed strong convergence for Machiavellianism and psychopathy across the two scales, as evidenced by high correlations between identically labeled factors. In contrast, narcissism demonstrated weaker convergence, with moderate correlations between the SD3 and DD scales. These results highlight important differences in the measurement focus of the two instruments, particularly for narcissism. The Fronell–Larcker analysis provided further evidence for the discriminant validity of the DT traits. While the DD scale consistently demonstrated acceptable discriminant validity, the SD3 scale exhibited significant overlap between Machiavellianism, psychopathy, and narcissism, suggesting challenges in distinguishing these constructs. Similarly, the heterotrait-monotrait (HTMT) ratios supported strong discriminant validity for DD but raised concerns about the SD3, particularly for narcissism.
Nomological network analyses further confirmed these findings by examining the relationships of DT traits with external constructs, including the Big Five, empathy, and mental health outcomes. There is convergence between SD3 and DD scales in relation to psychopathy and Machiavellianism, as they are negatively correlated with agreeableness and conscientiousness (Maples et al., 2014; O’Boyle et al., 2015; Vize et al., 2018). Notably, narcissism measured by SD3 and DD showed distinct patterns of association: SD3-Narcissism was more strongly linked to extraversion and openness, while DD-Narcissism showed stronger associations with neuroticism and lower empathy. These differences—yet observed in previous literature (Maples et al., 2014)—further underscore the divergent operationalizations of narcissism across scales. Literature reported that narcissism is not a single, cohesive trait but it presents multiple and variable aspects (Crowe et al., 2019). The SD3 appeared to focus predominantly on grandiose narcissism, emphasizing traits such as assertiveness and dominance, whereas the DD scale likely encompassed both grandiose and vulnerable aspects of narcissism. The distinct nomological profiles of these measures supported this interpretation.
In terms of mental health, our study replicated and extended previous research (Aghababaei & Błachnio, 2015), indicating that both Machiavellianism and psychopathy were negatively associated with positive functioning and well-being (i.e., subjective happiness, basic psychological needs satisfaction) and positively associated with negative outcomes (i.e., anxiety, depression). SD3 Narcissism demonstrated significant positive associations with measures of well-being and extraversion, consistent with previous research (Egan et al., 2014). In contrast, DD Narcissism did not show significant correlations with most measures of well-being, except for the BPNS competence and relatedness scale. These divergences further supported previous findings that SD3 and DD measure different forms of narcissism (Maples et al., 2014).
Regarding disinhibition and constraint, both SD3 and DD Psychopathy significantly predicted various aspects of disinhibition and prosocial attitudes. Machiavellianism and psychopathy positively predicted distractibility and manipulativeness, and negatively predicted prosociality. Narcissism showed positive correlations with prosociality and manipulativeness, and negative correlations with distractibility. These associations are consistent with the notion that individuals with negative personality traits are more prone to misconduct and substance abuse (Azizli et al., 2016; Stenason & Vernon, 2016).
Collectively, the findings of the present study highlighted a potential ‘jingle-jangle’ fallacy in measuring DT traits, which posits that similar labels (e.g., narcissism) may not represent identical constructs across different measures, and different labels (e.g., Machiavellianism and psychopathy) may sometimes overlap significantly. Furthermore, our findings underscore the need for researchers to carefully select DT scales based on their specific objectives and the construct clarity provided by each measure.
This study contributed methodologically by integrating a robust SEM framework with Fronell–Larcker and HTMT analyses to assess the validity of DT measures. These approaches provided a comprehensive evaluation of both convergence and discriminant validity, highlighting strengths and weaknesses in existing measures. Additionally, the inclusion of a nomological network analysis added depth to our understanding of how these traits were related to broader personality constructs and mental health outcomes. By comparing two prominent DT measures, this study offered valuable insights for both researchers and practitioners. Future research should focus on revising the SD3 items, particularly in Italian, as the observed lack of discriminant validity may arise from item heterogeneity.
Several limitations should be acknowledged. First, the reliance on self-report measures might have introduced some bias due to social desirability or response tendencies. Future research could complement self-reports with behavioral or observational methods to enhance validity. Second, the study was conducted on an Italian sample, which may limit generalizability to other cultural contexts. Given potential cultural variations in the expression of DT traits, replication in diverse populations is necessary. Third, the study’s design did not account for the measurement of vulnerable narcissism explicitly, which may partly explain the weaker convergence observed for this trait.
Notwithstanding limitations, overall, this study demonstrated that while the SD3 and DD scales are consistent in measuring Machiavellianism and psychopathy, their operationalizations of narcissism diverged significantly. The SD3’s challenges with discriminant validity highlight the need for careful scale selection depending on research objectives. By leveraging advanced psychometric methods and nomological network analyses, this study contributes to the ongoing refinement of DT measurement and emphasizes the importance of aligning measurement tools with theoretical clarity.
This is an open access article distributed under the terms of the Creative Commons Attribution License (