Satisfied With Your Satisfaction Scale? Validity Evidence on Bipolar, Branching, and Unipolar Questions
Abstract
Survey communities have regularly discussed optimal questionnaire design for attitude measurement. Specifically for consumer satisfaction, which has historically been treated as a bipolar construct (Thurstone, 1931; Likert, 1932), some argue it is actually two separate unipolar constructs, which may yield signals with separable and interactive dynamics (Cacioppo & Berntson, 1994).
Earlier research has explored whether attitude measurement validity can be optimized with a branching design that involves two questions: a question about the direction of an attitude (e.g., positive, negative) followed by a question using a unipolar scale, about the intensity of the selected direction (Krosnick & Berent, 1993).
The current experiment evaluated differences across a variety of question designs for in-product contextual satisfaction surveys (Sedley & Müller, 2016). Specifically, we randomly assigned respondents into the following designs:
Traditional 5-point bipolar satisfaction scale (fully labeled)
Branched: a directional question (satisfied, neither satisfied nor dissatisfied, dissatisfied), followed by a unipolar question on intensity (5-point scale from “not at all” to “extremely,” fully labeled)
Unipolar satisfaction scale, followed by a unipolar dissatisfaction scale (both use 5-point scale from “not at all” to “extremely,” fully labeled)
Unipolar dissatisfaction scale, followed by a unipolar satisfaction scale both use 5-point scale from “not at all” to “extremely,” fully labeled)
The experiment adds to the attitude question design literature by evaluating designs based on criterion validity evidence; namely the relationship with user behaviors linked to survey responses.
Results show that no format clearly outperformed the ‘traditional’ bipolar scale format, for the criteria included. Separate unipolar scales performed poorly, and may be awkward or annoying for respondents. Branching, while performing similarly as the traditional bipolar design, showed no gain in validity. Thus, it is also not desirable because it requires two questions instead of one, increasing respondent burden.
REFERENCES
Cacioppo, J. T., & Berntson, G. G. (1994). Relationship between attitudes and evaluative space: A critical review, with emphasis on the separability of positive and negative substrates. Psychological bulletin, 115, 401-423.
Krosnick, J. A., & Berent, M. K. (1993). Comparisons of party identification and policy preferences: The impact of survey question format. American Journal of Political Science, 37, 941-964.
Reliability of responses via test-retest, comparing branched vs unbranched
Orthogonal to our study? Not a validity analysis
Malhotra, N., Krosnick, J. A., & Thomas, R. K. (2009). Optimal design of branching questions to measure bipolar constructs. Public Opinion Quarterly, 73), 304-324.
Looks like their analyses were within-condition, and not comparing single question versions to branched versions like we are
page 308 summarizes how they coded the variants and normalized 0 to 1 for regression analysis
O’Muircheartaigh, C., Gaskell, G., & Wright, D. B. (1995). Weighing anchors: Verbal and numeric labels for response scales. Journal of Official Statistics, 11, 295–308.
Wang, R., & Krosnick, J. A. (2020). Middle alternatives and measurement validity: a recommendation for survey researchers. International Journal of Social Research Methodology, 23, 169-184.
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 79, 281–299.
Thurstone, L. L. (1931). Rank order as a psychological method. Journal of Experimental Psychology, 14, 187–201.
Likert, R. (1932). A Technique for the Measurement of Attitudes. Archives of Psychology,
22, 5–55.
Sedley, A., & Müller, H. (2016, May). User experience considerations for contextual product surveys on smartphones. Paper presented at 71st annual conference of the American Association for Public Opinion Research, Austin, TX. Retrieved from https://ai.google/research/pubs/pub46422/
Earlier research has explored whether attitude measurement validity can be optimized with a branching design that involves two questions: a question about the direction of an attitude (e.g., positive, negative) followed by a question using a unipolar scale, about the intensity of the selected direction (Krosnick & Berent, 1993).
The current experiment evaluated differences across a variety of question designs for in-product contextual satisfaction surveys (Sedley & Müller, 2016). Specifically, we randomly assigned respondents into the following designs:
Traditional 5-point bipolar satisfaction scale (fully labeled)
Branched: a directional question (satisfied, neither satisfied nor dissatisfied, dissatisfied), followed by a unipolar question on intensity (5-point scale from “not at all” to “extremely,” fully labeled)
Unipolar satisfaction scale, followed by a unipolar dissatisfaction scale (both use 5-point scale from “not at all” to “extremely,” fully labeled)
Unipolar dissatisfaction scale, followed by a unipolar satisfaction scale both use 5-point scale from “not at all” to “extremely,” fully labeled)
The experiment adds to the attitude question design literature by evaluating designs based on criterion validity evidence; namely the relationship with user behaviors linked to survey responses.
Results show that no format clearly outperformed the ‘traditional’ bipolar scale format, for the criteria included. Separate unipolar scales performed poorly, and may be awkward or annoying for respondents. Branching, while performing similarly as the traditional bipolar design, showed no gain in validity. Thus, it is also not desirable because it requires two questions instead of one, increasing respondent burden.
REFERENCES
Cacioppo, J. T., & Berntson, G. G. (1994). Relationship between attitudes and evaluative space: A critical review, with emphasis on the separability of positive and negative substrates. Psychological bulletin, 115, 401-423.
Krosnick, J. A., & Berent, M. K. (1993). Comparisons of party identification and policy preferences: The impact of survey question format. American Journal of Political Science, 37, 941-964.
Reliability of responses via test-retest, comparing branched vs unbranched
Orthogonal to our study? Not a validity analysis
Malhotra, N., Krosnick, J. A., & Thomas, R. K. (2009). Optimal design of branching questions to measure bipolar constructs. Public Opinion Quarterly, 73), 304-324.
Looks like their analyses were within-condition, and not comparing single question versions to branched versions like we are
page 308 summarizes how they coded the variants and normalized 0 to 1 for regression analysis
O’Muircheartaigh, C., Gaskell, G., & Wright, D. B. (1995). Weighing anchors: Verbal and numeric labels for response scales. Journal of Official Statistics, 11, 295–308.
Wang, R., & Krosnick, J. A. (2020). Middle alternatives and measurement validity: a recommendation for survey researchers. International Journal of Social Research Methodology, 23, 169-184.
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 79, 281–299.
Thurstone, L. L. (1931). Rank order as a psychological method. Journal of Experimental Psychology, 14, 187–201.
Likert, R. (1932). A Technique for the Measurement of Attitudes. Archives of Psychology,
22, 5–55.
Sedley, A., & Müller, H. (2016, May). User experience considerations for contextual product surveys on smartphones. Paper presented at 71st annual conference of the American Association for Public Opinion Research, Austin, TX. Retrieved from https://ai.google/research/pubs/pub46422/