If you want to share information about this web page...
This webpage will first provide an introductory definition of validity and reliability that not only connects to how the concepts are used within research with a quantitative approach but also to the concepts’ use within research with a qualitative approach. Afterward, there will be an explanation of how the concepts are used within quantitative and qualitative research approaches, respectively. The summary below does not claim to be the only way to explain and organize these concepts. Reading this webpage will give you a better ability to evaluate the quality of research projects.
You will understand this page best if you have read Introduction to research.
Quantitative and qualitative research methods use different data collection techniques to gather observations, which then form the basis for the results. For all data collection techniques, one wants to assess how well they measure what they are intended to measure. The concepts of validity and reliability are then used to describe how well our data collection has worked. Good validity and reliability are a prerequisite for our results to be generalizable, to apply to others beyond those who were studied.
Validity and reliability are concepts whose original definitions were developed for studies with a quantitative approach, but they have later begun to be applied to studies with a qualitative approach as well. This drift is partly unfortunate, as it is better to use other concepts within qualitative research. However, because the concepts of validity and reliability have become firmly established, they are also used—more or less successfully—within research with a qualitative approach. This webpage aims to describe all these different uses of the concepts of validity and reliability, as well as describe alternative, partially synonymous, concepts used within qualitative research.
There are certain differences (but also similarities) in how the concepts of validity and reliability are used within quantitative and qualitative research approaches, respectively. In a study with a quantitative approach, a data collection method with known and acceptable validity and reliability for the intended purpose has generally been chosen before the data collection begins. In a study with a qualitative approach, one works continuously with validity and reliability throughout the entire project. In a study with a quantitative approach, the concepts of validity and reliability primarily relate to the data collection—that the right kind of observations are collected in a reliable manner. In a study with a qualitative approach, the concepts of validity and reliability concern both the data collection and the subsequent analysis of the collected data.
Validity and reliability
Validity refers to measuring what is relevant in the context, while reliability refers to measuring in a trustworthy manner. If the final results are to be credible, one must have high validity and reliability.
Validity is about using the right thing at the right time. To easily understand the concept of validity, we can compare it to, for example, credit cards or bus passes. They are valid in certain situations but not in others. You can use the bus pass on a bus but not in a taxi. Sometimes there is even an expiration date. In research, validity is about being able to state in which situation and for which population the data collection technique is valid and, consequently, that the results are valid.
Reliability is about trustworthiness (dependability). Can I trust a mechanic who repairs my car? What might happen if the car mechanic is not dependable? What decisions might be made if they are based on observations that cannot be trusted?
Suppose I want to measure the degree of obesity in people. Measuring the foot size of each individual will likely not make me any wiser. It does not help to claim that I performed my measurements very accurately. High reliability is thus no guarantee that we will get high validity. Suppose that we instead had measured height and weight to calculate the body mass index (BMI). Then we are measuring something that is more relevant in the context than shoe size. Suppose further that we perform our measurement by quickly looking at the individual and then estimating their height and weight. Our measurement is then done with low trustworthiness (low reliability). Even though we measured the right thing, it was measured so poorly that we did not get a good measure of obesity. Low reliability thus always leads to low credibility. The following two rules are good to remember:
- High reliability does not guarantee high credibility.
- High credibility requires high reliability.
Quantitative (empiric-atomistic) approach
Validity within research with a quantitative approach
If we want to measure body weight, it is not too difficult to know when what we are measuring provides a measure of what we want to measure. If, instead, we want to measure well-being, intelligence, knowledge, perceptions, or experiences, it becomes more difficult. The concept of validity is used slightly differently depending on whether the study has a quantitative or qualitative approach. Below are described the most common concepts of validity used in studies with a quantitative approach (some comments about qualitative studies are within parentheses).
- First, we must determine if our data collection technique gives us information about the phenomenon that interests us. Compare this with the example of obesity and foot size. The easiest way to make this assessment of “content validity” = “face validity” is by asking a few outsiders, who are well-versed in the problem area, to give their opinion. This is preferably done in quantitative studies (can also be done in qualitative studies).
- Second, we should, if possible, ensure the “concurrent validity” = “correspondence validity” = “criterion validity.” This means that the result we obtain corresponds with the results from investigations done by others or concurrent measurements using a different method/technique. In studies with a quantitative orientation, measures such as kappa, sensitivity, specificity, likelihood ratio, and predictive values are often used. (Within studies with a qualitative orientation, it is not possible to determine concurrent validity).
- Third, we want to see if related concepts align with our measurements. This is called “construct validity“. Assume we are investigating the prevalence of anemia in women. At the same time as we take our blood sample, they answer a survey where they, among other things, state their perceived level of energy. If we saw that individuals with low blood values reported experiencing high energy and vice versa, it suggests that either our device for measuring blood value is not working, or our question is not measuring the experience of energy. (Construct validity is seldom applicable in studies with a qualitative orientation).
- Fourth, we can speak of “communicative validity“. The researcher’s ability to communicate their journey during the research process affects the validity of the knowledge. In a quantitative study, communicative validity means having a thorough methods description and a non-response analysis. (This also exists in qualitative studies, see below).
- Fifth, one can speak of “pragmatic validity“. Is the knowledge one arrives at useful? Without usefulness, the knowledge is limited. This is a sensitive point that can be misinterpreted as meaning that basic research is wrong. That is not the case. Even within basic research, however, it is important that the knowledge, sooner or later, becomes useful in some way. It is therefore important that in the reporting of a project, one tries to point out the significance of the results that have emerged.
Reliability within research with a quantitative approach
By reliability, we mean that the knowledge produced is obtained in a trustworthy manner, that there are no uncontrolled, random errors clouding the development of knowledge. Within quantitative research, reliability equals reproducibility, something that can often be estimated and given a numerical value. To investigate how well the measurement in a quantitative study can be reproduced, one can discuss reliability from three different perspectives:
- Is the measurement free from bias from the person measuring? This is called “inter-rater reliability” = “inter-rater agreement.” It is tested by having several people measure and then comparing how well the different individuals’ measurements correspond..
- Is the measurement affected by time? If the same person performs several measurements, how well do they correspond? This is called “test-retest reliability“.
- Is there consistency in the outcome between different parts of a survey that address the same phenomenon? This is called “internal consistency reliability” and can be estimated using Cronbach’s alpha.
Inter-rater reliability, test-retest reliability, and internal consistency reliability are concepts used only in studies with a quantitative orientation. (For information about reliability in qualitative studies, see below.)
Qualitative (empiric-holistic) approach
Validity and reliability must be assessed somewhat differently in studies with a qualitative orientation compared to studies with a quantitative orientation. Within qualitative research, trustworthiness cannot be estimated with numbers.
Validity and reliability in studies with a qualitative orientation are about being able to describe that one has collected and processed data in a systematic and honest manner. In the final report, one also describes the preconditions for the project and how the results have emerged during the process. Other, partially overlapping, concepts are used.
Below is an attempt to provide an overview of these concepts. The overview does not claim to be the only way to present how the concepts relate to one another. Nor does the overview claim to be complete.
Internal validity [=trustworthiness =credibility]
Communicative validity
The researcher’s ability to communicate how the research process affects the credibility of the knowledge. Communicative validity consists of:
- Description of pre-understanding: The author describes their own pre-understanding (prejudices/biases). What background, education, and personal experiences the author has.
- Description of data collection: How the data collection was done must be described in detail. If the data collection has been done over a longer period, it can sometimes increase credibility, as experiences from the first part of the data collection have had time to improve the data collection towards the end.
- Description of selection: How the participants were selected must be described in detail.
- Description of the analysis process: A detailed description of what happened during the analysis process. How it was done and what decisions were made. What appears directly in the material and what are interpretations?
Participant Validation (Member Check)
If the informants (those who provide information, for example, people who are interviewed) can themselves correct erroneous perceptions and misunderstandings, it can sometimes be positive. If this is done already during an interview, it is called “dialogic validation” = “clarification.” Another alternative is for the person who was interviewed to read a transcript of the entire interview to correct misunderstandings and make clarifications.
Triangulation
Triangulation means that one looks at the problem from several viewpoints. For example, one can interview people with different relationships to the problem (“source triangulation“) . Different researchers with different professional perspectives can participate in data collection (“observer triangulation“) . During the analysis phase, one can analyze the material with different paradigms (“theory triangulation“) . An example of the latter could be to first analyze the material from a systems theory perspective and then to analyze the same material from a life-world perspective .
External validity [=transferability]
Simplified, one can say that in a study with a quantitative approach, it is the researcher/author who defines the generalizability. The reader can then decide whether they agree with the author or not. In a qualitative study, the researcher/author does not define the generalizability but rather presents the path and the findings that were made at the end of the path. The reader then determines the generalizability.
The transferability to apply results outside the study depends on whether the findings offer a meaning that transcends their own horizon . When assessing how the results can be applied, one asks: what parts of the results can be applied? …for whom are the results applicable? …under what conditions can the results be applied?
Reliability [=trustworthiness =dependability]
Are the measurement instruments reliable? In studies with a qualitative approach, both technical equipment and humans are used as “instruments.” Both must fulfill their task in a trustworthy manner. Reliability consists of:
The quality of technical equipment
Almost always, some technical equipment is used to record a conversation. Depending on the quality of the technical equipment and choice of microphone, one gets varying sound quality. The data becomes poorer if it is difficult to hear what is actually said on the recording. Here, can there be a big difference between a microphone built into the equipment or a separate external microphone?
The quality of the researcher
- Description of pre-understanding: (See above).
- The researcher’s ability to make good observations/interviews: This is an important aspect that is both difficult to measure and to describe.
- The researcher’s fidelity to data: Have experiences from the beginning of the data collection influenced the rest of the data collection in a good or a bad way? This is partly dependent on which qualitative method the researcher uses. In some of them (for example, Grounded Theory), it is intended that the data collection be influenced by data that is collected and analyzed at the beginning. In other methods, it is preferred that data is analyzed only after all data has been collected.
- The quality of the researcher’s supervision: To the extent that the researcher does not hold a PhD themselves, one or more supervisors are needed. Who is/are this supervisor/these supervisors and what reputation do they have?
Objectivity
This is largely the same as the internal validity described above. By this is meant the researcher’s ability to be neutral and not color the data with their own pre-understanding. In certain types of projects, one can investigate objectivity by selecting transcripts of some interviews and letting different researchers assess these transcripts. One then compares what they concluded.