You are one of...current visitors on the English part of this website. (Also ...current visitors on the Swedish part.)

Cite this page as:

-

Choosing statistical analysis

-

First published:

on:

INFOVOICE.SE

Last updated:

If you want to share information about this web page...

This webpage explains different statistical methods and when to use them. Reading this page (a few times) will give you an understanding of the choices involved and how to select a statistical method. After studying this for a while, you will likely be able to choose the appropriate statistical method for your own project. Don’t forget to first review the reading recommendations, which provide a foundation for understanding this page.

You will understand this webpage best if you have first read the pages on Introduction to Statistics, Observations and variables, Level of significance, and Correlation and regression. Understanding the page Observations and Variables is absolutely essential for choosing the correct statistical method, so please read it more than once.

A bird’s-eye view of statistics

Birds view on inferential statistics
(Click on image to get a high resolution image for your own PowerPoint)

Statistics consists of two main parts; descriptive statistics and inferential statistics. Descriptive statistics tries to describe the observations, usually by indicating a measure of central tendency and a measure of dispersion. Inferential statistics tries to draw conclusions from the observations.

Analytical statistics is a bit broader and sometimes used less formally. It generally refers to the process of analyzing data to discover patterns, relationships, and insights. This often involves using inferential statistical methods, but it can also include descriptive statistics, data visualization, and more complex modeling. Inferential statistics is a more narrow term meaning mathematically calculating the p-value, effect size, or a measure of agreement.

This webpage is about the choice of statistical method for inferential statistics. The figure shows a bird’s-eye view of inferential statistics, which has two main branches:

  • Comparing groups (one group against a fixed value, matched or unmatched groups) with no or limited adjustment for other factors..
  • Covariation. It is always done within a single group (even if it might appear as if it involves multiple groups).

Parametric or non-parametric methods?

Statistical methods used within inferentialstatistics can be divided into parametric and non-parametric methods, respectively. So, when you are finished with the descriptive statistics, it is time to decide whether the inferential statistics should use parametric or non-parametric methods (follow the link and read more about this before continuing here)..

Parametric tests are those tests that have certain stricter requirements, especially regarding how observations should be distributed. The first and most important requirement is that the variables must be measured on an interval scale. Furthermore, they require that the variable should be normally distributed. Additionally, when comparing two or more groups, it is required that the variance (spread) in the different groups is approximately equal. If your variable is measured on the interval scale, you should investigate whether your measurements meet the conditions for using parametric tests.

The basic rule is to use parametric methods if your observations meet the conditions for them. Otherwise, use non-parametric methods. Parametric methods are a bit more sensitive and have a greater chance of finding what you are looking for. It is common in a single study to analyze some variables with parametric methods and other variables with non-parametric methods.

Simple group comparison or analysing associations?

It can be proven that performing a simple group comparison or evaluating the same thing using “association analysis with multifactorial models” (usually with some form of regression analysis) yields the same result. In fact, most group comparisons are actually just a special case of association analysis using multifactorial models. Does it matter then if I use statistical methods for group comparison or for association analysis with multifactorial models? Yes, it does!

It is quite common for confounding factors to influence the result in a group comparison. Examples of such confounding factors can be gender, age, being a smoker, having diabetes, etc. If you use group comparison, you will probably want to perform subgroup analyses of different subgroups to see how they affect the result. This leads to some serious problems:

  • You probably have to perform several separate group comparisons, which then leads to multiple p-values (one for each time you perform a group comparison). Let’s assume, as an example, that you have three variables where you can evaluate the difference between the groups. Let’s assume these outcome variables are reduction in mortality, reduction in the proportion of patients having a heart attack, and finally, reduction in blood cholesterol level. This then results in three different p-values (if we use p-values to indicate a difference between the groups) when we analyze the difference between our two groups, one p-value for each outcome variable. Furthermore, if we also want to perform subgroup analyses for gender, age (over or under 65 years), whether the patient has diabetes, and whether they are smokers, we need to calculate 3*2*2*2*2=48 p-values. Calculating many p-values requires adjusting the level of significance due to multiple testing. With many subgroups, the adjustment of the significance level quickly becomes so large that it becomes very difficult, perhaps impossible, to demonstrate a difference between the groups.
  • When you divide into subgroups, the number of available observations in each subgroup becomes much smaller, and this increases the risk that your statistical analyses will have too few observations to achieve reasonable statistical power.

The alternative to performing group comparisons is to use association analysis with multifactorial models. You can then include all relevant variables in a single statistical analysis that encompasses all observations. You then perform a separate analysis for each of the outcome variables (in that calculation, the outcome variables are called the dependent variable). All other variables such as gender, age, diabetes, and smoking, as well as group membership, are included as independent variables. In the example above, this involves calculating 3*(1+1+1+1+1)=15 measures (such as p-values or odds ratios, etc.) of what is significant in the group comparison. The magnitude of the adjustment in the level of significance you need to make to produce 15 measures (for example, p-values) is much smaller compared to calculating 48 measures. Furthermore, you have been able to use the entire dataset, which is not possible if you analyse multiple subgroups separately.

The conclusion is: use simple group comparison with no adjustment for other factors if you have no need to adjust for other factors. This only occurs in well-conducted randomized controlled trials. In all other situations (and even sometimes when you have conducted a randomized controlled trial), it is much better to perform the group comparison by analyzing association using multifactorial models. This is especially important when comparing groups using historically collected data, so-called retrospective studies (for example, retrospective medical record reviews), where confounding factors are always present.

Important aspects to consider in different situations

Randomized Controlled Trials (RCTs)

The main purpose of randomization into different groups is to reduce the risk of systematic sources of error. The main purpose of randomization is not to create groups that are equal, although randomization often has that desired side effect. It is important to understand that a difference between groups can arise even if individuals are randomly allocated (randomized) to the different groups. If a difference between the groups arises, one can ask why (read more about this on the page about randomization) and one can also consider adjusting for this by doing the group comparison using association analysis with multifactorial models. If you use association analysis with multifactorial models, you let group membership be one of the independent variables, while the variables where the groups differ at baseline (at the first measurement) are also included as independent variables.

Dichotomous tests for screening or diagnostics

Normally, a “Gold standard” (=reference method) is needed to evaluate a new test using sensitivity, specificity, likelihood ratio, or predictive values. The Gold standard is an accepted reference method that hopefully also provides a good measure of the true value to be measured.

Sensitivity and specificity indicate the “health” (performance characteristics) of the test you want to evaluate, something that is very interesting for those who develop and manufacture tests. The likelihood ratio tells how much information the test adds, which is of interest to those developing new guidelines. Predictive values inform about the health status of the patient, and this is much more interesting for healthcare personnel.

It is common that the test we want to evaluate measures something that does not necessarily mean the individual is sick. For example, one can carry bacteria, but an illness might be caused by something else, such as a virus. We must therefore understand that in some cases, there is a difference between detecting the presence of a bacterium and trying to prove that the person is sick specifically because of that bacterium. Please have a look at this presentation which attempts to exemplify this:

For each test to be evaluated, one should discuss what the gold standard actually detects: is it the presence of a marker or the presence of a disease?

Assume we want to compare a new, very good test with an established reference method (which we designate as the gold standard). If the new test is better than our reference test, the new test will incorrectly appear to be poor. The reason is that every time the new test and the reference method do not agree, it is classified as an error in the new test, even though in reality, it might be the opposite. Therefore, remember to always question whether the reference method is truly as good as or better than the test being evaluated.

Case-control or cohort studies

This is sometimes done as a review of patient records when one wants to know if one way of managing patients is better or worse than another way. Let’s discuss an example: For kidney stone surgery, either open surgery, keyhole surgery, or shockwave therapy is used (there are more methods, but for the sake of this discussion, we will stick to these three). Someone has been tasked with looking at the outcomes for patients who were operated on using one method or another . This is not a randomized controlled trial, and this is the main problem. The consequence in this example is that we are comparing three groups of patients who are not comparable. Differences in outcomes may very well be due to differences between the groups rather than differences in the effect of the different treatments.

Reviewing medical records or databases containing patient information is sometimes used for the purpose of comparing different treatments. It is important to remember that there are almost always confounding factors that one should account for by performing group comparisons using association analysis with multifactorial models. A special variant of the latter is “propensity score matching“.

Association with multifactorial models

It is common to want to predict the risk of something happening. This could involve, for example, changes in quality of life, the occurrence or worsening of a disease, or death. You then run a statistical method for each outcome variable you have (often called the dependent variable). You then investigate how a number of independent variables correlate with your outcome variable. If your outcome variable is dichotomous (like 0/1, or sick / healthy), you will probably want to use logistic regression. If your outcome variable is time to an event (for example, deterioration/improvement or death), you will probably want to use Cox regression.

Using association analysis with multifactorial models is a good alternative to conducting case-control studies. Group membership then becomes one of several independent variables. The interpretation of the relationship is that there is an “association” rather than cause-and-effect. The latter often requires randomized controlled trials to be established.

Generalized Linear Mixed Models (GLMM)

Many statistical software programs have a function called “Generalized Linear Mixed Model (GLMM)” (or something similar). This is mentioned below in several situations. You can read more about what this is on the page about Mixed-effects models. The concept of random effect is also explained there.

Specific advice for choosing a statistical method

Association

Association is almost always analyzed using some form of regression analysis. In regression analyses, one speaks of dependent and independent variables. One looks for how much of the variation in the dependent variable is explained by variations in the independent variables. In most regression analyses, there is a single dependent variable that is examined together with one or more independent variables. It is common to investigate what is associated with several dependent variables, but in that case, one almost always examines one dependent variable at a time.

There are very complex statistical analyses that simultaneously evaluate what is associated with multiple dependent variables. One can also analyze what is associated with a virtual dependent variable (a conceptual dependent variable believed to exist but not directly measurable).

Association with one dependent variable

Simple association without building multifactorial models

This involves looking at how two variables (one dependent and one independent) are associated. When there are only two variables, one rarely specifies which is the dependent and which is the independent variable. As soon as more than two variables are involved, it becomes multifactorial models. Suggestions for appropriate statistical tests are given in the table below:

Scale of measureDescriptionType of testsSuitable testsComments
Nominal scaleBoth variables are dichotomous and can only assume two values / classes.None parametricOdds ratioUsed often
Binary logistic regressionThis is just another way of calculating odds ratio. It will give the same result as if you calculate odds ratio using the standard method.
Relative riskUsed often
Phi coefficient
Craemer’s Phi koefficient = Craemer’s V index
Yule coefficient of association = Yule’s QRarely used
Nominal scaleAt least one of the variables can assume more than two values / classes.Non parametricCraemer’s Phi koefficient = Craemer’s V index
Nominal and ordinal scaleOne of the variables is measured on the nominal scale and is dichotomous. The other is measured on the ordinal scale or on the interval/ratio scale but is skewed.ParametricBinary logistic regression
Nominal and intervall scaleOne of the variables is measured on the nominal scale and is dichotomous. The other is measured on the interval/ratio scale and is normally distributed.ParametricEta squaredThis is the association analysis equivalent to the one-way ANOVA used for group comparisons.
Ordinal scaleBoth variables are measured with an ordinal scale.Non parametricSpearmann’s rank correlationUsed often
Gamma coefficient = Gamma statistic = Goodman and Kruskal’s gammaRarely used
Kendall’s coefficient of concordance = Kendall’s tauRarely used
Somer’s DRarely used
Interval scaleAt least one of the variables are skewed.Non parametricSpearmann’s rank correlationUsed often
Both variables are normally distributed.ParametricPearson’s correlationUsed often
Ratio scaleOne of the variables is time to an event.Cox regression = proportional hazards regressionUsed often
Count data, usually non-negative integers.Poisson regressionUsed often
Negative binomial regressionCommonly used if the conditions for doing Poisson regression are not fulfilled.
Zero inflated modelsSuitable if there are many observations with the value zero.
Advanced association with multifaktorial models

Here we have more than two variables. One of them is then always designated as the dependent variable, and the others are called independent variables. Some of the methods mentioned in the table above are also used when we have more than one independent variable. Thus, there are both similarities and differences between the table above and the one immediately below.

Scale of measureDescriptionType of testsSuitable testsComments
Nominal scaleThe dependent variable is dichotomous and can only assume two values / classes.ParametricUnmatched logistic regression = Unconditional binary logistic regressionUsed often
Non parametric(Mantel-Haenszels stratifierade analys)Use this only if all independent variables are dichotomous. However, logistic regression is always a better alternative.
“Semi-parametriskt”Propensity score matching (PSM)It is mostly used when you want to compare groups, but there are many confounding variables that need to be taken into account. PSM is actually not a statistical method but rather a preparatory procedure to prepare the data for statistical testing.
The dependent variable can assume more than two values / classes.ParametricMultinominal logistic regression = multiclass logistic regression
“Semi-parametric”Propensity score matching (PSM)Using various techniques, it is possible to make propensity score matching work with more than two groups in the dependent variable.
Ordinal scaleThe dependent variable is measured with an ordinal scaleParametricOrdinal regression = Ordered logistic regression
Unmatched logistic regression = Unconditional binary logistic regressionYou can introduce a cut-off and analyse data with logistic regression.
Interval scaleThe dependent variable is normally distributed.Standard linear regressionUsed often. This is also labelled analysis of covariance (ANCOVA) if at least one of the independent variables is dichotomous.
Unmatched logistic regression = Unconditional binary logistic regressionYou can introduce a cut-off and analyse data with logistic regression.
Mixed linear regressionUsing both “fixed effects” and “random effects”. Often used in multi-level analysis.
“Semi-parametric”Propensity score matching (PSM)(See comment above about PSM)
Ratio scaleTime to an event. This is a special case of interval or ratio scale.ParametricCox regression = proportional hazards regressionUsed often
The dependent variable is count data, usually non-negative integers.Poisson regressionUsed often
Negative binomial regressionThis is the preferred method if the conditions for doing Poisson regression are not fulfilled.
Zero inflated modelsSuitable if you have many observations with the value zero.

Association with multiple dependent variables

This is advanced statistics. An example is factor analysis or multivariate probit analysis.

Association with a virtual dependent variable

One can analyze what is associated with a virtual dependent variable that is a conceptual dependent variable believed to exist but not directly measurable. This is advanced statistics. An example is factor analysis.

Simple group comparison (no adjustment for confounding factors)

Observations are compared between groups. These observations are often labelled: result variable, outcome variable, outcome measure or dependent variable. If one also wants to adjust for other confounders use Advanced association with multifactorial models” (see above), and then group membership and confounders are called independent variables.

  1. Determine how many variables (factors) are used to divide the participants/observations into groups:
    Zero-factor design: No variables are used for grouping. A single group is then compared against a fixed target value. Alternatively, a before-after comparison is made within a single group.
    One-factor design: Here, one variable (factor) is used to divide observations/participants into groups. Most often, the observations/participants are divided into two groups (but there can be more). This is the most common type of group comparison.
    -Two-factor design: If two factors (for example, different treatments as one factor and different timing/initiation of treatment as another factor) are used to divide observations/participants into groups. If each factor dividing the observations/participants into groups had two options each, we would have a two-factor design with four groups. It would still be a two-factor design if each factor related to group division had three options each, but then we would have a two-factor design with nine groups.
    -N-factor design: There are studies where more than two factors determine the group division. These studies are complex and therefore very uncommon.
  2. If you have at least two groups (at least a one-factor design), clarify whether the groups are matched or unmatched.
  3. Clarify which scale of measures are appropriate for your variables. This influences the choice of statistical method..
  4. If the interval or ratio scale is used for some variables, are these observations normally distributed? If the answer is yes, you can use parametric statistical methods; otherwise, you must choose non-parametric methods.
  5. If the nominal scale is used for the outcome variable, does the outcome variable have only two options (dichotomous variable) or are there more options?

Simple group comparison – Zero factor design

Scale of measure for the outcome variableDescriptionType of testSuitable testsComments
Nominal scaleThe outcome variable is dichotomous and can only assume two values / classes.Non parametricChi-square testUsed often. Require at least 5 observations in each cell.
(Parametric)Z-test
The outcome variable can assume more than two values / classes.Non parametricChi-square testUsed often. Require at least 5 observations in each cell.
Ordinal scaleWilcoxon one sample signed rank sum test
Interval scaleThe outcome variable does not fulfill the requirements for using parametric testing (skewed distribution of observations).Teckenrangtest = Wilcoxon one sample signed rank sum test
ParametricZ-test
The outcomevariable is fulfilling the requirements for using parametric testing (observations are normally distributed).Student’s t-test – one sample t-estUsed often.
Z-testThe t-test (above) is more sensitive and should be used primarily if the conditions for that test are met.
Ratio scaleTime to an event.—–Kaplan-Meyer curveThis is not a statistical test, but just a graphical representation of change over time.
The outcome variable is count data, usually non-negative integers.ParametricMixed Poisson regression“Mixed Poisson regression” is a “Generalized Linear Mixed Model (GLMM)” that uses the Poisson family. You create a new variable that is a unique ID for each individual. You then tell the program to treat this variable as a random effect variable”.
Mixed negative binomial regressionSame as above, but ‘Mixed negative binomial regression’ is used instead of Poisson regression. Good if the assumptions for Poisson regression are not met.
Mixed zero inflated modelsSame as above, but a ‘Mixed zero-inflated model’ is used instead of Mixed Poisson regression. Appropriate if there are many observations of the outcome variable with the value zero.

Simple group comparison – One factor design

Simple group comparison – One factor design 2 unmatched groups
Scale of measure for the outcome variableDescriptionType of testSuitable testsComments
Nominal scaleThe dependent variable is dichotomous and can only assume two values / classes.Non parametricChi-square testUsed often. Require at least 5 observations in each cell. If you have fewer do Fisher’s exact test instead.
Fisher’s exact testHas basically no requirements for a minimum number. Gives a similar (but more exact) answer as chi-square.
The outcome variable can assume more than two values / classes.Chi-square testUsed often. Require at least 5 observations in each cell.
Ordinal scaleThe outcome variable is measured with an ordinal scaleMann-Whitney’s test = Wilcoxon two unpaired test = Rank sum testUsed often.
Kruskal-Wallis testCan compare >2 groups. With only 2 groups, you get the same result as the Mann-Whitney test. This is the non-parametric equivalent of one-way ANOVA.
Fisher’s permutation test
Cochran-Mantel-Haenszels stratified analysis
Interval or ratio scaleThe outcome variable does not fulfill the requirements for using parametric testing (skewed distribution of observations).Mann-Whitney’s test = Wilcoxon two unpaired test = Rank sum testUsed often.
Kruskal-Wallis testCan compare >2 groups. With only 2 groups, you get the same result as the Mann-Whitney test. This is the non-parametric equivalent of one-way ANOVA.
Cochran-Mantel-Haenszels stratified analysis
Fisher’s permutation test
ParametricZ-test
The outcome variable is fulfilling the requirements for using parametric testing (observations are normally distributed).Student’s t-test – two sample unpaired t-estUsed often.
One way analysis of variance
Cohen’s d
Z-testThe t-test (above) is more sensitive and should be used primarily if the conditions for that test are met.
Simple linear regressionLet the group allocation be the independent variable.
Ratio scaleTime to an event. This is a special case of interval or ratio scale.—–Kaplan-Meyer curvesThis is not a statistical test, but just a graphical representation of change over time.
Non parametricLog rank test = Mantel–Cox test = time-stratified Cochran–Mantel–Haenszel testThe log-rank test is used if you have two unmatched groups and no need to adjust for other confounding factors.
ParametricCox regression = proportional hazards regressionCox proportional hazards regression is used to compare unmatched groups and also to adjust for confounding factors (the adjustment then becomes an advanced group comparison – a variant of covariation).
Ratio scaleThe dependent variable is count data, usually non-negative integers.Poisson regressionUsed often. Let the variable for group allocation be the independent variable.
Negative binomial regressionThis is the preferred method if the conditions for doing Poisson regression are not fulfilled.
Zero inflated modelsSuitable if you have many observations with the value zero.
Simple group comparison – One factor design 2 matched groups
Scale of measure for the outcome variableDescriptionType of testSuitable testsComments
Nominal scaleThe outcome variable is dichotomous and can only assume two values / classes.Non parametricSigns testSigns test and McNemars test are interchangable but signs test is often slightly better.
McNemar’s testUsed often.
Stuart-Maxwells testStuart-Maxwells test can be used even if you have >2 matched groups.
The outcome variable can assume more than two values / classes.—–This situation rarely occurs. Should it happen, the scale should be converted to a dichotomous scale or an ordinal scale.
Ordinal scaleThe outcome variable is measured with an ordinal scaleNon parametricSigns testUsed often.
Fisher´s paired permutation test = Fisher-Pitman’s permutation test for paired data
Interval scaleThe outcome variable does not fulfill the requirements for using parametric testing (skewed distribution of observations).Signs testUsed often.
Fisher´s paired permutation test = Fisher-Pitman’s permutation test for paired data
ParametricZ-test
The outcome variable is fulfilling the requirements for using parametric testing (observations are normally distributed).Student’s t-test – one sample unpaired test = Student’s t-test paired t-testUsed often. Can only handle two matched groups.
(Z-test)
Mixed linear regressionMixed linear regression is a “Generalized Linear Mixed Model (GLMM)”. You create a new variable that is a unique ID for each pair. You then tell the program to treat this variable as a random effect variable”.
Ratio scaleTime to an event. This is a special case of interval or ratio scale.Non parametricStratifierat log rank testThe stratified log-rank test is used if you have two matched groups and no need to adjust for other confounding factors.
The dependent variable is count data, usually non-negative integers.ParametricMixed Cox regressionMixed Cox proportional hazards regression is used to compare unmatched groups and also to adjust for confounding factors (the adjustment then becomes an advanced group comparison – a variant of covariation).
Mixed Poisson regressionMixed Poisson regression is a “Generalized Linear Mixed Model (GLMM)” that uses the Poisson family. You create a new variable that is a unique ID for each pair. You then tell the program to treat this variable as a random effect variable”.
Mixed negativ binomial regressionSame as above, but Mixed negative binomial regression is used instead of Poisson regression. Good if the assumptions for Poisson regression are not met.
Mixed zero inflated modelsSame as above, but a Mixed zero-inflated model is used instead of Mixed Poisson regression. Appropriate if there are many observations of the outcome variable with the value zero.
Simple group comparison – One factor design >2 unmatched groups
Scale of measure for the outcome variableDescriptionType of testSuitable testsComments
Nominal scaleThe outcome variable is dichotomous and can only assume two values / classes.Non parametricChi square testUsed often. Require at least 5 observations in each cell.
The outcome variable can assume more than two values / classes.Chi square test
Ordinal scaleThe outcome variable is measured with an ordinal scale.Kruskal-Wallis testCan compare >2 groups. With only 2 groups, you get the same result as the Mann-Whitney test. This is the non-parametric equivalent of one-way ANOVA.
Interval scaleThe outcome variable does not fulfill the requirements for using parametric testing (skewed distribution of observations).Kruskal-Wallis test
The outcome variable is fulfilling the requirements for using parametric testing (observations are normally distributed).ParametricOne way analysis of varianceOne-way ANOVA is used if there are more than two unmatched groups in a one-factor experiment. (If there are only two unmatched groups, you get the same result with ‘Student’s t-test – two sample unpaired test’.)
Simple linear regressionLet the group allocation be the independent variable.
Ratio scaleTime to an event.—–Kaplan Meyer curveThis is not a statistical test, but just a graphical representation of change over time.
Resultavariabeln är tid till en händelse. Detta är ett specialfall av intervall- eller kvotskala.Non parametricLog rank testThe log-rank test is used if you have unmatched groups and no need to adjust for other confounding factors.
Semi-parametricCox regressionCox proportional hazards regression is used to compare unmatched groups and also to adjust for confounding factors (the adjustment then becomes an advanced group comparison – a variant of covariation).
The dependent variable is count data, usually non-negative integers.ParametricPoisson regressionUsed often. Let the variable for group allocation be the independent variable.
Negative binomial regressionThis is the preferred method if the conditions for doing Poisson regression are not fulfilled.
Zero inflated modelsSuitable if you have many observations with the value zero.
Simple group comparison – One factor design >2 matched groups
Scale of measure for the outcome variableDescriptionType of testSuitable testsComments
Nominal scaleThe outcome variable is dichotomous and can only assume two values / classes.Non parametricStuart-Maxwells testThis is similar to McNemars test but can handle more than two matched groups.
The outcome variable can assume more than two values / classes.—–—–(This situation probably never occurs. Should it occur, the scale should be converted to a dichotomous scale or an ordinal scale.)
Ordinal scaleThe outcome variable is measured with an ordinal scale.Non parametricFriedman´s testThe one-factor experiment is converted into a two-factor experiment by treating the groups as one variable and the individuals as another. Then, analyze with Friedman’s test as if it were a two-factor experiment.
Interval scaleThe outcome variable does not fulfill the requirements for using parametric testing (skewed distribution of observations).Friedman´s test
The outcome variable is fulfilling the requirements for using parametric testing (observations are normally distributed).ParametricTwo-way ANOVA
Mixed linear regressionThe key here is to use a “Generalized Linear Mixed Model (GLMM)”. Include the variable that control the group allocation as independent variable. You can then add more independent variables that you want to adjust for, e.g., sex, age, etc. (the adjustment then becomes an advanced group comparison – a variant of covariation). You also create a new variable that is a unique ID for each block used to allocate individuals to the matched groups. Tell the software this variable should be treated as a random effect. You also tell the software if the type of linear regression should be ordinal, simple, Cox, Poisson etc.
Ratio scaleTime to an event.Mixed Cox regression
The dependent variable is count data, usually non-negative integers.Mixed Poisson regression
Mixed negativ binomial regressionSame as above, but ‘Mixed negative binomial regression’ is used instead of Poisson regression. Good if the assumptions for Poisson regression are not met.
Mixed zero inflated modelsSame as above, but a ‘Mixed zero-inflated model’ is used instead of Mixed Poisson regression. Appropriate if there are many observations of the outcome variable with the value zero.

Simple group comparison – Two factor design

Two factor design means you have two variables for group allocation. As an example one might be treatment allocation and the other might be timing for initiation of treatment. If each of these factors (variables) are binary it means you would have four groups.

Simple group comparison – Two factor design unmatched groups
Scale of measure for the outcome variableDescriptionType of testSuitable testsComments
Nominal scaleThe outcome variable is dichotomous and can only assume two values / classes.ParametrisktUnconditional binary logistic regressionInclude the variables that control the group allocation and include them as independent variables (this will require at least 2 variables). You can then add more independent variables that you want to adjust for, e.g., sex, age, etc. (the adjustment then becomes an advanced group comparison – a variant of covariation).
The outcome variable can assume more than two values / classes.Multinominal logistic regression =multiclass logistic regression
Ordinal scaleThe outcome variable is measured with an ordinal scale.Ordered logistic regression
Icke parametrisktFriedman’s test
Interval scaleThe outcome variable does not fulfill the requirements for using parametric testing (skewed distribution of observations).Friedman´s test
ParametrisktOrdered logistic regressionInclude the variables that control the group allocation and include them as independent variables (this will require at least 2 variables). You can then add more independent variables that you want to adjust for, e.g., sex, age, etc. (the adjustment then becomes an advanced group comparison – a variant of covariation).
The outcome variable is fulfilling the requirements for using parametric testing (observations are normally distributed).Two way analysis of variance = Two way ANOVAA better approach than two-way ANOVA is to use standard linear regression (se row below) where group allocation becomes independent variables (this will require at least two variables). You can then also adjust for other covariates.
Standard linear regressionInclude the variables that control the group allocation and include them as independent variables (this will require at least 2 variables). You can then add more independent variables that you want to adjust for, e.g., sex, age, etc. (the adjustment then becomes an advanced group comparison – a variant of covariation).
Ratio scaleTime to an event.Semi parametrisktCox regression
The dependent variable is count data, usually non-negative integers.ParametrisktPoisson regression
Negative binomial regressionThis is the preferred method if the conditions for doing Poisson regression are not fulfilled.
Zero inflated modelsSuitable if you have many observations with the value zero.
Simple group comparison – Two factor design matched groups
Scale of measure for the outcome variableDescriptionType of testSuitable testsComments
Nominal scaleThe outcome variable is dichotomous and can only assume two values / classes.ParametricMixed binary logistic regressionThe key here is to use a “Generalized Linear Mixed Model (GLMM)”. Include the variables that control the group allocation as independent variables (this will require at least 2 variables). You can then add more independent variables that you want to adjust for, e.g., sex, age, etc. (the adjustment then becomes an advanced group comparison – a variant of covariation). You also create a new variable that is a unique ID for each block used to allocate individuals to the matched groups. Tell the software this variable should be treated as a random effect. You also tell the software if the type of linear regression should be ordinal, simple, Cox, Poisson etc.
The outcome variable can assume more than two values / classes.Mixed multinominal logistic regression = Mixed multiclass logistic regression
Ordinal scaleThe outcome variable is measured with an ordinal scale.Mixed ordered logistic regression
Interval scaleThe outcome variable does not fulfill the requirements for using parametric testing (skewed distribution of observations).Mixed ordered logistic regression
The outcome variable is fulfilling the requirements for using parametric testing (observations are normally distributed).Mixed standard linear regression
Ratio scaleTime to an event.Mixed Cox regression
The dependent variable is count data, usually non-negative integers.Mixed Poisson regression
Mixed negative binomial regression
Mixed zero inflated models

Show agreement

Typically done when one want to evaluate new, or existing, tests used for screening or diagnostics.

Scale of measure for the outcome variableDescriptionType of testsSuitable testsComments
Nominal scaleThe outcome variable is dichotomous and can only assume two values / classes.Non parametricCohen’s kappa coefficient
Sensitivity and SpecificityIndicates the performance characteristics (“the health”) of the test. Good for test manufacturers.
Likelihood ratioIndicates if the test provides new information. Useful for those who create guidelines. This is a special variant of the odds ratio.
Predictive value of testInforms about the health status of the patient (assuming the test is applied to patients). Good for healthcare personnel
Etiologic predictive valuePredictive value of test while adjusting for possible carriers ill from another agent than the test is looking for. Does not require a gold standard.
The outcome variable can assume more than two values / classes.Cohen’s kappa coefficient
Ordinal scaleThe outcome variable is measured by an ordinal scale.Cohen’s kappa coefficient
Weighted kappa coefficient
Interval- or ratio scaleThe outcome variable does not fulfill the requirements for parametric testing (usually due to a skewed distribution of observations).—–Bland-Altman plot on transformed dataTransform data so they become normally distributed.
Non parametricNon-parametrisk variant of Limits of agreementInstead of mean and standard deviation use median and interquartile range for the differences between tests.
The outcome variable fulfills the requirements for parametric testing (and observations are usually normally distributed).—–Bland-Altman plotThis is not a statistical test but a graphical representation of how two tests agree. This graph is used very often.
ParametricLimits of agreementIs often combined with making a Bland-Altman plot
Lin’s Concordance correlation coefficient
Intra class correlation (ICC)

Referenser

{2262766:HQSSGTN9} vancouver default asc 0 5583