Video
Transcript
Okay. So we're going to jump right in. If you had failed normality for the repeated measures ANOVA, but you still want to be able to say, “are there differences between your three groups”, you might switch to the non-parametric test. So you might switch to the test that doesn't require normal data and that's going to be our Friedman's test.
So for the Friedman's test, this is used to determine whether one group's ranking on three or more continuous or ordinal variables differ. So it's similar to a repeated measures ANOVA: you've got three groups. And those groups must have either continuous data or ordinal data.
This test does NOT assume normality; you don't have to meet that expectation of normality. You don't have to do a Shapiro-Wilk test or a Kolmogorov-Smirnov. You just have to have either continuous or ordinal data to use this test. And I've left you some help guides for if you're looking for a little bit of extra help running those tests.
All right, so we're going to have to check a couple assumptions. Every test we have has certain assumptions that must be met in order for the results of the test to be valid.
For assumption #1: our dependent variable must be either ordinal or continuous. So for continuous data, there are three data types in RStudio for that: <int> which is integer, <dbl> which is double, <num> which is numeric. If you have one of those data types, you've met this assumption. You could also use ordinal data, so this is if you had (for example) Likert data. So if you've got the options like one to five, you've got a neutral midpoint, you've got symmetry around that midpoint and you're looking to say: “Are there differences based on some Likert data?”, you could do that. But it has to be listed as “ordered factor” as a data type. If you're just joining for the first time, you might need to Run line 142; you might need to install the tidyverse package in order to run the next thing we're trying to do. [install.packages(“tidyverse”)]. I already have this on my computer, I don't need to Run this. I've already Run line 143 today, but I'll Run it again: we're going to load the tidyverse library so we can use some of the functions in that package. [library(tidyverse)]. I'm going to take a glimpse() at my data [glimpse(Fake_Data)]. And when I do that, it says glimpse of Fake_Data…we have 30 rows and 7 columns, and it tells us which data type everything is in my dataset. So I've got some <int> data and some <dbl> (or double) data. Our dependent variable: we're going to be using Fake_Data1, 2, and 3; these are listed as <dbl>, so we've met this assumption.
So we've got our continuous data. We'd like to run this test. What do we do next? Well, assumption #2: the independent variable must be categorical with three or more related, matched, paired groups. There's different ways to say this, but you have to have…if I was a participant in the study, I need a score for all of my groups.
So we can look at our data and have a look [View(Fake_Data)]. In our fake dataset: each row is a participant. So if I was a participant in the study, I have a score for Fake_Data1, Fake_Data2, and Fake_Data3. So we pass this assumption, because each person has three related, matched, or paired groups. They have all three.
I've left you an expert tip here: non-parametric tests do not care about outliers. So if you started with the repeated measures ANOVA and you removed a few outliers for certain reasons, you could (if you wanted) put those back in. Because this test actually doesn't care about outliers; this test is using medians where outliers don't matter as much as they do with the tests that use means. So it's up to you whether you would like to put those outliers back in.
And that's it for our assumptions! It's pretty easy. We've got the correct data types, we didn't actually do anything with outliers, so we're ready to go. We can run the Friedman's test.
So what we're trying to do: we would like to check whether the medians of Fake_Data1, 2, and 3 are statistically significantly different from each other.
Are the medians different? We're going to need the rstatix package to make this work. So if you haven't already, you can Run line 161, which is install.packages(“rstatix”). I already have this on my computer, I don't need to Run this today. I can Run line 162, which is: library(rstatix). I get some red text, it says: Attention – or, sorry – “Attaching package: ‘rstatix’. The following object is masked from ‘package:stats’: filter”. This means the stats package and the rstatix package are fighting for what to do with the filter() function. This is not an error code, it's not saying it didn't work. It's warning me that if I use filter(), those two packages are going to fight, which is fine.
If you get an error code, either right here or in a minute saying you need the “coin” package, you will need to Run line 164 [install.packages(“coin”)]. So if you want, you can just Run it now and then that way you won't have that issue. It might ask you to install the coin package.
Okay, so I'm trying to use the Friedman's test. What if I've never done this? I don't know what I'm doing. I can Run line 166 which is: ?friedman_test(), and it will give me some output down here in our Help window to help me figure out what I'm doing. It says: friedman_test (that's our function) {rstatix} (that's the package the function comes from). And it says: “Friedman Rank Sum Test. Description: Provides a pipe-friendly framework to perform a Friedman rank sum test, which is the non-parametric alternative to the one-way repeated measures ANOVA test. Wrapper around the function friedman.test().” To read more, there's a button you can click. Here's the usage, it has some arguments, there's some values you can give it, and it gives you some examples. Might look a little concerning; that's okay, we're going to run through this together.
The thing that we care about is we actually need this data to be in long format. If you were just here for the repeated measures ANOVA version, we already did this, we already transformed the data into long format. And what I mean by “long format” is we need one column for the “group” (so it's going to say, for some parts, it's going to say Fake_Data1, for some of them it's going to say Fake_Data2, and for some of them it's going to say Fake_Data3)…we have a “score” for every person (for each of those groups)…and we have a “participant” column saying, are you participant 1, 2, or 3 [etc.]. And each participant has a score for Fake_Data1, they also have a score for Fake_Data2, and they also have a score for Fake_Data3. So that's what we mean by long data going down the page. So if you need a review for that, you can watch the repeated measures ANOVA video if you're following after-the-fact. To Run this, on line 169, we have: friedman_test(Fake_Data_score ~ Fake_Data_group | Participant, data = Fake_Data_Long). We need the long format data for this.
So what this is doing, is we're saying: “Here's your dependent variable, here's your independent variable, here's what we're doing for our error term, here's the data that we need”. We can highlight this and click Run, and it will give us some output. It says we have a tibble (a 1 by 6 tibble); it's just a data type in RStudio, don't worry about that part too much. It says you're going to look at the scores. Our “n” (or our number of observations) is 30. It gives you a statistic. It gives you your degrees of freedom for that statistic. It gives you a p-value. And under “method”, it tells you what test you ran (that's the Friedman test). So the thing you probably care about the most here is the p-value; our p-value says 9.36e-14, which means take that decimal place and move it 14 times to the left. So it's going to have a lot of zeros. So our p-value…let's see if I can read it…is: .0000000000000936 that means there is a statistically significant difference in the medians of Fake_Data1, Fake_Data2, Fake_Data3, and that difference is somewhere between those groups. We don't know where, we're going to have to follow that up and ask in a second “Where is the difference?”, but somewhere in there, there is a difference.
You might want to report the effect size for the Friedman's test. The effect size is called Kendall's W. And we can get that using the friedman_effsize() function. You give it essentially the same thing you just gave it for the Friedman's test, but now you pass it to the function friedman_effsize(). So: friedman_effsize(Fake_Data_score ~ Fake_Data_group | Participant, data = Fake_Data_Long). I can Run this…and it says you're checking “score”, your “n” is 30, here's your effect size, here's your Kendall W, and it was a large effect size. So it gives you a little bit of extra information. And if you wanted to write the results you would do so like this: Oh, we've got some question marks in there for some reason…interesting. So this should be for your Friedman's test, I don't know why it's giving us some question marks. So you've got your Friedman's test. You put your degree of freedom. You say this was my statistic for the Friedman's test, here. You give it the p-value. You can give it the Kendall's W, which is the effect size so that people know “how big was this effect?”. That's how you would write it.
The next thing you might want are your medians. Because you've got three groups and you're saying the medians are different somewhere, you might want to tell people what those medians are. So I can just use the median() function on all three and click Run. [e.g.: median(Fake_Data$Fake_Data1)]. The median for Fake_Data1 is 87.1945. The median for Fake_Data2 is 92.14947. And the median for Fake_Data3 was 112.7586. So you can write that out as well if you want. It's also a good idea to visualize your data. The best visualization for a Friedman's test is a boxplot because the Friedman's test deals with medians. So it's not a graphing exercise today, you can just Run the line if you want, we can cover it if you have questions afterwards. [boxplot(Fake_Data$Fake_Data1,Fake_Data$Fake_Data2,Fake_Data$Fake_Data3,names=c("Fake_Data1","Fake_Data2","Fake_Data3"),ylab="Median",xlab="Group")]. But we're going to do a boxplot. It looks a little squished, so I can click the Zoom button to make it a bit bigger. So we've got medians on our y-axis. And on our x-axis we have Fake_Data1, we have a median, an interquartile range, and then our whiskers. We've got Fake_Data2, median, interquartile range, and whiskers. And we have Fake_Data3, median, interquartile range, and whiskers. The medians and the graph do not tell you which groups are different. You might be able to look at this and think you know which groups are different, but you still have to run the follow-up tests to say statistically which ones are different from each other. My guess is Fake_Data3 is going to be different, because look how high this is compared to everything else. But the graph is not enough to draw that conclusion. So we're going to do some post hoc testing.
So if you found a significant Friedman's test, if your p-value was less than (<) .05, it means there's a difference somewhere and you need to follow that up with additional testing to say where are the differences. So we found a significant Friedman's test, we're going to follow that up with some Wilcoxon (essentially non-parametric) t-tests. So we're going to use: wilcox_test(Fake_Data_score ~ Fake_Data_group, data = Fake_Data_Long, paired = TRUE, p.adjust.method = "bonferroni"). What this is doing, is saying: this is our dependent variable, this is our independent variable, this is the dataset we're trying to use. It's paired data because it's repeated data (I'm a participant, I have a score 1, 2, and 3). And we're going to make an adjustment, the Bonferroni adjustment, because we're doing three tests at the same time. So I can highlight this, I can click Run, and it gives you the results of all three tests. It says which groups are different from each other. So we're looking at the “score” variable, and Fake_Data1 versus Fake_Data2, we have a p-value which is right here…the adjusted value…and it gives you some stars to say whether that is statistically significant, You would generally report the adjusted p-value, because that's saying: “I ran three tests. I don't want to make a mistake by accident, and say something is significant if maybe it's not; I'm going to add a Bonferroni adjustment to make it less likely for me to make a mistake”. So the p.adj [p adjust] column is normally what you would want to report. So we've got our p-values here. Fake_Data1 is significantly different than Fake_Data2. Fake_Data1 is significantly different than Fake_Data3. And Fake_Data2 is significantly different than Fake_Data3. We know they're significant because the p-values are less than (<) .05. So we can say there is a difference in the medians between 1 and 2, between 1 and 3, and between 2 and 3.
If the p-values were greater than (>) .05, all we would be allowed to say is “we failed to find a difference” between the medians of those groups. I didn't say there's NO difference, I say we “failed to find a difference”, and that's a very specific little stats thing for you. But that is how you do the Friedman's test.
Time commitment
Greater than 15 minutes
Description
RStudio Workshop Series: Friedman Test shows the process of conducting a Friedman's test in RStudio, including assumption checking, interpretation, and graphing.
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.