Video
Transcript
Welcome everyone to lesson 8. Today, we are doing one-way ANOVA and the Kruskal-Wallis H test. Jumping right in. A one-way ANOVA: ANOVA stands for “ANalysis Of VAriance”. So a one-way ANOVA is used to determine whether the means of the same continuous variable from three or more groups differ. So we generally want one type of variable (e.g., whether we're measuring something like “happiness”), we're going to be checking the means on that variable across three separate groups in this case. This is a parametric test, so it assumes normality. And when we say normality, we should think that typical bell-shaped curve with the highest amount of data in the middle and the lowest amount of data at the ends. And if you need a little bit of extra help running a one-way ANOVA, we have the RStudio LibGuide that has a section on this, so that looks something like that. We also have the RStudio documentation. And the Laerd statistics guide that I recommend.
The first thing we're going to want to do if we're going to be doing a one-way ANOVA today, is we need to tell the computer where our file is located on our computer, and then open that file. So the first thing we're going to do is set our working directory, which is the fancy way of saying: “Tell the computer where the file is”. There are lots of different ways to do this in RStudio, but the easiest way is we're going to click the button that says “Session”, we're going to click the button that says “Set Working Directory”, and we're going to click the button that says “Choose Directory”. That will open up our File Explorer on our computer, so we can tell the computer: “Where is my file”. Today, I'm in: Documents > Classes and Workshops > Lindsay's workshops > MICRO WORKSHOPS > RStudio. It doesn't show me anything in this File Explorer window, but I know this is the correct location, so I can say “Open”. So again, if you're doing this on your own computer, you're going to say: Session > Set Working Directory > Choose Directory, go through your computer files in the File Explorer, and pick where you have stored YOUR file today. Your file folder structure is different than mine.
We know this worked because down in the bottom-left in the Console, we get some blue text saying: setwd(). That's our function saying “set the working directory”. And in the top-right…ooh, it should say…oh! We haven't opened the file yet. So for now, we just have the working directory. So you want to copy this line of code and replace what's on line 23 of my script (if you're using my script), or just copy this into a blank script if you're working from blank. So we've set the working direction, we've told the computer: “This is where my file is located”. In the bottom-right in the Files window you should see the files for the window that you have selected. So this is my File Explorer, it's saying this is what's in this file for me.
The next thing we want to do, is actually open the file; so we haven't opened the file yet. We're going to use the red.csv() function to say: “Open up today's file”. We're going to give it a name. I have called mine Fake_Data. So I have: Fake_Data = read.csv(“Fake_Data.csv”). [LP2] You have to spell the file exactly as the file is spelled; with the file extension, with all underscores and all spaces and everything like that, all capitals…to make sure that you're opening the file exactly as written. So if I highlight this and click Run, we get some blue text down in the bottom-left in the Console. And in the top-right in our Global Environment, we now have a data tab, which has Fake_Data. It has 30 observations of 7 variables, and if I click on this, it opens like an Excel sheet, and I can see my data. So we know that this worked.
So we've opened our file, now we want to run the test. How do we do this?
Every test that we want to run has certain assumptions that must be met in order for the results of that test to be valid. So assumption #1 for a one-way ANOVA is that the dependent variable must be continuous. In RStudio, there are a couple ways to denote continuous data: that's <int> for integer, <dbl> for double, or <num> for numeric. Those are the data types we're looking for. So we can ask RStudio: “What data types do you think we're working with today?”, and if it's wrong, we can fix it. If you haven't come to a workshop before, you'll probably have to run line 30; that's: install.packages(“tidyverse”)[LP3] . That takes the tidyverse package from the Internet and puts it on your computer so you can work with it. I already have tidyverse on my computer, I don't need to take that from the Internet today. But I will run line 31, which is: library(tidyverse)[LP4] , because that tells the computer: “I have this on my computer and I would like to use it right now”. So we're going to use the tidyverse package, and the first thing I'm going to do is I'm going to use the function glimpse() to take a glimpse of my fake data (i.e., I want to look at my data and say: “What data types do we have?”). So: glimpse(Fake_Data)[LP5] . And if I Run this, in the bottom-left in the Console we get some blue text showing that this worked because I've run a glimpse() of Fake_Data. It tells me 30 rows and 7 columns, which matches what is up in the Global Environment, so that's good. And it's got dollar signs ($) to denote which column I'm looking in, and then it tells me what data type it is. So our continuous variable today is Fake_Data1 and that is listed as <dbl> or double, which is one of RStudio's way of saying: “This is continuous data”. So we have met assumption 1: our dependent variable is continuous. Excellent.
We can then move on to assumption 2. For assumption #2, the independent variable must be categorical with three or more independent groups. So that means if you were a participant in the study, you were only in Group 1, or in Group 2, or in Group 3; you were not in all of the groups, you were not in multiple groups, you were only in one group. So today we're going to be using our Colour variable as our grouping variable, and there are four different colours in this variable.
You can either be (if you were in this experiment) in the red Group, the blue group, the yellow group, or the purple group for this dataset. So we just took a glimpse() of our data, and if we look at $Colour, $Colour is listed as <int> or integer, so it's treating it like a number. We know – because I just said it's four different colours – we know that this actually shouldn't be treated as integer, it should be treated like it is categorical data. Which means it needs to be set to what we call “factor” data type; it needs to be <fct> data in RStudio to be treated properly.
We can actually change the data type to make sure it's correct. So if you have an independent variable, and it's listed as the wrong data type, we want to make sure it's set to <fct> or factor, and we can do that using what's on line 44. So we use the as.factor() function to say: “In our Fake_Data dataset, we would like to change the Colour column”. So we say $Colour. So we're setting the Colour column to a factor, and we overwrite what's already in the Colour column by saying Fake_Data$Colour is equal to (=) all of this. So to read the line of code from left to right, we have: Fake_Data$Colour = as.factor(Fake_Data$Colour)[LP6] . So we're saying: “Take the Colour column, make it a factor, and overwrite what's already in that column”.
We can Run this...it doesn't look like much happened, we just get some blue text down in the bottom-left. So we can check that this actually works by re-running the glimpse() line. So we can say glimpse(Fake_Data)[LP7] . And what we see has changed, is it still says we have 30 rows and 7 columns…but $Colour is now listed as <fct> which is factor data type. Which means it's going to treat it like four different groups, four categorical groups, which is correct. So we've treated our data like it's factor type data. It's still listed as numbers, so if we look in this – right now, it's listed as a row, but it's from our column, our Colour column – we have options 2-1-3-3-4-2, et cetera. That might be confusing, because we might not want to have to remember which number means which colour. So if we wanted, we don't have to, but we could change the numbers to words so we know which number means which word. And we can do that on line 49. So we're going to use the recode() function. We're going to say: “What are we recoding? Fake_Data$Colour”. It's a Colour column. And how would you like to recode() it? The number “1” we will set to the word “blue”; the number “2” we will set to the word “purple”; the number “3” we will set to the word “yellow”; and the number “4” we will set to the word “red”.
And to make this stick, we have to overwrite what is in Fake_Data$Colour. So say take the Colour column and replace what's already there with the words. So what this is (if we read from left to right), what this is saying: Fake_Data$Colour = recode(Fake_Data$Colour, “1” = “blue”, “2” = “purple”, “3” = “yellow”, “4” = “red”)[LP8] . Here I have the numbers AND the words in quotation marks. Only the words NEED to be in quotation marks, But sometimes RStudio makes a mistake, so sometimes it treats the numbers like they are words as well; So just to make sure we never have any problems, you can just put the numbers in quotation marks as well, which is what we have here. So if we highlight this and click Run…we get some blue text, it looks like it worked. And I can either (in the Global Environment at the top-right) click on Fake_Data and open it that way, or I can run line 50 (they do the same thing): View(Fake_Data)[LP9] . And this will show us our data again. And the change we need to see is in the Colour column, we now have colours. We have words instead of numbers; we have purple, blue, yellow, and red. So this worked. We have changed the numbers to words so it's just easier for us to work with. And again, you don't need to do that, but this way you don't have to have another file open to say: “Well, what was Group 2 again? What was Group 3?”, you just already know what what's what.
So we've got our assumption. We have independent groups; you can only have one of these options – each participant lists one colour. So we have four different groups, you can only have one of those groups, and we put the words on top just to make it easy. So we passed that assumption.
Assumption #3 data must be independent. So we talked about this if you've come to any workshops before. Independence generally means something kind of specific in stats-land; it means you want to make sure, in this case you only get one of the options for Colour. So we have independent groups, that's one version of independence. This version of independence normally means independence between people; so each row in this data set should be a unique participant (one person per row), and, generally speaking, you want to do something like random sampling, so it's not all of your best friends filling out the same survey, because that data might not be independent. They might know each other in some way, they might all respond similarly in some way because the data aren't independent. It's hard to track for independence after-the-fact. But today's a fake dataset, we can say we've met this assumption. So this is something you want to be thinking about and checking and working towards even before you run your study.
So next we have assumption #4: The dependent variable should be approximately normally distributed for each independent group. Here we have four Colour options, so within Colour there are four different groups; so we need to be checking normality for each group. And there are a bunch of ways to check normality, so you could do what's called visual inspection (make some sort of graph); you could make a histogram or a boxplot and say: “Is there any data that looks really far away from the rest of the data?”. If yes, maybe we remove it. You could also do something like a statistic. So today we're going to be running a statistic, so it gives you a p-value to be very objective and say: “Is this data statistically normal, or do we say it's statistically not normal?”. So today, we don't have very many data points per group, we have less than 50 per group, so we're going to be doing what's called the Shapiro-Wilk statistic because this works best with smaller data sets.
So we have to check EACH group. The easiest way to do this is just filter() your four groups, so make smaller scales of data, or make subsets of your data (so one for blue, one for purple, one for yellow, one for red) to say: “I want to check those groups individually”. So on lines 64, 65, 66, and 67, that's what we've done. I have blue = filter(Fake_Data, Colour == “blue”)[LP10] . And we've got that for purple, for yellow, and for red. The double equal sign (==) here is important, and it means “is equal to”.
If the value in the Colour column is equal to (==) the word blue, that participant will be put into the “blue” dataset. If the version is purple, that participant will be put in the “purple” dataset. So “is equal to” (==) is different than a single equal sign (=) which is “set equal to”. We're going to check these four different options, so we can highlight this click Run. It will Run all of these, so we've got blue, purple, red, and yellow down here. And in our Global Environment, we now have four additional smaller datasets: we have 9 observations in blue, 8 in purple, 6 in red, and 7 in yellow. So it looks like it worked. We've got our different datasets, and if I wanted to open up “blue”, I could see that everyone in the blue dataset has listed blue as their Colour. So we've got our small datasets.
To actually run the Shapiro-Wilk statistic, we use those smaller datasets. So blue, purple, red, and yellow. Snd we use the shapiro.test() function. So for example, we've got shapiro.test(blue$Fake_Data1)[LP11] . And the capitalization has to match exactly. We have called this dataset lowercase blue, but we have called Fake_Data capital F “Fake”, capital D “Data”. So we have to make sure that matches exactly or this won't work. We can highlight all four of these at once, or we can go one at a time. So if I look at just blue, I can run just blue, and it says: “Shapiro-Wilk normality test”. I have used our “blue” dataset and I'm looking at the Fake_Data1 variable in the blue dataset. So for just these nine people here, did we pass normality on Fake_Data1?
We get a W-statistic and we get a p-value. And the p-value tells you whether you have passed normality. This is an assumption check, so if p is less than (<) .05, we have a problem; something is wrong, we might need to stop and make an adjustment.
Here, our p-value is .542 which is above (>) .05, which means we're okay. The blue group has passed normality. Let us check the other groups. So if I now Run the “purple” group…we've Run the purple group on Fake_Data1, and our p-value is .1815; we have passed normality. I can Run the “yellow” group…we've Run yellow Fake_Data1, and our p-value is .1493; we have passed normality. And I can Run the “red” group…we have red for Fake_Data1, and our p-value is .2285. For all four of our groups, our p-value is greater than (>) .05, which means we've passed the Shapiro-Wilk statistic. All of our groups are statistically normal. We've met the assumption of normality. So this means we passed.
If you had NOT passed, if even one of your groups fails, you have failed normality for the entire test, and you might want to do something like switch to the non-parametric test (which is the Kruskal-Wallis H test). We're going to run both versions today, but we've passed, so actually the one-way ANOVA would be appropriate. So we get a checkmark there, we’ve passed this assumption.
Moving on to assumption #5. I haven't left you any code here, but I left you a note. Assumption 5 is no significant outliers for each independent group. So just like we checked normality for each group individually (four groups, four normality checks), same with outliers; if we were checking outliers today, we would need to check this for each of our four groups. Different fields and areas of study look for and remove outliers in different ways. So if you're working on your own data, you'll want to touch base with your research team and see how do YOU identify and remove outliers in your field. There's a bunch of ways to do this: some people do histograms and box plots (if there's any data that's really far away, we'll remove it), some folks will do things like the mean and standard deviation of each group (if you're too far away from the data based on your field, they'll remove that data). So chat with your research team about how you folks remove outliers. And if you remove outliers, you should check normality again, because if you had normality problems, removing outliers will sometimes fix normality. So that’s assumption 5.
We can move on to assumption #6, homogeneity of variances (it’s a $5 word). This means that your three or more groups must have approximately equal variance.
We can test this by running what we call the Levene's test. You might have to run line 89; if you've been to a couple of my workshops before, you might have already run this. I don't need to run line 89 because I already have the “car” package on my computer, so line 89 is: install.packages(“car”)[LP12] . I do need to run line 90 here, which is library(car)[LP13] , so I'm going to run this one. This will give us the Levene's test function, so it's leveneTest(Fake_Data$Fake_Data1, Fake_Data$Colour, center = mean)[LP14] . This test is saying: “Check Fake_Data1 for each of my four Colour groups”, and “centre = mean” is something we want to say if we're doing Levene's test for a one-way ANOVA. If you forget to write “centre = mean”, it will use “centre = median”, which might be the same or it might be a little bit different. So just make sure you say “centre = mean” if you're doing one-way ANOVA.
So if I highlight this and click Run, we will run the leveneTest() to check for homogeneity. So we've got: “Levene's Test for Homogeneity of Variance (centre= mean)”. It says group, it's got the degrees of freedom, we've got an F-value (which is 1.0064), and Pr(>F) that's our p-value (so that's 0.4057). Again, this is an assumption check; if p-value is less than (<) .05, we have a problem…we have failed this assumption, we might need to do something else. Here, our p-value is greater than (>) .05. So we've actually passed homogeneity of variance. It's saying for my four colour groups, the variances are approximately equal across those four groups. So we get a check mark, we're good to move on, we've passed this assumption. If you fail this assumption, you might need to make an adjustment to your calculations. But today we've passed, so we're good. I also said make sure you say “centre = mean”, that's the Levene's test. If you leave it as “centre = median”, it's running a different kind of test called the Brown-Forsythe test. Sometimes those give the same answer, sometimes they don't, so just be aware that there is what we call an “argument”, and we can change what that argument is saying to change the result of our test and to change which test it's running.
Okay, we have checked all of our assumptions and today we have actually passed all of our assumptions, which I didn't do on purpose, but sometimes that works out. So in our fake dataset, we're good to go. We are ready to run the one-way ANOVA. How do you actually run a one-way ANOVA?
So today what we're going to check: we're going to check to see whether Fake_Data1’s mean differs based on the participants’ reported fake Colour, what's their favourite Colour? Again, all of this is fake data. It's all made-up. What is Fake_Data1? You can make it whatever you want! Maybe it's a happiness score, maybe it's how much sleep they got last night, maybe it's their salt intake. Whatever works. So whatever you're thinking, that can be what we're looking at for Fake_Data1. Does that value differ based on their reported fake favourite Colour? Let's see.
We're going to use the aov() function today for a one-way ANOVA. And if you've never used this before, you can always ask for a little bit of help. Using question Mark AOV, left bracket, right bracket. If we Run this, in the bottom-right in our “Help” window, it will pop out the documentation for RStudio to tell us: “What is this doing? What do we need?”. So we have here: “aov {stats}” (that means it's from the stats package). This is actually subsumed within tidyverse, and we already have tidyverse, so we're good, we've got everything we need here. It says: “Fit an Analysis of Variance Model. Description: Fit and analysis of variance model by a call to lm”. lm(), as a side note, that's what we use for linear regression normally. And it says: “(for each stratum if an Error(.) is used).”. Sounds a little confusing. What else does it say? “Usage: aov(formula, data = NULL, projections = FALSE, qr = TRUE, contrasts = NULL, …)”. Whoa, that's a lot going on! And we can scroll…and we can get a little bit more information about each of the different arguments within this function. And we can get a little bit more detail in the “Details” section, so it talks about lm, which is for linear models, which we use again for regression a lot of times. It's got stuff about print and summary…this looks like a lot going on. If you don't know how to read R code. So what we can do is say, let's put that to the side for now, and I'll show you what we need.
What we need: pretty simple, we've already got the data in the right format in our fake dataset. We've got Fake_Data1 going down the page and we have Colour going down the page; those are the things we care about the most. We're going to use the aov() function. First we give it what we call our dependent variable: what's the thing that you want to see “is this different between our groups”? So we've got Fake_Data1. We have the tilde (~) symbol, which is generally in the top-left of your keyboard; it's a little squiggle. And then we've got our grouping variable, so what's your independent variable (this is Colour: which Colour did they pick)? One of four options here. Then we say data; what data are we using? We're using Fake_Data. Reminder, we have other datasets [blue, purple, red, yellow], but we want to use the big dataset with all four colour groups. And we need to give that a name: ery creative, I called mine “my_anova”. So reading from left to right we have: my_anova = aov(Fake_Data1~Colour, data = Fake_Data)[LP15] . If we Run this, we get some blue text in the bottom-left in the Console. And in the “Data” tab in the top-right in the Global Environment, we now have “my_anova”, and it says “List of 13”. It's stored a bunch of stuff for us, but you'll see you don't actually have any results. Nothing popped up on the screen. It didn't tell you what your p-value is, told you nothing.
It's stored the result in my_anova because that's what we asked it to do. And to get the results of the ANOVA, you have to run the next line, which is” summay(my_anova).[LP16] Because you're saying: “I have stored this in my_anova, give me the summary, give me my results.”. If we Run this, we get a little bit more information. So you're going to want to look on the “Colour” line here. The Colour line is your group. Residuals is important for if you're doing some other calculations or maybe if you need to report some stuff; so if you needed to report your degrees of freedom, for example, you'll need stuff here. But generally speaking, you want to look on the line that is listed as the name of your grouping variable. So here we're looking at Colour. It tells us our degrees of freedom. It tells us our sums of squares. It tells us about mean squares. It gives us our F-value, so this one here is the F-value .159. And Pr(>F) is our p-value. It uses weird notation, but that's your p-value and it's 0.923. So our p-value is greater than (>) .05; we have non-significant results.
And we are not allowed to say there is NO difference between the Colour groups. But what we are allowed to say is “we failed to find a difference between our Colour groups”; we cannot say whether Fake_Data1 differs based on the participants’ reported favourite Colour.
The next thing you might want to do is calculate an effect size; so we have a non-significant finding, but how big is that effect? Is it really tiny? Do I need more people? Or is it a really big effect, and maybe we just got unlucky! What's going on? So let's go find out how big this effect actually is. The effect size for a one-way ANOVA is generally partial eta-squared, which we can get from the rstatix package. If you've been to a workshop before, you've probably already Run line 108 [install.packages(“rstatix”)[LP17] ]. If you haven't, make sure you Run that. I already have the rstatix package, I don't need that right now. But I will Run line 109, which is: library(rstatix)[LP18] .So I'll Run this…I get a little bit of red text, it says: “Attaching package: ‘rstatix’. The following object is masked from ‘package:stats’: filter”. That's saying the stats package and the rstatix package are fighting on my computer about what they should be doing for the filter() function. This is not an error code, this is just a warning saying: “If you're trying to use the filter() feature or the filter() function, these two packages are kind of fighting in the background and it's going to tell you which one it's using”. So we don't have to worry about this. It's not an error code. But just be aware, we did get a little bit of red text, and that's what it means.
To actually run the partial eta-squared function, it's literally: partial_eta_squared(my_anova)[LP19] because we're trying to calculate the partial eta-squared calculation or the partial eta-squared number for the ANOVA you just ran. So we can highlight this and click Run…and it spits out Colour = .01806467. So our partially squared value is like .02 if you round it. That's really small. We didn't find anything statistically significant, which makes sense because the effect size here is itty bitty; it's very tiny. You would need a lot of participants to find something here.
I've also given you how you would actually write this thing. So we've got all this data in our Console, all this stuff down here. How do we write about this if we were writing a paper? You would write capital and italicized F, left bracket, 3 for your numerator degree of freedom, comma, 26 for your denominator degree of freedom, right bracket, is equal to…what's your F-value? It's this one right here: .159. Then you give it your p-value. Your p-value is this one right here: .923. And then you have to give it your partial eta-squared value, because that's your effect size, and you would give it this one right here, which is: .018. That is how you write it [F(3, 26) = 0.159, p = 0.923, partial eta-squared = 0.018].
The next thing you might want to do is actually give the means of the different groups. What was the mean for blue? What was the mean for red? There's a bunch of ways to do this, but because we've already broken up our dataset into those four mini datasets, we can run lines 115 to 118, where it says: mean(blue$Fake_Data1)[LP20] . I do that for blue, I do it for red, for purple, for yellow. And this will give me my four values! So the mean for blue is 87.3. The mean for red is (if we round this, this is) 87.0. The mean for purple is 87.0. And the mean for yellow is 87.0. So it makes sense that our p-value is non significant, because three of those values are actually the same value with rounding. They're a little bit different if you don't round, but they're literally the same value. So it makes sense that we didn't find a difference here.
The last thing you might want to do here, is you might want to visualize the data. It's a good idea to have a look at your data to be able to say: “Well, why does it look like there's no differences here?”. We can't say there are no differences, but let's visually see and look and say statistically there's no difference, what does the graph show us? Let's have a look. It's not a graphing class today, so we're going to skim over this really quickly, but you can highlight this chunk of code, and what it's going to do is plot the means of each of your group in a bar plot (because a bar plot is the best option for a one-way ANOVA). And if we click Run…it will spit out a plot down in the bottom-right in our Plots window. It's a little hard to see, so you can click the “Zoom” button to make it bigger, and that will pop open your graph. And, uh, yeah, that kind of makes sense that we didn't find any statistically significant differences, because if we look at our different Colours we have blue, purple, yellow, and red (we didn't put colours on them today, it's not a graphing class, but you could put the colours)…but just for the words, the only difference is if you look really closely, “blue” is a little bit higher than your three other groups. So it kind of makes sense that, statistically, we're not showing much going on here. It's not like blue is all the way down here and the other three groups are all the way up there. So we can't say there are any statistically significant differences, and it makes sense if you look at the graph. And I lied, the very last thing you might want to do; if your p-value for your ANOVA was significant, that would tell you there's a difference SOMEWHERE between your groups. You've four groups, you don't know where the differences are…looking at the graph might help you maybe figure out where those differences are…but the best way to tell where the actual differences are, is to run some sort of follow-up test. So if you found a significant one-way ANOVA, if your p-value was less than (<) .05, you don't know where the differences are, but you would like to find out! So you can follow it up doing what we call some post-hoc comparisons. There's lots of different kinds of these tests, but today I'm going to show you how to do the Tukey’s HSD. Again, lots of options; this is just one of them. So TukeyHSD(my_anova)[LP21] . “Tukey” is capitalized, so capital T on Tukey. And “HSD” is capitalized (highly significant differences). If we Run this…it's going to spit out every possible combination of your groups. So it'll do: blue versus purple, and blue versus red, and blue versus yellow, and all of them. So how you read this, is it tells you: “Tukey multiple comparisons of means, 95% family-wise confidence level”. It tells you what model you fit, so we used aov(), and it tells you the formula you used, and what data you used. It says $Colour because we're looking at different Colour groups. And then it gives you the pairings. So purple versus blue, and it gives you an adjusted p-value. Yellow versus blue, it gives you an adjusted p-value. Red versus blue, it gives you an adjusted p-value. Et cetera. It gives you the difference between the groups (which are all very tiny), and it gives you a lower bound and an upper bound (which all cross over zero). And all of our p-values are non-significant, which makes sense because our ANOVA is non-significant, and which matches the graph because we can't say any of these are different. The important thing to note here, is it says “adj”, adjusted; p adjusted. It's an adjusted p-value because we have applied a correction. We won't get into this too much today, but any time you do a test, you have a chance to make a mistake; the more tests you do, the greater your chance of potentially making a mistake. You can apply a correction to make it less likely for you to make a mistake. Some common corrections: you might have heard of Tukey’s, you might have heard of Bonferroni’s…those are some common corrections that we can apply.
Today, we did the Tukey HSD, which gives you an adjustment for how many tests you were doing. So that's what it means when it says “p adjusted”. And that's everything! We have run the one-way ANOVA from start to finish.
Attribution
By Lindsay Plater
Time commitment
Greater than 15 minutes
Description
RStudio Workshop Series: One Way ANOVA shows the process of conducting a one-way ANOVA in the RStudio software (including assumption checks and graphing).
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.