Time commitment
2 - 5 minutes
Description
The purpose of this video is to explain the different types of data, including categorical and continuous data, and how they relate to statistical analysis. It also demonstrates how to set data types correctly in SPSS to ensure proper analysis.
Video
Transcript
We're going to start with data types.
So data types are super important. You might have been taught this if you've ever taken any kind of intro stats class before. It's potentially a little boring, you're like, “Okay, why would does anyone actually going to care about this?”, but the reason we care about this so much is the kind of statistic you're trying to run is determined by what kind of data you have. So this slide here is all text, my next slide is more of a visual if you're more of a visual learner, you want to see it as like a flow chart, but we have two main kinds of data: We've got categorical data and we've got continuous data.
We talked a little bit about this last week, but categorical data is essentially those buckets, you've got different buckets of information, and there's two kinds of categorical data we normally talk about: we've got nominal data, which is a name with no order. This could be something like gender, you’re male, female, nonbinary. There's no inherent order; female is not better than male, male is not better than female. We also have ordinal data, so this is the name with an order. If you've worked with Likert data before, so like “strongly disagree” to “strongly agree”, those go in order 1, 2, 3, 4, 5, for example. This could be the order you finished the race: 1st place means you came in first, 2nd place means you came in second. They are different buckets, but there's an inherent order there, which is a little bit different.
For continuous data, we have interval data. This is data that has meaningful distances between numbers, but there is no true zero or zero does not mean that there's an absence of the measure. So if I say I had 0°C, it doesn't mean that there's no temperature, it means that the temperature is actually 0. For continuous data we also have ratio data. This kind of data has meaningful distances between the numbers, and there is a true 0, so you can take any possible value or fraction. If I said, for example, “0 height”, you literally have no height. If I had “0 weight”, you literally have no weight. If your reaction time was “0 milliseconds”, you did not have a reaction time, it was 0 seconds. Zero actually means something in this case.
I said I've got a graphic as well, if that's easier for folks.
[Slide shows the information presented in a table with five rows and five columns.]
So this is in the slides for you if you ever want to come back to it. When we talk about data types, a lot of folks short form this to “NOIR”, n-o-i-r, for the four different kinds of data. The first two are [categorical], the second two are continuous, so this is the same information I told you in the previous slide, but now you can see it as a pretty graphic. The reason I bring this up, again as a reminder: the kind of data you have determines which statistic is appropriate, and we can tell SPSS the Software we're learning today what kind of data we have for each variable we've got.
[Slide shows a screenshot of SPSS.]
So we can see this here. This is variable view, so last week we talked about the difference between data view and variable view. Variable view, if you click that on your SPSS in the bottom left-hand corner, variable view includes your additional metadata. We can see what kind of data we have in this column that's labeled “Measure”. SPSS distinguishes between categorical data (it's got “Nominal” and “Ordinal”) and continuous data in SPSS; continuous data is listed as “Scale”. So if you've got any kind of continuous data, you say it's scale data, if it's a category you have to decipher whether it's nominal (i.e., just separate buckets; your favorite colour blue is not better than purple is not better than green, the order doesn't matter), or ordinal data.
I heard a ding but I don't see anyone joining the room, so we'll keep going.
So this is super important within SPSS that you set your measure properly, your measure is going to tell SPSS what kind of data is in each of your columns for data view. If you set this up properly, you should be able to do all the statistics you're trying to do. If you set this up incorrectly, SPSS might let you do something you're not supposed to do, or it might not let you do something that you should be able to do because it thinks you have the wrong data type.
This fake data set, you know it's set up properly because I've done it for you. If you're getting data from somewhere like Statistics Canada or you've collected your own research data and you're putting it into SPSS for the first time, you just want to double check that the “Measure” is set properly before you do any analyses.
[Questions? Contact us. UG Library. Website: lib.uoguelph.ca. Email: library@uoguelph.ca.]
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
- Ask Chat is a collaborative service
- Ask Us Online Chat hours
- Contact Us