Video
Transcript
OpenRefine video 3. In this video, we’ll walk through several common data-cleaning steps in OpenRefine. You’ll learn how to rename columns, adjust letter case, and see how OpenRefine tracks your changes so you can safely experiment with your data.
These tasks are part of many data-cleaning workflows, which often include things like removing duplicates, running find-and-replace, cleaning up spacing or invisible characters, and converting data types.
Let’s start with renaming columns. Right now, the order in which gender is listed in our Adult and Children columns is inconsistent. In the Adult columns, gender comes first—such as Male Adults. In the Children columns, it comes second—such as Children Male.
To make these column headings consistent, we’ll begin with Male Adults. Click the arrow within the Male Adults column header. From the menu that appears, select Edit column. Then choose Rename column. A small window will open. In the text field, replace the existing name with Adults Male. When you’re finished, click OK to confirm the change.
Now repeat the same steps for Female Adults: Click the arrow next to Female Adults, choose Edit column, then Rename this column, and change the name to Adults Female. Click OK to apply the update.
Another common data-cleaning task is adjusting the letter case of text. In our dataset, we’ve decided that we want all ship names to appear in uppercase.
To do this, click the arrow within the Ship column header.
In the menu that appears, hover over Edit cells.
A secondary menu will appear. From there, hover over Common transformations.
This opens a third menu listing several transformation options.
You’ll see three choices for changing letter case—uppercase, lowercase, and titlecase.
Below there are options for converting data into different formats, such as numbers, dates, or plain text, which we’ll explore later.
For now, select To uppercase.
The table will update immediately, showing all ship names in uppercase letters.
As covered in the previous video, it is important to remember that changes in OpenRefine are non-permanent. Any edits you make do not alter your original file, so your raw data remains unchanged.
All changes can be reversed at any time using the Undo / Redo tab on the left side of the workspace. You can click any entry in the history to jump back to that point.
In this video, we focused on some fundamental data-cleaning steps in OpenRefine, including renaming columns and changing the case of text.
When you’re ready, move on to the next video, where we’ll explore additional cleaning techniques, like removing duplicates, trimming extra spaces, and converting data types to make your dataset even more consistent and ready for analysis.
Time commitment
2 - 5 minutes
Downloads
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.