It’s been a minute! I’m not sure how this will last, but I’m hoping to put together semi-regular blog posts again. Ideally, someone out there will find *something* I have to say useful! As always, what I’m sharing with you today is based on work I’ve done at my own institution–but the visualizations in this post have been created using fabricated data.
Today’s topic? Visualizing achievement gaps in students. I plan to break this topic into two parts. In part one, I’ll be building a scatterplot looking at failure rates on a course-by-course basis. This scatterplot will then be used as a filter for part two; part two will lay out differences in failure rate by gender.
The final product is a dashboard with the scatterplot at the top. Each dot represents a course; the further to the right the dot falls, the higher the D, F, or W (withdraw) rate for that course. The higher it is place on the chart, the larger the enrollment for the course. The size and color of each dot represents the raw number of students who failed that course.
The bottom of the dashboard shows the achievement gap between men and women on a course-by-course basis. For the purpose of this visualization, we are defining “achievement gap” as the difference in pass rates by gender. Here is a look at the end result:
I built the database for this project out of our demographic and enrollment census files. The demographic files are unique at the level of [Student x Term], so each student is only included once per term. The enrollment files are unique at the level of [Student x Term x Course], so each individual student may have multiple rows in a given term, assuming they took more than one course. I included fields indicating Term, Course Department, College Name, Course Level (Grad, Undergrad), Course Subject (e.g., PSY, PHL), Catalog Number, Course Description, and Course Grade and Gender. I also included a field we call “Component” which indicates the *type* of course (e.g., Lab, Lecture, Fieldwork, etc). It’s kinda’ tiny, but here’s what the data looks like:
Now the fun part. I’m making the assumption that you, dear reader, have some basic Tableau knowledge so I’m not going to step through the initial setup for the Tableau workbook; except to say that I fired up Tableau and connected to the data source. My first order of business was to turn the Course Grade field into a binary. I coded all “failing” grades as 1 and all passing grades as 0. This way, when you average the binary you get a percentage of failing grades. As a point of clarification: for this project, we are considering grades of F, D, W [withdraw], and U [unsatisfactory] to be failing). This new field was named “Fail.”
Once I recoded the grades I started building out the first chart. I placed the “Fail” field in the columns shelf and changed the measure type from “Sum” to “Average.” Next, I placed the sum of the “Number of Records” field in the rows shelf”
I then placed the “Course Description” field into the marks shelf as a “detail” and selected the mark type of “Circle” from the dropdown menu. This converts the chart to something that much more clearly resembles a scatterplot:
Finally, I added the sum of the “Fail” field to the marks shelf as two attributes–Color and Size–and changed the color palette to *reversed* “red-green diverging.” This allows the end user to better differentiate the higher-fail courses from the lower-fail courses:
At this point, I had a functional visualization so the rest is mostly cosmetic. I added a more descriptive title, labeled the axes, converted the horizontal to display percentages, and added a few filters to give end users a little more ability to drill into the information based on Term, College Name, and Department. I also added a sliding filter to allow users the ability to filter out courses by the total number of students enrolled in a given class. To do this, I calculated enrollment at the level of Term and Course Description using the “fixed” formula:
I also made some edits to the tooltip to provide users with additional information when they rollover circles on the chart. Here is how that looked:
The end result for part one of this visualization is a worksheet that looks like this: