Data Analytics

Project 2: Data Analysis for Distance Learning Courses

This project is part of the University wide Experimental Teaching and Learning Analytics group organized by Northwestern Information Technology.


The data used to derive these results is hit data from our Canvas LMS. Each row of data represents one click, one action, on the part of a user. When a user clicks in Canvas, several data points are collected–the user’s identity, the time of the click, the resource clicked, and the geographical location from which the click appears to originate. This information is collected for all users of Canvas courses, including students, professors, and DL staff.

This kind of data can tell us how students interact with the course, as individuals and in aggregate. Clickstream data analysis has been a driving force on the commercial Web for years, but we are at the beginning stages of learning how–and whether—this information can be useful in supporting and enhancing learning at the post-secondary level.

This project focused on three main areas:

  • Geographical data–where are users?
  • Device data–what devices (phones, tablets, laptops) do students use, and how?
  • Temporal data–when do students do their work with DL courses?

Each of these areas is covered here. The results are preliminary; the data still has much more to tell us.

The Dataset

Findings listed here are based on MSGH courses offered in Fall 2015:

  • MSGH 405 — Shannon Galvin
  • MSGH 408 — Chad Achenbach
  • MSGH 417 — Michael Diamond
  • MSGH 427 — Sharon DeJoy
  • MSGH 452 — Suzen Moeller

These courses were chosen in part because this small set represents a complete program.  We also knew beforehand that these courses included non-US students.

The data set includes both instructor and student hits. A manual process was used to identify instructors in the data set. While most of the graphs and results below include only student data, some also include instructor data. These are noted in the commentary.

The data set we used included about 1,100,000 rows of data. While the results, in general, cover Fall quarter 2015 (September 21 – December 12), our data covers a wider date range than the quarter itself. For students, the first hit in the set came on August 31, 2015 and the last on Jan 13, 2016. For instructors and staff, the first hit came on July 27, 2015 and the last on Jan 13, 2016.

Finally, it is important to note that these results came from a pilot project. Actionable data should be produced via a formalized process with a system of checks.

Geographical Data

Geographical Data — Goals

  • Visualize the geographical distribution of Distance Learning students


Geographical Distribution

Students access MSGH courses from all over the world. The darker the color of the country, the more hits came from that location. In this graph, the preponderance of United States participation is easy to see. The MSHG courses got 451,054 hits from the US. The second-highest number–10,132–came from Germany.

These results include only student data.

Students were also distributed across the United States.

US Distance Learning Students

It may be useful to view a map without students from Illinois, because the range of non-Illinois participation appears a little more clearly:

DL Students Without Illinois
The raw data for the top 10 states shows that MSGH students are distributed widely within the US:


State Hits
Illinois 119945
California 49954
Massachusetts 36109
Minnesota 22671
New York 22022
Kentucky 18887
Ohio 17346
Texas 15574
North Carolina 15026
Indiana 13311


MSGH courses draw from a wide geographical range. This distribution–across the country and the globe–means that DL instructors must be aware of time zone differences in planning their courses. This is obviously the case for any synchronous online activity, but rhythms of comments and posting will also be affected by the geographical distribution of online students.

Student Mobility

There are 92 students in the MSGH courses we surveyed. Of these, 18 — almost 20% — accessed the courses from more than one country. The most mobile student is enrolled in four different MSGH courses — 417, 405, 408 and 452 — and had hits from 9 different countries during the fall quarter.

Number of Countries Number of Students
9 1
6 1
5 2
4 1
3 4
2 9
1 61

Geographical Data — Next steps

  • Gather feedback from faculty and ID stakeholders on geographical distribution statistics
  • Examine differences in the distribution of students within courses
  • Examine differences in student behavior. For example, does the location of students affect the way they participate in discussions?

Mobile Device Data

Mobile Devices — Goals

  • Identify patterns in mobile and desktop usage by students and instructors

Mobile Devices — Findings

In the Fall 2015 MSGH courses, we found the following patterns in mobile usage:

  • Mobile devices represented about 4% of usage overall
  • Mobile usage varied significantly by course
  • Instructors are less likely than students to use a mobile device
  • Students who used mobile devices viewed different parts of the course than did students who used desktops

Overall Mobile Usage

Overall, about 4% of course accesses by both students and instructors came from mobile devices.

These results include both student and instructor data.  Some users appear in the raw click data but not in the user role data; these users are not included in the totals below, to maintain consistency with the other queries in the set.

However, mobile usages varies significantly course by course. This graph shows the percentage of mobile device usage across our data set.  This data includes both student and instructor clicks.

A logical next step would be to determine why some courses seem to attract more mobile usage than others.

Mobile Usage: Instructors v. Students

Instructors were less likely to use a mobile device to access the MSGH courses than were students.

These results include both student and instructor data.


These results imply that instructors rarely see their course sites in a mobile device; in raw numbers, the MSGH courses saw almost 25,000 hits from students using mobile devices, but only 238 from instructors. Going forward, it might be worth encouraging instructors to check their course sites with a mobile device, especially if mobile usage grows over time, to ensure that course materials are easy for mobile users to view.

Overall Mobile Usage

Students use mobile and desktop devices to access different types of resources.

These results include only student data. The bars show the percentage of device hits for each resource category. Resource categories were determined by a manual process of grouping Canvas content categories into meaningful groups.  “Content” includes files, folders and pages.

Discussions and modules are the top destinations for mobile student users and student users overall, but mobile student users are more likely to view assignments, less likely to view content, and much less likely to view quizzes.  Knowing which resources students access with mobile devices can help prioritize design work.

Mobile Devices — Next steps

  • Gather feedback from faculty and ID stakeholders on mobile statistics
  • Explore data around access times to see if there are differences in device/time-of-day access patterns


Temporal Data

Temporal Data–Goals

  • Visualize the temporal distribution of LMS hits–when do students do their work?

Temporal Data — Findings

Temporal Distribution

As we can see from our analysis of geographical distribution, MSGH students are active from all over the globe. Overall, however, most of the traffic in MSGH courses takes place between mid-morning and late evening, Chicago time. These results include only student data.

Time Distribution, Student Access

This graph shows hits from students for each hour of the day during the fall as seen from the Central Time Zone, taking Daylight Savings Time into account. In other words, we see the temporal distribution of student hits from the point of view of a teacher located in Chicago.

The relatively even distribution of student access times throughout the day and evening, including the workday, is interesting. Although many MSGH students are employed, they do not appear to be doing all their work in the evening. Rather, they are finding time during the course of the day for their coursework, as well as in the evening. There is a dropoff from midnight to early morning Chicago time, but hits are still coming in; this traffic may mostly be non-US.

Temporal Data — Next steps

  • Break student access down into day of week
  • Look at US and non-US access times
  • Look at the time distribution based on students’ local times.
  • Look at time variance for individual students. Do individuals tend to work at the same time every day or does it change?
  • Look at changes in access frequency and behavior in light of the course syllabus, especially when assignments or assessments are due
  • Look at how many students are enrolled in multiple MSGH courses simultaneously. Do students enrolled in multiple courses tend to bunch their work (do they log in and work in all courses during a single session) or do they distribute it?