How Predictive Analytics Is Improving Student Retention in Higher Ed

Loraine Lawson
Slide Show

Six Big Business Intelligence Mistakes

For years, colleges’ and universities’ admissions officers placed their faith in SATs and other testing scores as a barometer of who would succeed and who would fail in college.

“Faith” is the right word, it turns out. A 2014 report shows that there isn’t a statistically significant correlation between test scores and success in college.

When it comes to college retention rates, there are a lot of “old wives tales,” says Bill Thirsk, vice president of IT/CIO at Marist College in New York. So when it came time to set the IT department’s Constant Improvement goals, Thirsk decided to use a data-driven approach to reducing course drop-out rates.

“It was interesting to us to move the needle away from these touchy feeling things to statistical and analytical activities that can tell us exactly when the student starts to falter,” Thirsk said in a recent interview with IT Business Edge.

It wasn’t that Marist College has a problem with student retention; in fact, he said, the school’s numbers are higher than the national average. Rather, he wanted to apply data to a growing, industry-wide problem. Only 37.9 percent of full-time students at four-year institutions complete a bachelor’s degree within four years, according to the National Center for Education Statistics. The six-year completion rate isn’t much more impressive, with only 59 percent finishing their degree.

“We're really good at data and we're really good at predictive analytics,” Thirsk said. “This was research for us and trying to help education as an industry better meet the student's needs as far as starting a degree and finishing a degree.”

IT partnered with the mathematics department to launch the Open Academic Analytics Initiative (OAAI). The project received a $250,000 grant from the EDUCAUSE’s Next Generation Learning Challenges program, which is funded primarily by the Bill and Melinda Gates Foundation. The goal: Use predictive analytics to identify at-risk students as early as possible and steer them back on track.

The project required pulling data from a number of sources, including the school’s ERP system and Sakai, a learning management system used at Marist and other institutions of higher education. Students attend courses on-site, but the syllabus, course materials, discussions and other work take place in Sakai.

Sakai allowed them to monitor “softer” functions such as starting the work early versus procrastinating tasks, participating in discussions and so on, he said.

Higher Education

That proved key, since the data showed the earliest indicator of academic problems is course engagement. In fact, the OAAI model can predict which students will have problems with a 60-85 percent accuracy range within two to three weeks.

To build the system, IT needed a way to pull data from all these sources, run transformations and load the correct data into an analytics platform. The team tested a number of analytics tools, including commercial platforms, but ultimately chose Pentaho’s Business Analytics platform, which includes the open source Pentaho Data Integration (previously, Kettle, an ETL tool).

The college planned to make the model freely available under the Creative Commons, so Thirsk wanted an open source tool that colleges could use without buying a proprietary solution. He was also impressed by Pentaho’s performance against its commercial competitors, which he said were overly complicated to use.

"A lot of CIOS are very wary of open source. I've never been able to figure that out because I've been very successful in avoiding costs by using great open source tools,” he said. “This would have taken a year longer if I'd used a commercial product, I'm convinced of that."

Both parts — the ETL tool and the platform’s ability to help weigh the value of the data sources — are critical for predictive analytics, he added.

“The neat thing about about Pentaho, the piece that we love the most, is I can take data from different sources, transform them into a singular view of what's going on, then load them into our analytics models and come back with really good predictive models,” he said. “So it's a very high quality product and we're going to continue to use it."

Loraine Lawson is a veteran technology reporter and blogger. She currently writes the Integration blog for IT Business Edge, which covers all aspects of integration technology, including data governance and best practices. She has also covered IT/Business Alignment and IT Security for IT Business Edge. Before becoming a freelance writer, Lawson worked at TechRepublic as a site editor and writer, covering mobile, IT management, IT security and other technology trends. Previously, she was a webmaster at the Kentucky Transportation Cabinet and a newspaper journalist. Follow Lawson at Google+ and on Twitter.

Add Comment      Leave a comment on this blog post

Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.