The world is creating mountains of data at unprecedented rates, with Forbes estimating that by 2020, 1.7 megabytes of new information will be created each second for every human on the planet. But what becomes of all that data? At the moment, just one half of one percent is ever used or analyzed. That gap represents a huge opportunity for organizations looking to improve and grow.

Predictive analytics is the practice of applying statistical models to a wide variety of data to identify trends and opportunities. When coupled with open source technologies capable of processing large and diverse datasets, predictive analytics is now well within the reach of organizations eager to leverage and monetize their data. Even organizations that have just started their predictive analytics journeys see immense potential.

Early Adopters Lay Ground Work; Begin Realizing “Wins”

Companies like Panera Bread, the St. Louis headquartered chain of fast casual bakery-cafes, have been laying the ground work for predictive analytics, using the open source Hadoop ecosystem for processing large diverse datasets coupled with a Schema-on-Read analysis strategy. The combination, along with rapidly emerging data analysis technologies, makes predictive analytics faster, less expensive and more accurate than ever.

“We are getting all those pieces in place so we can be agile enough to provide both heavy-duty batch processing of large datasets and handle real-time requests for predictive analytics use cases. It’s a whole different answer now,” said Jim Foppe of Panera Bread, speaking at a TDK Technologies roundtable discussion on Data Solutions. “Our CIO has a deep background in analytics and data, so I was able to leverage a question his leadership was asking about infrastructure and get some answers that were reliable and more accurate.”

Predictive analytics offers organizations new tools for market analysis, capital investment decisions and reducing exposure to risk. Key executives at Graybar, the St. Louis based industrial and electrical supplier, backed the open source approach and were rewarded with some early success stories.

“There’s a little bit of ‘Field of Dreams’ with this. It’s a vision of what we’ll do with predictive around the opportunity to monetize data. We have a hundred thousand SKUs and hundreds of thousands of customers,” said Dan Sherman of Graybar. “For us, it was building the platform and showing a couple of wins that were never available before. It’s not a huge bet the executives are making with us to support this endeavor.”

Degrees of Confidence Help Build Support

Sherman told the roundtable attendees it is important to build support across the organization for what predictive analytics can do to drive decision-making. For example, if the analysis could accurately predict the commercial construction market for a certain geography over a specific period of time, what action would the company take with that information? Sherman said companies must avoid tendencies to use data to validate preconceived conclusions.

“We need the skills to grow the practice of predictive analytics while also building awareness around the operational team. That is not a small undertaking,” Sherman said. “You have to help your audience understand that predictive is about a level of confidence, not that it will show a 100 percent outcome. In other words, to what degree of confidence does that model show you can get the result that it’s showing?”

The St. Louis Federal Reserve Bank is building an advanced analytics team to study government payments and vetting certain types of payments using predictive models. It’s one of the Fed’s first endeavors in analytics, even though the agency has lots of data available. In addition to putting the correct technology and skill sets in place, developing proper data governance procedures was critical.

“To be good at it, we need lots of data. So we are procuring data and setting up governance because we’re in a highly regulated space that requires a lot of security,” Farmer said. “But people are starting to see some of the benefits. In a quasi-government or government situation, it just takes time for people to become at ease. We are dealing with all agencies. And regardless of what agency you are dealing with, some accept it right away where others don’t.”

Curiosity, Hunger Coupled with Programming Skills

Panera created an internal data science users group to help spread awareness and increase buy-in about using predictive analytics. Members attend a wide variety of networking groups that focus specifically on analytics, and then share the resulting best practices internally.

“We call it data science but in essence its analytics. There are people with a programming background, statistics and people with domain knowledge.  It’s to help collaborate on any level,” Panera’s Foppe told the roundtable attendees.

All three panelists agreed that hiring problem solvers with high levels of curiosity, in addition to the technical skills required to analyze data, is important when assembling predictive analytics teams.

“We are looking for people who are curious, problem solvers who can communicate and who want to work on a team. We are not looking for people who want to sit in a corner and crunch data. Writing skills are big in our space, because most of our deliverables are published artifacts,” said Jeromey Farmer of the Fed. “We try to find people who have experience with programming of some sort. We do like people who have some database mining experience.”

“What has helped us is a partnership with LaunchCode (the non-profit that trains and places aspiring developers). That has been a huge asset to the team. They all have Java backgrounds, C++, rPython and some of these new open source capabilities,” Sherman said. “And they are hungry. They want to take it to the next level.”

“As this matures, there will be a spectrum. On one end you’ll have hard core data scientists; specialized and dedicated analytics people with a high skill set in statistics and programming. On the other end will be people who just want access to the data to see what it’s doing. And there will be others in the middle who want to take the data and do a little more with it,” Foppe said.

The roundtable was hosted by Larry Steele, Director of Data Solutions at TDK Technologies. The Data Solutions practice leverages TDK's existing core competencies of Software Technology and Management Services. Learn more about how these services can help your organization.

An IT pro who'll take the time to learn my business. Is that too much to ask?