Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Three Best Practices for Unilever’s Global Analytics Initiatives
    This article from Morgan Vawter, Global Vice...
Getting Machine Learning Projects from Idea to Execution
 Originally published in Harvard Business Review Machine learning might...
Eric Siegel on Bloomberg Businessweek
  Listen to Eric Siegel, former Columbia University Professor,...
Effective Machine Learning Needs Leadership — Not AI Hype
 Originally published in BigThink, Feb 12, 2024.  Excerpted from The...
SHARE THIS:

This excerpt is from Cmswire. To view the whole article click here.  

9 years ago
Shine a Light on Your Dark Data

 

PATimes AD
Use Code PATIMES15 for 15% off a two day pass or combo pass. (Excludes workshops & All Access)

For those worrying about the data security issues caused by enterprise file sharing or poorly constructed information management strategies, add a new item to your ‘things to worry about’ list — dark data.

Never heard of dark data? Gartner coined the term to describe enterprise data that’s fallen into disuse, due to a lack of ownership, poor visibility, accessibility, etc.

And the problem will only get worse as data gathering gets more efficient and companies continue to ignore content management.

Any unmanaged or unsupervised data poses a potential security risk. Any data not actively in use in the organization (or required for e-discovery purposes) is freeloading, and storage space costs too much for that.

The emergence over the last 12 months of new technologies — in content analytics, predictive analytics and process management — aim to help organizations bring their dark data to light.

But is dark data really all that dark? Or is it just another facet of poor data management?

Dark Data, By the Numbers

According to a recent AIIM report (registration required) by Doug Miles, in spite of content analytics’ potential, 80 percent of those surveyed had yet to allocate a senior role to initiate and coordinate content analytics applications.

The lack of designated leadership and shortage in analytics skills is holding back the deployment of content analytics tools, according to almost two-thirds (63 percent) of respondents.

Dark data was named as a big business driver for deploying content analytics, with other drivers including process productivity improvements, additional business insight, and adding value to legacy content.

Seventy-three percent of respondents felt that enhancing the value of legacy content was better than wholesale deletion, while more than half (53 percent) said that auto-classification using content analytics was the only way to get content chaos under control.

Content Management And Metadata

Should organizations be concerned about dark data? According to Greg Milliken, vice president of marketing at M-Files Corporation, the answer is no. Milliken shared the M-Files take on dark data in an interview with CMSWire. Dark data’s not sinister, just badly managed data.

“One of the whole premises of our architecture is that by classifying information by what it is — it’s a proposal, it’s an invoice, it’s a support ticket — and relating it to other key elements that themselves are often the fundamental drivers of the business, it allows this data to show up dynamically based on the context without the individual [searcher] necessarily being aware of it,” he said.

M-Files enterprise information management platform provides users with a metadata-driven system for organizing and managing data.

“What we believe is that what drives the discovery and utilizations of this data that could go dark are those relationships, those connections to the intelligent layer that we believe is metadata. So if I am searching for something relating to customers, if other assets have been tagged with this customer, or information, that now happens dynamically in the M-Files and it throws this up,” Milliken said.

Wasting Content

Another approach comes from predictive analytics vendor idio. Andrew Davies, CMO at idio, shared some numbers from SiriusDecisions research: between 60 to 70 percent of content produced by business-to-business companies goes unused. Corporate Visions data puts this figure as high as 90 percent. Whichever number you believe, we can agree that a lot of data goes unused and that can mean unrealized business potential.

Like Milliken, Davies believes that technology can solve the problem:

“The key to solving this problem is technological. To make content ‘useful’ it has to be understood and served to those for whom it will be most relevant. This might work manually when you only have a few assets, but it doesn’t scale in any serious organization running simultaneous campaigns and marketing channels. It’s not possible for humans to both have a global understanding of every piece of content created within the enterprise and know what it is about, this has to be turned over to machine learning systems that can cope with large volumes of content and customer interactions at scale,” he said.

Big data technologies like idio’s Content Intelligence offering are a response to the ‘dark data’ issue, Davies continued, stating it was designed to ensure that useful content, dark or otherwise, is associated with the right customers and prospects.

By: David Roe
This excerpt is from Cmswire. To view the whole article click here.

Leave a Reply