Preface from Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie or Die?
Inside the Secret World of the Data Crunchers Who Helped Obama Win
Why Predictive Modelers Should be Suspicious of Statistical Tests
Why Predictive Modelers Should be Suspicious of Statistical Tests
Rising Media Prediction Impact

Predictive Analytics Times Newsletter:

We've got some outstanding original content in this month's issue of Predictive Analytics Times that I'm sure you will enjoy. We start with an important question from Dean Abbott – "Do predictive modelers need to know math?" and Eric Siegel uses his predictive skills to tell us what the world of Predictive Analytics will look like in 2020. Continuing with the future outlook, Tom Reamy provides an industry Q&A into the future of Text Analytics.

Thanks for taking the time to read this month's articles. If you're not already signed up, I encourage you to do so today to take advantage of all things Predictive. For feedback, thoughts or ideas, don't hesitate to contact us.

Kind regards,

Adam Kahn Adam Kahn
Publisher, Predictive Analytics Times

Not Subscribed? Sign Up For The Predictive Analytics Times Newsletter:

* required

Do Predictive Modelers Need to Know Math? 1
The Future of Prediction: Predictive Analytics in 2020 3
Join SmartData Collective now 5
Reclaim Your Edge: How Advanced Analytics Is Helping Macy's Transform The Customer Experience 6
Training Program in Predictive Analytics – April in New York City 7
The Future Directions for Text Analytics 9
Online Course: Predictive Analytics Applied – On demand any time 10
Predictive Analytics World

By Dean Abbott, President, Abbott Analytics, Inc.

Predictive analytics is just a bunch of math, isn't it? After all, algorithms in the form of matrix algebra, summations, integrals, multiplies and adds are the core of what predictive modeling algorithms do. Even rule-based approaches need math to compute how good the if-then-else rules are.

I was participating in a predictive analytics course recently and the question a participant asked at the end of two days of instruction was this: "it's been a long time since I've had to do this kind of math and I'm a bit rusty. Is there a book that would help me learn the techniques without the math?"

The question about math was interesting. But do we need to know the math to build models well? Anyone can build a bad model, but to build a good model, don't we need to know what the algorithms are doing?

The answer, of course, depends on the role of the analyst. I contend, however, that for most predictive analytics projects, the answer is "no".

Let's consider building decision tree models. What options does one need to set to build good trees? Here is a short list of common knobs that can be set by most predictive analytics software packages:

  1. Splitting metric (CART style trees, C5 style trees, CHAID style trees, etc.)
  2. Terminal node minimum size
  3. Parent node minimum size
  4. Maximum tree depth
  5. Pruning options (standard error, Chi-square test p-value threshold, etc.)

The most mathematical of these knobs is the splitting metric. CART-styled trees use the Gini Index, C5 trees use Entropy (information gain), and CHAID style trees use the chi-square test as the splitting criterion.

A book I consider the best technical book on data mining and statistical learning methods, "The Elements of Statistical Learning", has this description of the splitting criteria for decision trees, including the Gini Index and Entropy:

Models can be biased toward splitting on particular variables or even particular records. In some cases, it may appear that the models are performing well but in actuality they are brittle. Understanding the math can help remind us that this may happen and why.

To a mathematician, these make sense. But without a mathematics background, these equations will be at best opaque and at worst incomprehensible. (And these are not very complicated. Technical textbooks and papers describing machine learning algorithms can be quite difficult even for more seasoned, but out-of-practice mathematicians to understand).

As someone with a mathematics background and a predictive modeler, I must say that the actual splitting equations almost never matter to me. Gini and Entropy often produce the same splits or at least similar splits. CHAID differs more, especially in how it creates multi-way splits.

There are, however, very important reasons for someone on the team to understand the mathematics or at least the way these algorithms work qualitatively. First and foremost, understanding the algorithms helps us uncover why models go wrong.

The fact that linear regression uses a quadratic cost function tells us that outliers affect overall error disproportionately. Understanding how decision trees measure differences between the parent population and sub-populations informs us why a high-cardinality variable may be showing up at the top of our tree, and why additional penalties may be in order to reduce this bias.

The answer to the question if predictive modelers need to know math is this: no they don't need to understand the mathematical notation, but neither should they ignore the mathematics. Instead, we all need to understand the effects of the mathematics on the algorithms we use. "Those who ignore statistics are condemned to reinvent it," warns Bradley Efron of Stanford University. The same applies to mathematics.

By Eric Siegel, president of Prediction Impact, Inc., author of the acclaimed book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, Executive Editor of the Predictive Analytics Times, and the founder of Predictive Anaytics World and Text Analytics World

Good morning. It's January 2, 2020, the first workday of the year. As you drive to the office, the only thing predictive analytics doesn't do for you is steer the car (yet that's coming soon as well).

  • 1.   Anti-theft. As you enter your car, a predictive model establishes your identity based on several biometric readings, rendering it virtually impossible for an imposter to start the engine.

  • 2.  Entertainment. Pandora plays new music it predicts you will like.


  • 3.  Traffic. Your navigator pipes up and suggests alternative routing due to predicted traffic delays. Because the new route has hills and your car's battery -- its only energy source -- is low, your maximum acceleration is decreased.
  • 4.  Breakfast. An en-route drive-through restaurant is suggested by a recommendation system that knows its daily food preference predictions must be accurate or you will disable it.

  • 5.   Social. Your Social Techretary offers to read you select Facebook feeds and Match.com responses it predicts will be of greatest interest. Inappropriate comments are accurately filtered out. CareerBuilder offers to read job postings to which you're predicted to apply. When playing your voicemail, solicitations such as robo call messages are screened by predictive models just like email spam.

  • 6. Deals. You accept your smartphone's offer to read to you a text message from your wireless carrier. Apparently, they've predicted you're going to switch to a competitor, because they are offering a huge discount on the iPhone 13.

  • 7. Internet search. As it's your colleague's kid's birthday, you query for a toy store that's en route. Siri, available through your car's audio, has been greatly improved -- better speech recognition and proficiently tailored interaction.

  • 8. Driver inattention. Your seat vibrates as internal sensors predict your attention has wavered -- perhaps you were distracted by a personalized billboard a bit too long.

  • 9. Collision avoidance. A stronger vibration plus a warning sound alert you to a potential imminent collision -- possibly with a child running toward the curb or another car threatening to run a red light.

  • 10. Reliability. Your car says to you, "Please take me in for service soon, as I have predicted my transmission will fail within the next three weeks."

Predictive analytics not only enhances your commute – it was instrumental to making this drive possible in the first place:

Car loan. You could afford this car only because a bank correctly scored you as a low credit risk and approved your car loan.

Insurance. Sensors you volunteered to have installed in your car transmit driving behavior readings to your auto insurance company, which in turn plugs them into a predictive model in order to continually adjust your premium. Your participation in this program will reduce your payment by $30 this month

Wireless reliability. The wireless carrier that serves to connect to your phone – as well as your car – has built out its robust infrastructure according to demand prediction.

Cyber–security. Unbeknownst to you, your car and phone avert crippling virus attacks by way of analytical detection.

Road safety. Impending hazards such as large potholes and bridge failures have been efficiently discovered and preempted by government systems that predictively target inspections.

No reckless drivers. Dangerous repeat moving violation offenders have been scored as such by a predictive model to help determine how long their licenses should be suspended.

Your health. Predictive models helped determine the medical treatments you have previously received, leaving you healthier today.

Continue >>>

Tomorrow's Just a Day Away

All the preceding capabilities are available now or have similar incarnations actively under development. Many are delayed more by the (now imminent) integration of your smartphone with your car than by the development of predictive technology itself. The advent of mobile devices built into your glasses, such as Google Glass, will provide yet another multiplicative effect on the moment-to-moment integration of prediction, as well as further accelerating the accumulation of data with which to develop predictive models.

Today, predictive analytics' all-encompassing scope already reaches the very heart of a functioning society. Organizations – be they companies, governments, law–enforcement, charities, hospitals or universities – undertake many millions of operational decisions in order to enact services. Prediction is key to guiding these decisions. It is the means with which to improve the efficiency of massive operations.

Several mounting ingredients promise to spread prediction even more pervasively: bigger data, better computers, wider familiarity, and advancing science. A growing majority of interactions between the organization and the individual will be driven by prediction.

The Future of Prediction

Of course, the details and timing of these developments are up to conjecture; predictive analytics has not conquered itself. But we can confidently predict more prediction. Every few months another big story about predictive analytics rolls off the presses. We're sure to see the opportunities continue to grow and surprise. Come what may, only time will tell what we'll tell of time to come.

Excerpted with permission of the publisher, Wiley, from Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die (February 2013) by Eric Siegel. Copyright (c) 2013 by Eric Siegel. This book is available at all bookstores and online booksellers.

Smart Data CollectiveJoin SmartData Collective now: the moderated business community for business intelligence, predictive analytics, and data professionals. SmartData Collective, an online community moderated by Social Media Today, provides enterprise leaders access to the latest trends in Business Intelligence and Data Management. Our innovative model serves as a platform for recognized, global experts to share their insights through peer contributions, custom content publishing and alignment with industry leaders. SmartData Collective is a key resource for executives who need to make informed data management decisions.

By: Joseph Dennis Kelly
Originally published at http://blogs.sap.com/

The traditional balance of business power has shifted. Today, the advantage rests – literally, through mobile devices – in the buyer's hands.

Disrupting the Balance

Consumers worldwide are getting more for their money. Because they can easily tap the Web, through their desktops, notebooks, and smartphones, they can quickly compare, price-shop, and purchase a wide range of products and services. Because of the Web, as Bloomberg Businessweek Research Services reports, the dynamics driving business have changed:

  • "Consumers and business buyers are in the driver's seat thanks to online forums, social communities and social media sites.

  • The Web and mobile devices have increased expectations for transparency, immediate response and intuitive business processes.
  • Differentiation is more difficult to maintain, with competitive offerings just a mouse click and a "free-shipping" offer away.

  • Whereas customer relations used to be considered a sales or service function, customer experience encompasses everything from the first impression of the brand all the way to sales, fulfillment, invoicing, billing, collections and after-sales service." (p. 5)

To put their companies back in the driver's seat, business leaders must think differently and learn how to leverage the Web to deliver the kind of value that builds and nurtures customer trust and loyalty. It's a shift which can strengthen companies and give them the foresight to generate the streams of revenue which enable them to profitably sustain their operations over the long term. It's a shift better realized by running advanced analytics to glean from Big Data the real-time insights that help decision makers spot those previously untraceable opportunities so their companies can more strategically navigate their markets.

Continue >>>

Predictive Analytics Training in NYC

"The goal," says LiquidAnalytics Partner Ravi Kalakota, "is to leverage the shopping, spending, inventory data [held in each company's databases to help them] make thousands of micro pricing, merchandize, and assortment decisions in a week instead of ten..to customize and deliver one hundred assortments to shopper segments, instead of ten..to predict one hundred stockouts about to occur, instead of ten..."

To make this change, so companies can outperform competitors – and more important, provide customers with the offerings and level of services they desire, business leaders must focus on developing new ideas and using better methods for testing each possibility. With advanced analytics, companies can get the functionality to see better and create strategies that help them better engage their core customer segments. One company that is using advanced analytics in this way is retail leader Macy's, with its much lauded omni-channel shopping strategy.

Energizing Engagement
To help Macy's create an exceptional customer experience, Chairman Terry Lundgren appointed himself chief customer officer and sponsored an initiative to build an omni-channel strategy that uses insights from advanced analytics to inform decision-making.

Since roll out, this strategy has helped Macy's develop in-store experiences that "mirror the online shopping experience," says Lundgren, while "...adding functionality and content online to provide customers with additional assistance in product selection." The goal: "to build deeper relationships with customers and to ensure Macy's and Bloomingdale's are accessible no matter how or when our customers prefer to explore or shop."

From its advanced analytics findings, Macy's identified and developed several opportunities to turbo-charge its strategy: self–service kiosks, inventory–locating registers, and True Fit, a Macy.com "tool that helps women select jeans that are best–suited for their 'unique body and style preferences.'" Tools like True Fit can do much to help ease the trepidations that e-shoppers sometimes feel, wondering if a product purchased on the Web will actually look and function – after taking it out of the box – like the product promised in the imagery and reviews profiled online.

For Macy's VP of Customer Centricity Julie Bernard, this omni–channel strategy enables the retailer to finely tune its merchandising decisions.

Considering that Macy's annually invests US$40B in its displays, it could boost profits significantly by simply developing merchandising plans which appeal to a particular customer segment, plans which are informed by archived data on each segment's previous product preferences.

Finding the Most Fitting Option
There are many choices available. With advanced analytics, business leaders can understand which option identified is most feasible for their company to implement. In the case of Macy's, omni-channel retailing is proving a successful solution to helping the company dissuade customers from engaging in the two challenges large retailers face in the Internet age: showrooming and price-shopping.

As Macy's is also proving, advanced analytics can help companies better understand how they can refocus a customer's attention away from price – and toward their encounter with that products (i.e., in–store and online), people (i.e., clerks, reps, and customers), and places (i.e., physical and virtual) that humanize the transactional process – and infuse into the shopping experience a bit of retailtainment.

Though these kinds of efforts, businesses can put the right offers into the hands of the right customers via the right channel – before those customers find what they need elsewhere, either on the Web or in their neighborhood.

By: Joseph Dennis Kelly
Originally published at http://blogs.sap.com/

Tom Anderson: I think data visualization today is incredibly poor. I can't believe many of our competitors in the text analytics field still offer simple "word clouds" as output.

Conversely, I think clients have to realize that data visualization techniques are generally best used as exploration tools, and not one click export to a management level PowerPoint slide.

There is currently an opportunity in best ways to communicate insights from text analytics. Having powerful software and the right data is half the battle. But we also need more creative analysts who understand the respective business and data and who can communicate the findings effectively. This more of a shortage of good analysts with the time to use these tools problem than a need for additional technology.

Tom Reamy: Do you see any revolutionary changes for text analytics on the horizon?

Tom Anderson: Yes, what I've been talking about a lot is domain expertise. OdinText for instance is focused on the use of text analytics for consumer insights. That is a very different thing than using text analytics for engaging with twitters or detecting terrorists or fraud etc. All these require special knowledge, rule and code modification.

I think there will be less "Enterprise" as well as "Twitter Monitoring" firms, and a lot more domain and industry specific text analytics tools/firms.

Predictive Analytics World San Francisco

Continue >>>

Also this technology will be incorporated by most of the companies that own sizeable amounts of unstructured data. So there will be more licensing and acquisitions going on.

Tom Reamy: Is there anything else you would like to say about the future of text analytics?

Tom Anderson: I'm so glad I got into text analytics as early as I did. It's still in its infancy, not in terms of what we can do with it already/the power, but in terms of adoption and creatively thinking about how to leverage it in different ways. Very exciting times ahead!

Tom Reamy: What do you see as the major trends in text analytics in the next year or two?

Jeremy Bentley: To borrow from Big Data parlance – Velocity, Volume and Variety mean text analytics in real time over a lot of it, in different formats and from different places.

Content Intelligence (which includes text analytics) brings structure to unstructured information so it can be joined with the data world. Data tells you what happened, and content tells you why. Associating the what with the why is the major requirement for organizations that protect, value and make money from their information.


Course outline, sneak preview, discount offers and registration

Tom Reamy: What are the problems and issues that are slowing down the field?

Jeremy Bentley: The reality check that content is not clean, properly managed or sufficiently findable today. Information overload (the often cited big issue) is nothing but a filter problem – the problem is that the filter parameters are not present in the current information management systems of CMS, ERDMS and search engines. Until it is recognized that the gritty and unglamorous task of metadata management and automatic application of whatever metadata is needed for a particular view of the content at any particular point in time. Once addressed content becomes process-able and valuable.

Tom Reamy: What new technologies and developments in text analytics or related fields (predictive analytics, machine learning, artificial intelligence, etc.) do you see or want to see in the next year or two?

Jeremy Bentley: There is a balance to be drawn between what is fully automatic and what requires some human oversight – Classification and text analysis should be fully automatic – the methods and rules used to drive the analysis should be subject to user oversight. Machine learning and AI have a role to play in the latter – as software become more sophisticated so the effort needed to achieve quality analytics and metadata derivation will go down.

Tom Reamy: Do you see any revolutionary changes for text analytics on the horizon?

Jeremy Bentley: Most users see text analytics as pretty cutting edge as it is, so to this question we have to widen it from Text to Content – in all of its forms to see where the revolution comes.

Content Intelligence for Big Data will revolutionize how organizations use their information to gain insight and competitive advantage. This is already happen ing in forward thinking enterprises – inclreasingly it will not just be the larger organizations that benefit from such an approach.

Tom Reamy: Is there anything else you would like to say about the future of text analytics?

Jeremy Bentley: Being able to process content, as we do data in a database will seem standard in a decades time.

At the Boston Text Analytics World, held on October 3-4, one of the themes of the conference was the future of text analytics. The program chair, Tom Reamy, gave a keynote presentation on that theme, Future Directions in Text Analytics.

For more on Text Analytics visit www.textanalyticsworld.com

Predictive Analytics World
     



PAW



 Rising Media   Prediction Impact