My Photo

Clicky

  • Clicky Web Analytics

Your email address:


Powered by FeedBlitz

« Debriefing the Philadelphia WAA Symposium | Main | Social Media and the Sampling Problem »

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83454a6d169e2015392e700b4970b

Listed below are links to weblogs that reference The Limits of Machine Analysis:

Comments

Wonderful post!

An analogy comes to mind. We have wonderful tools and models to do hypothesis testing, but we don't have any great models for generating hypotheses in the first place - at least not from statistics.

Really enjoyed your posting and I'm flattered that I was able to get a good discussion going on what I find to be a fascinating subject. And,thanks for the kind words about the talk. I agree that there are real-world challenges that are not easily solved by AI or machine assisted learning alone.

Of course there are very good examples of where this technology CAN compliment the analysis process, but it can’t simply replace it. Finding non obvious patterns in other structures of data using AI and machine learning (beyond free text in the case of the example) is both possible and practical in many cases.

In my opinion it is a combination of using machines for what they are good at… calculations across very large sets of information. Also, we should recognize people for what they are good at: hypothesizing, intuition, creativity, validation and investigative instinct. But the example you point out is a key point to be made on why machine learning is not THE final answer, rather possibly a part of the dialogue.

For the foreseeable future, I agree, we humans are still in control.

Thanks for posting this, Gary - it's very thought-provoking. I agree with you that the role of the "analyst" is more than just asking great questions. I also agree with Ryan's comment above. In fact, when thinking about a standard approach to statistical modeling, there are three areas where the role of the analyst stands out for me:
1. Defining the question.
3. Identifying the meta-data (or semantic relationships) - what you mention above.
5. Selecting the best model - based on the combination of statistical indicators (r2 and VIF in a regression), contextual explainability (is that a word?), and the ability to act on the model's findings.

The other steps are better candidates for automation and ML, IMHO:
2. Collecting (and some cleaning of) the data.
4. Building and refining multiple models.

Great post (and blog overall). I am obviously very late to the discussion but I don't think the relevance of the topic has expired since November :-).

Here is another topic that this blog touches on. Despite the limitations of machine analysis that you very eloquently pointed at, many companies today that have large web presence are still using only a small portion of the data that they can potentially tap into, to understand what is going on with the online portion of their business. We are only at the dawn of BigData (despite the fact that for some firms out there it already is afternoon in that respect) but we can already pinpoint at some of the big opportunities that ability to sift through large amounts of data more cheaply than ever are opening. Nevertheless, a large number of companies are still grappling with how to incorporate BigData into their existing organizational architecture (both technical one and human one). While, as you mention above, human is still (and will remain for a foreseeable future) a central point of this architecture, I'd like to argue that a medium to large size company that has a significant portion (or all) of the revenue coming through its online presence, cannot survive without setting up a BigData shop and incorporating it firmly into its organizational structure. One of the big benefits of BigData is the ability to bring together multiple data sources more cheaply (web analytics data, bid management tools data, advertising server data, logged data etc.) and enable deep analytics using this data of the type that no one of the individual tools can do by themselves.

If one accepts the view above (which is relatively easy nowadays) then here are some questions that I think are important answering:

- What are the best practices of introducing the BigData into organization that is a traditional RDBMS and web tools shop (e.g. should it be in technology organization, analytics organization or somewhere else)?

- What is the best way to introduce the traditional analysis currently being done through web tools (Omniture, Google Analytics etc.) with the deep analysis that can be done (what you call "machine analysis") using BigData framework in the organization that is mostly attuned to using traditional tools only?

I am aware that the answer to these questions may be something that is obvious or apparent to many of the readers and posters here but I think that there is a large audience out there that may be interested in hearing some answers to it.

Thanks again for the insightful post.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.