The goal of big data is to find patterns in large volumes of noisy data, from which value can be extracted. The general case is finding things that are similar to one another, so that the solutions that have worked well in the past can be targeted at the new opportunity.

Whether your goal is to increase revenue or cut expenses, or to alter public perception of your company or an issue that is important to you, the technology is the same. Big data compares past behavior with desired outcomes, and guides you to maximize ROI. Big data lets you apply mathematical precision to the human relations side of your business.

Robert Bushman has been researching and building big data systems for fifteen years, and has worked with companies from smaller enterprises to the Fortune 10. Following are some examples from two technology demonstration projects, one using Wikipedia as the dataset, the other using Reddit discussions.

Wikipedia Dendrograms

300,000 articles were analyzed and clustered on Amazon Elastic MapReduce to find associated groups of Wikipedia entries. This is similar to how you might group your leads, customers, or users for closer analysis or to target advertisements, marketing materials, or sales representatives.
Reddit Similar Stories Engine

Fifty million Reddit comments from more than one quarter million discussions have been stored and analyzed in a one hundred gigabyte database. This interface lets you find the most similar discussions to some of the stories that are hot on Reddit right now.
Note: This demo runs on a development machine and may be offline. If you cannot reach it, please try again in a few minutes, or contact Robert Bushman.
Reddit Similar Stories Examples

I have posted a few demo lists of related stories on Reddit, as a way of testing the results and getting feedback. You can see some of the best, and compare my results with Reddit's standard "Related" feature, at the link below.
