Is Big Data a Good Thing?

Data star trek 8470691 1024 782
Image via

Big Data is a crazy reality that we have created with society’s many digital input devices, from street cameras to the common smartphone (sorry, Trekkies). There is so much data available that computing algorithms are needed to extrapolate and contextualize the information. Companies are actively looking at ways to mine and extrapolate Big Data for analytics and market use.

McKinsey & Company’s Business Technology Office says Big Data will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus. The report goes on to list five ways Big Data can be used by companies and nonprofits:

1) Big Data can unlock significant value by making information transparent and usable at much higher frequency.

2) Organizations create and store more transactional data in digital form, they can collect more accurate and detailed performance information on everything from product inventories to sick days, and therefore expose variability and boost performance.

3) Big Data allows ever-narrower segmentation of customers and therefore much more precisely tailored products or services.

4) Sophisticated analytics can substantially improve decision making.

5) Big Data can be used to improve the development of the next generation of products and services.

Given the incredible amounts of data available about people, will companies abuse the data to take advantage of people and society in general? This is a tough issue because generally, Big Data will improve our ability to serve each other with better, more qualitative information, product and service offerings. Semantic information is already making search infinitely better.

However, there will be repercussions including further polarization and perhaps an unhappy realization of the picture that Big Data shows of ourselves as a society. Society may not be ready to see itself in the mirror.

Further, the continuing trials of Facebook illustrate just how serious of an issue Big Data has become. Facebook’s consistent use of user data to benefit its corporate customers in the face of privacy has triggered investigation requests to the FTC, and continues to get exposed by the media. Yet Facebook continues its practices in the face of media protests and potential lawsuits or worse.

For every Facebook that data issues become well known (and the company suspect), there are dozens who get away with Big Data abuses, oft under the radar. Really, in every technology, in every sector, there are abuses. Big Data is and will always be no different.

Will we accept Big Data’s negatives as a trade off for better results. Or do we even have a choice? What do you think?

The Mounting Challenges of an Established Social Content Market

Cowboy # 5
Image by Randy Pertiet

When blogging was new, anyone with vertical subject matter expertise could create their own site and become a success. These voices were integral role players within communities that shared the same interest. Today, the corporatization of social media by content farms, the use of algorithmic content sourcing, and an established tier of “A-List bloggers” has drastically reduced the chances of success for the individual voice. Increasingly, the desired outcomes of blogging seem like a myth of the past, just like the romantic cowboy of the Wild West.

It’s not impossible, but the dream remains big while the real opportunity has become significantly more challenging. Great content is not enough.

The pioneering era prompted the rise of books like Naked Conversations with story after story of content marketing success, and folks like Sarah Lacy who espoused the theory that anyone could create their own successes online. And there was a time when these things were true.

Years later — whether it’s traditional print, video content, imagery or applications — individual voices find it difficult to break through, unless there’s a sudden new “green field” such as the iPad application marketplace one year ago. The social content market evolved and embraced power dynamics, mostly in the pursuit of monetization. Established power structures weigh down on newcomers, forcing them to navigate a much more complicated field of competitors.

The Weight of Established Social Media

Content Farming

Last week’s AOL acquisition of the Huffington Post thrust content farming back into the spotlight as a viable means of generating ad revenue. Whether it’s an actual content farm or editorial driven sites that harness collective paid content and free “guest” columns, these corporate sites dominate the top tier of content producing social sites. Many of them are really vertical specific digital publications running on a blog platform.

Publishing on these mega-content sites is often the only way for new writers to garner tens of thousands of eyeballs in lieu of an established following. But it’s a serious trade off, sacrificing all copyright, search engine optimization (SEO), and the ability to create calls to action on one’s own site. Many writers use content farms to market their own blogs, or simply because they would rather have the eyeballs instead of launching a unique site.

Algorithm Sourced Social Content


Popularity driven algorithm sourced content exists on almost every social networking site with a significant user base, from Facebook and Twitter to Delicious and YouTube. Thanks to Facebook’s Open Graph protocol (Like feature), algorithm sourced content is now featured on many traditional 1.0 sites, too. These algorithms serve stories that have the highest probability of provoking engagement. Depending on the site, they even incorporate personal semantic data preferences to further encourage interaction.

The challenge for the new voice remains getting sourced by algorithms as a popular voice for content. This requires intense network development, interaction and hot content… Much more so than the open era of blogging’s initial days or even the first couple years of Twitter and Facebook’s market availability. In the maturing market of 2011, new voices have significant organic network development hurdles to overcome. Either that, or they need the runaway hit to break them into the idea market.

Competing with the A List

It’s hard to find any social content marketplace that doesn’t have entrenched voices already. While none will admit to holding newcomers back, all will fight to maintain position. Further, these voices often have years of community building behind them. Tactics include ignoring new voices, blackballing and punishing dissenting voices, and stealing content ideas and positions without attribution or cross-links. The rare winners highlight other voices, and welcome them.

If new voices are lucky, the existing blogging and content producing corps within their vertical lack strength in conversation. This allows for obvious differentiation. Otherwise, expect a thinly veiled dog fight.

Search Algorithms

Using social media to drive search has been a long standing tactic for bloggers. The rise of personalized and semantic data-based search changes the picture. Like the algorithms driving popular content, these algorithms not only reward linking behavior, but also personal behaviors, social context (including tonality), and popularity.

This creates tremendous issues for new voices who have not built their networks yet. Stellar content needs to perform well to drive the linking behavior necessary to be sourced. Breaking through without a strong peer network to help out requires stellar content backed by great SEO practices, such as keyword usage and titling.

Immediate Social Network Referrals


Referred content continues to be a great source of readership. Many people trust their social networks to bring them the news they need to hear. While the 2011 Edelman Trust barometer shows that we trust our peers less than we used to, this is still a crucial component of marketing content. In fact, as evidenced by the placement of algorithms, these referrals drive several tenants of the current content marketplace.

It’s not enough to write, produce and/or create anymore. Community centric content that drives two-way participation has become a must in 2011.


This assessment means to provide an accurate market picture of the competitive forces facing a new content effort. The 2011 social content marketplace requires a much stronger marketing effort behind it than past years. Instead of the conditions of the pioneering days, new content creators find a rapidly maturing media marketplace with strong power structures.

Start-ups have faced big companies and smaller entrenched competitors as long as there has been free market economies. In that sense, the content farms and A-Listers represent the traditional challenges of an established market. The technology charged online media environment of 2011 lends additional hurdles for content creators such as algorithms and social network referrals, all of which point to the need for savvy community marketing practices.

From traditional blogging practices and SEO to high powered social networking and visibility in top tier social content farms, new voices need to deploy a wide range of marketing tools to rise to the top. This becomes easier if the voice has traditional marketing strengths to leverage such as a house file of email contacts, and a functioning PR and events program. Integrating traditional marketing into social outreach creates greater opportunities for success.

How would you approach the modern social content marketplace?

How Social Semantic Search Defines People

(Cartoon by David G. Klein from the New York Times)

Search is the underpinning of the Internet today, from the 1 billion traditional searches everyday on Google to providing references about a person on Twitter and delivering their stream feed on Facebook. Search has moved from simple page rank to an increasingly complex algorithm that weight’s social and semantic data points to deliver the outcomes most likely to please you. Personalization of search continues to evolve, but in turn it defines people and their choices.

Search — the technology itself — doesn’t bear responsibility for this. People do. People who use the Internet and its many free tools without understanding how the information is provided to them. They blindly accept search results or the search-based content feed without considering the source.

Consider the DecorMyEyes fiasco broken by the New York Times. Owner Vitaly Borker explained how he used intentionally created negative complaints about DecorMyEyes to game search results and place himself as a top ranked eyeglasses vendor. To Google’s credit, they promptly changed their algorithm to include more semantic weight (all negative or all positive disqualifying you), and the Department of Justice followed up with charges.

Social networks and applications also use search to source preferred content. Facebook’s activity feed is designed to source the most “interesting” content to people in your friends network are using the Open Graph API and likes. Search on Facebook is completely driven by the Open Graph (Like) protocol.

Of course, hashtags have demonstrated the power of search on Twitter. Twitter search was originally based on the acquired Summize search technology, and has been used to reference mentions and trends, too. Now Twitter (and other services) suggests people like you using semantic data.

The Danger of Homogeneous Definition

Google Organic.jpg

The danger in all of this personalized search — particularly when it’s largely based on peer interests — is creating a society of homogeneous sycophants that blindly accept the content sourced to them, either via search or feeds. Lest we think that people actually think through the click, consider organic click through rates on Google (as pictured above by SEO’s Neil Walker). Clicking through on the first few search terms is and has been the norm.

The addition of local semantic data to search only further complicates concepts of popularity. Algorithms tell people which burger joints, music venues, theaters, etc. are most likely to meet their interests.

When popularity is defined by an algorithm and served to people, homogeneos or mob thinking becomes the norm. This thinking feeds on the popular. Society is not currently trained to question the information presented to it. Thus algorithms — designed to create the output that will generate the most click throughs — become a critical determinant in defining people’s lives, and society as a whole.

That’s not necessarily a bad thing. Semantic information can weigh in when a system is gamed, and social search can provide the latest information based on people’s actual use and check-ins. However, idea markets are increasingly influenced by the popular, and not necessarily in a good way. Algorithms can keep bad ideas popular for longer periods of time.

It all points back to the need for society to teach better information skills. In an information economy, the ability to question and discern quality data presented via a plethora of media is an essential quality for democracy and individualism. It’s important to look deeper at online search, whether that’s because a search provided direct information or because an algorithm sourced a friend or influencer touting an idea or product. Quoting Doug Haslam, “Think for yourself. …you needn’t be part of some pack that can’t brook disagreement with your heroes.”

An educated Fifth Estate creates an evolutionary society, a mindless one creates results like Kim Kardashian as the number one search term on Bing for 2010. While many people find Kardashian attractive, should social semantic search tell every person — man and woman alike — what the icon of attractive is? Parents across America may object.

What do you think about how search and algorithms are defining our society?