The Semantic Web and the Big Data Revolution

So what's the connection between the Semantic Web and Big Data? Big Data is the new challenge for all kinds of IT operations. The pressure is on to act smart by leveraging connections between disparate data sources.
In this series of articles I hope to show that semantic web technologies hold the promise of integrating data from a variety of sources: structured data sitting in relational databases,  semi-structured data in XML and JSON, and the huge influx of unstructured data coming from social network feeds, transaction logs, and the audio and video streaming in from everywhere.

Big Data

Let's begin with Big Data. What are we talking about here? Something more than lots o data. It's BIG because it's the UNION of three different universes of data.
  1. Structure Data. It's what you find in relational databases -  tables with schema that tell you what kind of data to expect in the fixed rows of your relational tables. Got Oracle? MySQL? You've got structured data.
  2. Semi-structured Data.  In practice XML or JSON. In theory, data that carries with it some descriptive meta-data that says something about what you've got. For XML it's the element and attribute names. For JSON it's the name part of the name-value pairs.
  3. Unstructured Data. The stuff of server logs, audio and video streams, paragraphs of text, bitstreams of all kinds with no inherent meta-data, semantic or structural. 
The challenge of Big Data is how to make connections between these different universes, each with its own technology subculture.

Data rules, and Big Data rules big time.

No comments:

Post a Comment