AtScale Blog

BIG BI: Data isn’t a Process. It’s an Asset.

Posted by on
Find me on:
DataisanAsset_3.jpeg

Data. It isn't a process. It's an asset.

Welcome to the 1st in a series of 8 blogs, where I will dive-in to separate and clarify both concepts and relationships across Business Intelligence and its active component OLAP,  predecessor technologies, and data. OLAP in particular has suffered from issues of scale and speed, but the need for this type of analysis is greater than ever. And while, the industry of analytics has been overrun with big data and data science, there is a general lack of understanding that previous drawbacks of BI and OLAP have been solved with the new architecture of big data, Hadoop/Spark and Cloud.

Join me as I uncover both the vital need for multi-dimensional analysis and the vastly improved capabilities that exist for big, nae massive, data of today. I believe you will find some rather interesting surprises.

So let me repeat. Data. It isn't a process. It's an asset.

How many layer-cake data-flow, architecture and process flow diagrams have you seen? In almost every one, data is invisible except in a cylinder (or trash can) or as an arrow. What’s in the arrow? And where are all the arrows going? All of these diagrams are devised from a technical point of view, not a practical one. And they don’t address one fundamental question...what’s the point of all this data? 

That’s one of those questions that has one simple answer and a million more complicated ones, but the simple one is...to inform decision-making.

What’s Missing?

What is missing from virtually every diagram I’ve ever seen is just how all of the diagrams of information technology actually inform decision making. By looking at them one would assume that after all the transfers, cleansing, integration, modeling, optimization, scale-up/scale-out, some miracle happens and analysts are rewarded with knowledge like an oyster exuding pearls.  But, in my experience, that isn’t what really happens.

What Really Happens

When the triple threat of Big Data/Cloud/Hadoop hit the world like a tsunami, attention to ‘how’ data informs decision making was inundated with options. All eyes focused in on ‘data scientists’ who with their programming, statistical and subject matter skills combined were the new data alchemists. The problem was that the domain of data scientists’ investigations was very different from what everyone else in the organization was doing. In fact, the ‘old’ way of making decisions, using data warehouses and Business Intelligence, actually became derided as so old-school it wasn’t even discussed. After all, the ‘old’ way didn’t deal well with the scale of big data, it didn’t involve mysterious practices like Machine Learning or Support Vector Machines. And existing business analysts were actually excluded from the party because they didn’t know Java, or R or SQL. No; instead they used tools that didn’t even require coding or scripting; they used Business Intelligence (BI).

Business Intelligence is actually a terrible term in my opinion, because it isn’t always about business, and intelligence is a term that implies something passive. When analysts in organizations like businesses, government and NGO’s, for example, need to understand what happened, what is happening and what they should do about it, they turn to tools grouped under the heading of BI, but there is a vast array of tools and types and capabilities under that heading. One of them is OLAP.

OLAP. Old-School?

OLAP is a term coined in 1993 as an acronym for On-line Analytical Processing. It is an archaic term because everything is on-line today, and no one really talks about 'processing.' OLAP is multi-dimensional analytics. It looks at the world as an arrangement of dimensions (time, part, product, flight, etc), facts (numbers), hierarchies (what rolls up to what) and extremely powerful calculations across all those elements. Most importantly, it is interactive, so much so that you can literally zoom through all of these elements at will. The best part is it is far easier to use than to explain. Applications of OLAP range from the simple Sales by Month and YTD, by Region to extensive KPI’s, allocations and market basket analysis, to name a few. It can handle time series and spatial data. Many OLAP tools include specialized libraries for finance, statistics, supply chain and many others.

If you want to earn the disdain of a big data poser, just mention OLAP. To a big data person, OLAP represents everything that is wrong with computing. They will tell you it can’t scale. It relied on prebuilt structures of only aggregated data. That OLAP “cubes” have to be rebuilt from scratch for even minor changes. Actually all of these things are true with previous generations of OLAP, but it has all changed because, ironically, BIG DATA CHANGED IT. Using big data technologies, new ways of providing OLAP eliminate all of the old drawbacks.

What OLAP-haters won’t mention, mainly because they don’t completely understand it, is the true magic of this old, but oh so powerful, BI functionality.

The magic of OLAP is that it works the way people think.

The Magic of OLAP

OLAP arranges things by attributes/dimensions/hierarchies, treats them as NAVIGATION points and filters, and instantly calculates numbers (metrics).  Using the concepts of OLAP, one can arrange data in multidimensional arrays with a point and click. No code. Just click. And the real magic happens when you can make subsequent analyses based on the previous ones and follow a trail, that you can navigate forward or backward, drilling into detail, rolling up, pivoting and other kinds of navigation. Try that in Java. Or SQL. Yeah, good luck with that.

So in summary I’ve found that many who are busy with big data have missed a major point when it comes to driving value out of their big data assets. They don’t really understand how people do their jobs and how they think. If you want to use data to run your business, and to drive your decisions, you need to deliver interactive, navigable data tools that work the way people think.

And here is a dirty little side secret. Most of the output from innovative data science is ultimately presented in BI tools anyway. And guess what those BI tools, including visualizations, use to enable business user navigation and decisions?  You guessed it; measures/dimensions/attributes (aka OLAP).

As you look to the future, to innovation, and technology that will support you in this big data world, make sure not to throw the baby out with the bathwater. Old doesn’t always mean old-school. Old can mean experience, tried and true, and proven.  

A Lesson to Learn

The big lesson here is not to confuse the ‘tool’ with the ‘process’. OLAP emerged because it met a need. The need didn’t disappear when technology leapt ahead with big data and data science. Big data and the attendant physical resources it brings are, mostly, about analytics, not operations. Data, flowing into Hadoop and massively parallel databases, are ‘used.’ The data is secondhand, sourced from other systems, or even streaming from sensors. That data is there for analysis, for data science and to drive an unlimited number of data-driven applications not possible before. But for all innovative companies wanting to be data-driven, who are swimming in oceans of big data only accessible to a limited population of the ever elusive data-scientist, BI and OLAP are getting a turbo-boost of energy.

Coming Soon...

I’ll describe in deeper detail how it all (BI, OLAP, big data...) fits together in the rest of this blog series.  So, remember to subscribe to get the next one titled 'Running with the Red Queen'.

A Conversation, Not Just a Post

Have a thought or response to this post? Please comment. This world of BI and data is ever evolving, and I welcome the chance hear thoughts and perspectives that broaden the conversation.

Until next time, ~ Neil Raden

 

About the Author: Neil Raden is an author, consultant, industry analyst and founder of Hired Brains Research, based in Santa Fe, NM. He has a passion for analytics based on decades of experience and strives to express it through his work in writing, speaking and advising clients.  Neil began his career as an actuary with AIG. In 1985, he started Archer Decision Sciences, consulting on analytics projects for Fortune 500 companies. Archer was one of the first to develop large-scale data warehouses and BI environments. In 2003, Neil expanded into a role as an industry analyst, publishing over 50 white papers, hundreds of articles, blogs, keynote addresses and research reports. He is also co-author of the book, Smart (Enough) Systems, about decision automation systems driven by predictive analytics. Please feel free to contact Neil directly at nraden@hiredbrains.com


Topics: Hadoop, Business Intelligence, Big Data

Learn about BI & Hadoop

The AtScale Blog is the one-stop shop for cutting edge news and insights about BI on Hadoop and all things AtScale.

Subscribe to Email Updates