AtScale Blog

Joshua Klahr

Recent Posts

TECH TALK: BI Performance Benchmarks with Google BigQuery

Posted by Joshua Klahr on Apr 6, 2017

In the world of Business Intelligence and Big Data there continue to be a number of exciting innovations as new and improved options for processing large data sets appear on the market.  You may be familiar with AtScale’s BI-on-Hadoop Benchmarks - where we focus on evaluating the top SQL-on-Hadoop engines and their fitness to support traditional BI-style queries.  As we continue to work with customers who are navigating their journey to BI on Big Data, we are increasingly getting questions about the emerging cloud-based data processing engines.

 In this blog post, we will take a deeper look at Google’s BigQuery, and how it stacks up in the BI-on-Big Data ecosystem.

Read More

Topics: Business Intelligence, Big Data, olap, BI

Announcing AtScale 5.0: Rapid Innovation for our Customers

Posted by Joshua Klahr on Mar 7, 2017

CONTINUING OUR TRACK RECORD OF RAPID DELIVERY & INNOVATION

Today we announced the general availability of AtScale 5.0 and I couldn’t be more excited about the host of great new features that are included in this release.  As we’ve continued to gain traction in a number of industries - ranging from healthcare to retail to financial services to telco to online- we continue to learn from our customers and use these learnings to feed directly back into our product features.  With the release of 5.0, AtScale customers now have an even richer set of capabilities that they can use to derive business insights and value from their Big Data investments.  I’ve included some of the highlights of the release in the sections below.

Read More

Topics: Business Intelligence, Big Data, olap, BI

TECH TALK: Multi-Level Metric Analysis. Uncover the Hidden Relationships

Posted by Joshua Klahr on Mar 2, 2017

I’ve asked it before and I’ll ask it again. Wouldn’t it be great if you could easily analyze ALL your data from a Excel single file? We all know this isn’t feasible; especially when dealing with big data and complex business analytics needs.

In working at the intersection of Big Data and traditional Business Intelligence, the AtScale team has encountered a number of complex business analytics use cases that are difficult, if not near-impossible, to solve using typical table-based data models and SQL. Today, I’m going to share why and how complex analysis, like for multi-level metrics, is no longer as ‘difficult’ nor ‘near-impossible’ as it once was.

Read More

Topics: Business Intelligence, Big Data, olap, BI

TECH TALK: Solving the Unrelated Dimension Dilemma. A Connect the Dots Story of Sorts.

Posted by Joshua Klahr on Feb 15, 2017

Wouldn’t it be great if you could load all of your data from a single file into an Excel pivot table for easy analysis? 

Unfortunately, this approach isn’t usually viable when dealing with complex business analytics and big data.  Take for example a typical use case found inthe world of healthcare insurance.  A large insurance provider has 10s of millions of members, and processes 100s of millions of claims a year.  As flexible as Excel is, we all know it won’t handle this volume or velocity of data. 

As a result, more and more enterprises store  large data sets in big data platforms like Hadoop.  And while Hadoop provides a low-cost and performant approach to store and process this information, there is still the challenge of supporting the many types of analytics required on claims and member data sets.  But why? Why and how, with all of the advances in technology, can a simple calculation cause so much complexity?

Read More

Topics: Business Intelligence, Big Data, olap, BI

The 6 Principles of Modern Data Architecture

Posted by Joshua Klahr on Nov 15, 2016

A version of this article originally appeared on the Cloudera VISION blog.

One of my favorite parts of my role is that I get to spend time with customers and prospects, learning what’s important to them as they move to a modern data architecture. Lately, a consistent set of six themes has emerged during these discussions. The themes span industries, use cases and geographies, and I’ve come to think of them as the key principles underlying an enterprise data architecture.

Whether you’re responsible for data, systems, analysis, strategy or results, you can use these principles to help you navigate the fast-paced modern world of data and decisions. Think of them as the foundation for data architecture that will allow your business to run at an optimized level today, and into the future.

Read More

Topics: Hadoop, Business Intelligence, Big Data, Hadoop Summit

TECH TALK:  BI-on-Hadoop Engine Wars Continue...Everybody Wins

Posted by Joshua Klahr on Oct 18, 2016

Just this week, AtScale published the Q4 Edition of our BI-on-Hadoop Benchmark, and we found 1.5X to 4X performance improvements across SQL engines Hive, Spark, Impala and Presto for Business Intelligence and Analytic workloads on Hadoop.

Bottom line, the benchmark results are great news for any company looking to analyze their big data in Hadoop because you can now do so faster, on more data, for more users than ever before.

While this blog provides a high level summary of our findings, you can access the full Q4 2016 Edition of the BI-on-Hadoop Benchmarks here, and also listen to our webinar replay discussing this in more details here.

Read More

Topics: Hadoop, Business Intelligence, spark, hive, bi-on-hadoop, Big Data, impala, presto

What to Watch for at Hadoop Summit San Jose (June 2016)

Posted by Joshua Klahr on Jun 23, 2016

With Hadoop Summit San Jose just around the corner, I thought it might be helpful to preview what to watch out for a the conference. In some ways, not much has changed in the past few months - streaming data is a hot topic, more and more people are adopting adjacent technologies (like Spark), and “in memory” is “in vogue” in the world of big data.  However, a quick tour around the Hadoop Summit website reveals a few more trends that deserve some additional attention.

Read More

Topics: Hadoop, Business Intelligence, Big Data, Hadoop Summit

TECH TALK: Scale-Out Business Intelligence with Hadoop

Posted by Joshua Klahr on Apr 21, 2016

The growing popularity of big data analytics coupled with the adoption of technologies like Spark and Hadoop have allowed enterprises to collect an ever increasing amount of data - in terms of breadth and volume.  At the same time, the need for traditional business analysis of these data sets using widely adopted tools like Microsoft Excel, Tableau, and Qlik still remains.  Historically data is provided to these visualization front ends using OLAP interfaces and data structures. OLAP makes the data easy for business users to consume, and offers interactive performance for the types of queries that the business intelligence (BI) tools generate. 

However, as data volumes explode, reaching hundreds of terabytes or even petabytes of data, traditional OLAP servers have a hard time scaling.  To surmount this modern data challenge, many leading enterprises are now in search of the next generation of business intelligence capabilities, falling into the category of scale-out BI.  In this blog I'll  share how you can leverage the familiar interface and performance of an OLAP server while scaling out to the largest of data sets. 

And if you don't have time to read the whole thing, don't miss the 10-minute 'cliff-note' video of scale-out BI on Hadoop near the end.

 

Read More

Topics: Hadoop, Business Intelligence, spark, hive, bi-on-hadoop, Big Data

TECH TALK:  First-Child & Last-Child Measures in Hadoop

Posted by Joshua Klahr on Mar 24, 2016

As more and more enterprises adopt Hadoop as their next generation data platform, the demands of traditional enterprise workloads, including support for Business Intelligence use cases, are creating challenges.  While Hadoop excels at low-cost distributed storage and parallel data processing, interactive support for BI-style queries remains a challenge.  Additionally, multi-dimensional queries often demand complex OLAP-style calculations and functions.  In this post we will share how AtScale helps to bridge the gap between Business Intelligence users and data that resides in Hadoop.

In many typical business analyses or applications it is important to be able to directly query the first or last value of a particular metric across a hierarchy.  For example:

  • What was the starting or ending price of a security during a particular day
  • What were inventory levels for a SKU at the beginning and end of the month
  • What was the first and last payment amount for a loan agreement

Not Always as Easy as it Sounds

Executing such a query using SQL may involve complex queries consisting of unions, sub-queries, and/or temporary tables.  In MDX (Multidimensional Expression Language) such a query is easier to support, given MDX’s rich support for analytical queries and hierarchical representation.  AtScale has implemented support for First Child and Last Child measures in a way that supports BOTH SQL and MDX clients, which means that virtually any data visualization client can take advantage of this advanced functionality.

Read More

Topics: Hadoop, Business Intelligence, spark, hive, bi-on-hadoop, Big Data

Unprecedented Concurrency with AtScale and Cloudera Impala

Posted by Joshua Klahr on Sep 23, 2015

Just last week Cloudera released some impressive performance numbers showing how the Impala SQL-on-Hadoop engine scales to support concurrent query workloads. The Cloudera blog post confirms what we at AtScale have experienced with real-world customer installations – that Impala plus AtScale is a scalable solution for running concurrent, interactive business intelligence (BI) queries on Hadoop.

Read More

Learn about BI & Hadoop

The AtScale Blog is the one-stop shop for cutting edge news and insights about BI on Hadoop and all things AtScale.

Subscribe to Email Updates