Big Data analytics: data management moves centre-stage

You are here: >

Big Data analytics: data management moves centre-stage

Big Data Analytics: Suddenly, analytics has become hot. While Business Intelligence is hardly new, a whole series of recent high-profile examples has alerted companies to the scale of the potential insights and information lurking in their datasets. Dashboards, predictive analytics, machine learning, cluster analysis: consequently, in a growing number of businesses, such tools are now becoming commonplace.

But there’s a problem. And it’s one with which we at eBECS are all too familiar. Namely this: that all too often, the focus is on the visualisation tools that deliver the answers, rather than on the data management tools that actually enable those visualisation tools to provide answers in the first place.

In Excel terms, it’s a bit like lauding pivot tables, without first recognising that the starting point is a well-organised spreadsheet, free from missing values and data inaccuracies.

And as datasets routinely approach Big Data scale, this misplaced focus on visualisation rather than data management is set to become an ever-greater challenge.

20% visualisation, 80% data management

To date, it’s fair to say that this issue hasn’t been fully appreciated. And one reason isn’t difficult to see: a lot of businesses are still focusing on analysing simple sales-oriented datasets, which provide few data management challenges.

But broaden the focus away from sales-based datasets, and that’s far from the case. CRM datasets, for instance, while conceptually very similar to sales-based datasets, in fact turn out to be of an altogether greater order of complexity.

And start looking at datasets that aren’t primarily customer-facing at all—operations-based datasets, accessed through Internet of Things technologies, for example—and the data management challenge quickly dwarfs the visualisation and analytics challenge.

Put another way, we estimate that in such circumstances, the data management front-end of such a project should constitute about 80% of the focus, with the back-end analysis making up a mere 20%.

Not your father’s technology stack

But it’s fair to say that there’s another reason why businesses are struggling to engage with this data management challenge.

Which is this: that significant portions of it can’t be met—especially at Big Data scale—by conventional data warehousing solutions. And this simple fact rapidly takes many IT professionals out of their comfort and expertise zones.

Because the tools that do meet their data management requirements are likely to be tools that have emerged through the open source community—Apache Hadoop, for instance, or MongoDB, or even the R analytics language and its plug-in data handling and conversion packages.

And for IT executives long accustomed to staying close to their chosen vendor’s technology stack, such tools are often anathema, and generally firmly terra incognita.

Moreover, even when IT executives are happy to countenance the use of such tools for exploratory projects, or line-of-business workgroup tasks, they’re generally loath to see such open source tools operate at the enterprise level, on mission-critical or high-value tasks.

Big Data meets the Internet of Things

The good news: it doesn’t have to be that way.

For proof, look no further than a recent analytics solution built by Business Intelligence experts here at eBECS, a solution which undeniably marched firmly into Big Data and Internet of Things territory.

Built for FTSE 250 workwear and linen rental business Berendsen, it helped it track some four million RFID tags attached to towels and sheets despatched to its customers from 30 plants nationwide, collectively processing a million items a day.

And at the heart of the solution: the Microsoft Azure platform’s cloud-based bundled instance of Apache Hadoop, a technology purpose-designed to scalably handle Big Data volumes at speed.

Hadoop in the Cloud—from Microsoft

Termed HDInsight, and fully supported by Microsoft, it can be deployed under Windows or Linux, can process unstructured, semi-structured, or structured data, and readily scales to petabytes on demand—and yet is fully-integrated with familiar Microsoft analysis and visualisation tools such as Excel and Power BI.

And so, at Berendsen, approximately 20 files of XML messages are sent to the Microsoft Azure Event Hub for processing every minute, each containing multiple XML transactions, collectively equating to over four transactions per second.

Once processed into HDInsight, these XML messages are transferred to Microsoft Azure Blob Storage for subsequent off-line processing before being archived in the Microsoft Azure SQL Database—a cloud-based relational database as a service.

From an end user’s perspective, it’s the undeniably the visualisation and back-end analytics tools—a combination of Microsoft Azure Stream Analytics and Power BI—that get the plaudits, delivering such things as dashboards on managers’ mobile devices.

But the real heavy lifting is carried out much earlier in the process, by Azure-based tools such as the Microsoft-supported HDInsight instance of Apache Hadoop.

Data management: ignore at your peril

The conclusion to draw from all this? To us, it’s inescapable: as datasets become larger, and data sources more diverse, the nuts and bolts of data management are going to become an increasingly central part of any analytics project.

So it’s vital to understand the underlying data management requirement, and select the right tool to handle it.

Because otherwise, pain and disappointment undoubtedly beckon.

Author: 
eBECS

Get in Touch

Email or call us now to
discuss how Microsoft
Business Solutions can
improve your business

Upcoming Events - Register Now

Join our list

eBECS will invite you to webinars, events and keep you up to date with relevant news. You can unsubscribe at any time.

UK: +44 (0) 8455 441 441
Ireland: +353 (0)1 893 4831
USA: +1 (678) 701 5856
Saudi Arabia: +966 (11)920 007299

© 2025 eBECS Limited. All rights reserved.
Registered office: Royal Pavilion, Wellesley Road, Aldershot, Hampshire, England, GU11 1PZ