Why organisations are spending time at the lakehouse

Why organisations are spending time at the lakehouse

James Wright, Senior Director, Asia Pacific and Japan, Cloudian, warns that the lack of broad understanding of ‘spending time at the lakehouse’ won’t be a viable excuse for risking data security or breaking compliance rules and regulations around sovereignty.

James Wright, Senior Director, Asia Pacific and Japan, Cloudian

First there were data lakes, then data warehouses, now – the inevitable data lakehouse.

Most people don’t really understand what any of these systems mean and most definitions do little to change that.

But to put it in terms of value that people can understand, it’s about easier, cheaper storage and management of, as well as creating value out of (you guessed it) data.

Organisations have been storing unfathomable amounts of data on their business, customers, partners, and other areas for years now. The latest estimates indicate 0.33 zettabytes or 328.77 billion gigabytes are created every single day.

That’s the equivalent of more than 164 billion hours of HD video or nearly half a trillion physical filing cabinets worth of paper, every day.

Cloud and data centre directory Cloudscene says Australia now has more than 300 data centres crunching our share of that pie, and that doesn’t account for the many thousands of on-premises and edge data centres across the country.

Still, we’ve done very little to extract value out of all that data, and the more we add to the pile the harder it can become to do so.

Essentially, the data lakehouse promises to change that with companies such as HPE, Snowflake and Vertica are leading the revolution. There are a whole bunch of technologies, applications, databases, storage components and other data ‘things’ involved that ingest data from diverse sources, but it’s easier to understand it better through examples.

Imagine an Australian telco wants to quickly pull up every single transcript of customer service calls and web chats that have occurred in the last year to assess key issues, customer sentiment, and the effectiveness of the company’s response.

A human would take years to do that, whereas an older IT system might take days, weeks, months or be unable to do it at all. If it could, it would also come at enormous cost in both real terms and to the IT capacity that telco needs for a range of business functions.

The right data lakehouse with a touch of AI is pulling up that data and querying the issues, sentiment, and response in minutes, at little financial or resource cost. Business leaders can look at it, think ‘huh, interesting’ and make some real decisions based on it. The IT team can move onto the next query.

But as with all things technology, and particularly our data, issues like security and sovereignty must be top of mind.

The advent of AI, data lakehouses and everything that comes with them means that more and more data is going to be pulled up, analysed, and shared within and between organisations.

Data is the crown jewel cybercriminals are after and organisations need to be careful about where that data is stored, who has access, and where and how it’s backed up.

These technologies are only going to increase the value of our data and it’s already any organisation’s most valuable resource. That means ransomware gangs will demand even more in ransom to release data they can hack.

There are compliance standards to be mindful of too, such as the OAIC’s Consumer Data Right system which requires businesses to follow strict information security requirements around governance, minimum system controls, testing, monitoring, evaluation and reporting.

And as the Government moves to reshape the cyber legislative landscape to support the Cyber Security Strategy, new requirements may come into effect.

It’s also well established in the cyber security industry that there’s no way to fully shut out cybercriminals, which means organisations using more data for more uses need to think creatively.

They have to create an immutable, unencryptable version of their key data, siphoned away from cybercriminals even after they infiltrate.

Data sovereignty is important in this context as data lakehouses regularly use public cloud infrastructure to host data, meaning that data can be removed from Australian shores and be accessible from abroad.

Immutable copies of sensitive and critical datasets should be kept on-premises, regardless of the rounds they’re doing at the lakehouse.

With this in place, companies can reinject their crown jewel data without needing to pay the ransom and keep operating, selling and using tools like AI and data lakehouses to help them do that.

Whether they understand the lakehouse world or not – Australian organisations need to understand that security and sovereignty must be prioritised when spending time there.

Click below to share this article

Browse our latest issue

Intelligent CIO APAC

View Magazine Archive