In the face of data complexity, IT architects are responsible for making sure the backend data plumbing is well-conceived, so data teams across the organization can effectively leverage data. François Sergot, Dataiku offers a guide to IT architects on how to navigate the rise of data democratization.
IT architects work with teams across the organization, including data teams, to manage data integration globally – receiving data from and pushing it into multiple sources, connecting sources together, and so on. Although IT architects are not involved in the actual data science projects, their job is to make sure the systems that data teams are using work, as well as to understand the business implications of these resources.
These key players (and their greater IT teams) need to have their finger on the pulse of the entire technology stack as it evolves and course-correct whenever necessary. In the face of data complexity, IT architects are responsible for making sure the backend data plumbing is well-conceived, so data teams across the organization can effectively leverage data.
Despite challenges associated with data democratization and tasks like data integration and processing, IT architects can ease their own burden by using collaboration-driven data science tools that mitigate risk through more holistic data integration approaches.
What does ‘orchestrate data efforts’ really mean?
According to a recent survey of 200 IT executives, most companies have more analysts than data scientists, a finding that demonstrates the need for AI to be accessible to a wider population within the enterprise. As businesses take note and swiftly charge into the age of data democratization and race to adopt an Enterprise AI strategy, the number of people working with data inside an organization is skyrocketing.
While this accessibility and growing use of data is a positive sign, it indicates a particularly challenging moment for IT teams as they work to keep the pace with this accelerated demand, ensure systems are functioning as they should and maintain security across the organization. It is worth noting that while data democratization can cause more work for IT architects, it is not their primary concern. Integration is. IT architects work with teams across the organization, including data teams, to manage integration globally.
The good news is that IT architects can leverage data science tools to help make their jobs easier which, in turn, allows people to gather data themselves (even from multiple sources) and merge it without IT intervention. In order for that to even be a possibility, though, IT architects will need to ensure the proper security protocols and systems are in place, so they still have a critical role to play in the process.
Data orchestration involves automating the process of taking siloed data from multiple data storage systems and locations, combining it, and making it available for analysis and insight extraction. This orchestration is pivotal in today’s enterprise, given the rising complexity of the data landscape. This is mainly due to the variety of data repositories, the growing use of alternative and unconventional data pools and the use of hybrid infrastructure – not to mention the fact that this process likely carries different meaning for different companies.
Extract, Transform, Load (ETL)
IT architects usually face the brunt of ETL tasks. However, it doesn’t need to be that way. By equipping data science users with a data science tool that can handle everything data-related, IT architects are putting data integration into the hands of the many, including those working closely with the data for a given data science project.
In Dataiku’s survey of IT leaders, the majority of companies perform data preparation and ETL in one system and Machine Learning in another, which can cause extra work for IT.
There are many benefits to implementing a data science tool with ETL capabilities, notably the fact that it enables IT architects to avoid going through multiple tools for a snapshot of where and how data is flowing. It can also help organizations maintain a consolidated view of the data that fuels business decisions (making it easier for folks like analysts to examine and report on data relevant to their projects). Further, it fosters consistency by ensuring data policies or even the data itself is consistent across each data location.
Security in the age of data democratization
IT architects are heavily involved in compliance and cybersecurity initiatives across the organization. As more data becomes accessible, their role becomes increasingly critical in order to make sure policies are enforced and audits can be carried out easily and without any bad surprises (e.g. for GDPR or ISO 27001). Without a unified workspace, those policies and audits can become extremely complex and require a lot of time both from IT architects and the auditors.
Using a unified data science tool can make a significant impact when dealing with securing data, but one additional important advantage lies in how such a unified tool greatly mitigates the risk added by the usage of multiple cloud solutions – it’s hard to deny that the shift to hybrid or pure cloud from pure on-premises infrastructure has brought many advantages to companies.
With data accessed through a unified, secured and audited workplace, companies can benefit from cloud platforms without increasing their risks. As an example, an IBM report estimates the average cost of a data breach to be US$3.92 million, demonstrating the importance of a robust data governance plan (which should always include data quality and security).
Successful data orchestration Is the key to extracting impactful insights
There’s no arguing that, individually, instruments sound nice when they are played by someone who knows how to play them. However, when orchestra instruments are played together in unison, they go a step further than just sounding nice – they give the conductor a job and allow the audience to watch the one-of-a-kind performance, all from one stage.
Similarly, data orchestration combines various disparate datasets from different systems and locations, brings it together, and makes it available for the extraction of deeply informed insights (which can then inform business leaders to make decisions that have real-world impact).
In an era of data democratization, the role of IT architects to support and provide transparency around both SSA and o16n initiatives, ensure elasticity and security and provide data accessibility to a wide variety of users is pivotal to the true success of data projects.Click below to share this article