The rise of multi-cloud, data-first architecture and the broad portfolio of advanced data-driven applications that have arrived as a result require cloud data management systems to collect, manage, govern and build pipelines for enterprise data. Cloud data management architectures span private, multi-cloud and hybrid cloud environments connecting to data sources not just from transaction systems, but from file servers, the Internet or multi-cloud repositories.
The scope of cloud data management includes enterprise data lake, enterprise archiving, enterprise content services, and consumer data privacy solutions. These solutions manage the utility, risk and compliance challenges of storing large amounts of data.
Cloud data platforms
Cloud data platforms are the centerpiece of cloud data management programs and provide uniform data collection and data storage at the lowest cost. Archives, data lakes, and content services enable cloud migration projects to connect, ingest, and manage any type of data from any source. For instance, cloud data platforms collect legacy and real-time data from mainframes, ERP, CRM, file stores, relational and non-relational databases, and even SaaS environments like Salesforce or Workday.
Studies have shown that data is accessed less frequently as it ages. Current data such as online data is accessed most frequently, but after two years, most enterprise data is hardly ever accessed. As data growth accelerates the load on production infrastructure grows, and the challenge to maintain application performance increases.
Application portfolios should be screened regularly for legacy applications that are no longer in use and those applications should be retired or decommissioned. In addition historical data from production databases should be archived to improve performance, optimize infrastructure and reduce overall costs. Information Lifecycle Management (ILM) should be used to establish data governance and compliance controls.
Enterprise archiving supports all enterprise data including databases, streaming data, file servers and email. Using ILM, enterprise archiving moves less frequently accessed data from production systems to nearline repositories. The archive data remains highly accessible and is stored in low cost buckets. Large organizations operating silos of file servers across departments and divisions use enterprise archiving to consolidate these silos into a unified and compliant cloud repository.
Enterprise Data Lake
Data-driven enterprises leverage vast and complex networks of data and services, and enterprise data lakes deliver the connections necessary to move data from any source to any target location. Enterprise data lakes handle very large volumes of data and scale horizontally using commodity cloud infrastructure to deliver data pipeline and data preparation services for downstream applications such as SQL data warehouse, artificial intelligence (AI) and machine learning (ML).
Data pipelines are a series of data flows where the output of one element is the input of the next one, and so on. Data lakes serve as the collection and access points in a data pipeline and are responsible for data organization and access control.
Data preparation makes data-fit-for-use with improved data quality. Data preparation services include data profiling, data cleansing, data enrichment and data transformation and data modeling. As an open source and industry standard solution, enterprise data lakes safely and securely collect and store large amounts of data for cloud migration, and provide enterprise grade services to explore, manage, govern, prepare and provide access control to the data.
Enterprise Content Services (ECS)
Corporate file shares are overflowing with files and long ago abandoned data. Enterprise Content Services collect and store historical enterprise data that would otherwise be spread out across various islands of storage, on personal devices, file shares, Google Drive, Dropbox, or personal OneDrives. Organizations planning cloud data migration to tackle content sprawl should consider ECS for secure and compliant file storage at the lowest cost. Cloud data migration with ECS consolidates enterprise data onto a single platform and unifies silos of file servers in innovative ways to become more efficient and reduce costs.
Consumer Data Privacy
Consumer data privacy regulations are proliferating with nearly 100 countries now adopting regulations. The California Consumer Privacy Act (CCPA) and Europe’s General Data Protection Regulation (GDPR) are perhaps the best known laws, but new regulations are on the rise everywhere as security breaches, cyberattacks and unauthorized releases of personal information continue to grow unabated. These new regulations mandate strict controls over the handing of personally identifiable information (PII), yet variations across geographies make legal compliance a complex requirement.
Information Lifecycle Management (ILM) manages data throughout its lifecycle and establishes a system of controls and business rules including data retention policies and legal holds. Security and privacy tools like data classification, data masking and sensitive data discovery help data administrators achieve compliance with data governance policies such as NIST 800-53, PCI, HIPAA, and GDPR. Consumer data privacy and data governance are not only essential for legal compliance, they improve data quality as well.
What’s The Urgency?
Exponential data growth is a known fact, however, the implications are only being felt by enterprises in the recent couple of years. On one end, more and more data is required to support data-driven applications and analytics. On the other end, data growth results in operational inefficiencies, technical debts and increased compliance risks. Data growth is a double-edged sword if left unmanaged and delivers great value by enabling enterprises to more effectively manage their data.