Some 60 years ago the early adopters of data-driven enterprise implemented Material Requirements Planning (MRP) to improve manufacturing throughput and efficiency. For the first time plant managers used a data platform to foresee outcomes based on optimized plans and schedules.
From this great success, the data-driven enterprise evolved to Enterprise Resource Planning (ERP) and Enterprise Data Warehousing (EDW) to meet the ever-growing demands for online reporting, analytics, and improved business intelligence.
“If you can’t measure it, you can’t improve it,” decried management guru Peter Drucker, and the quest for the data-driven enterprise went into warp drive.
These early data platforms established a single version of the truth to describe enterprise business performance using online transaction processing, online reporting, and business intelligence. By storing and processing critical transaction data in a single database instance, organizations gained enormous control and efficiency to free capital and increase throughput. At last, business leaders could measure results in real-time and leverage analytics to improve situational awareness and decision making.
Still, these enterprise data management blueprints provided only a partial picture because they only dealt with structured data. Today, up to 80% of enterprise data is unstructured or semi-structured and includes images, email, social media, audio, you name it. And since unstructured data is growing at an extreme rate up to 65% per year, much faster than structured data, these early-stage data management blueprints are no longer sufficient.
Common Data Platform (CDP)
Next-Generation information architecture is required to support the modern data-driven enterprise. These new data platforms ingest any data and use scalable, commodity hardware, and data governance to turn the data growth crisis into a data-driven business opportunity. Big data market revenues are surging as companies retool and evolve their enterprise data management strategies to take advantage.
A Common Data Platform (CDP) collects, organizes and governs all enterprise data in a single, scalable, repository to achieve infrastructure optimization and low cost, bulk data storage. Data is typically stored “as-is” or transformed to meet special requirements, but always remains available for real-time access by downstream applications. These downstream use cases are the next generation of data-driven applications, and they are fueled by data from across every functional area of the enterprise.
Modern data platforms require a native cloud technology stack featuring NoSQL databases, containers, microservices, and object stores. Data is everywhere and deployment options require support for public, private, hybrid, or multi-cloud data platform solutions. The prize is a modern data-driven enterprise leveraging real-time, schema-on-read access to all enterprise data. With this new information architecture, data scientists may run more advanced analytics and data visualization, artificial intelligence (AI) and machine learning (ML) applications.
The solution scope for the next generation data platform is end-to-end and includes the complex data fabric connecting disparate data sources with target data stores. Source data is archived as-a-service from production systems based on Information Lifecycle Management controls. ILM grounds data ingestion into a best practice with data governance policies, compliance plans, and data security rules to properly protect the data.
Critical data governance, data security, and compliance policies are managed centrally throughout the entire data life cycle. A centralized metadata repository enables policy-driven controls and retention plans to ensure data is safely integrated, accessed, shared, linked, analyzed, and maintained to the best effect across the organization. The goal is to enable proper and secure access for text searches, forms, reports, queries, visualizations, analytics, and AI/ ML applications.
Big Data Application Framework
The application framework for Common Data Platform includes both IT and functional use cases critical to today’s data-driven enterprise. IT use cases include infrastructure optimization and data ingestion thru enterprise archiving and application retirement.
Data governance capabilities on a cloud data platform are versatile and support a wide variety of business-critical compliance requirements. Governance, risk and compliance policies are enforced by business rules to create a system of controls to satisfy the most demanding compliance requirements. Critical policies such as ILM, GDPR and NIST 800 series are available in out-of-the-box configurations.
Data-driven finance enables next-generation financial apps such as advanced consolidation for order-to-cash and procure-to-pay. Industry application use cases include packaged data fabrics, data governance and data transformation for industry-specific solutions in healthcare, government, and banking.
Enterprise Data Lakes are data repositories with data governance, metadata management, and advanced tools for data analysis. Operational Data Stores (ODS) are enterprise data lakes whose data fabric supports real-time updates. By storing data “as-is” and by updating the data in real-time, data lakes and ODS reduce heavy-lift ETL processes and free users from the canonical, fixed schema design of enterprise data warehouses. NoSQL database architectures support schema-on-read data access enabling data scientists to describe their data better and achieve more powerful data insights.
The rise of data platforms features open-source, native cloud data platform architecture, and a broad portfolio of advanced NoSQL applications. Common Data Platforms perform uniform data collection as-a-service to feed the enterprise archives and data lakes powering the next generation data-driven enterprise.
Thru rich-text search, reports, queries, advanced analytics, AI/ML, and NoSQL applications, the data-driven enterprise helps business leaders manage beyond a single version of the truth and project “what will happen?” Running advanced parallel computing models, CDP supports a wide variety of workloads and leverages commodity infrastructure and object stores to process and store bulk data at the lowest cost.