We have evolved our technologies to handle the scale of data in terms of: volume, velocity, variety, veracity (accuracy), and value.

Resources

2 Types of Data & Why an Enterprise Needs Them

Say we have an enterprise with:

  • a group of databases that runs the business (i.e. operational data plane)
  • a group of analysts to analyze business patterns to boost revenue (i.e. analytical data consumers)

The consumers may analyze directly off from the operational data as depicted below.

The problems with this approach are:

  • a consumer need may affect the operational data plane (e.g. An analytical consumer wants to know the number of times each user logged in. This requires adding a column to track the number of logins. This column isn’t needed for business. Additional columns like this will bloat the operational data plane)
  • managing private data is difficult

The solution is to introduce a separate data plane solely used for analysis.

The 2 Types of Data

  • operational data plane - sits in databases behind business capabilities, has a transactional nature, keeps the current state, and serves the needs of the applications running the business
  • analytical data plane - is a temporal and aggregated view of the facts of the business over time, often modeled to provide retrospective or future-perspective insights; it trains the ML models or feeds the analytical reports
Link to original

Analytical Data Plane Architecture Types

There are 3 main architectures to choose from for setting up an analytical data plane:

Usually, a single architecture is used for the entire enterprise.

Data Warehouse & Data Lake - Similarities

Data Warehouse & Data Lake - Tech Stacks

Factors

On-Premise

Private Cloud

Software as a Service (SaaS)

Maintenance

hard

hard

easy

Monthly Cost

economic with large datasets

predictable

predictable

Vendor Lock-in

avoidable

avoidable

not avoidable

Suitability

for large corporations

for all businesses

ideal for startups

Investment

substantial in the beginning

increases as data grows

increases as data grows

Examples

  • Hadoop On-Cloud
  • Azure Data Lake (ADL)
  • Amazon S3
  • Google Cloud Storage (GCS)

Subpages