Practical Guidelines for Implementing Data Products

Esittely

OVERVIEW

Most companies today are storing data and running applications in a hybrid multi-cloud environment. Analytical systems tend to be centralised and siloed like data warehouses and data marts for BI, Hadoop or cloud storage data lakes for data science and stand-alone streaming analytical systems for real-time analysis.  These centralised systems rely on data engineers and data scientists working within each silo to ingest data from many different sources, clean and integrate it for use in a specific analytical system or machine learning models.  There are many issues with this centralised, siloed approach including multiple tools to prepare and integrate data, reinvention of data integration pipelines in each silo and centralised data engineering with poor understanding of source data unable to keep pace with business demands for new data. Also, master data is not well managed.

To address these issues, a new  approach has emerged attempting to accelerate creation of data for use in multiple analytical workloads.  That approach is to create Data Products. This is a decentralised  business domain-oriented approach to data ownership and  data engineering to create reusable data products that can be created once and shared across multiple analytical systems and workloads. Multiple data architecture options are available to create data products. These include using one or more cloud storage accounts on an organised cloud storage data lake, on a Lakehouse, on a cloud data warehouse, using Kafka or using data virtualisation. Data products can then be consumed in other pipelines for use in streaming analytics, Data Warehouses or Lakehouse Gold Tables, for use in business intelligence, data science and other analytical workloads.

This 2-day class looks at data products in detail and examines its strengths, and weaknesses.  It also looks at the strengths and weaknesses of data products implementation options. Which architecture is best to implement this?  How do you co-ordinate multiple domain-oriented teams and use common data infrastructure software like Data Fabric to create high-quality, compliant, reusable, data products?  Also, how can you use a data marketplace to govern and share data products?  The objective is to shorten time to value while also ensuring that data is correctly governed and engineered in a decentralised environment. It also looks at the organisational implications of democratised data product development  and how to create sharable data products for master data management AND for use in multi-dimensional analysis on a data warehouse, data science, graph analysis and real-time streaming analytics to drive business value?  Technologies discussed includes data catalogs, data fabric for collaborative development of data integration pipelines to create data products, DataOps to speed up the process, data orchestration automation, data marketplaces and data governance platforms.

AUDIENCE

This seminar is intended for business data analysts, data architects, chief data officers, master data management professionals, data scientists, IT ETL developers, and data governance professionals.  It assumes you understand basic data management principles and data architecture plus a reasonable understanding of data cleansing, data integration, data catalogs, data lakes and data governance.

LEARNING OBJECTIVES

Attendees will learn about:

    • Strengths and weaknesses of centralised data architectures used in analytics
    • The problems caused in existing analytical systems by a hybrid, multi-cloud data landscape
    • What are data products and how do they differ from other approaches?
    • What benefits do data products offer and what are the implementation options?
    • What are the principles, requirements, and challenges of implementing a data product approach?
    • A best practice organisational model for coordinating development of data products across different domains to succeed in implementation
    • The critical importance of a data catalog in understanding what data is available
    • How business glossaries can help ensure data products are understood and semantically linked
    • An operating model for effective federated data governance
    • What software is required to build, operate and govern data products for use in  a data lakehouse, data science, a data warehouse, graph analysis or other analytical workloads?
    • What is data fabric software, how does it integrate with data catalogs and connect to data in your data estate
    • An Implementation methodology to produce ready-made, trusted, reusable data products
    • Collaborative domain-oriented development of modular and distributed DataOps pipelines to create data products
    • How a data catalog, Generative AI and data automation software can be used to generate DataOps pipelines to create data products
    • Managing data quality, privacy, access security, versioning, and the lifecycle of data products
    • Pros and cons of different data architecture options for implementing data products
    • Publishing semantically linked data products in a data marketplace for others to consume and use
    • Federated data architecture and data products – the emergence of lakehouses open tables as a way for multiple analytical workloads to access shared data products
    • Persisting master data products in an MDM system
    • Consuming and assembling data products in multiple analytical systems like data warehouses, lakehouses and graph databases to shorten time to value
    • How to implement federated data governance

MODULES

  • Module 1: What are data products and why are they needed?
  • Module 2: Organising and standardising your environment to support democratised data product development
  • Module 3: Methodologies for creating data products
  • Module 4: Defining and designing data products using a catalog business glossary and data modelling
  • Module 5: Sourcing, mapping and data quality profiling data for your data products
  • Module 6: Building DataOps pipelines to create reusable data products
  • Module 7: Implementing federated data governance to produce and use compliant data products
+ Lue koko esittely

Kouluttaja:

MIKE FERGUSON

Managing Director, Intelligent Business Strategies Limited

Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an independent IT industry analyst and consultant, he specialises in BI / analytics and data management. With over 40 years of IT experience, Mike has consulted for dozens of companies on BI/Analytics, data strategy, technology selection, data architecture, and data management. Mike is also conference chairman of Big Data LDN, the fastest growing data and analytics conference in Europe.  He has spoken at events all over the world and written numerous articles. Formerly he was a principal and co-founder of Codd and Date Europe Limited – the inventors of the Relational Model, a Chief Architect at Teradata on the Teradata DBMS. He teaches popular master classes in Data Warehouse Modernisation, Big Data Architecture & Technology, Centralised Data Governance of a Distributed Data Landscape, Practical Guidelines for Implementing a Data Mesh (Data Catalog, Data Fabric, Data Products, Data Marketplace), Real-Time Analytics, Embedded Analytics, Intelligent Apps & AI Automation, Migrating your Data Warehouse to the Cloud, Modern Data Architecture and Data Virtualisation & the Logical Data Warehouse.

Lue lisää

Practical Guidelines for Implementing Data Products

Kouluttaja:
MIKE FERGUSON
Kieli:
English
Kesto:
2 days
Paikka:
LIVE Helsinki / Huone Kamppi
Aloituspäivämäärät:
  • 10.04.2025 8.45-17
Hinnoittelu: 1900€+VAT
Ilmoittaudu mukaan

Ottakaa yhteyttä:

 

  • Kenttä on validointitarkoituksiin ja tulee jättää koskemattomaksi.

Saattaisit olla kiinnostunut myös näistä

+