Skip to content

Introduction

With the vast data and organisational landscape that Novo Nordisk currently has, it is easy to have multiple variations of the same dataset within different organisational units that fade the actual derivative/feature of the data - be it an inferenced KPI for a business user, or a correct data source for a data engineer. These artefact of distrust in data needs to be controlled in order to ensure the quality of decisions - which is the problem statement the solution works on. Furthermore, there is no standard way to define a data product/contract - the process of negotiating data contracts, defining data products, and managing usage agreements can be complex, time-consuming, and prone to human error.

Current Challenges

  1. Templates There are no standard templates to define a data product and a data contract cohesively across cloud solutions/data solutions within NN.
  2. Data Assertions There are no current viable way apart from relying on self assertions on ETL to report data inconsistencies on data contracts or fail the refreshes if the data quality rules are not met. In both the cases, external consumers would have no way to know that there was a data assertion failure, and resume data refresh after data assertions are completed.
  3. Availability/Findability Novo Access and Datahub are the two major delegation for access to Data Products – but do not provide and intuitive way to have metadata/glossary/formal naming/entity-based search.
  4. Programmatic Access Current tools do not provide non-user authenticated way to query metadata, tags, contract definition and quality assertion rules.
  5. Versioning compatibility No current system that enables versioning strategy for contracts.
  6. Audit No current system that enables audit trails for changes on the Product or Contract.
  7. Lateral Push No programmatic way to enable the metadata to flow from the central system to localized findability tools like iNNdex
  8. E2E Lineage Not available across system boundaries

Solution Overview

The Novo Nordisk Data Marketplace emerges as a solution to navigate the above mentioned challenges. Acting as a bridge between technical data infrastructure and end data consumers (system/user), it provides a unified and comprehensible view of source systems, domain teams, data assets, data products and data contracts. The approach is code-first,user-friendly - designed to facilitate central metadata management and a repository for data contracts and data products with associated data assertions as its base offerings. As an offshoot of the current implementation, it also can be used to implement governance on metadata structures, data products and data contracts (partially).

Key Features

WORK IN PROGRESS


Solution Components

The solution architecture includes the following components :

  1. NNDM UI : Custom UI
  2. NNDM Db : Azure Postgres Database
  3. Purview : Tenant default

Domain Team


WORK IN PROGRESS