Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certification/Recommendation Tag similar in functionality and visibility to TIER #18639

Open
DovileKr opened this issue Nov 14, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@DovileKr
Copy link

Context:
The current Tier system is excellent for identifying the importance of data assets based on downstream usage (right side of the data lineage). However, there’s a need for an additional tag to represent trust and readiness of assets created on the left side of the lineage, such as curated, governed, and documented tables or views. These assets may not yet have downstream usage but are trusted and recommended for use.

Proposal:
Introduce a Certification/Recommendation Tag alongside the existing Tier tag. The two tags would serve distinct purposes:

TIER: Highlights the importance of an asset to the organization (e.g., Tier 1, Tier 2).
Certification/Recommendation: Highlights the trust and quality of an asset, regardless of its usage.
The Certification/Recommendation Tag could have classification levels such as:

Grade 1: Certified and ready for use.
Grade 2: Trusted but still in evaluation.
Grade 3: Under development.
Grade 4: Not-for-use.
Grade 5: Uncertified or unknown origin.
Benefits:
Discoverability: Users can find important assets (via Tier) and trusted assets (via Certification) separately or in combination.
Governance: Promotes use of high-quality, governed data assets over raw or ad-hoc alternatives.
Flexibility: Enables clear differentiation between importance (Tier) and quality/trust (Certification).

@DovileKr DovileKr added the enhancement New feature or request label Nov 14, 2024
@DovileKr
Copy link
Author

In a data catalog, the concepts of TIER and GRADE serve distinct purposes for classifying and assessing data assets. Here’s how they differ:


TIER

Definition:
Tiers are typically used to categorize data assets based on criticality, business impact, or operational importance. It focuses on how the data asset is used in the organization and its overall priority.

Key Characteristics:

  • Purpose: Prioritization and resource allocation.
  • Common Criteria:
    • Business-critical vs. non-critical
    • Data availability requirements (e.g., high availability for Tier 1 data)
    • Support levels (e.g., Tier 1 might require 24/7 support)
    • Regulatory importance
  • Examples:
    • Tier 1: Mission-critical assets like financial reports or operational dashboards.
    • Tier 2: Important but not business-critical assets.
    • Tier 3: Informational or exploratory data with minimal impact if unavailable.

GRADE

Definition:
Grades evaluate the quality or trustworthiness of the data itself, based on factors like accuracy, completeness, timeliness, and reliability.

Key Characteristics:

  • Purpose: Establish confidence and usability in the data for decision-making.
  • Common Criteria:
    • Data quality metrics (e.g., error rates, null values)
    • Data lineage and governance adherence
    • User feedback or ratings
    • Compliance with data standards
  • Examples:
    • Grade A: High-quality, fully governed, and highly trusted data assets.
    • Grade B: Reasonably good quality with minor issues.
    • Grade C: Lower quality, incomplete, or unreliable data requiring caution.

Key Differences

Aspect TIER GRADE
Focus Importance to the business Quality and trustworthiness
Purpose Operational prioritization Usability and confidence
Based On Business impact, availability Accuracy, completeness, governance
Typical Use Case Resource allocation, risk planning Decision-making, quality assurance
Assessment Level Macro (organization-level) Micro (data quality-level)

By using both TIER and GRADE, an organization can effectively manage its data assets by prioritizing critical resources for the most important data (TIER) while ensuring that decisions are made based on reliable and high-quality information (GRADE).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant