Metadata Automation Management
Managing vast amounts of technical data without adequate metadata descriptions is a significant barrier to efficiency, requiring extensive resources to validate and utilize data effectively.
Leveraging AI to automate metadata generation, greatly enhancing data usability and decision-making.
Delivering seamless omni-channel fulfillment requires more than just inventory—it demands an end-to-end integrated ecosystem and governance to ensure accuracy, availability, and trust.
PROJECT TYPE
DATE
The Problem
The Technical Data Assets (TDC) system at The Club contained comprehensive metadata for tables—such as table names, schemas, asset types, and various IDs—but was missing table and column descriptions for the vast majority of its entries. This issue affected approximately 1.2 million records with less than 1% of the metadata descriptions available. As a result, searching for and validating the usefulness of tables was highly inefficient and labor-intensive.
Manually adding descriptions, at an estimated 5 minutes per table/column for 1,300,000 production tables, would have taken roughly 108,333 hours (or over 13,500 days) of engineering time.
The Strategy
To address this critical bottleneck, we spearheaded the integration of Guru ML GenAI (our internal AI service) into the TDC system.
The Result
Increased column coverage from 1% to 98%, transforming metadata completeness.
Saved approximately 108,333 hours of engineering time, transforming what would have been a multi-year manual effort into an efficient, automated process.
Enhanced searchability and discoverability of data assets, significantly improving the ability of users to find and validate tables, thereby boosting overall system efficiency and decision-making.
My Contribution
Discovery
Identified the inefficiency in manual metadata entry and quantified the potential cost in time and resources.
Collaborated with cross-functional teams to define requirements and ensure alignment with data governance practices.
Design
Coordinated AI-Driven automation implementation planning.
Integrated Guru ML GenAI services to automatically generate table and column descriptions.
Designed a workflow where the AI-generated descriptions were seamlessly passed to Sam’s Data Club for validation by the respective data owners, ensuring both accuracy and compliance.
Delivery
Conducted rigorous testing and pilot runs to validate the quality of the generated descriptions.
Fine-tuned the AI model based on feedback and quality checks to achieve maximum accuracy before a full-scale rollout.