Data Product Recommender#
Analyze database query logs to identify high-value tables and logical groupings for data product prioritization.
Contents:
Overview#
The data_product_recommender module analyzes query log files to identify which tables should be prioritized as data products in a data marketplace.
Key Features#
- Multi-Platform Support
Supports Snowflake, Databricks, BigQuery, and watsonx.data query log formats.
- File-Based Input
Works with CSV and JSON query log files (no direct database connection required).
- Intelligent Scoring
Combines query frequency, user diversity, recency, and consistency metrics.
- Table Grouping
Identifies tables frequently used together for logical data product groupings.
- Multiple Output Formats
Generates both Markdown (human-readable) and JSON (agent-consumable) reports.
- CLI and Python API
Use from command line or integrate into applications.
Quick Start#
Command Line#
python -m wxdi.data_product_recommender.cli \
--platform snowflake \
--input-file query_logs.csv \
--output output \
--num-recommendations 20
Python API#
from wxdi.data_product_recommender.platforms import SnowflakeQueryParser
from wxdi.data_product_recommender.recommender import DataProductRecommender
# Initialize
parser = SnowflakeQueryParser()
recommender = DataProductRecommender(parser)
# Load and analyze
recommender.load_query_logs_from_csv_file('query_logs.csv')
recommender.calculate_metrics()
recommendations = recommender.recommend_data_products(num_recommendations=20)
# Export
recommender.export_recommendations_markdown(recommendations, 'output/recommendations.md')
Use Cases#
- Accelerate Data Product Onboarding
Leverage existing usage patterns rather than starting from scratch.
- Identify High-Value Assets
Find tables with demonstrated business value through real usage.
- Discover Logical Groupings
Identify tables commonly used together for cohesive data products.
- Prioritize Catalog Promotion
Focus efforts on tables with highest user demand and diversity.
Next Steps#
Usage Guide - Detailed usage guide
Examples - Code examples
Data Product Recommender Reference - API reference