Data Provenance and Set Sources
Learn how to track the origin of your trading card data using the Set Sources API to maintain data quality, provide proper attribution, and verify information accuracy.
What are Set Sources?
Set Sources allow you to track where your trading card set data came from. Every piece of information in your database - whether it's a checklist, metadata, or images - originated from somewhere. The Set Sources API provides a standardized way to record and manage these data origins.
Why Track Data Sources?
Data Provenance
Understanding where your data came from is essential for:
- Transparency - Users can see where information originated
- Credibility - Verified sources build trust in your data
- Compliance - Some data sources require attribution
- Quality Control - Track which sources provide accurate information
Verification
Source tracking enables you to:
- Mark sources as verified after validation
- Identify which data needs review
- Update information from authoritative sources
- Remove unreliable data sources
Attribution
Properly crediting data sources:
- Respects intellectual property
- Maintains good relationships with data providers
- Meets licensing requirements
- Builds community trust
Source Types
The API supports three distinct types of sources:
Checklist Sources
Track where your card lists came from.
Common checklist sources:
- Trading card databases (TCDB, COMC)
- Price guides (Beckett, PSA)
- Manufacturer checklists
- Community-contributed lists
- Retailer catalogs
Example:
{
"source_type": "checklist",
"source_name": "COMC Database",
"source_url": "https://www.comc.com"
}
Metadata Sources
Track where set information originated.
Metadata includes:
- Set name and year
- Manufacturer details
- Print run information
- Set descriptions
- Release dates
Common metadata sources:
- CardboardConnection
- Manufacturer press releases
- Industry publications
- Collector guides
- Historical archives
Example:
{
"source_type": "metadata",
"source_name": "CardboardConnection",
"source_url": "https://www.cardboardconnection.com"
}
Image Sources
Track where card images were obtained.
Common image sources:
- Trading Card Database
- COMC scans
- Personal collection photos
- Official manufacturer images
- Community contributions
Example:
{
"source_type": "images",
"source_name": "Trading Card Database",
"source_url": "https://www.tradingcarddb.com"
}
Key Fields
source_type
The type of data this source provides: checklist, metadata, or images.
source_name
A human-readable name for the source (e.g., "Beckett Price Guide", "TCDB").
source_url
The URL where this data can be found or verified. This should be as specific as possible.
verified_at
An ISO 8601 timestamp indicating when this source was last verified as accurate. null if unverified.
set_id
The UUID of the set this source applies to. Each source is specific to one set.
Common Use Cases
When Importing Data
Record sources immediately when importing new sets:
# Import set data from external source
set_data = import_from_beckett(set_name)
set_id = create_set(set_data)
# Record where this data came from
client.set_sources.create({
'type': 'set_sources',
'attributes': {
'set_id': set_id,
'source_type': 'checklist',
'source_name': 'Beckett Online Price Guide',
'source_url': 'https://www.beckett.com/price-guides'
}
})
Multiple Sources Per Set
A single set can have multiple sources for different types of data:
# Different sources for different data types
sources = [
{
'type': 'checklist',
'name': 'COMC Database',
'url': 'https://www.comc.com'
},
{
'type': 'metadata',
'name': 'CardboardConnection',
'url': 'https://www.cardboardconnection.com'
},
{
'type': 'images',
'name': 'Trading Card Database',
'url': 'https://www.tradingcarddb.com'
}
]
Verification Workflow
Track when sources are verified:
from datetime import datetime
# After manually verifying the source
client.set_sources.update(source_id, {
'type': 'set_sources',
'id': source_id,
'attributes': {
'verified_at': datetime.utcnow().isoformat() + 'Z'
}
})
Displaying Attribution
Show users where your data comes from:
# Get set with all sources
response = client.sets.get(set_id, include='sources')
# Display attribution
if 'included' in response:
sources = [s for s in response['included'] if s['type'] == 'set_sources']
print("Data Sources:")
for source in sources:
attrs = source['attributes']
verified = " ✓" if attrs.get('verified_at') else ""
print(f" {attrs['source_type']}: {attrs['source_name']}{verified}")