Best Practices

Best practices for getting the most out of Sellestial while maintaining data quality and controlling costs.

General Principles

Start Small

Why this matters

Catch configuration errors early
Verify output quality
Understand costs
Identify issues quickly

How to test safely

Create small test list in HubSpot (10-50 records)
Configure pipeline with test list
Deploy and process
Review every result manually
Adjust configuration
Scale to larger lists

Enable Human Review Initially

For high-impact operations:

When to require review

Always require review for:

Data deletions
Record merges
Bulk updates to critical fields
New pipeline deployments

Can auto-confirm after:

50-100 successful reviews
Confidence in results
Pattern validation complete

Monitor Continuously

Daily checks:

Processing status
Error rates
Success rates

Weekly reviews:

Cost analysis
Quality spot checks
Performance trends

Monthly audits:

Configuration review
Results validation
ROI assessment

Data Hygiene Best Practices

Workflow Order

Recommended sequence

Validate — Code pipelines for basic validation
Classify — Classifier pipelines for quality assessment
Clean — Agent pipelines for normalization
Deduplicate — Deduplication + merge pipelines
Maintain — Agent pipelines for ongoing monitoring

Deduplication Strategy

Set up rules:

Start with exact match on unique identifiers (domain, LinkedIn URL)
Add fuzzy match cautiously
Test with small dataset
Publish when confident

Review pairs:

Check all pairs before merging
Document merge decisions
Keep audit trail

Large merges:

Extra caution for companies with more than 30 associations
Verify data quality first
Consider manual review

Data Cleaning Frequency

Continuous:

New record validation (Code and Classifier pipelines)

Weekly:

Recent record cleaning (Agent pipelines on last 7 days)

Monthly:

Full database cleanup pass
Deduplication review
Employment verification

Quarterly:

Comprehensive data audit
Rule review and adjustment
Archive obsolete records

Data Enrichment Best Practices

Cost Optimization

Filter before enriching

Step 1: Validation (Free Code pipeline)
Step 2: Quality Assessment (Low-cost Classifier)
Step 3: Filter for high-quality records
Step 4: Data Cleaning (Moderate-cost Agent)
Step 5: Enrichment (Higher-cost service)
Savings: Avoid enriching invalid/low-quality records

Target Strategically

What to enrich

Don’t enrich:

All contacts (expensive)
Test data
Invalid emails
Archived records

Do enrich:

Active opportunities
Recent leads (last 30 days)
High-value segments
Missing critical data

Write Mode Strategy

Choosing write modes

Default to “Write if empty”:

Safest option
Preserves existing data
Fills gaps only

Use “Always overwrite” for:

Standardization fields (Country, Industry)
Known poor-quality data
Enrichment refresh

Use “Write if not modified by user” for:

Job titles (change frequently)
Company data (needs updates)
Phone numbers (may update)

Allowed Values

Configure for:

Country names
Industry classifications
State abbreviations
Standard enums

Benefits:

Consistent segmentation
Clean reporting
Reliable automation

Pipeline Configuration Best Practices

Processing Limits

Set appropriate limits:

High-volume (1000s/day):

Enrollment: 1000/day
Generation: 1000/day
Weekend blocking: OFF

Medium-volume (100s/day):

Enrollment: 100/day
Generation: 100/day
Weekend blocking: Optional

Low-volume (10s/day):

Enrollment: 50/day
Generation: 50/day
Weekend blocking: ON

Review Policies

Auto-confirm:

Well-tested pipelines
Low-risk operations (reads, logging)
High-volume, low-impact

Human review:

New pipelines (first 50-100 records)
High-impact operations (merges, deletes)
Critical data updates
Uncertain classifications

Timed auto-confirm (10 mins):

Sequences (review content quality)
Medium-impact updates
Allows quick review without blocking

Data Source Selection

Always enable:

HubSpot contact/company (core data)
Previous communication (avoid duplicates)

Selectively enable:

LinkedIn profiles (if personalization needed)
Company news (if timing important)

Rarely enable:

LinkedIn posts (expensive, often low value)
All external sources simultaneously

Cost Management

Budget Planning

Calculate costs:

Pipeline cost per record × Expected volume = Daily cost

Example:
Agent pipeline: 1 credit/contact
500 contacts/day
= 500 credits/day
= ~15,000 credits/month

Set daily limits: Match budget constraints with processing limits in pipeline settings.

Cost Monitoring

Weekly:

Check Usage → Credits tab
Review category distribution
Identify unexpected spikes

Monthly:

Analyze trends
Compare to budget
Adjust limits or strategy

Cost Optimization Tactics

1. Filter Aggressively:

Before enrichment:
- Remove test data
- Validate with Code pipelines (low/free)
- Assess quality with Classifiers
- Filter invalid
Result: Only enrich quality records

2. Target High-Value:

Instead of:
50,000 contacts × enrichment cost = High total

Target:
2,500 active opportunities × enrichment cost = Lower total
Savings: Significant cost reduction

3. Use Cheaper Pipelines First:

Code (Free/Low) → Filter
Classifier (Low-Moderate) → Filter
Then: Agent and enrichment on validated data

4. Adjust Processing Frequency:

Daily: New records only
Weekly: Recent activity
Monthly: Full database
Quarterly: Comprehensive audit

Performance Optimization

Speed Improvements

Parallel processing:

Use multiple pipelines simultaneously
Process different segments
Separate hygiene from enrichment

Efficient data sources:

Disable unnecessary sources
Use only required fields
Minimize external API calls

Batch sizing:

Find optimal batch size
Too small = overhead
Too large = timeouts
Test 50-200 record batches

Success Rate Optimization

For enrichment:

Clean data first
Ensure required fields present
Validate company associations
Use quality source lists

For classification:

Provide sufficient context
Use appropriate model
Add validation if needed
Review borderline cases

HubSpot Integration Best Practices

List Management

Create specific lists:

"Contacts - Missing Email"
"Contacts - Invalid Domain"
"Companies - Need LinkedIn URL"
"Contacts - Created Last 7 Days"

Use active lists:

Auto-updates membership
Continuous processing
No manual maintenance

Document criteria:

Clear naming
Note purpose
Track usage

Property Management

Naming conventions:

Use sellestial_ prefix
Descriptive names
Consistent format

Documentation:

Add help text
Explain values
Link to pipelines

Organization:

Group related properties
Standard property groups
Easy to find

Field Mapping

Review mappings:

Correct target fields
Appropriate write modes
Proper data types

Test writes:

Verify updates work
Check data format
Validate calculations

Security & Compliance

Data Privacy

GDPR/CCPA:

Document data processing
Provide opt-out mechanisms
Respect preferences
Maintain records

Internal policies:

Follow company guidelines
Legal team review
Compliance checks

Access Control

User permissions:

Admin: Full access
User: Pipeline management
Viewer: Read-only

API keys:

Secure storage
Regular rotation
Scope limiting
Audit usage

Documentation Best Practices

Pipeline Documentation

For each pipeline:

Purpose and goals
Configuration details
Input sources
Expected outputs
Review process
Owner/contacts

Change Management

Document:

Configuration changes
Deployment dates
Version numbers
Reasons for changes
Impact analysis

Communicate:

Notify affected teams
Explain changes
Training if needed
Support available

Team Collaboration

Roles & Responsibilities

Admin:

Platform configuration
Pipeline deployment
Settings management
User access control

Pipeline Owner:

Pipeline configuration
Quality monitoring
Issue resolution
Performance optimization

Reviewer:

Manual review tasks
Quality spot checks
Feedback reporting

Communication

Regular sync:

Weekly pipeline review
Monthly performance review
Quarterly strategy planning

Documentation:

Share best practices
Document decisions
Maintain runbooks
Update guides

Troubleshooting Workflow

When issues arise:

Check basics:
- Pipeline active?
- Sources configured?
- Limits reasonable?
- Recent deployments?
Review logs:
- Execution logs
- Error messages
- Status codes
- Timing information
Test in isolation:
- Single record test
- Verify each component
- Check data sources
- Review output
Adjust & retry:
- Fix identified issues
- Deploy changes
- Retest
- Monitor results
Escalate if needed:
- Document issue
- Gather logs
- Contact support
- Provide context

Next Steps

Pipeline Kinds — Choose the right pipeline kind
HubSpot Integration — Configure CRM
LLM Models — Select AI models
Glossary — Key terms and definitions