Skip to content

Best Practices

Best practices for getting the most out of Sellestial while maintaining data quality and controlling costs.

Why this matters
  • Catch configuration errors early
  • Verify output quality
  • Understand costs
  • Identify issues quickly
How to test safely
  1. Create small test list in HubSpot (10-50 records)
  2. Configure pipeline with test list
  3. Deploy and process
  4. Review every result manually
  5. Adjust configuration
  6. Scale to larger lists

For high-impact operations:

When to require review

Always require review for:

  • Data deletions
  • Record merges
  • Bulk updates to critical fields
  • New pipeline deployments

Can auto-confirm after:

  • 50-100 successful reviews
  • Confidence in results
  • Pattern validation complete

Daily checks:

  • Processing status
  • Error rates
  • Success rates

Weekly reviews:

  • Cost analysis
  • Quality spot checks
  • Performance trends

Monthly audits:

  • Configuration review
  • Results validation
  • ROI assessment
Recommended sequence
  1. Validate — Code pipelines for basic validation
  2. Classify — Classifier pipelines for quality assessment
  3. Clean — Agent pipelines for normalization
  4. Deduplicate — Deduplication + merge pipelines
  5. Maintain — Agent pipelines for ongoing monitoring

Set up rules:

  1. Start with exact match on unique identifiers (domain, LinkedIn URL)
  2. Add fuzzy match cautiously
  3. Test with small dataset
  4. Publish when confident

Review pairs:

  • Check all pairs before merging
  • Document merge decisions
  • Keep audit trail

Large merges:

  • Extra caution for companies with more than 30 associations
  • Verify data quality first
  • Consider manual review

Continuous:

  • New record validation (Code and Classifier pipelines)

Weekly:

  • Recent record cleaning (Agent pipelines on last 7 days)

Monthly:

  • Full database cleanup pass
  • Deduplication review
  • Employment verification

Quarterly:

  • Comprehensive data audit
  • Rule review and adjustment
  • Archive obsolete records
Filter before enriching

Step 1: Validation (Free Code pipeline) Step 2: Quality Assessment (Low-cost Classifier) Step 3: Filter for high-quality records Step 4: Data Cleaning (Moderate-cost Agent) Step 5: Enrichment (Higher-cost service)

Savings: Avoid enriching invalid/low-quality records

What to enrich

Don’t enrich:

  • All contacts (expensive)
  • Test data
  • Invalid emails
  • Archived records

Do enrich:

  • Active opportunities
  • Recent leads (last 30 days)
  • High-value segments
  • Missing critical data
Choosing write modes

Default to “Write if empty”:

  • Safest option
  • Preserves existing data
  • Fills gaps only

Use “Always overwrite” for:

  • Standardization fields (Country, Industry)
  • Known poor-quality data
  • Enrichment refresh

Use “Write if not modified by user” for:

  • Job titles (change frequently)
  • Company data (needs updates)
  • Phone numbers (may update)

Configure for:

  • Country names
  • Industry classifications
  • State abbreviations
  • Standard enums

Benefits:

  • Consistent segmentation
  • Clean reporting
  • Reliable automation

Set appropriate limits:

High-volume (1000s/day):

Enrollment: 1000/day
Generation: 1000/day
Weekend blocking: OFF

Medium-volume (100s/day):

Enrollment: 100/day
Generation: 100/day
Weekend blocking: Optional

Low-volume (10s/day):

Enrollment: 50/day
Generation: 50/day
Weekend blocking: ON

Auto-confirm:

  • Well-tested pipelines
  • Low-risk operations (reads, logging)
  • High-volume, low-impact

Human review:

  • New pipelines (first 50-100 records)
  • High-impact operations (merges, deletes)
  • Critical data updates
  • Uncertain classifications

Timed auto-confirm (10 mins):

  • Sequences (review content quality)
  • Medium-impact updates
  • Allows quick review without blocking

Always enable:

  • HubSpot contact/company (core data)
  • Previous communication (avoid duplicates)

Selectively enable:

  • LinkedIn profiles (if personalization needed)
  • Company news (if timing important)

Rarely enable:

  • LinkedIn posts (expensive, often low value)
  • All external sources simultaneously

Calculate costs:

Pipeline cost per record × Expected volume = Daily cost
Example:
Agent pipeline: 1 credit/contact
500 contacts/day
= 500 credits/day
= ~15,000 credits/month

Set daily limits: Match budget constraints with processing limits in pipeline settings.

Weekly:

  • Check Usage → Credits tab
  • Review category distribution
  • Identify unexpected spikes

Monthly:

  • Analyze trends
  • Compare to budget
  • Adjust limits or strategy

1. Filter Aggressively:

Before enrichment:
- Remove test data
- Validate with Code pipelines (low/free)
- Assess quality with Classifiers
- Filter invalid
Result: Only enrich quality records

2. Target High-Value:

Instead of:
50,000 contacts × enrichment cost = High total
Target:
2,500 active opportunities × enrichment cost = Lower total
Savings: Significant cost reduction

3. Use Cheaper Pipelines First:

Code (Free/Low) → Filter
Classifier (Low-Moderate) → Filter
Then: Agent and enrichment on validated data

4. Adjust Processing Frequency:

Daily: New records only
Weekly: Recent activity
Monthly: Full database
Quarterly: Comprehensive audit

Parallel processing:

  • Use multiple pipelines simultaneously
  • Process different segments
  • Separate hygiene from enrichment

Efficient data sources:

  • Disable unnecessary sources
  • Use only required fields
  • Minimize external API calls

Batch sizing:

  • Find optimal batch size
  • Too small = overhead
  • Too large = timeouts
  • Test 50-200 record batches

For enrichment:

  1. Clean data first
  2. Ensure required fields present
  3. Validate company associations
  4. Use quality source lists

For classification:

  1. Provide sufficient context
  2. Use appropriate model
  3. Add validation if needed
  4. Review borderline cases

Create specific lists:

"Contacts - Missing Email"
"Contacts - Invalid Domain"
"Companies - Need LinkedIn URL"
"Contacts - Created Last 7 Days"

Use active lists:

  • Auto-updates membership
  • Continuous processing
  • No manual maintenance

Document criteria:

  • Clear naming
  • Note purpose
  • Track usage

Naming conventions:

  • Use sellestial_ prefix
  • Descriptive names
  • Consistent format

Documentation:

  • Add help text
  • Explain values
  • Link to pipelines

Organization:

  • Group related properties
  • Standard property groups
  • Easy to find

Review mappings:

  • Correct target fields
  • Appropriate write modes
  • Proper data types

Test writes:

  • Verify updates work
  • Check data format
  • Validate calculations

GDPR/CCPA:

  • Document data processing
  • Provide opt-out mechanisms
  • Respect preferences
  • Maintain records

Internal policies:

  • Follow company guidelines
  • Legal team review
  • Compliance checks

User permissions:

  • Admin: Full access
  • User: Pipeline management
  • Viewer: Read-only

API keys:

  • Secure storage
  • Regular rotation
  • Scope limiting
  • Audit usage

For each pipeline:

  • Purpose and goals
  • Configuration details
  • Input sources
  • Expected outputs
  • Review process
  • Owner/contacts

Document:

  • Configuration changes
  • Deployment dates
  • Version numbers
  • Reasons for changes
  • Impact analysis

Communicate:

  • Notify affected teams
  • Explain changes
  • Training if needed
  • Support available

Admin:

  • Platform configuration
  • Pipeline deployment
  • Settings management
  • User access control

Pipeline Owner:

  • Pipeline configuration
  • Quality monitoring
  • Issue resolution
  • Performance optimization

Reviewer:

  • Manual review tasks
  • Quality spot checks
  • Feedback reporting

Regular sync:

  • Weekly pipeline review
  • Monthly performance review
  • Quarterly strategy planning

Documentation:

  • Share best practices
  • Document decisions
  • Maintain runbooks
  • Update guides

When issues arise:

  1. Check basics:

    • Pipeline active?
    • Sources configured?
    • Limits reasonable?
    • Recent deployments?
  2. Review logs:

    • Execution logs
    • Error messages
    • Status codes
    • Timing information
  3. Test in isolation:

    • Single record test
    • Verify each component
    • Check data sources
    • Review output
  4. Adjust & retry:

    • Fix identified issues
    • Deploy changes
    • Retest
    • Monitor results
  5. Escalate if needed:

    • Document issue
    • Gather logs
    • Contact support
    • Provide context