How it works: Data

Uncompromising Quality

Private company intelligence requires a unique approach.  Given the thousands of data sources and varying quality,  SourceScrub built an operation that combines advanced data acquisition  technologies with a human approach to data quality. The result is the most accurate, complete and fresh private company data available.

Book a demo!

Our data process

We've designed our system to optimize for Private Company Intelligence. A set of systems, processes and a team of over 650 analysts work around the clock to ensure maximum accuracy and precision, essential for a private equity deal sourcing platform.

Data Collection

Privately-held company data is challenging to source and digitize. Crawlers and researchers work methodically across 115,000+ sources to find and ingest data. Our data operations team is organized by data dimension to ensure a deep knowledge and understanding of the data types.

Structuring the Data

With a complex data set across 4 dimensions and hundreds of data fields, normalizing and structuring the data is critical to get your data model right. Structuring the data ensures users can create the right connections across companies and across verticals. We use the dimensional data model to create linkages across data dimensions: company, sources, investors and people.

Data Quality Operations

Our world-class Data Operations team work 24/7 to normalize, edit and QA our data. It's the combination of web technologies with human editorial which gives SourceScrub a unique advantage.  Some of the quality processes we have in place include:

Hand-written company descriptions which ensure accurate understanding of the company as well as richer search and discovery.

Cross-referencing critical data points such as employee counts to ensure the most accurate representation of the data.

Outlier data QA ensures data from disparate sources makes sense. Our human QA process cleans data in a way that machine learning cannot.

Delivering Data to Customers

Once our data is collected, organized and quality assured, we give customers access to it in the way that most makes sense for them. From data exports, to a web interface, CRM integration, API access and even a Data Warehouse, SourceScrub delivers the data the way you need it.

4-Dimensional Data

Our approach to data starts with a deep understanding our our customers' goals. We built a 4-dimensional data model to help customers quickly find, research and connect with privately-held companies. This purpose-built approach gives users exactly what they need to be successful.


The company dimension captures core details on the company such as employee count, revenue, job postings and website ranking.


Sources capture where companies show up on the web.  This includes buyer's guides, best-of lists, conference attendance, industry associations.


The people dimension captures contact details and professional background of the people associated with a company.


The Investor dimension captures information on the investors behind the companies. This includes transaction details, portfolio companies and deal history.

9 core signals

While there are hundreds of signals to choose from, we've built unique data processes around 9 core signals. These signals are surfaced in our web platform as filters and pivots that accelerate your "time to insight".