Building a Data Stack: Founding Stage

4 min read
Thumbnail for Blog Post - Building a Data Stack: Founding Stage || blog/data-stack-founding-stage/datastack-foundingstage.jpeg

Hey, startup founder! You already know you need analytics in your product or on your website. What you might not know, is that you need less tools than you think. Yes, that’s controversial — but trust us and read on for our first proper stack recommendation post.

In a series of blogposts, we’ll set out what we think you need at different stages of your lifecycle and how to make sure your data stack is remorselessly focussed on being useful for the business.

Today: founding stage! What stack do you really need when you have just started up and are still establishing product-market fit? We typically categorise any company under 10 people in this stage; but it could be that you have already raised seed funding and are slightly larger as well. Most important is that you are still experimenting with your product and your business model, still making large, impactful decisions.

This means getting rough insight fast is much more important than surfacing sophisticated analytics slowly. For us this means that rather than have a centralised data warehouse as single source of truth, at founding stage we think you can probably get by with a number of SaaS analytics tools. Some examples:

  • For product & user journey analytics on mobile (but also decent on web), find a tool that does good out-of-the-box reporting but also has ways to export data. We have had great experiences with Amplitude — it lots of inbuilt reporting functionality which will get you up and running in no time, and it is pretty cheap (or even free) for most seed-stage companies. It can connect into most data warehouses to make sure you can also keep the full history of raw event data, and it allows you and your team to build your own reporting very easily. We also quite like the newer combination of mParticle and Indicative, but it requires a bit more set-up work to get right — so keep that in mind.
  • GA4 for web analytics — it is very easy to set up, and can optionally export data into BigQuery meaning you can always revert back to it for further analysis later down the line. We recommend this particularly for web attribution (and find it less useful for other product analysis). (Update April 2023: make sure to understand the impact of the new attribution model changes!)
  • You could consider Appsflyer or Adjust as mobile attribution tools if you are reliant on an app — but only if it fits into a marketing strategy that recognises that perfect attribution of installations is not possible. These tools are expensive, so make sure you have a plan to utilise them properly.
  • You do not really need a data warehouse at this point! You are still very much changing the fundamentals of your business, and therefore do not overinvest in a stack just yet. There are exceptions here of course: if you already know the general shape of things to come and want to get up and running with analytics as fast as possible, then it does make sense to have some infrastructure in place. But if in doubt, just wait. All tool choices have compromises, bottlenecks and other risks. How do you hedge against them? We see three main concerns:
  • Exit strategy. You want to make sure you do not lock yourself into any long term with any of the tools you choose now: options and flexibility are your friends. When you grow, you want to be able to relatively easily swap out one tool for the other — e.g. Amplitude events for Rudderstack.
  • As you do not have a data warehouse (yet), you will have limited ability to sense check or validate the data yourself. There will be no single customer view without some extensive ductaping together of different reports. This will not matter for e.g. simple funnel analysis, but it will be a critical gap once you start growing!
  • Price. Most analytics SaaS tools are priced on a proxy for data volume, meaning that you will start out with a really cheap deal which gets progressively more expensive (and less interesting).

With that in mind, we think the seed series stage is where it starts to make a lot of sense to build a data warehouse.