How We Tackle Attribution at Tasman - Part 2 (And Package Release!)

13.02.24 • 5 min read

We have spent years building refining a reconfigurable dbt package that empowers both our analytics engineers and data analysts. We are now ready to share this tool with the world 🙌

Find the other parts in this series here:

Part 1: The state of attribution in 2024
Part 2: How we do attribution
Part 3: Attribution outputs & actions (this post)

Deterministic attribution is certainly not dead yet. Particularly for Web! It is still one of the most important methods to extract value from your first-party event data. Even though a lot of performance marketing will have to start using more statistical models, marketeers still like to have a deterministic attribution model as a good frame of reference for advanced data-driven techniques.

But building a configurable attribution model is not simple, and there are very few dbt packages that enable parametrised and reconfigurable attribution. Most attribution packages are hardcoding key business logic, making them very hard to tailor to your unique situation.

So — we built our own, and have stress-tested it over the last few years. And now we are ready to make it open source.

🔗 👉 Go to our GitHub page to explore further — and read on for more information about its mechanics!

With the Tasman MTA, you can easily configure rules and templates to unlock two key outputs:

Effortless setup: Get started quickly with a clear set of configurable settings within dbt.
Actionable insights: Generate two essential outputs that illuminate your multi-touch attribution data.

We’ll cover some elements of the settings in the next section.

Models

The engine can handle multiple models at the same time, we’ve a consistent way to identify each model across all files in the package, which describes which attribution model the configuration relates to.

For a last touch model, with a 30 day attribution window on a payment conversion, this might be a good name to use: last_touch_30_days_payment

Touch and Conversion Rules

Touch rules define what interactions qualify as relevant touchpoints – a website visit, an ad click, an email open. They can be used to filter based on specific characteristics, like channel, campaign, or content type.

Conversion rules, on the other hand, set the criteria for what constitutes a successful conversion – a purchase, a signup, a download. These rules work together to paint a clear picture of which touchpoints played a role in driving that desired outcome. By adjusting and refining these rules, you can tailor the engine to your specific business goals and gain deeper insights to ultimately optimising your marketing efforts for maximum impact.

These rules are defined in seed files that can be easily amended if logic changes overtime.

Attribution Rules

The attribution rules seed defines how touches are attributed to conversions for each attribution model. Each set of rules is grouped into a spec, and each spec can be assigned a different conversion share value in the conversion shares seed.

Conversion Shares

In simple words, it as a metric that defines the percentage of credit each touchpoint receives for contributing to a successful conversion.

Imagine each customer interaction as a step on a journey, with the final conversion being the desired destination. Conversion share then measures how much each “step” contributed to reaching that endpoint.

In the engine, the conversion shares seed is used to map attribution rules specs to decimal percentage conversion credits that are applied to matching touches.

Attribution windows

Attribution windows define the timeframe within which interactions can be credited for influencing a conversion. It can be days, hours or minutes.

Attribution windows are important for three main reasons:

Accuracy: Different customer journeys take different lengths of time. A longer window ensures all relevant interactions are captured, especially for complex purchases or lengthy consideration periods.
Fairness: By setting a clear window, you avoid giving undue credit to touchpoints that happened too far back in the journey and might not have been very influential.
Insights: Analysing how conversion rates change with different window lengths can reveal interesting patterns about your customer journey and the effectiveness of different touchpoints at different stages.

Industries have different standard lengths which is mapped to the typical customer journey as well as their marketing goals.

Apps: 7 days (typical install window)
E-commerce: 30 days (longer consideration periods)
Lead generation: 90 days (complex B2B sales cycles).

In the engine, the attribution window seed is used to define the maximum time between a touch and conversion for each attribution model.

Final Outputs

Output 1: attributed_touches contains all filtered touches (based on the touch rules) across all attribution models that have been attributed to a conversion. Where touches have been attributed, there will be a conversion_event_id for that touch_event_id, as well as a conversion share value if appropriate.
Output 2: attributed_conversions is this inverse of the attributed touches and contains all filtered conversions (based on the conversion rules) across all attribution models, whether or not they have attributed to a touch. Each conversion_event_id may appear once all multiple times depending on the number of attributed touches. Where touch_event_id is null, this indicates that the conversion is unattributed. This is the model that should be used in downstream models to analyse attribution performance.

Bonus point: Touches vs Sessions

The term used throughout this package to describe the actions taken by a given user is a touch. However, many organisations prefer to think about attribution from a session perspective. The engine supports both and isn’t opinionated in its approach.

If working with individual touches, its important to filter out touches that come from an internal referrer - particularly when using a last-touch model - otherwise the last touch will almost always be an internal touch and provide little insight.

If working with sessions, its important that sessionisation is completed upstream of the engine, and that the model contains 1 row per session.