Crafting a robust marketing strategy across diverse channels and platforms, each with their unique and intricate user journeys, is a complex task. Yet, assessing the effectiveness of your marketing expenditure and calculating its ROI should be straightforward.
Grasping how your marketing expenditure contributes to conversions and sales is vital. It empowers you to continually adapt your strategy to evolving customer preferences and market conditions. While numerous commercial solutions exist to help you understand your marketing campaigns’ performance, they often suffer from two main drawbacks:
- They are siloed. For instance, Facebook provides an overview of your performance on their platform, but comparing that to the performance of other channels can be challenging.
- They are one-size-fits-all black boxes that remove flexibility and control. For example, Google Analytics attribution can only attribute one channel to each session.
This is where GPT (Generative Pretrained Transformer) and other Language Models (LLMs) come into play. They can help early-stage marketers leverage scientific methods to measure marketing effectiveness.
Modeling Sessions
The Snowplow JavaScript tracker also captures a session ID with each event, the domain_sessionid, as well as a session index. The session ID is used to model sessions in Snowplow’s web data model. This SQL model aggregates out-of-the-box page views and page pings into a set of derived tables: page_views, sessions, and users. These tables have one row per page view ID, session ID (i.e., the domain_sessionid), or user ID (i.e., the domain_userid).
The Snowplow JavaScript tracker also captures a session ID with each event, the domain_sessionid
, as well as a session index. The session cookie is set against the same domain as the domain_userid
cookie (a first-party cookie set against the domain the tracking is on). By default, it expires after 30 minutes of inactivity, but a different interval can be picked in the tracker initialization (i.e. sessionCookieTimeout: 3600
).
The session ID is used to model sessions in Snowplow’s web data model. This SQL model aggregates out-of-the-box page views and page pings into a set of derived tables: page_views, sessions and users. These tables have one row per page view ID (as captured in the web page context), session ID (i.e. the domain_sessionid
) or user ID (i.e. the domain_userid
).
Adding Marketing Costs (Google Ads Example)
If marketing costs are pulled into the data warehouse (using an ETL tool such as Stitch), they can be added to the sessions table based on the marketing parameters. For example, if the Google click and keyword performance reports are available, the average cost per click can be added to sessions that originated from a paid Google search using the marketing click ID:
` CREATE TABLE .ad_kw_click_perf DISTKEY(gclickid) SORTKEY(gclickid) AS ( WITH click_perf AS ( SELECT cpr.googleclickid AS gclickid, cpr.day::date as date, cpr.adgroupid AS adgid, cpr.keywordid AS kwid FROM .click_performance_report AS cpr WHERE cpr.googleclickid IS NOT NULL GROUP BY 1,2,3,4 ) SELECT cp.gclickid, kpr.keywordid as kw_id, kpr.keyword as kw, kpr.adgroup as ad_group, kpr.adgroupid as ad_gid, kpr.adgroupstate as ad_g_state, kpr.campaign as campaign, kpr.campaignid as camp_id, kpr.campaignstate as camp_state, kpr.customerid as cust_id, kpr.clicks as clicks, kpr.impressions as impressions, cast((kpr.cost::float/1000000::float) as numeric(38,6)) as total_cost, cast((kpr.avgcpc::float/1000000::float) as numeric(38,6)) as avg_cpc, kpr.day::date as date FROM .keywords_performance_report AS kpr INNER JOIN click_perf AS cp ON kpr.keywordid = cp.kwid AND kpr.day::date = cp.date AND kpr.adgroupid = cp.adgid GROUP BY 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 );
`
Next Steps
Once you have developed an understanding of what channels drive customers to your digital products, you can proceed with defining what activities you want to attribute - whether its newsletter signups, pdf downloads, product purchases, subscriptions, etc. This information can also be added to sessions as additional metrics, or flags. The resulting table can then be used as the basis for your various attribution models.
Marketing Attribution with GPT and Other LLMs
Data is driving more high-stakes decisions across companies and industries, and marketing strategies are no exception. As your channel mix and user journeys grow more complex, it becomes less likely that siloed or one-size-fits-all commercial tools will deliver what you need to attribute and optimize your marketing spend accurately.
Attributing credit to different events in the journey provides evidence of what is and isn’t working. But without being able to take charge of your data to choose the attribution logic that reflects your customers’ journeys (and their touchpoints), you cannot truly understand the real return on your investment. With GPT and other LLMs, you have that flexibility and control.

About Sharad Jain
Sharad Jain is an AI Engineer and Data Scientist specializing in enterprise-scale generative AI and NLP. Currently leading AI initiatives at Autoscreen.ai, he has developed ACRUE frameworks and optimized LLM performance at scale. Previously at Meta, Autodesk, and WithJoy.com, he brings extensive experience in machine learning, data analytics, and building scalable AI systems. He holds an MS in Business Analytics from UC Davis.