Cross-elasticity and SKU price differentiation in dynamic pricing

12 min readApr 9, 2021

In my previous article, I talked about the different approaches for building a dynamic pricing engine. There I stated that, in my opinion, target-based pricing using continuous experimentation was in general better positioned to create value compared to other dynamic pricing approaches. This statement was based on the fact that:

Pricing approaches focusing on pursuing a target can align better to a company’s objectives.
Experimentation allows for faster adaptation to ever-changing contexts (such as competition, pandemic situations, proximity to holidays etc.).

In this new post, I’m focusing on target-based and experimentation-based pricing methodologies. I’ll be examining whether it’s worth implementing price differentiation at SKU level, and how cross-elasticity can be managed. Let’s tackle these questions alongside a fictitious pricing specialist, let’s call him Jean, working at an online retailer that sells n products. The products present cross-elasticity effects (the price of one affects the sales of another) and this e-commerce can change their prices on a daily basis. Jean’s role is to design the target-based and experimentation-based pricing engine that the company wants to create for KPI optimisation.

Should targets be defined at portfolio level or SKU level?

The first decision that Jean has to make is ‘Should targets be defined at portfolio level (i.e. the group of SKUs under optimisation) or SKU level?’ Jean imagines a simplified scenario with only two SKUs that can cannibalise each other, and how the target can be set:

Setting objectives at SKU level (Option 1): “Optimise profit but always respecting a minimum sales volume of product A of 100 units and a minimum volume for product B of 100 units, as well.”

Setting objectives at an overall level (Option 2): “Optimise profit but always respecting that the sales, either of product A and B, should be above 200 units”

Note that expressing an optimisation problem as in option 2 is more generic than option 1. Any solution able to fulfill the constraints (minimum number of sales for both product A and B) as expressed in option 1, will also satisfy the constraints of option 2 (sales of product A + sales of product B >200). However, the opposite is not true. Not all solutions for option 2 are also valid for option 1.

As an example, see Table 1 and Table 2. In Table 1 we can see a scenario where at least 100 items of both products are sold. In Table 2 this constraint is not met; only 50 sales of product A are achieved. The decrease in price for product B has cannibalised sales for product A in favour of sales of product B. The price decrease has also generated extra sales for product B. Overall, the scenario in Table 2 generates more volume, more revenue, and more profit. However, it won’t be a valid scenario if the optimisation problem is stated as in option 1.

Is there a reason why the scenario in Table 2 isn’t more attractive than the one in Table 1? There might be a very valid reason; managing stock or the relationships and dependencies with suppliers. However, Jean thinks that having defined goals at SKU level is equivalent to throwing a spanner in the works as potentially very attractive results may be lost. He thinks that a customer-centric focus can be much more powerful than a supplier-centric focus, because, in the end, the elasticity is an end-consumer characteristic, not a supplier one. Jean would try to manage the suppliers later on.

Jean has decided to go with defining the target at portfolio level, and not at SKU level. Now he needs to come up with a model that explains the KPI he wants to optimise as a function of the prices of the products:

Where KPI (in capital letters) refers to the daily values of the variable that Jean wants to optimise at portfolio level, p refers to prices of each individual SKU, and i identifies the SKUs. Cₜ involves any predictor or signal incorporated into the model to explain how the KPI changes with seasonality and context, and hence, it is time dependent. Knowing such relationships, Jean would be able to engage in an optimisation routine to find which combination of prices (p₁, p₂,…,pₙ) produces higher KPI levels.

It looks to Jean like there are many parameters (or predictors) for a single daily KPI (or response) to explain. The situation gets even worse when client segmentation comes into play. Products are not elastic or inelastic, consumers are (that’s why we talk about customer price sensitivity). So, for Jean it’s kind of mandatory to account for consumer segmentation, meaning the problem results in:

So, a portfolio of n=100 products and four consumer segments makes 400 parameters to explain one single signal, the KPI!

Models are likely to largely overfit. Lasso and other similar models with feature selection would reduce the number of predictors. But this is equivalent to explaining the portfolio performance with just a few products, disregarding the contribution that many SKUs may have if unexplored low prices were to be tested. This, again, does not look like a good idea to Jean. Clearly, Jean needs to reduce the number of predictors in the modelling!

Working at SKU level

One possible solution is to assume that the KPI of the portfolio can be broken down into smaller functions dependent on one single product. So, the KPI is the sum of the KPI at SKU level. This way the problems results in:

Where kpi (in lowercase) refers to the contribution of each SKU to the KPI variable at portfolio level. Assuming SKU independence as per the formula above allows Jean to move from one model with hundreds of predictors to hundreds of models with a handful of predictors each. This lowers the risk of overfitting while keeping maximum price granularity. Plus, historical data at SKU level can be used to fit the models, which is great!

However, there are three drawbacks:

Cross-elasticity (the effect of the price of a product j, pⱼ, on the kpi performance of another product i, kpiᵢ) is not acknowledged at all. End consumers may or may not compare Jean’s products against other retailers. But they’ll definitely be exposed to different products during the purchase experience in the retail store or on the web. SKU comparison is unavoidable and, thus, independent SKU is a strong assumption. In other words, acknowledging cross-elasticity is a must.
Data at SKU level is noisy. For example, if a portfolio of 100 products shows a noise level of 10% of the value of a given KPI (note the upper case), then at SKU level, a noise level of 100% in respect to the kpi (note the lower case) on average is expected. Summing up hundreds of noisy models doesn’t look right to Jean.
The number of free parameters to optimise KPI is still high (equal to the number of products times the number of client segments, n x j). After understanding how the KPI evolves with prices, Jean would need to find the price combination that maximises the KPI. Having such a large number of free parameters would make the optimisation routine challenging. I personally would dare to say that the inability to solve such a complex optimisation problem at portfolio level is one of the two reasons at the core of why many companies end up setting business targets at SKU level and not at portfolio level. It’s just easier… and it might be enough to create sufficient value.

Working at group level

Jean finds an alternative solution — reducing the number of predictors in Equation 2 by grouping the SKUs. That way, Jean would only have to manage the prices of just a handful of groups and not for all SKUs.

However, what is the price of a group of SKUs? Is it the mean of all the prices across all SKUs belonging to the group? Or the median? Or any other statistic? Different distributions of prices across SKUs can deliver the same statistic at group level. But very probably, the different distributions would produce very different product mixes in terms of sales, which in turn would yield different results in KPI performance. This leads to new problems related to reproducibility of the model predictions and ambiguity in what an optimal price really means. Let’s illustrate this with an example using the mean as the statistic. Imagine two different price mixes for a group of SKUs (Figure 1), both producing the same mean, P3 =P3’=P3’’, but different performance (Figures 2 and 3).

After gathering data and modelling the KPIs for a group of SKUs, Jean obtained the models shown in Figures 2 and 3. The models show that the maximum KPI is achieved when the price (remember, the mean of the SKU prices in the group) is equal to P3 (see Figure 2). However:

The model prediction is not achievable by setting P3’ or P3’’, it’s a mid-point between their performance. Model results/predictions are not reproducible.
Given that P3 is optimal, which price mix should we put? P3’, P3’’, or any other as long as the mean equals P3? It’s ambiguous. If there is no constraint, it’s obvious that P3’’ is better than P3’. But what happens if we add a constraint to the optimisation problem? Imagine a minimum level for a secondary KPI (KPI’) is requested (Figure 3). Then P3’’ shouldn’t be an option as it doesn’t fulfil the constraint on KPI’, and nor should P3’ be since P2 is a better performer! We would be misled by the model indicating that P3 is optimal.

So then how do we define the price for a group of SKUs to avoid such problems or irreproducibility and ambiguity?

There is a famous price experiment in which people were asked the following question, either in the form of version A or version B (A/B):

Imagine that you are about to purchase a jacket for $125/$15, and a calculator for $15/$125. The calculator salesman informs you that the calculator you wish to buy is on sale for $10/$120 at the other branch of the store, located 20 minutes drive away. Would you make the trip to the other store?

When version A of the question is posed, 68% of the respondents were willing to make an extra trip to save $5 on a $15 calculator. But when version B of the question is posed, the respondents willing to make the extra trip (in order to save $5 on a $125 calculator) drops to 29% (Tversky and Kahneman, 1981). Same inconvenience, same saving, but completely different answers. This irrational behaviour is explained by Tversky and Kahneman (1981): the subjective perceived value of money is a concave curve (Figure 4). Thus, a discount of $5 has a greater subjective value when the price of the calculator is low compared to when it’s high.

What does this price experiment imply in practice? It implies that the mark-up can be higher for the most expensive or valuable products. So a pricing model that takes this effect into account can be built following the rule:

Where pᵢ refers to the price of the SKU i, RP refers to the reference price for SKU i and IR refers to an incremental rate with respect to the reference price. The reference price can be set in many ways. It can be the value of the break-even price of each product, the supplier price, or the average price of a product in the last year, etc.

Taking that formula into account, Jean decides that it would be a good idea to select a proper reference price for each SKU, but use the same IR for all SKUs in the group. This way he can use the IR to define the price for a group. So then the original problem results in:

By introducing the IR as a predictor in the model, and by following the rule that all products within the group should have the same IR at the same time, Jean removes the ambiguity and problems related to irreproducibility. Having defined the RP for all SKUs, one value of IR can only derive one single combination of pᵢ.

The IR effect of group i on the performance of group k (that is the cross-elasticity across groups) is considered in the model as no independence of groups is assumed beforehand. The effects of cross-elasticity within groups are reflected in group performance. Since SKU prices are tied within a group, there is no need to understand how different prices of one product affect another product’s sales. The risk of overfitting using this approach is much lower as the number of predictors is reduced significantly. The optimisation process is much easier as, correspondingly, the number of free-parameters (prices) is much lower.

The cost of this solution is that historically, SKUs weren’t grouped and their prices didn’t follow the rule of having the same IR for all SKUs in a group. So, the model can’t use historical data to make conclusions. An experimental approach where many combinations of IR across groups are tested is needed. This experiment serves to generate enough data for the model to learn, and identify a pattern that explains KPI as a function of prices (referred to here as IRs). On top of that, there are some additional restrictions that this solution brings with it:

Price differentiation at SKU level is limited by the choices associated with the reference price, RP, in Equation 5, and to the price differentiation between groups. All products within a group move prices up and down at the same time.
Groups should be stable in terms of product content definition. There should be no trespassing between groups, otherwise the single IR rule for all SKUs within the group at all times will break.
The reference price for a product shouldn’t be changed regularly over time.

Jean has no problem accepting the cost of ditching historical data and engaging in an experimentation process. He understands that spending time continuously experimenting is essential if he wants to track the constant changes in context. In the end, Jean decides on an approach that yields:

A simplification of how pricing is done at granular level;
A simplification in the optimisation routine; and
A simplification of the KPI modelling.

Jean accepts that sometimes, simple means easier to adapt and easier to control, and with this brings improved benefits.

Conclusions:

A good means of leveraging the pricing lever towards company objectives is to set objectives at portfolio level and not at product level. However, with this comes certain barriers to overcome and some hard truths to accept.

If one adopts an SKU level approach, then it benefits from using historical data. But unless the number of SKUs in the portfolio is relatively small, one probably needs to disregard cross-elasticity effects, deal with noisy models, and invest in a very proficient optimisation routine.

If one prefers to group SKUs to facilitate model learning and its ability to predict, then one might be willing to accept the following; limited price differentiation at SKU level; linking pricing between SKUs within groups; and the need to engage into a learning period because historical data is not suitable.

The final decision would depend on the size of the categories and the typical time span within which the context of the portfolio changes over time. Portfolios of a few SKUs but with good sales levels would allow for introducing cross-elasticity effects even on the SKU level approach and would limit the effect of noise. If agility is a key factor, then the grouping approach is more appropriate because of better signal-to-noise ratio and the continuous experimentation allows for rapid and accurate price changes while pursuing a KPI objective.

References:

Richard Thaler, “Journal of Economic Behavior and Organization”, 1980, pp. 1,39

Richard Thaler, “Mental Accounting and Consumer Choice”, Marketing Science, vol. 4, №3, 1985, pp. 199–214. 10.1287/mksc.4.3.199.

Amos Tversky; Daniel Kahneman, “Science”, New Series, vol. 211, №4481, 1981, pp. 453–458.

Eduardo Rodríguez Quintana, “Decision-Making: The Behavioural Economy” 2012, (https://digibuo.uniovi.es/dspace/bitstream/handle/10651/13074/Trabajo%20fin%20de%20m%E1ster%20Eduardo%20Rodr%EDguez%20Quintana.pdf;jsessionid=E96F13B812B3F3A2F3F2B7324E1621B8?sequence=1).

Juan Manuel Mayén Gijón
Head of Data Science at PricingHUB, PhD

Cross-elasticity and SKU price differentiation in dynamic pricing

Should targets be defined at portfolio level or SKU level?

Working at SKU level

Working at group level

References:

Written by PricingHUB