Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Filter by Categories
Glossary
News

Machine Learning applied to Store and Account Clustering

The Problem Statement

Store Clustering has been treated for far too long as a necessary evil and nuisance in the fashion retail industry. The classical approaches of using grading stores and store size to cluster stores are outdated and do not reflect the retail complexity of today. Because store clustering mostly performed manually, analysis of rich retail data is impossible. Even excel cannot perform this task.

Before we go any further, let us first review the objective of store clustering, i.e. what represents a store cluster:

A cluster is a set of similar stores in terms of category and product distribution, meaning that the same selling behavior exists throughout the cluster HENCE should have the same assortment composition and width, only varied by depth depending on the sales volume and/or size.

ebp Global in conjunction with its technology partners Datacrag and Retailisation have pioneered the application of machine learning for store clustering, as an integral part of the merchandizing process both as a historical evaluation of assortment performance in the store cluster as well as a future determination of the best assortment going forward.

As such, it supports vital functions in the merchandize planning process namely sales planning, range planning and buy planning. The buy plan in particular requires an assortment at store level to determine the correct buying budget per store, initial allocation and open to buy management during in-season.

What is Machine Learning?

Machine learning is composed of two main domains:  one is called supervised learning and the other is called unsupervised learning. Clustering methods belong to the unsupervised learning domain and that means that it will not be trained through data sets but rather analyze the data set every time without using past experience or previous data sets. It uses features to analyze a data set and as a result determines the main drivers of a cluster. Features can be values or classifications such as store types or consumer types.

Integrating both the k-means and hierarchical clustering methods into the application; it allows the users to pick the best performing clustering method for a given data set. One of the outcomes from the clustering analysis is the ranking of features, allowing the users to see how each feature impacts the final clustering result.

Application of Machine Learning to Store Clustering

The clustering application generates 3 main results:

  • Store Cluster Assignment for each store
  • Category or Assortment Mix to determine sales channels and market opportunities to be addressed
  • Ideal Assortment per cluster at SKU level

One of the key metrics used to determine the cluster assignment performance and hence the effectiveness of the clustering method is called Silhouette. With this metric we are able to determine how “clean” the cluster assignment of each store has been. Below is an example of such as clustering result:

Experience will show that any fashion retail organization should limit the number of clusters used; store populations of 300 to 500 store should not have more than 5 or 6 clusters. Even in store populations of 3,000 to 4,000 stores, 8 to 12 store clusters are usually sufficient. Too many store clusters can lead to performance degradation of the related assortment from both selling and buying perspectives.

The category distribution or assortment mix provides key insights into the market and sales channels the cluster addresses. As shown below, such insight will indicate if this cluster addresses a general sales channel or a specialty sales channel, includes a certain marketing theme, or covers a broad range of categories and products:

Improving Sales in Stores through Assortment Optimization

The master assortment represents the “ideal assortment” used in the cluster and will be used to phase out products which have no meaningful contribution to the cluster performance and therefore sales.

The master assortment is coupled with the quantity sold in a season or selling period. The clustering method integrated in the application generates the master assortment from the top % tier of the stores within the cluster and uses then 2 approaches to determine the range of the assortment composition:

  • Top x % of products
  • Pareto analysis of the sales within that season or selling period, meaning the products driving 80% of the sales value with be included

Both views provide the range of products to be included in the master assortment, allowing the merchandize planner to determine which products to push and which to drop.

Using Store Clusters for the Wholesale Channel at Account Level

Recently, ebp Global has applied the store clustering approach to the wholesale channel which includes distribution centers, distributors and different store types owned by accounts. Replacing stores with ship to addresses yielded very good results. The method described above were able to determine account clusters of ship to addresses with a clean category distribution and master assortment.