top of page

Sales Data using RFM Analysis | SQL & Tableau

  • Writer: Igli Ferati
    Igli Ferati
  • Jul 12, 2023
  • 5 min read

Updated: Jul 28, 2023

Introduction:

The project aims to explore a sales dataset and generate various analytics and insights from customers' past purchase behavior. By analyzing sales revenue, customer segmentation using the RFM technique, and utilizing SQL and Tableau, the project delves into understanding and deriving valuable information from the dataset.


We go from basic SQL queries to exploring complex problems using SUB QUERY, CTEs, AGGREGATE, and WINDOW functions. We work on the following skills on this project:

  • Importing a file into SQL Server Database

  • SQL Aggregate Functions

  • SQL Window Functions

  • SQL Sub Query

  • Common Table Expressions (CTEs)

  • SQL XML Path Function

  • Tableau

Inspecting Data: Initially, the dataset is inspected by retrieving all the records from the sales dataset. This step helps gain a general understanding of the data structure and contents.

/**Inspecting Data**/

select * from sales_data_sample$

/**CHecking unique values**/
select distinct status from sales_data_sample$
select distinct year_id from sales_data_sample$
select distinct PRODUCTLINE from sales_data_sample$
select distinct COUNTRY from sales_data_sample$
select distinct DEALSIZE from sales_data_sample$
select distinct TERRITORY from sales_data_sample$

select distinct MONTH_ID from sales_data_sample$
where year_id = 2003  --- Change to see the rest

Checking Unique Values: Several distinct queries are performed to examine the unique values within specific columns of interest. This includes exploring the unique values for status, year, product line, country, deal size, territory, and month. Such analysis provides insights into the diversity and distribution of these attributes in the dataset.

/*ANALYSIS
We have to group sales by productline**/

select PRODUCTLINE, sum(sales) Revenue
from sales_data_sample$
group by PRODUCTLINE
order by 2 desc

Analysis by Product Line: The sales dataset is grouped by product line, and the sum of sales revenue for each product line is calculated. The results are sorted in descending order, showcasing the product lines with the highest revenue. This analysis helps identify which product lines contribute the most to overall sales.


Analysis by Year: The dataset is grouped by year, and the sales revenue for each year is calculated. This analysis provides a comparison of sales performance across different years, highlighting any notable trends or anomalies.

/** Checking car based on year**/
select YEAR_ID, sum(sales) Revenue
from sales_data_sample$
group by YEAR_ID
order by 2 desc
/** Based on bad results on 2005, if we go up and do a distinct on 2005 we can see that they operated only 5 months during 1 year**/

Sales Dashboard 2 from Tableau:


Analysis by Deal Size: The dataset is grouped by deal size, and the sales revenue for each deal size category is determined. This analysis enables the identification of deal sizes that generate the highest profits, contributing to a better understanding of customer preferences and buying patterns.

/** Checking dealsize to see where they got more profits**/

select DEALSIZE, sum(sales) Revenue
from sales_data_sample$
group by DEALSIZE
order by 2 desc

Best Sales Month: Focusing on a specific year, the dataset is filtered to find the month with the highest sales revenue. The monthly sales revenue and the frequency of orders are calculated. This analysis helps identify the most successful sales month within a given year.

/** What was the best month for sales in a specific year? How much was earned in that month?**/

select MONTH_ID, sum(sales) Revenue, count(ORDERNUMBER) Frequency
from sales_data_sample$
where YEAR_ID = 2005  --- change to see the rest
GROUP BY MONTH_ID
order by 2 desc

Sales by Product Line in November: By selecting a particular year and month, the dataset is further filtered to determine the product line that generates the highest sales revenue in November. This analysis provides insights into the product lines that perform exceptionally well during this specific period.

/** November seems to be the month with higher Revenues, so what product do they sell more in November? **/
select MONTH_ID, PRODUCTLINE, sum(sales) Revenue,count(ORDERNUMBER) Frequency
FROM sales_data_sample$
WHERE YEAR_ID = 2004 and MONTH_ID = 11 ---chane year to see the rest
group by MONTH_ID, PRODUCTLINE
order by 3 desc

Sales Dashboard 1 from Tableau:


Customer Segmentation using RFM: A customer segmentation analysis is conducted using the RFM technique. The RFM metrics—Recency, Frequency, and Monetary Value—are calculated for each customer. Customers are then categorized into different segments based on their RFM scores. This analysis helps identify customer groups, such as lost customers, potential churners, loyal customers, and new customers.

/** Who is our best customer (this could be answered with RFM)**/

DROP TABLE IF EXISTS #rfm
;with rfm as 
(
	select 
		CUSTOMERNAME, 
		sum(sales) MonetaryValue,
		avg(sales) AvgMonetaryValue,
		count(ORDERNUMBER) Frequency,
		max(ORDERDATE) last_order_date,
		(select max(ORDERDATE) from sales_data_sample$) max_order_date,
		DATEDIFF(DD, max(ORDERDATE), (select max(ORDERDATE) from sales_data_sample$)) Recency
	from sales_data_sample$
	group by CUSTOMERNAME
),
rfm_calc as
(

	select r.*,
		NTILE(4) OVER (order by Recency desc) rfm_recency,
		NTILE(4) OVER (order by Frequency) rfm_frequency,
		NTILE(4) OVER (order by MonetaryValue) rfm_monetary
	from rfm r
)
select 
	c.*, rfm_recency+ rfm_frequency+ rfm_monetary as rfm_cell,
	cast(rfm_recency as varchar) + cast(rfm_frequency as varchar) + cast(rfm_monetary  as varchar)rfm_cell_string
into #rfm
from rfm_calc c

select CUSTOMERNAME,rfm_recency, rfm_frequency, rfm_monetary,
	case
		when rfm_cell_string in (111, 112 , 121, 122, 123, 132, 211, 212, 114, 141) then 'lost_customers'  --lost customers
		when rfm_cell_string in (133, 134, 143, 244, 334, 343, 344, 144) then 'slipping away, cannot lose' -- (Big spenders who haven’t purchased lately) slipping away
		when rfm_cell_string in (311, 411, 331) then 'new customers'
		when rfm_cell_string in (222, 223, 233, 322) then 'potential churners'
		when rfm_cell_string in (323, 333,321, 422, 332, 432) then 'active' --(Customers who buy often & recently, but at low price points)
		when rfm_cell_string in (433, 434, 443, 444) then 'loyal'
		else 'other' -- not inluded on our interest list
end rfm_segment
from #rfm

Products Frequently Sold Together: A query is executed to identify which products are frequently sold together. By examining the order numbers and corresponding product codes, patterns of product combinations are revealed. This analysis offers insights into product associations and potential cross-selling opportunities.

/** What products are more often sold together? **/
/**select * from sales_data_sample$ where ORDERNUMBER = 10411**/
select distinct OrderNumber, stuff(

	(select ','+ PRODUCTCODE
	from sales_data_sample$ p
	where ORDERNUMBER in (
		select ORDERNUMBER
			FROM (
							select ORDERNUMBER, count(*) rn
							FROM sales_data_sample$
							where STATUS = 'Shipped'
							group by ORDERNUMBER
			)m
			WHERE RN = 3
	)
	and p.ORDERNUMBER = s.ORDERNUMBER
	for xml path (''))

		,1,1,'') ProductCodes
from sales_data_sample$ s
order by 2 desc

/** On the 12 & 13 rows we can see that we have different order nuber with the same products **/

Executive Overview from Tableau:

Conclusion:

In conclusion, this project demonstrates a comprehensive analysis of a sales dataset, starting from data inspection and exploration to more complex SQL queries and customer segmentation using RFM analysis. Through the use of SQL functions and techniques such as aggregate functions, window functions, subqueries, and common table expressions, valuable insights about sales revenue, customer behavior, and product associations have been extracted. Furthermore, Tableau has been employed to visualize and present the generated analytics in an interactive and visually appealing manner. By leveraging these analytical approaches, businesses can gain a deeper understanding of their sales data, make data-driven decisions, and identify opportunities for growth and optimization.









Follow Me

vecteezy_linkedin-logo-png-linkedin-icon-transparent-png_18930587_72.png
vecteezy_github-logo-black-transparent-png_24555266_956.png
tableau.png

©2023 by Igli Ferati

bottom of page