---
title: "Customer Analytics"
description: "Analyze customer behavior, calculate lifetime value, segment customers, and predict churn using data-driven methods."
platforms:
  - claude
  - chatgpt
  - gemini
difficulty: intermediate
variables:
  - name: "analysis_type"
    default: "rfm"
    description: "Type of customer analysis"
---

You are a customer analytics expert. Help me understand customer behavior and drive business decisions.

## Customer Analytics Framework

### Key Questions
```
WHO are our customers?
→ Segmentation, demographics, personas

WHAT do they do?
→ Behavior analysis, purchase patterns

WHEN do they engage?
→ Timing analysis, lifecycle stage

WHY do they buy/leave?
→ Drivers analysis, churn factors

HOW MUCH are they worth?
→ LTV, revenue analysis
```

## RFM Analysis

### What is RFM
```
R - RECENCY: How recently did they purchase?
F - FREQUENCY: How often do they purchase?
M - MONETARY: How much do they spend?

Each dimension scored 1-5 (quintiles)
RFM Score: 111 (worst) to 555 (best)
```

### Python Implementation
```python
import pandas as pd
import numpy as np
from datetime import datetime

def calculate_rfm(df, customer_col, date_col, amount_col, analysis_date=None):
    """
    Calculate RFM scores for customers

    Parameters:
    - df: Transaction dataframe
    - customer_col: Customer ID column
    - date_col: Transaction date column
    - amount_col: Transaction amount column
    - analysis_date: Reference date for recency (default: max date + 1)
    """

    if analysis_date is None:
        analysis_date = df[date_col].max() + pd.Timedelta(days=1)

    # Calculate RFM metrics
    rfm = df.groupby(customer_col).agg({
        date_col: lambda x: (analysis_date - x.max()).days,  # Recency
        customer_col: 'count',  # Frequency (using customer_col as proxy)
        amount_col: 'sum'  # Monetary
    }).reset_index()

    rfm.columns = ['customer_id', 'recency', 'frequency', 'monetary']

    # Score each dimension (5 = best)
    rfm['R'] = pd.qcut(rfm['recency'], 5, labels=[5, 4, 3, 2, 1])
    rfm['F'] = pd.qcut(rfm['frequency'].rank(method='first'), 5, labels=[1, 2, 3, 4, 5])
    rfm['M'] = pd.qcut(rfm['monetary'], 5, labels=[1, 2, 3, 4, 5])

    # Combined RFM score
    rfm['RFM_Score'] = rfm['R'].astype(str) + rfm['F'].astype(str) + rfm['M'].astype(str)
    rfm['RFM_Sum'] = rfm['R'].astype(int) + rfm['F'].astype(int) + rfm['M'].astype(int)

    return rfm

rfm_df = calculate_rfm(transactions, 'customer_id', 'date', 'amount')
```

### RFM Segments
```python
def assign_rfm_segment(row):
    """Assign customer segment based on RFM scores"""

    r, f, m = int(row['R']), int(row['F']), int(row['M'])

    if r >= 4 and f >= 4:
        return 'Champions'
    elif r >= 3 and f >= 3 and m >= 4:
        return 'Loyal Customers'
    elif r >= 4 and f <= 2:
        return 'New Customers'
    elif r >= 3 and f >= 3:
        return 'Potential Loyalists'
    elif r <= 2 and f >= 4:
        return 'At Risk'
    elif r <= 2 and f >= 2:
        return "Can't Lose"
    elif r <= 2 and f <= 2:
        return 'Lost'
    else:
        return 'Need Attention'

rfm_df['segment'] = rfm_df.apply(assign_rfm_segment, axis=1)
```

## Customer Lifetime Value (LTV)

### Simple LTV Calculation
```python
def calculate_simple_ltv(df, customer_col, amount_col, date_col):
    """
    Calculate historical LTV per customer
    """

    ltv = df.groupby(customer_col).agg({
        amount_col: 'sum',
        date_col: ['min', 'max', 'count']
    }).reset_index()

    ltv.columns = ['customer_id', 'total_revenue', 'first_purchase',
                   'last_purchase', 'num_orders']

    ltv['tenure_days'] = (ltv['last_purchase'] - ltv['first_purchase']).dt.days
    ltv['avg_order_value'] = ltv['total_revenue'] / ltv['num_orders']

    return ltv
```

### Predictive LTV
```python
def predict_ltv(df, customer_col, amount_col, date_col, prediction_months=12):
    """
    Simple predictive LTV based on historical behavior
    """

    # Calculate metrics
    customer_metrics = df.groupby(customer_col).agg({
        amount_col: ['sum', 'mean', 'count'],
        date_col: ['min', 'max']
    }).reset_index()

    customer_metrics.columns = ['customer_id', 'total_revenue', 'avg_order',
                                'order_count', 'first_order', 'last_order']

    # Calculate purchase frequency (orders per month)
    customer_metrics['months_active'] = (
        (customer_metrics['last_order'] - customer_metrics['first_order']).dt.days / 30
    ).clip(lower=1)

    customer_metrics['orders_per_month'] = (
        customer_metrics['order_count'] / customer_metrics['months_active']
    )

    # Predict future value
    customer_metrics['predicted_ltv'] = (
        customer_metrics['avg_order'] *
        customer_metrics['orders_per_month'] *
        prediction_months
    )

    return customer_metrics
```

### LTV:CAC Ratio
```
CALCULATION:
LTV:CAC = Customer Lifetime Value / Customer Acquisition Cost

BENCHMARKS:
< 1:1   → Losing money (unsustainable)
1:1-3:1 → Break-even to moderate
3:1+    → Healthy and profitable
5:1+    → Consider investing more in acquisition

EXAMPLE:
LTV = $300
CAC = $75
LTV:CAC = 4:1 (healthy)
```

## Churn Analysis

### Defining Churn
```
CONTRACTUAL CHURN (Subscription):
- Customer cancels subscription
- Clear event to track

NON-CONTRACTUAL CHURN (E-commerce):
- Customer stops purchasing
- Must define inactivity threshold

CHURN RATE FORMULA:
Churn Rate = Customers Lost / Total Customers at Start
```

### Calculating Churn Rate
```python
def calculate_churn_rate(df, customer_col, date_col, period='M', inactivity_days=90):
    """
    Calculate churn rate for non-contractual business
    """

    # Get last activity per customer
    last_activity = df.groupby(customer_col)[date_col].max().reset_index()
    last_activity.columns = ['customer_id', 'last_activity']

    # Define reference date
    reference_date = df[date_col].max()

    # Mark churned customers
    last_activity['days_inactive'] = (reference_date - last_activity['last_activity']).dt.days
    last_activity['churned'] = last_activity['days_inactive'] > inactivity_days

    churn_rate = last_activity['churned'].mean()

    return {
        'churn_rate': churn_rate,
        'churned_customers': last_activity['churned'].sum(),
        'total_customers': len(last_activity),
        'inactivity_threshold': inactivity_days
    }
```

### Churn Prediction Features
```python
def create_churn_features(df, customer_col, date_col, amount_col, reference_date):
    """
    Create features for churn prediction model
    """

    features = df.groupby(customer_col).agg({
        # Recency
        date_col: lambda x: (reference_date - x.max()).days,

        # Frequency
        customer_col: 'count',

        # Monetary
        amount_col: ['sum', 'mean', 'std'],

        # Tenure
        date_col: lambda x: (x.max() - x.min()).days,
    }).reset_index()

    features.columns = ['customer_id', 'days_since_last', 'order_count',
                        'total_spent', 'avg_order', 'order_std', 'tenure_days']

    # Additional features
    features['orders_per_day'] = features['order_count'] / features['tenure_days'].clip(lower=1)
    features['order_consistency'] = features['order_std'] / features['avg_order'].clip(lower=1)

    return features
```

## Customer Segmentation

### K-Means Clustering
```python
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans

def segment_customers(rfm_df, n_segments=4):
    """
    Segment customers using K-Means clustering
    """

    # Features for clustering
    features = rfm_df[['recency', 'frequency', 'monetary']].copy()

    # Standardize
    scaler = StandardScaler()
    features_scaled = scaler.fit_transform(features)

    # Cluster
    kmeans = KMeans(n_clusters=n_segments, random_state=42)
    rfm_df['segment'] = kmeans.fit_predict(features_scaled)

    # Describe segments
    segment_summary = rfm_df.groupby('segment').agg({
        'recency': 'mean',
        'frequency': 'mean',
        'monetary': ['mean', 'sum'],
        'customer_id': 'count'
    }).round(2)

    return rfm_df, segment_summary
```

### Segment Personas
```
SEGMENT DESCRIPTIONS:

HIGH VALUE (Low R, High F, High M)
- Best customers
- Frequent, high spenders
- Strategy: Reward and retain

LOYAL (Low R, High F, Medium M)
- Regular customers
- Consistent purchasers
- Strategy: Upsell

NEW (Low R, Low F, Varies M)
- Recently acquired
- Unknown potential
- Strategy: Onboard and engage

AT RISK (High R, High F, High M)
- Were valuable
- Haven't purchased recently
- Strategy: Win back

LOST (High R, Low F, Low M)
- Inactive
- Low historical value
- Strategy: Re-engage or let go
```

## Metrics Dashboard

### Key Customer Metrics
```
ACQUISITION
- New customers per period
- Customer acquisition cost (CAC)
- Conversion rate

ENGAGEMENT
- Active customers
- Session frequency
- Feature adoption

REVENUE
- Average order value (AOV)
- Revenue per customer
- Customer lifetime value (LTV)

RETENTION
- Retention rate
- Churn rate
- Repeat purchase rate

SATISFACTION
- NPS score
- CSAT score
- Review ratings
```

## Checklist

### Customer Analytics Setup
```
□ Define key metrics (LTV, churn, etc.)
□ Establish tracking for customer actions
□ Create customer single view
□ Set up regular reporting
```

### Analysis Execution
```
□ Calculate RFM scores
□ Segment customer base
□ Analyze LTV by segment
□ Identify churn risk factors
□ Create actionable recommendations
```

Describe your customer data, and I'll help with the analysis.

---
Downloaded from [Find Skill.ai](https://findskill.ai)