Back to Blog

How Distance Impacts Delivery Time and Ways to Optimize for Customer Satisfaction

Problem Statement:

Quick commerce is a rapidly evolving business in India, projected to grow to $10 billion by 2029, according to a Times of India report. In this context, businesses want to understand how distance impacts delivery time and explore ways to optimize it to improve customer satisfaction.

To address this, we’ll solve the problem using SQL. Let’s start by creating a sample data table in MySQL.

Sample code for analysing relation between distance and time


    CREATE TABLE delivery (
        order_id INT,
        delivery_start_time DATETIME,
        delivery_end_time DATETIME,
        distance_covered FLOAT
    );
    
    INSERT INTO delivery (order_id, delivery_start_time, delivery_end_time, distance_covered) VALUES
    (1, '2024-10-25 10:00:00', '2024-10-25 10:30:00', 5.0),
    (2, '2024-10-25 11:00:00', '2024-10-25 11:45:00', 10.0),
    (3, '2024-10-25 12:00:00', '2024-10-25 12:35:00', 7.0),
    (4, '2024-10-25 13:00:00', '2024-10-25 13:20:00', 4.0),
    (5, '2024-10-25 14:00:00', '2024-10-25 14:55:00', 12.0),
    (6, '2024-10-25 15:00:00', '2024-10-25 15:25:00', 6.5),
    (7, '2024-10-25 16:00:00', '2024-10-25 16:40:00', 9.0),
    (8, '2024-10-25 17:00:00', '2024-10-25 17:30:00', 8.0),
    (9, '2024-10-25 18:00:00', '2024-10-25 18:50:00', 11.0),
    (10, '2024-10-25 19:00:00', '2024-10-25 19:35:00', 10.5);
        

Above delivery details in table format as follows

Order IDStart TimeEnd TimeDistance (km)
12024-10-25 10:002024-10-25 10:305.0
22024-10-25 11:002024-10-25 11:4510.0
32024-10-25 12:002024-10-25 12:357.0
42024-10-25 13:002024-10-25 13:204.0
52024-10-25 14:002024-10-25 14:5512.0
62024-10-25 15:002024-10-25 15:256.5
72024-10-25 16:002024-10-25 16:409.0
82024-10-25 17:002024-10-25 17:308.0
92024-10-25 18:002024-10-25 18:5011.0
102024-10-25 19:002024-10-25 19:3510.5

Now we have order_id, start_time, end_time, distance.In order to analyse the relation between distance covered and delivery time we can use correlation metrics. To do that lets proceed with Pearson's correlation coefficient.

\( r = \frac{\sum (X - \overline{X})(Y - \overline{Y})}{\sqrt{\sum (X - \overline{X})^2 \cdot \sum (Y - \overline{Y})^2}} \)

Where:
\( \overline{X} \): Means Average Time
\( \overline{Y} \): Means Average Distance

  • Pearson's correlations(PC) tell us about the linear relationship between 2 variables. In this case 2 variables are distance and delivery time. PC gives values between +1 and -1, if value gets near +1 then there is positive relation meaning if one variable increases then other also follows the same. 0 means no relation and approach to -1 means they are divergent to each other(means one increases & other decreases)
  •             
    WITH cte AS (
        SELECT order_id, delivery_start_time, delivery_end_time,
        distance_covered, 
        TIMESTAMPDIFF(MINUTE, delivery_start_time, delivery_end_time) AS delivery_time
        FROM delivery
    )
    ,cte2 AS (
        SELECT ROUND(AVG(distance_covered),2) AS avg_distance, 
        ROUND(AVG(TIMESTAMPDIFF(MINUTE, delivery_start_time,delivery_end_time) ),2) AS avg_delivery_time
        FROM delivery
    )
    , cte3 AS(
        SELECT c1.distance_covered,
        c1.delivery_time,
        c2.avg_distance,c2.avg_delivery_time,
        (c1.distance_covered - c2.avg_distance) * (c1.delivery_time - c2.avg_delivery_time) AS covariance,
        POWER((c1.distance_covered - c2.avg_distance),2) AS covariance_distance,
        POWER((c1.delivery_time - c2.avg_delivery_time),2) AS covariance_time
        FROM cte c1, cte2 c2 
    )
    SELECT  
    ROUND(sum(covariance) *1.0 / 
                            SQRT(sum(covariance_distance) * sum(covariance_time) ),3) AS pearson_correlation
    FROM cte3 
                
            

    Back to Blog