How Distance Impacts Delivery Time and Ways to Optimize for Customer Satisfaction
Problem Statement:
Quick commerce is a rapidly evolving business in India, projected to grow to $10 billion by 2029, according to a Times of India report. In this context, businesses want to understand how distance impacts delivery time and explore ways to optimize it to improve customer satisfaction.
To address this, we’ll solve the problem using SQL. Let’s start by creating a sample data table in MySQL.
Sample code for analysing relation between distance and time
CREATE TABLE delivery (
order_id INT,
delivery_start_time DATETIME,
delivery_end_time DATETIME,
distance_covered FLOAT
);
INSERT INTO delivery (order_id, delivery_start_time, delivery_end_time, distance_covered) VALUES
(1, '2024-10-25 10:00:00', '2024-10-25 10:30:00', 5.0),
(2, '2024-10-25 11:00:00', '2024-10-25 11:45:00', 10.0),
(3, '2024-10-25 12:00:00', '2024-10-25 12:35:00', 7.0),
(4, '2024-10-25 13:00:00', '2024-10-25 13:20:00', 4.0),
(5, '2024-10-25 14:00:00', '2024-10-25 14:55:00', 12.0),
(6, '2024-10-25 15:00:00', '2024-10-25 15:25:00', 6.5),
(7, '2024-10-25 16:00:00', '2024-10-25 16:40:00', 9.0),
(8, '2024-10-25 17:00:00', '2024-10-25 17:30:00', 8.0),
(9, '2024-10-25 18:00:00', '2024-10-25 18:50:00', 11.0),
(10, '2024-10-25 19:00:00', '2024-10-25 19:35:00', 10.5);
Above delivery details in table format as follows
Order ID | Start Time | End Time | Distance (km) |
---|---|---|---|
1 | 2024-10-25 10:00 | 2024-10-25 10:30 | 5.0 |
2 | 2024-10-25 11:00 | 2024-10-25 11:45 | 10.0 |
3 | 2024-10-25 12:00 | 2024-10-25 12:35 | 7.0 |
4 | 2024-10-25 13:00 | 2024-10-25 13:20 | 4.0 |
5 | 2024-10-25 14:00 | 2024-10-25 14:55 | 12.0 |
6 | 2024-10-25 15:00 | 2024-10-25 15:25 | 6.5 |
7 | 2024-10-25 16:00 | 2024-10-25 16:40 | 9.0 |
8 | 2024-10-25 17:00 | 2024-10-25 17:30 | 8.0 |
9 | 2024-10-25 18:00 | 2024-10-25 18:50 | 11.0 |
10 | 2024-10-25 19:00 | 2024-10-25 19:35 | 10.5 |
Now we have order_id, start_time, end_time, distance.In order to analyse the relation between distance covered and delivery time we can use correlation metrics. To do that lets proceed with Pearson's correlation coefficient.
\( r = \frac{\sum (X - \overline{X})(Y - \overline{Y})}{\sqrt{\sum (X - \overline{X})^2 \cdot \sum (Y - \overline{Y})^2}} \)
Where:
\( \overline{X} \): Means Average Time
\( \overline{Y} \): Means Average Distance
- Here +1 means as distance increases so does the time. 0 means there is no relation between distance and time. -1 means there is negative relation between distance and time, that means, if distance increases then time decreases.
- Now lets find out how Pearson correlation Coefficient can be implemented here.
WITH cte AS (
SELECT order_id, delivery_start_time, delivery_end_time,
distance_covered,
TIMESTAMPDIFF(MINUTE, delivery_start_time, delivery_end_time) AS delivery_time
FROM delivery
)
,cte2 AS (
SELECT ROUND(AVG(distance_covered),2) AS avg_distance,
ROUND(AVG(TIMESTAMPDIFF(MINUTE, delivery_start_time,delivery_end_time) ),2) AS avg_delivery_time
FROM delivery
)
, cte3 AS(
SELECT c1.distance_covered,
c1.delivery_time,
c2.avg_distance,c2.avg_delivery_time,
(c1.distance_covered - c2.avg_distance) * (c1.delivery_time - c2.avg_delivery_time) AS covariance,
POWER((c1.distance_covered - c2.avg_distance),2) AS covariance_distance,
POWER((c1.delivery_time - c2.avg_delivery_time),2) AS covariance_time
FROM cte c1, cte2 c2
)
SELECT
ROUND(sum(covariance) *1.0 /
SQRT(sum(covariance_distance) * sum(covariance_time) ),3) AS pearson_correlation
FROM cte3
- First from the delivery data we have calculated delivery time using delivery_start_time & delivery_end_time(i.e. from restaurant to customer).
- Next, to implement pearson correlation, we need to have average time and average distance which was calculated in second cte i.e. CTE2.
- In third CTE i.e. CTE3, we have calculated covariance of distance and time to see the magnitude of relation between them.
- Finally, Pearson correlation implemented in the last CTE and result of the value is 0.889. The result indicates that there is indeed strong positive Pearon correlation between distance covered and Delivery Time. This Suggests that, as distance increases, do does the time to deliver the food to the customer.
- This metric can be useful for future orders if we know the distance then we can able to predict the time to reach the product to the customer(i.e. Estimating Delivery Time)
- From this metric, we can use our staff efficiently during peak hours, also for route optimisation, dark store location, identifying any anomalies in the data and improving customer satisfaction.
- The Pearson correlation in this case provides a quantitative measure to support decisions and improvements around delivery efficiency and cost-effectiveness.



