Looping Over Small List Is Fast Enough -> You Must Quit Coding Right Now
2 min readApr 20, 2023
Benchmarking over size of list = [10, 100, 1000, 10000, 100000]
Test Target:
- 0. Dumbest for loop
- 1. Dumbest for loop with indexing
- 2. Pure indexing
- 3. SIMD indexing
Result:
size = 1e1 1e2 1e3 1e4 1e5
0. 1X 1X 1X 1X 1X
1. 10X 61X 520X 5253X 51794X
2. 15X 86X 732X 7287X 70721X
3. 107X 3727X 31271X 279104X 2476500X
And basically you have zero justification for this kind of slowiness.
Imagine the slowest dumbest for loop is everywhere in the server code. For every access or request you have to run that shit once. The server has 100,000 access per day and even 1,000,000 in peak. How much time and energy wasted by you?
%%time
from time import perf_counter as now
import numpy as np
from operator import itemgetter as itg
search_grid = [1,2,3,4,5]
log = [[] for _ in search_grid]
repeat = 10
for i, exponential in enumerate(search_grid):
test_size = int(10 ** exponential)
loop = 1000
random_index = np.random.randint(0, test_size, loop)
container = np.arange(test_size)
t0 = now()
for _ in range(repeat):
for idx in random_index:
for x in container:
if x == idx:
temp = x
t1 = now()
t10 = now()
for _ in range(repeat):
for idx in random_index:
temp = container[idx]
t11 = now()
t30 = now()
for _ in range(repeat):
temp = itg(*random_index)(container)
t31 = now()
t20 = now()
for _ in range(repeat):
temp = container[random_index]
t21 = now()
log[i] = [t1-t0, t11-t10, t31-t30, t21-t20]
import matplotlib.pyplot as plt
log_np = np.array(log)/10
log_np = log_np.T
log_np
plt.rcParams.update({'font.size': 18})
x = np.arange(len(search_grid)) # the label locations
width = 0.25 # the width of the bars
multiplier = 0
plt.figure(figsize=(20, 10))
x_axis = np.array(search_grid)
for i, row in enumerate(log_np):
offset = width * multiplier
plt.bar(x_axis + offset, row, width, label=f"code{i}", log=True)
#rects = ax.bar(x + offset, measurement, width, label=attribute)
#ax.bar_label(rects, padding=3)
multiplier += 1
plt.xticks(x_axis)
plt.xlabel("len(list)")
plt.ylabel("time")
plt.tight_layout()
plt.grid()
plt.legend(loc="best")
plt.show()