Performance Tips

Generator vs List (using yield)

Feature
Generator
List

Memory Usage

Low, because data is generated on demand

High, as all elements are stored in memory

Speed

Faster for large data, as items are produced one at a time

Slower for large data, as all items are processed and stored

Iteration

One-time iteration (can't reuse)

Multiple iterations are possible

Syntax

Uses yield keyword

Standard list comprehension or manual appending

Access

Sequential access only

Random access is possible

Use Case

When processing large datasets or streaming data

When you need random access or need to store the entire dataset

# Using generator
def generate_numbers(n):
    for i in range(n):
        yield i

gen = generate_numbers(5)
for num in gen:
    print(num)

# Using list
numbers = [i for i in range(5)]
print(numbers)

Time and Space Complexity


Memory Profiling (Using memory_profiler Module)

Memory Profiling is an important aspect of performance optimization, so to monitor the memory usage of our code, we use the "memory_profiler" module that Python provides.

@profile or mprof command, we can check usage line by line.

from memory_profiler import profile

@profile
def my_func():
    a = [i * 2 for i in range(1000)]
    b = [i * 3 for i in range(1000)]
    return a, b

if __name__ == "__main__":
    my_func()

Least Recently Used (LRU) Caching technique

caching techniques are powerful technique to optimize performance. The functools.lru_cache Decorator is a simple and powerful way to cache the results of expensive function calls. This is particularly useful for functions repeatedly called with the same arguments.

The Least Recently Used (LRU) caching mechanism ensures that only a limited number of results are stored, so the cache does not consume excessive memory. When a function is called with already cached arguments, it returns the cached result instead of recalculating the result.

from functools import lru_cache

@lru_cache(maxsize=128)
def expensive_function(n):
    print(f"Calculating {n}")
    return n * 2

# First call, will calculate and cache
print(expensive_function(5))  # Output: Calculating 5, 10

# Second call with the same argument, will use the cached result
print(expensive_function(5))  # Output: 10 (no "Calculating" print)

Performance optimization is a broad area, and I have just touched the surface. The optimization always depends on the problem we are solving. The right tool or techniques come in hand in hand for a performance-critical application. Please refer to this Python page for more detailed tips.

Last updated