Advanced Python Guide: Type Hints, Async, Metaclasses, Pattern Matching & Performance Optimization
A comprehensive deep-dive into Python's most powerful features, from advanced type annotations to concurrency patterns and design patterns.
- ✓ Type hints with TypeVar, ParamSpec, and Protocol enable fully typed generic APIs with zero runtime cost.
- ✓ Pydantic v2 provides 5-50x faster validation than v1 with Rust-powered core.
- ✓ Parameterized decorators and class decorators unlock metaprogramming for cross-cutting concerns.
- ✓ asyncio task groups (Python 3.11+) provide structured concurrency with automatic cleanup.
- ✓ Pattern matching with guard clauses replaces complex if/elif chains with declarative logic.
- ✓ __slots__ reduces per-instance memory by 40-60% and speeds up attribute access.
- ✓ uv is the modern Python package manager: 10-100x faster than pip with built-in venv management.
Why Advanced Python Matters
Python's simplicity makes it the most popular language for beginners, but its advanced features make it equally powerful for building production systems, data pipelines, and high-performance APIs. Modern Python (3.10+) includes a type system rivaling TypeScript, structural pattern matching like Rust, and async capabilities that handle thousands of concurrent connections.
This guide walks through 13 advanced topics with production-ready code examples. Whether you are building a FastAPI service, a data engineering pipeline, or a CLI tool, mastering these features will make your code more maintainable, performant, and correct.
1. Type Hints & Generics
Python's type system has evolved dramatically since PEP 484. Modern type hints support generics with TypeVar, callable signatures with ParamSpec, structural subtyping with Protocol, and self-referencing types. These annotations are checked by tools like mypy, pyright, and ruff at development time with zero runtime overhead.
TypeVar & Generic Constraints
TypeVar creates generic type variables that preserve type relationships. ParamSpec (PEP 612) captures function parameter signatures for decorator typing. Protocol (PEP 544) enables structural subtyping where any class with matching methods satisfies the protocol without explicit inheritance.
from typing import TypeVar, Generic, Protocol, ParamSpec, Callable
from collections.abc import Sequence
# Basic TypeVar with constraints
T = TypeVar("T")
S = TypeVar("S", bound=str) # Upper bound: must be str or subclass
Num = TypeVar("Num", int, float) # Value restriction: only int or float
def first(items: Sequence[T]) -> T:
"""Return the first item, preserving the exact type."""
return items[0]
reveal_type(first([1, 2, 3])) # int
reveal_type(first(["a", "b"])) # str
reveal_type(first([(1, 2), (3,)])) # tuple[int, ...]
# Generic class with TypeVar
class Stack(Generic[T]):
def __init__(self) -> None:
self._items: list[T] = []
def push(self, item: T) -> None:
self._items.append(item)
def pop(self) -> T:
return self._items.pop()
def peek(self) -> T:
return self._items[-1]
int_stack: Stack[int] = Stack()
int_stack.push(42) # OK
# int_stack.push("oops") # Type error!ParamSpec & Protocol
Python 3.12 introduced the new type parameter syntax (PEP 695) that simplifies generic definitions with a cleaner [T] syntax instead of TypeVar declarations.
# ParamSpec — preserve function signatures in decorators
P = ParamSpec("P")
R = TypeVar("R")
def logged(func: Callable[P, R]) -> Callable[P, R]:
def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
print(f"Calling {func.__name__}")
return func(*args, **kwargs)
return wrapper
@logged
def add(x: int, y: int) -> int:
return x + y
add(1, 2) # OK — type checker knows signature
# add("a", "b") # Type error: expected int
# Protocol — structural subtyping (duck typing with types)
class Renderable(Protocol):
def render(self) -> str: ...
class Button:
def render(self) -> str:
return "<button>Click</button>"
class Chart:
def render(self) -> str:
return "<svg>...</svg>"
def display(widget: Renderable) -> None:
print(widget.render())
display(Button()) # OK — Button has render() -> str
display(Chart()) # OK — Chart has render() -> str
# Python 3.12+ new syntax (PEP 695)
def first_new[T](items: Sequence[T]) -> T:
return items[0]2. Dataclasses & Pydantic Models
Dataclasses (PEP 557) eliminate boilerplate for data-holding classes by auto-generating __init__, __repr__, __eq__, and more. They support default values, field factories, post-init processing, slots, and frozen (immutable) instances.
Advanced Dataclass Usage
from dataclasses import dataclass, field, asdict, astuple
from typing import Optional
from datetime import datetime
@dataclass(frozen=True, slots=True) # Immutable + memory efficient
class Point:
x: float
y: float
@property
def distance(self) -> float:
return (self.x ** 2 + self.y ** 2) ** 0.5
@dataclass
class User:
name: str
email: str
age: int
tags: list[str] = field(default_factory=list)
created_at: datetime = field(default_factory=datetime.now)
_password_hash: str = field(default="", repr=False, compare=False)
def __post_init__(self) -> None:
"""Validate after __init__ runs."""
if self.age < 0:
raise ValueError("Age must be non-negative")
if "@" not in self.email:
raise ValueError("Invalid email")
user = User("Alice", "alice@example.com", 30, ["admin"])
print(asdict(user)) # Convert to dict
print(astuple(user)) # Convert to tuplePydantic v2 Model Validation
Pydantic v2 takes data modeling further with runtime validation, serialization, JSON Schema generation, and settings management. Built on a Rust core (pydantic-core), v2 is 5-50x faster than v1 and is the foundation of FastAPI.
from pydantic import BaseModel, Field, field_validator, model_validator
from pydantic import ConfigDict, EmailStr
from datetime import datetime
class Address(BaseModel):
street: str
city: str
country: str = "US"
zip_code: str = Field(pattern=r"^\d{5}(-\d{4})?$")
class UserCreate(BaseModel):
model_config = ConfigDict(str_strip_whitespace=True)
name: str = Field(min_length=1, max_length=100)
email: EmailStr
age: int = Field(ge=0, le=150)
address: Address
tags: list[str] = Field(default_factory=list, max_length=10)
@field_validator("name")
@classmethod
def name_must_be_title_case(cls, v: str) -> str:
return v.title()
@model_validator(mode="after")
def check_age_for_minors(self) -> "UserCreate":
if self.age < 18 and "minor" not in self.tags:
self.tags.append("minor")
return self
# Automatic validation + serialization
user = UserCreate(
name=" alice smith ",
email="alice@example.com",
age=25,
address={"street": "123 Main St", "city": "NYC", "zip_code": "10001"},
)
print(user.model_dump_json(indent=2)) # JSON serialization
print(user.model_json_schema()) # JSON Schema generationChoose dataclasses for simple internal data structures and Pydantic for external boundaries where validation matters: API request/response bodies, configuration files, database records, and event schemas.
3. Decorators Deep Dive
Decorators are Python's most powerful metaprogramming tool. They modify functions or classes at definition time using the @decorator syntax. Beyond simple wrappers, Python supports parameterized decorators, class decorators, decorator stacking, and functools.wraps for preserving metadata.
Parameterized Decorators
A decorator is simply a callable that takes a function and returns a function. Parameterized decorators add an outer function that returns the actual decorator. Class decorators take a class and return a modified class, enabling patterns like singleton, registration, and auto-serialization.
import functools
import time
from typing import Callable, TypeVar, ParamSpec
P = ParamSpec("P")
R = TypeVar("R")
# Simple decorator with functools.wraps
def timer(func: Callable[P, R]) -> Callable[P, R]:
@functools.wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
start = time.perf_counter()
result = func(*args, **kwargs)
elapsed = time.perf_counter() - start
print(f"{func.__name__} took {elapsed:.4f}s")
return result
return wrapper
# Parameterized decorator (decorator factory)
def retry(max_attempts: int = 3, delay: float = 1.0):
"""Retry a function on failure with exponential backoff."""
def decorator(func: Callable[P, R]) -> Callable[P, R]:
@functools.wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
last_exception: Exception | None = None
for attempt in range(max_attempts):
try:
return func(*args, **kwargs)
except Exception as e:
last_exception = e
wait = delay * (2 ** attempt)
print(f"Attempt {attempt + 1} failed, retrying in {wait}s")
time.sleep(wait)
raise last_exception # type: ignore
return wrapper
return decorator
@retry(max_attempts=5, delay=0.5)
@timer
def fetch_data(url: str) -> dict:
"""Fetch data from API with retry logic."""
import urllib.request
import json
with urllib.request.urlopen(url) as resp:
return json.loads(resp.read())Class Decorators
Always use functools.wraps on wrapper functions to preserve the original function's __name__, __doc__, and __module__. Without it, debugging and documentation tools show the wrapper's metadata instead.
# Class decorator: auto-register all subclasses
_registry: dict[str, type] = {}
def register(cls: type) -> type:
"""Register a class in the global registry."""
_registry[cls.__name__] = cls
return cls
@register
class JSONParser:
def parse(self, data: str) -> dict:
import json
return json.loads(data)
@register
class XMLParser:
def parse(self, data: str) -> dict:
# XML parsing logic
return {}
print(_registry) # {"JSONParser": <class>, "XMLParser": <class>}
# Class decorator: singleton pattern
def singleton(cls: type) -> type:
instances: dict[type, object] = {}
@functools.wraps(cls, updated=())
def get_instance(*args, **kwargs):
if cls not in instances:
instances[cls] = cls(*args, **kwargs)
return instances[cls]
return get_instance # type: ignore
@singleton
class DatabaseConnection:
def __init__(self, url: str) -> None:
self.url = url
print(f"Connecting to {url}")
db1 = DatabaseConnection("postgres://localhost/mydb")
db2 = DatabaseConnection("postgres://localhost/mydb")
print(db1 is db2) # True — same instance4. Context Managers & Generators
Context managers handle resource lifecycle with guaranteed cleanup via the with statement. They implement __enter__ and __exit__ (or __aenter__/__aexit__ for async). The contextlib module provides shortcuts like @contextmanager, suppress(), and ExitStack for composing multiple managers.
Context Manager Patterns
from contextlib import contextmanager, asynccontextmanager, ExitStack
from typing import Generator, AsyncGenerator
import time
# Class-based context manager
class Timer:
def __init__(self, label: str) -> None:
self.label = label
self.elapsed: float = 0.0
def __enter__(self) -> "Timer":
self.start = time.perf_counter()
return self
def __exit__(self, exc_type, exc_val, exc_tb) -> bool:
self.elapsed = time.perf_counter() - self.start
print(f"{self.label}: {self.elapsed:.4f}s")
return False # Do not suppress exceptions
with Timer("data processing") as t:
data = [i ** 2 for i in range(1_000_000)]
# Generator-based context manager (simpler)
@contextmanager
def temp_directory() -> Generator[str, None, None]:
import tempfile, shutil
path = tempfile.mkdtemp()
try:
yield path # Provide the resource
finally:
shutil.rmtree(path) # Guaranteed cleanup
with temp_directory() as tmpdir:
print(f"Working in {tmpdir}")
# ExitStack — compose multiple context managers dynamically
def process_files(paths: list[str]) -> list[str]:
with ExitStack() as stack:
files = [stack.enter_context(open(p)) for p in paths]
return [f.read() for f in files]Generators & yield from
Generators produce values lazily with yield, enabling memory-efficient iteration over large datasets. Generator expressions provide inline syntax. The yield from syntax (PEP 380) delegates to sub-generators, enabling clean recursive generation and coroutine composition.
from typing import Generator, Iterator
# Generator for memory-efficient processing
def read_large_file(path: str, chunk_size: int = 8192) -> Generator[str, None, None]:
"""Read a large file in chunks without loading it all."""
with open(path) as f:
while chunk := f.read(chunk_size):
yield chunk
# yield from — delegate to sub-generator
def flatten(nested: list) -> Generator:
"""Recursively flatten nested lists."""
for item in nested:
if isinstance(item, list):
yield from flatten(item) # Delegate recursion
else:
yield item
data = [1, [2, 3], [4, [5, 6]], 7]
print(list(flatten(data))) # [1, 2, 3, 4, 5, 6, 7]
# Generator pipeline — compose data transformations
def lines(path: str) -> Iterator[str]:
with open(path) as f:
yield from f
def strip(it: Iterator[str]) -> Iterator[str]:
for line in it:
yield line.strip()
def non_empty(it: Iterator[str]) -> Iterator[str]:
for line in it:
if line:
yield line
# Compose: file -> strip -> non_empty -> process
# pipeline = non_empty(strip(lines("data.txt")))
# for line in pipeline:
# process(line)Async Context Managers
Async context managers combine async/await with resource management, essential for database connections, HTTP sessions, and file I/O in async code.
import asyncio
@asynccontextmanager
async def db_transaction(conn) -> AsyncGenerator:
"""Async context manager for database transactions."""
tx = await conn.begin()
try:
yield tx
await tx.commit()
except Exception:
await tx.rollback()
raise
# Usage:
# async with db_transaction(conn) as tx:
# await tx.execute("INSERT INTO users ...")5. Async/Await Patterns
asyncio is Python's built-in framework for concurrent I/O-bound operations. A single thread handles thousands of connections using cooperative multitasking. The async/await syntax makes asynchronous code nearly as readable as synchronous code.
asyncio.gather & Semaphores
asyncio.gather runs multiple coroutines concurrently and collects results. Semaphores limit concurrent access to shared resources. Task groups (Python 3.11+, PEP 654) provide structured concurrency with automatic cancellation on failure.
import asyncio
import httpx
async def fetch_url(client: httpx.AsyncClient, url: str) -> dict:
"""Fetch a single URL."""
response = await client.get(url)
return {"url": url, "status": response.status_code}
async def fetch_all(urls: list[str]) -> list[dict]:
"""Fetch multiple URLs concurrently."""
async with httpx.AsyncClient() as client:
tasks = [fetch_url(client, url) for url in urls]
results = await asyncio.gather(*tasks, return_exceptions=True)
return [r for r in results if isinstance(r, dict)]
# Semaphore — limit concurrent connections
async def fetch_with_limit(urls: list[str], max_concurrent: int = 10):
semaphore = asyncio.Semaphore(max_concurrent)
async def limited_fetch(client: httpx.AsyncClient, url: str):
async with semaphore: # At most max_concurrent at once
return await fetch_url(client, url)
async with httpx.AsyncClient() as client:
tasks = [limited_fetch(client, url) for url in urls]
return await asyncio.gather(*tasks)Task Groups (Python 3.11+)
For production async code, use asyncio.TaskGroup for structured concurrency, aiohttp or httpx for HTTP requests, asyncpg for PostgreSQL, and motor for MongoDB. Always handle cancellation properly and use asyncio.shield for critical operations.
# TaskGroup — structured concurrency (Python 3.11+)
async def process_batch(items: list[str]) -> list[str]:
results: list[str] = []
async with asyncio.TaskGroup() as tg:
for item in items:
tg.create_task(process_item(item))
# All tasks completed successfully here
# If ANY task raises, ALL are cancelled automatically
return results
async def process_item(item: str) -> str:
await asyncio.sleep(0.1) # Simulate I/O
return f"processed: {item}"
# Producer-consumer with asyncio.Queue
async def producer(queue: asyncio.Queue[str], items: list[str]):
for item in items:
await queue.put(item)
await queue.put("") # Sentinel to signal done
async def consumer(queue: asyncio.Queue[str], name: str):
while True:
item = await queue.get()
if item == "": # Sentinel
await queue.put("") # Pass sentinel to other consumers
break
print(f"{name} processing: {item}")
await asyncio.sleep(0.1)
queue.task_done()
async def main():
queue: asyncio.Queue[str] = asyncio.Queue(maxsize=100)
items = [f"item-{i}" for i in range(50)]
async with asyncio.TaskGroup() as tg:
tg.create_task(producer(queue, items))
for i in range(3): # 3 consumers
tg.create_task(consumer(queue, f"worker-{i}"))6. Metaclasses & Descriptors
Metaclasses are the classes of classes. When you define a class, Python uses its metaclass (usually type) to create it. Custom metaclasses intercept class creation to add validation, register classes, modify attributes, or enforce coding patterns.
Custom Metaclass
# Metaclass that validates class attributes
class ValidatedMeta(type):
def __new__(mcs, name: str, bases: tuple, namespace: dict):
# Ensure all public methods have docstrings
for attr_name, attr_value in namespace.items():
if callable(attr_value) and not attr_name.startswith("_"):
if not attr_value.__doc__:
raise TypeError(
f"{name}.{attr_name} must have a docstring"
)
return super().__new__(mcs, name, bases, namespace)
class APIHandler(metaclass=ValidatedMeta):
def get(self, request):
"""Handle GET request."""
pass
def post(self, request):
"""Handle POST request."""
pass
# This would raise TypeError:
# class BadHandler(metaclass=ValidatedMeta):
# def get(self, request): # No docstring!
# passDescriptors & __set_name__
Descriptors implement __get__, __set__, and __delete__ to control attribute access on instances. They power Python's property, staticmethod, classmethod, and ORM fields. The __set_name__ hook (PEP 487) lets descriptors know their attribute name automatically.
# Descriptor for validated attributes
class Validated:
def __init__(self, *, min_val: float = float("-inf"), max_val: float = float("inf")):
self.min_val = min_val
self.max_val = max_val
def __set_name__(self, owner: type, name: str) -> None:
"""Called automatically when class is created."""
self.public_name = name
self.private_name = f"_{name}"
def __get__(self, obj, objtype=None):
if obj is None:
return self
return getattr(obj, self.private_name, None)
def __set__(self, obj, value: float) -> None:
if not isinstance(value, (int, float)):
raise TypeError(f"{self.public_name} must be a number")
if not (self.min_val <= value <= self.max_val):
raise ValueError(
f"{self.public_name} must be between "
f"{self.min_val} and {self.max_val}"
)
setattr(obj, self.private_name, value)
class Product:
price = Validated(min_val=0, max_val=10_000)
quantity = Validated(min_val=0, max_val=1_000_000)
def __init__(self, name: str, price: float, quantity: int):
self.name = name
self.price = price # Triggers Validated.__set__
self.quantity = quantity # Triggers Validated.__set__
p = Product("Widget", 29.99, 100) # OK
# Product("Bad", -5, 10) # ValueError!# __init_subclass__ — simpler alternative to metaclasses
class PluginBase:
_plugins: dict[str, type] = {}
def __init_subclass__(cls, *, plugin_name: str = "", **kwargs):
super().__init_subclass__(**kwargs)
name = plugin_name or cls.__name__.lower()
PluginBase._plugins[name] = cls
class JSONPlugin(PluginBase, plugin_name="json"):
pass
class YAMLPlugin(PluginBase, plugin_name="yaml"):
pass
print(PluginBase._plugins) # {"json": <class>, "yaml": <class>}7. Pattern Matching
Structural pattern matching (PEP 634, Python 3.10+) brings match/case statements that destructure and match data by shape. Unlike switch statements in other languages, Python's match works with sequences, mappings, classes, and nested structures.
Guard clauses (if conditions) add runtime checks to pattern cases. Capture patterns bind matched values to names. Or patterns (|) match multiple alternatives. Wildcard (_) matches anything without binding.
Structural Pattern Matching Examples
from dataclasses import dataclass
# Matching sequences and mappings
def process_command(command: list[str]) -> str:
match command:
case ["quit" | "exit"]:
return "Goodbye!"
case ["hello", name]:
return f"Hello, {name}!"
case ["add", *numbers] if all(n.isdigit() for n in numbers):
total = sum(int(n) for n in numbers)
return f"Sum: {total}"
case ["set", key, value]:
return f"Setting {key} = {value}"
case _:
return "Unknown command"
print(process_command(["hello", "Alice"])) # Hello, Alice!
print(process_command(["add", "1", "2", "3"])) # Sum: 6
# Matching class instances
@dataclass
class Point:
x: float
y: float
@dataclass
class Circle:
center: Point
radius: float
@dataclass
class Rectangle:
top_left: Point
width: float
height: float
def describe_shape(shape) -> str:
match shape:
case Circle(center=Point(x=0, y=0), radius=r):
return f"Circle at origin with radius {r}"
case Circle(center=Point(x=x, y=y), radius=r) if r > 100:
return f"Large circle at ({x}, {y})"
case Rectangle(width=w, height=h) if w == h:
return f"Square with side {w}"
case Rectangle(width=w, height=h):
return f"Rectangle {w}x{h}"
case _:
return "Unknown shape"Matching Mappings & API Responses
Pattern matching excels at parsing command structures, handling API responses with different shapes, processing AST nodes, and implementing state machines.
# Matching dict-like structures (API responses)
def handle_response(response: dict) -> str:
match response:
case {"status": "ok", "data": {"users": [first, *rest]}}:
return f"Found {1 + len(rest)} users, first: {first}"
case {"status": "ok", "data": data}:
return f"Success with data: {data}"
case {"status": "error", "code": code, "message": msg}:
return f"Error {code}: {msg}"
case {"status": "error", **rest}:
return f"Error with details: {rest}"
case _:
return "Unexpected response format"
print(handle_response({
"status": "ok",
"data": {"users": ["Alice", "Bob", "Charlie"]}
})) # Found 3 users, first: Alice
print(handle_response({
"status": "error",
"code": 404,
"message": "Not found"
})) # Error 404: Not found8. Memory Management
__slots__ restricts instance attributes to a fixed set, replacing the per-instance __dict__ with a more compact representation. This reduces memory usage by 40-60% and speeds up attribute access. It is essential for classes with millions of instances.
__slots__ & Memory Optimization
import sys
import weakref
import gc
# Without __slots__: each instance has a __dict__
class PointRegular:
def __init__(self, x: float, y: float):
self.x = x
self.y = y
# With __slots__: fixed attribute storage, no __dict__
class PointSlots:
__slots__ = ("x", "y")
def __init__(self, x: float, y: float):
self.x = x
self.y = y
# Memory comparison
regular = PointRegular(1.0, 2.0)
slotted = PointSlots(1.0, 2.0)
print(f"Regular: {sys.getsizeof(regular)} + {sys.getsizeof(regular.__dict__)} bytes")
print(f"Slotted: {sys.getsizeof(slotted)} bytes (no __dict__)")
# Regular: ~56 + ~104 = 160 bytes
# Slotted: ~56 bytes (60% less memory!)
# With 1 million instances:
# Regular: ~160 MB
# Slotted: ~56 MBWeakref & Garbage Collection
weakref creates references that do not prevent garbage collection, crucial for caches, observer patterns, and avoiding circular reference leaks. The gc module provides control over the garbage collector including debugging reference cycles.
# weakref — references that do not prevent GC
class ExpensiveObject:
def __init__(self, name: str):
self.name = name
def __del__(self):
print(f"Deleting {self.name}")
# WeakValueDictionary for caching
cache: weakref.WeakValueDictionary[str, ExpensiveObject] = (
weakref.WeakValueDictionary()
)
obj = ExpensiveObject("data-1")
cache["data-1"] = obj
print("data-1" in cache) # True
del obj # No more strong references
gc.collect() # Force garbage collection
print("data-1" in cache) # False — obj was collected
# Memory profiling with tracemalloc
import tracemalloc
tracemalloc.start()
# ... your code here ...
data = [dict(x=i, y=i**2) for i in range(100_000)]
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics("lineno")
for stat in top_stats[:5]:
print(stat)For memory profiling, use tracemalloc (built-in), memory_profiler, or objgraph. Track allocations, find memory leaks, and optimize data structures to reduce your application's memory footprint.
9. Concurrency: Threading vs Multiprocessing vs Asyncio
Python offers three concurrency models, each suited to different workloads. Threading uses OS threads sharing the same memory space but is limited by the GIL for CPU-bound work. Multiprocessing uses separate processes with full CPU parallelism but higher memory overhead. Asyncio uses a single-threaded event loop for I/O-bound concurrency.
| Feature | threading | multiprocessing | asyncio |
|---|---|---|---|
| Best For | I/O-bound (legacy) | CPU-bound | I/O-bound (modern) |
| GIL | Limited by GIL | No GIL (separate processes) | Single-threaded (N/A) |
| Memory | Shared | Separate (high overhead) | Shared (low overhead) |
| Scalability | ~dozens of threads | ~CPU core count | ~thousands of tasks |
| Debugging | Hard (race conditions) | Medium | Easier (single-threaded) |
Concurrency Pattern Comparison
For I/O-bound tasks (HTTP requests, database queries, file operations), use asyncio. For CPU-bound tasks (data processing, image manipulation, ML training), use multiprocessing. Threading works for I/O-bound tasks when async is not an option, or when integrating with C libraries that release the GIL.
import asyncio
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
import time
# I/O-bound: asyncio wins
async def async_fetch_all(urls: list[str]) -> list:
async def fetch(url: str):
await asyncio.sleep(0.1) # Simulate network I/O
return url
return await asyncio.gather(*[fetch(u) for u in urls])
# I/O-bound: threading alternative
def threaded_fetch_all(urls: list[str]) -> list:
def fetch(url: str) -> str:
time.sleep(0.1) # Simulate network I/O
return url
with ThreadPoolExecutor(max_workers=20) as executor:
return list(executor.map(fetch, urls))
# CPU-bound: multiprocessing wins
def cpu_task(n: int) -> int:
"""CPU-intensive computation."""
return sum(i * i for i in range(n))
def parallel_compute(numbers: list[int]) -> list[int]:
with ProcessPoolExecutor() as executor:
return list(executor.map(cpu_task, numbers))
# Benchmark results (1000 URLs / 10 CPU tasks):
# asyncio: ~0.1s (1000 concurrent coroutines)
# threading (20): ~5.0s (20 threads, batched)
# multiprocessing: ~1.5s (CPU-bound, 8 cores)Python 3.13 introduced a free-threaded build (PEP 703) that removes the GIL, enabling true thread parallelism. This is still experimental but represents the future of Python concurrency.
10. Testing with pytest
pytest is the de facto standard for Python testing. It provides a simpler syntax than unittest, powerful fixtures for setup/teardown, parametrize for data-driven tests, and a rich plugin ecosystem.
Fixtures & Parametrize
Fixtures manage test dependencies and state. They support scopes (function, class, module, session) and automatic cleanup with yield. The conftest.py file shares fixtures across test modules without imports.
import pytest
from unittest.mock import AsyncMock, patch, MagicMock
# conftest.py — shared fixtures
@pytest.fixture
def sample_user() -> dict:
return {"name": "Alice", "email": "alice@test.com", "age": 30}
@pytest.fixture
def db_session():
"""Create a test database session with rollback."""
session = create_test_session()
yield session # Provide to test
session.rollback() # Cleanup after test
session.close()
@pytest.fixture(scope="module")
def api_client():
"""Shared HTTP client for the entire test module."""
client = TestClient(app)
yield client
client.close()
# Parametrize — generate multiple test cases
@pytest.mark.parametrize("input_val, expected", [
("hello", "HELLO"),
("World", "WORLD"),
("", ""),
("123abc", "123ABC"),
("already UPPER", "ALREADY UPPER"),
])
def test_uppercase(input_val: str, expected: str):
assert input_val.upper() == expected
# Parametrize with IDs for clear test names
@pytest.mark.parametrize("a, b, expected", [
pytest.param(1, 2, 3, id="positive"),
pytest.param(-1, 1, 0, id="negative-positive"),
pytest.param(0, 0, 0, id="zeros"),
])
def test_add(a: int, b: int, expected: int):
assert a + b == expectedMocking & Async Testing
Use unittest.mock or pytest-mock for isolating units under test. Parametrize generates multiple test cases from data. Markers (@pytest.mark) categorize tests for selective execution.
# Mocking external services
class UserService:
def __init__(self, api_client):
self.api_client = api_client
async def get_user(self, user_id: int) -> dict:
response = await self.api_client.get(f"/users/{user_id}")
return response.json()
def test_get_user_with_mock():
mock_client = MagicMock()
mock_client.get.return_value.json.return_value = {
"id": 1, "name": "Alice"
}
service = UserService(mock_client)
# ... test logic
# Async test (pytest-asyncio)
@pytest.mark.asyncio
async def test_async_fetch():
mock_client = AsyncMock()
mock_client.get.return_value.json.return_value = {"status": "ok"}
service = UserService(mock_client)
result = await service.get_user(1)
assert result == {"status": "ok"}
mock_client.get.assert_called_once_with("/users/1")
# Patching module-level dependencies
@patch("myapp.services.requests.get")
def test_external_api(mock_get):
mock_get.return_value.status_code = 200
mock_get.return_value.json.return_value = {"data": [1, 2, 3]}
# ... test your function that calls requests.get
# Custom markers for test categorization
@pytest.mark.slow
def test_large_dataset_processing():
"""Run with: pytest -m slow"""
pass
@pytest.mark.integration
def test_database_connection():
"""Run with: pytest -m integration"""
pass11. Python Packaging
Modern Python packaging centers on pyproject.toml (PEP 621), replacing setup.py and setup.cfg. It defines project metadata, dependencies, build system, and tool configuration in one file.
pyproject.toml
# pyproject.toml — modern Python project configuration
[project]
name = "my-awesome-lib"
version = "1.0.0"
description = "A high-performance data processing library"
readme = "README.md"
license = {text = "MIT"}
requires-python = ">=3.11"
authors = [{name = "Alice", email = "alice@example.com"}]
keywords = ["data", "processing", "etl"]
classifiers = [
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
]
dependencies = [
"httpx>=0.27",
"pydantic>=2.0",
"rich>=13.0",
]
[project.optional-dependencies]
dev = ["pytest>=8.0", "ruff>=0.8", "mypy>=1.13"]
docs = ["mkdocs-material>=9.0"]
[project.scripts]
my-cli = "my_lib.cli:main"
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.ruff]
line-length = 88
target-version = "py311"
[tool.ruff.lint]
select = ["E", "F", "I", "UP", "B", "SIM"]
[tool.mypy]
python_version = "3.11"
strict = true
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]uv: Modern Python Package Management
Build backends include setuptools (standard), Hatch (modern), Poetry (dependency management + publishing), and PDM (PEP 582). uv (from Astral, makers of ruff) is the fastest package installer and resolver, 10-100x faster than pip.
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create a new project
uv init my-project
cd my-project
# Add dependencies (resolves and installs in seconds)
uv add httpx pydantic rich
uv add --dev pytest ruff mypy
# Run scripts (auto-creates venv if needed)
uv run python main.py
uv run pytest
uv run ruff check .
# Pin Python version
uv python pin 3.12
# Lock dependencies for reproducibility
uv lock
# Build and publish
uv build
uv publish
# Run a one-off script with inline dependencies
uv run --with requests --with rich script.py
# Speed comparison (adding 10 packages):
# pip: 45.2s
# poetry: 32.1s
# uv: 0.4s (100x faster!)For new projects in 2026, use uv for dependency management: uv init to create a project, uv add for dependencies, uv run to execute scripts, and uv publish to upload to PyPI. It handles virtual environments automatically.
12. Performance Optimization
Start optimization with profiling: cProfile for function-level timing, line_profiler for line-by-line analysis, and py-spy for sampling-based profiling of production code. Never optimize without profiling first.
Profiling & Memoization
import cProfile
import functools
import time
# cProfile — function-level profiling
def profile_me():
data = [i ** 2 for i in range(1_000_000)]
sorted_data = sorted(data, reverse=True)
return sum(sorted_data[:100])
cProfile.run("profile_me()", sort="cumulative")
# Output:
# ncalls tottime percall cumtime percall filename:lineno(function)
# 1 0.000 0.000 0.412 0.412 <string>:1(<module>)
# 1 0.231 0.231 0.412 0.412 script.py:5(profile_me)
# 1 0.181 0.181 0.181 0.181 {built-in method builtins.sorted}
# functools.lru_cache — memoization
@functools.lru_cache(maxsize=128)
def fibonacci(n: int) -> int:
if n < 2:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
# Without cache: fibonacci(35) takes ~5 seconds
# With cache: fibonacci(35) takes ~0.00001 seconds
print(fibonacci(100)) # Instant!
print(fibonacci.cache_info()) # Hits, misses, size
# functools.cache — unlimited cache (Python 3.9+)
@functools.cache
def expensive_computation(x: int, y: int) -> float:
time.sleep(1) # Simulate expensive work
return (x ** y) / (x + y)NumPy Vectorization vs Python Loops
Memoization with functools.lru_cache and functools.cache eliminates redundant computation. NumPy vectorization replaces Python loops with C-level operations for 10-100x speedups on numerical data.
import numpy as np
import time
# Python loop: slow
def python_distance(x1, y1, x2, y2):
"""Calculate distances using pure Python."""
distances = []
for i in range(len(x1)):
d = ((x1[i] - x2[i])**2 + (y1[i] - y2[i])**2) ** 0.5
distances.append(d)
return distances
# NumPy vectorized: fast
def numpy_distance(x1, y1, x2, y2):
"""Calculate distances using NumPy vectorization."""
return np.sqrt((x1 - x2)**2 + (y1 - y2)**2)
# Benchmark with 1 million points
n = 1_000_000
x1 = np.random.rand(n)
y1 = np.random.rand(n)
x2 = np.random.rand(n)
y2 = np.random.rand(n)
start = time.perf_counter()
python_distance(list(x1), list(y1), list(x2), list(y2))
python_time = time.perf_counter() - start
start = time.perf_counter()
numpy_distance(x1, y1, x2, y2)
numpy_time = time.perf_counter() - start
print(f"Python: {python_time:.3f}s")
print(f"NumPy: {numpy_time:.3f}s")
print(f"Speedup: {python_time / numpy_time:.0f}x")
# Python: 1.234s
# NumPy: 0.012s
# Speedup: ~100xCython & Compiled Acceleration
For maximum performance, Cython compiles Python to C with optional static typing. Alternatives include Numba (JIT for numerical code), mypyc (compiles type-annotated Python), and PyO3 (write Python extensions in Rust).
# math_ops.pyx — Cython source file
# cython: language_level=3
def primes_python(int limit):
"""Find primes up to limit using Sieve of Eratosthenes."""
cdef int i, j
cdef list sieve = [True] * (limit + 1)
cdef list result = []
for i in range(2, limit + 1):
if sieve[i]:
result.append(i)
for j in range(i * i, limit + 1, i):
sieve[j] = False
return result
# setup.py for Cython
# from setuptools import setup
# from Cython.Build import cythonize
# setup(ext_modules=cythonize("math_ops.pyx"))
# Alternative: Numba JIT (no Cython setup needed)
# from numba import njit
# @njit
# def fast_sum(arr):
# total = 0.0
# for val in arr:
# total += val
# return total13. Design Patterns in Python
Classic design patterns look different in Python because of first-class functions, duck typing, and dynamic features. Many Gang of Four patterns that require complex class hierarchies in Java become simple functions or decorators in Python.
Singleton & Factory Patterns
# Singleton — using module-level instance (Pythonic way)
# config.py
class _Config:
def __init__(self):
self._settings: dict = {}
def get(self, key: str, default=None):
return self._settings.get(key, default)
def set(self, key: str, value) -> None:
self._settings[key] = value
config = _Config() # Module-level singleton
# Usage: from config import config
# Factory — using a registry dictionary
from typing import Protocol
class Serializer(Protocol):
def serialize(self, data: dict) -> str: ...
def deserialize(self, raw: str) -> dict: ...
class JSONSerializer:
def serialize(self, data: dict) -> str:
import json
return json.dumps(data)
def deserialize(self, raw: str) -> dict:
import json
return json.loads(raw)
class YAMLSerializer:
def serialize(self, data: dict) -> str:
import yaml
return yaml.dump(data)
def deserialize(self, raw: str) -> dict:
import yaml
return yaml.safe_load(raw)
_serializers: dict[str, type[Serializer]] = {
"json": JSONSerializer,
"yaml": YAMLSerializer,
}
def get_serializer(format: str) -> Serializer:
"""Factory function to create serializers."""
cls = _serializers.get(format)
if cls is None:
raise ValueError(f"Unknown format: {format}")
return cls()Observer & Strategy Patterns
The Singleton pattern uses module-level instances or __new__. The Factory pattern leverages dictionaries of callables. The Observer pattern uses weakref sets. The Strategy pattern passes functions directly. Python's dynamic nature makes many patterns lighter than their Java counterparts.
import weakref
from typing import Callable
# Observer pattern with weakref
class EventEmitter:
def __init__(self):
self._listeners: dict[str, list[weakref.ref]] = {}
def on(self, event: str, callback: Callable) -> None:
if event not in self._listeners:
self._listeners[event] = []
self._listeners[event].append(weakref.ref(callback))
def emit(self, event: str, *args, **kwargs) -> None:
if event not in self._listeners:
return
alive = []
for ref in self._listeners[event]:
callback = ref()
if callback is not None:
callback(*args, **kwargs)
alive.append(ref)
self._listeners[event] = alive # Prune dead refs
# Strategy pattern — functions as strategies
from typing import Callable
SortStrategy = Callable[[list], list]
def bubble_sort(data: list) -> list:
arr = data.copy()
for i in range(len(arr)):
for j in range(len(arr) - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
return arr
def quick_sort(data: list) -> list:
if len(data) <= 1:
return data
pivot = data[len(data) // 2]
left = [x for x in data if x < pivot]
middle = [x for x in data if x == pivot]
right = [x for x in data if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
class DataProcessor:
def __init__(self, strategy: SortStrategy = sorted):
self.sort = strategy # Inject strategy
def process(self, data: list) -> list:
return self.sort(data)
processor = DataProcessor(strategy=quick_sort)
print(processor.process([3, 1, 4, 1, 5, 9])) # [1, 1, 3, 4, 5, 9]Understanding when to apply these patterns and when Python's built-in features suffice is key to writing idiomatic, maintainable code.
Conclusion
Advanced Python features transform the language from a scripting tool into a full-scale software engineering platform. Type hints catch bugs before runtime. Async/await handles massive concurrency. Pattern matching makes complex logic declarative. And modern tooling like uv and ruff makes the developer experience fast and reliable.
Start with the features most relevant to your current project. Type hints and dataclasses provide immediate value for any codebase. Async/await and testing patterns are essential for web services. Metaclasses and descriptors become important as you build frameworks and libraries. Performance optimization and design patterns round out your toolkit for building production-grade Python applications.
FAQ
What is the difference between TypeVar and Protocol in Python?
TypeVar creates generic type variables that preserve specific types through function calls (like T in generics). Protocol defines structural subtyping interfaces where any class with matching methods satisfies the protocol without inheriting from it. Use TypeVar for generic containers and functions; use Protocol for defining expected behavior interfaces.
When should I use dataclasses vs Pydantic?
Use dataclasses for simple internal data structures where validation is not needed. Use Pydantic for external data boundaries like API inputs, configuration files, and database records where runtime validation, serialization, and JSON Schema generation are required. Pydantic v2 is also significantly faster due to its Rust core.
How does asyncio compare to threading in Python?
asyncio uses a single-threaded event loop for cooperative concurrency, ideal for I/O-bound tasks with thousands of connections. Threading uses OS threads but is limited by the GIL for CPU-bound work. asyncio generally has lower overhead and is easier to reason about, but requires async-compatible libraries.
What is structural pattern matching in Python?
Structural pattern matching (match/case, Python 3.10+) destructures and matches data by shape. Unlike simple switch statements, it handles sequences, mappings, class instances, and nested structures. Guard clauses add conditional logic within cases. It excels at parsing complex data structures.
How do __slots__ improve Python performance?
__slots__ replaces the per-instance __dict__ dictionary with a fixed-size array of attribute slots. This reduces memory usage by 40-60% per instance and speeds up attribute access. It is critical for classes with millions of instances like data processing records or game entities.
What is the recommended Python packaging tool in 2026?
uv (from Astral) is the recommended tool for new projects. It is 10-100x faster than pip, handles virtual environments automatically, supports pyproject.toml natively, and provides commands for project creation, dependency management, and publishing. Poetry and PDM remain popular alternatives.
When should I use metaclasses vs __init_subclass__?
__init_subclass__ (PEP 487) is simpler and preferred for most subclass customization needs like validation and registration. Use metaclasses only when you need to intercept class creation itself, modify the class namespace before creation, or control the class hierarchy. Most real-world code never needs custom metaclasses.
How do I choose between Cython, Numba, and PyO3 for performance?
Use Cython for gradual optimization of existing Python code with optional type annotations. Use Numba for JIT-compiling numerical functions with NumPy arrays (no code changes needed). Use PyO3 for writing high-performance Python extensions in Rust with full control. For most cases, start with profiling and algorithmic improvements before reaching for compiled solutions.