Physics-inspired event detection — 6 methods, from pure signal processing to vision-language models
Modern video streams generate terabytes of mostly unremarkable footage. A 24-hour surveillance feed might contain 3 minutes of meaningful events. A 90-minute dashcam recording might have 5 critical moments. Watching it all is impractical. Having AI watch every frame is expensive. EDF explores a middle path: physics-inspired signal processing that cheaply identifies where the interesting stuff is.
Identify braking events, near-collisions, lane changes in long recordings without watching hours of uneventful driving.
Flag anomalous motion patterns in hours of static footage, reducing review time to just the segments worth watching.
Automatically clip highlights — goals, fouls, key plays — without manual annotation or frame-by-frame scrubbing.
EDF is a research framework and demonstration. The detection methods are real implementations — 73 passing tests on genuine algorithms. This is not a production system; it is an exploration of how physics-inspired signal processing can reduce the cost of AI-based video understanding.
EDF treats event detection as a physics problem. Each video window has an "energy" — a weighted sum of five signal features. High energy means something interesting is happening.
E(W) = α₁·φ_motion + α₂·φ_interaction + α₃·φ_scene + α₄·φ_uncertainty + α₅·φ_spectral Where: φ_motion = ‖v‖² + ‖dv/dt‖² (optical flow velocity + acceleration) φ_interaction = Var(edge_density) (Canny edge variance across frames) φ_scene = ‖hist_end - hist_start‖ (color histogram L2 shift) φ_uncertainty = E[pixel_variance] (per-frame information content) φ_spectral = P_high / P_low (FFT energy ratio of flow magnitude)
Hierarchical Filtering
Rather than compute all 5 features for every window, EDF filters in 3 passes. Cheap features run first; expensive features only run on windows that already look promising.
A sudden braking event creates a sharp, high-frequency motion signature in optical flow. A car cruising at constant speed creates low-frequency, smooth motion. The FFT separates these by decomposing the flow magnitude into frequency bands — P_high / P_low tells us whether a window's motion is sudden or gradual.
Drop a video file here or click to browse
MP4, AVI, MOV · Max 150 MB · Max 8 minutes
Pre-computed results showing how EDF scores a simulated dashcam recording with 7 event types: sudden braking, lane change, near collision, speed bump, traffic merge, pedestrian crossing, and emergency vehicle.
Initializing...
Detection Results
| Rank | Timestamp | Score | Duration | Label |
|---|
EDF implements six distinct approaches to event detection. They differ in compute cost, dependency on machine learning models, and what kinds of events they detect best.
| Method | Approach | Compute | Needs ML? | Best For |
|---|---|---|---|---|
| Hierarchical Energy | Physics cascade | Low | No | General, fast scans |
| Geometric Outlier | PCA + k-NN manifold | Low | No | Smooth, regular footage |
| Pure Optimization | Submodular maximization | Low | No | Batch offline processing |
| CLIP Embedding | Zero-shot VLM | Medium (GPU) | Pre-trained CLIP | Semantic / named events |
| Dense VLM | GPT-4V oracle | Very High ($) | GPT-4V API | Ground truth labeling |
| Attention Temporal | Self-attention entropy | Low | No | Surprising / rare frames |
EDF installs directly from GitHub. No compiled extensions — pure Python with OpenCV for video I/O.
# Install from GitHub
pip install git+https://github.com/alawein/event-discovery-framework.git
# Quick start from event_discovery.methods import HierarchicalEnergyMethod, EnergyConfig config = EnergyConfig(top_k=10) method = HierarchicalEnergyMethod(config) events = method.process_video("video.mp4") for event in events: print(f"{event.start_time:.1f}s - {event.end_time:.1f}s (score: {event.score:.3f})")