Open Source Research Framework

Find What Matters
in Long Video

Physics-inspired event detection — 6 methods, from pure signal processing to vision-language models

Try the Demo ↓ View on GitHub →
6
Methods
73
Tests Passing
MIT
License

Why This Problem Is Hard

Modern video streams generate terabytes of mostly unremarkable footage. A 24-hour surveillance feed might contain 3 minutes of meaningful events. A 90-minute dashcam recording might have 5 critical moments. Watching it all is impractical. Having AI watch every frame is expensive. EDF explores a middle path: physics-inspired signal processing that cheaply identifies where the interesting stuff is.

🚗

Dashcam Analysis

Identify braking events, near-collisions, lane changes in long recordings without watching hours of uneventful driving.

📹

Surveillance

Flag anomalous motion patterns in hours of static footage, reducing review time to just the segments worth watching.

🏃

Sports Analysis

Automatically clip highlights — goals, fouls, key plays — without manual annotation or frame-by-frame scrubbing.

What This Is

EDF is a research framework and demonstration. The detection methods are real implementations — 73 passing tests on genuine algorithms. This is not a production system; it is an exploration of how physics-inspired signal processing can reduce the cost of AI-based video understanding.

The Energy Functional

EDF treats event detection as a physics problem. Each video window has an "energy" — a weighted sum of five signal features. High energy means something interesting is happening.

E(W) = α₁·φ_motion + α₂·φ_interaction + α₃·φ_scene + α₄·φ_uncertainty + α₅·φ_spectral

Where:
  φ_motion      = ‖v‖² + ‖dv/dt‖²        (optical flow velocity + acceleration)
  φ_interaction = Var(edge_density)        (Canny edge variance across frames)
  φ_scene       = ‖hist_end - hist_start‖  (color histogram L2 shift)
  φ_uncertainty = E[pixel_variance]        (per-frame information content)
  φ_spectral    = P_high / P_low           (FFT energy ratio of flow magnitude)

Hierarchical Filtering

Rather than compute all 5 features for every window, EDF filters in 3 passes. Cheap features run first; expensive features only run on windows that already look promising.

Level 0 — All Windows
Motion + Scene
Compute φ_motion and φ_scene only. Threshold τ₀ = μ + 2σ cuts inexpensively.
~50% survive
Level 1 — Candidates
+ Interaction + Spectral
Add edge variance and FFT ratio. Threshold τ₁ = μ + 1.5σ tightens the set.
~20% survive
Level 2 — Candidates
+ Uncertainty
Add per-frame information content. Threshold τ₂ = μ + σ gives a small final candidate set.
~5% survive
Final
Greedy Diverse Selection → Top-k Events
Select k diverse, high-energy events using a greedy submodular maximization step to avoid clustering near a single peak.

Spectral Decomposition

A sudden braking event creates a sharp, high-frequency motion signature in optical flow. A car cruising at constant speed creates low-frequency, smooth motion. The FFT separates these by decomposing the flow magnitude into frequency bands — P_high / P_low tells us whether a window's motion is sudden or gradual.

Try It

This is a live demonstration of the EDF algorithms. Upload any short video clip to see event detection in action.

Drop a video file here or click to browse

MP4, AVI, MOV · Max 150 MB · Max 8 minutes

Synthetic Dashcam (2 min)

Pre-computed results showing how EDF scores a simulated dashcam recording with 7 event types: sudden braking, lane change, near collision, speed bump, traffic merge, pedestrian crossing, and emergency vehicle.

Initializing...

Detection Results

Rank Timestamp Score Duration Label

All 6 Methods

EDF implements six distinct approaches to event detection. They differ in compute cost, dependency on machine learning models, and what kinds of events they detect best.

Method Approach Compute Needs ML? Best For
Hierarchical Energy Physics cascade Low No General, fast scans
Geometric Outlier PCA + k-NN manifold Low No Smooth, regular footage
Pure Optimization Submodular maximization Low No Batch offline processing
CLIP Embedding Zero-shot VLM Medium (GPU) Pre-trained CLIP Semantic / named events
Dense VLM GPT-4V oracle Very High ($) GPT-4V API Ground truth labeling
Attention Temporal Self-attention entropy Low No Surprising / rare frames

Install

EDF installs directly from GitHub. No compiled extensions — pure Python with OpenCV for video I/O.

# Install from GitHub
pip install git+https://github.com/alawein/event-discovery-framework.git
# Quick start
from event_discovery.methods import HierarchicalEnergyMethod, EnergyConfig

config = EnergyConfig(top_k=10)
method = HierarchicalEnergyMethod(config)
events = method.process_video("video.mp4")

for event in events:
    print(f"{event.start_time:.1f}s - {event.end_time:.1f}s (score: {event.score:.3f})")