← BackJan 8, 2026

Show HN: DeepDream for Video with Temporal Consistency

deepdream-video-pytorch This is a fork of neural-dream, a PyTorch implementation of DeepDream. This fork introduces optical flow estimation and occlusion masking to apply DeepDream to videos with temporal consistency. Features Standard DeepDream: The original single-image implementation. Video DeepDream: New CLI (video_dream.py) that uses RAFT Optical Flow to warp previous dream frames into the current frame, ensuring smooth transitions and object tracking. Occlusion Masking: Automatically detects when objects move in front of one another to prevent "ghosting" artifacts. Demo With temporal consistency mallard_demo.mp4 highway_demo.mp4 With frames processed independently mallard_independent_demo.mp4 highway_independent_demo.mp4 Inputs mallard.mp4 highway.mp4 Setup Dependencies This project requires the following key packages: PyTorch torchvision OpenCV NumPy Pillow Install Dependencies: pip install -r requirements.txt Download Models: Run the download script to fetch the standard Inception/GoogLeNet models: python models/download_models.py To download all compatible models: python models/download_models.py -models all-caffe-googlenet Usage 1. Video DeepDream To dream on a video, use the video_dream.py script. This wrapper accepts specific video arguments and any argument accepted by the standard image dreamer (e.g., layers, octaves, iterations). Basic Video Command: python video_dream.py -content_video input.mp4 -output_video output.mp4 -num_iterations 1 Note: For video processing, we recommend using -num_iterations 1. The temporal consistency from optical flow means each frame builds on the previous dream, so fewer iterations per frame are needed compared to single images. Video-Specific Arguments: Argument Default Description -content_video input.mp4 Path to the source video file. -output_video output.mp4 Path where the final video will be saved. -blend 0.5 (0.0 - 1.0): Mix ratio between the raw video frame and the warped previous dream. Higher values (closer to 1.0) use more of the raw frame; lower values (closer to 0.0) preserve more of the previous hallucinations. -independent False Flag: If set, disables temporal consistency (Optical Flow). Every frame is dreamed on independently (causes flickering). -update_interval 5 Updates the output video file on disk every N frames (allows you to preview progress while running). -temp_dir temp Directory to store extracted frames, flow data, and masks during processing. -keep_temp False Flag: If set, the temporary directory is not deleted after processing finishes. -verbose False Flag: Enable detailed logs (prints DeepDream iteration logs for every frame). 2. Standard DeepDream Arguments All of the following arguments are from the single frame implementation, and you can mix and match any of these with the video-specific arguments above. Refer to neural-dream for more information on single frame parameters. Example combining video and standard args: python video_dream.py -content_video test.mp4 -dream_layers inception_4d -num_iterations 1 -octave_scale 0.7 -image_size 512 For single image processing only: python neural_dream.py -content_image -dream_layers inception_4c -num_iterations 10 Note: Paths to images should not contain the ~ character; use relative or absolute paths. Standard Options -image_size: Maximum side length (in pixels) of the generated image. Default is 512. -gpu: Zero-indexed ID of the GPU to use; for CPU mode set -gpu to c; for MPS mode (Apple Silicon) set -gpu to mps. Optimization Options -dream_weight: How much to weight DeepDream. Default is 1e3. -tv_weight: Weight of total-variation (TV) regularization; helps smooth the image. Default 0. -l2_weight: Weight of latent state regularization. Default 0. -num_iterations: Number of iterations. Default is 10. For video, use 1 (temporal consistency reduces the need for multiple iterations per frame). -init: Initialization method: image (content image) or random (noise). Default image. -jitter: Apply jitter to image. Default 32. -layer_sigma: Apply gaussian blur to image. Default 0 (disabled). -optimizer: lbfgs or adam. Default adam. -learning_rate: Learning rate (step size). Default 1.5. -normalize_weights: Divide dream weights by the number of channels. -loss_mode: Loss mode: bce, mse, mean, norm, or l2. Default l2. Output Options -output_image: Name of the output image. Default out.png. -output_start_num: Number to start output image names at. Default 1. -print_iter: Print progress every N iterations. -save_iter: Save image every N iterations. Layer & Channel Options -dream_layers: Comma-separated list of layer names to use. -channels: Comma-separated list of channels to use. -channel_mode: Selection mode: all, strong, avg, weak, or ignore. -channel_capture: once or octave_iter. Octave Options -num_octaves: Number of octaves per iteration. Default 4. -octave_scale: Resize value. Default 0.6. -octave_iter: Iterations (steps) per octave. Default 50. -octave_mode: normal, advanced, manual_max, manual_min, or manual. Laplacian Pyramid Options -lap_scale: Number of layers in laplacian pyramid. Default 0 (disabled). -sigma: Strength of gaussian blur in pyramids. Default 1. Zoom & Tile Options -zoom: Amount to zoom in. -zoom_mode: percentage or pixel. -tile_size: Desired tile size. Default 0 (disabled). -overlap_percent: Percentage of overlap for tiles. Default 50. Other Options -original_colors: Set to 1 to keep content image colors. -model_file: Path to .pth file. Default is VGG-19. -model_type: caffe, pytorch, keras, or auto. -backend: nn, cudnn, openmp, or mkl. -cudnn_autotune: Use built-in cuDNN autotuner (slower start, faster run). Frequently Asked Questions Problem: The program runs out of memory (OOM) Solution: Reduce -image_size (e.g., to 512 or 256). If using GPU, use -backend cudnn. For video: Reduce the input video resolution before processing. Problem: Video processing is very slow Solution: Video DeepDreaming is computationally expensive. It runs the full DeepDream process per frame, plus Optical Flow calculations. Use -num_iterations 1 (recommended for video; temporal consistency means fewer iterations are needed). Reduce -octave_iter (e.g., to 10 or 20). Use a smaller -image_size. Memory Usage By default, neural-dream uses the nn backend. Use cuDNN: -backend cudnn (GPU only, reduces memory). Reduce Size: -image_size 256 (Halves memory usage). With default settings, standard execution uses ~1.3 GB GPU memory. Multi-GPU Scaling You can use multiple devices with -gpu and -multidevice_strategy. Example: -gpu 0,1,2,3 -multidevice_strategy 3,6,12 splits layers across 4 GPUs. See ProGamerGov/neural-dream for details.