Intelligent Scissors: Interactive Image Segmentation Tool
Overview
Implemented an Intelligent Scissors (Live-Wire) interactive image segmentation tool in Java, as a course project at CMU (Spring 2025), with teammates Tingjun Huang and Guoheng Ma. The tool allows a user to click seed points on an image and have the system automatically trace optimal boundaries between objects in real time.
The implementation models the image as an 8-neighborhood weighted pixel graph with edge costs derived from Sobel gradient magnitudes, then computes live boundary paths via Dijkstra and A* shortest-path search from the last seed to the current cursor position.
Results
Processing time (measured on Intel Core i9-12900H, 32 GB RAM, Windows 11, vs. Adobe Photoshop 2020 manual polygon-lasso):
| Test Image | Photoshop | Intelligent Scissors | Speedup |
|---|---|---|---|
| Image 1 (Mickey Mouse cartoon, low-contrast boundary, 661×494 px) | 551 s | 106 s | 5.2× |
| Image 2 (two parrots, high-contrast textured, 707×480 px) | 279 s | 25 s | 11.2× |
Segmentation accuracy:
| Test Image | IoU | Dice Coefficient |
|---|---|---|
| Image 1 | 0.88 | 0.94 |
| Image 2 | 0.97 | 0.99 |
Both results exceed the 0.85 Dice threshold commonly considered “high-quality” in interactive segmentation literature.
Technical Details
Image Processing and Cost Function:
- Input image converted to grayscale (ITU-R BT.709 weighted average).
- Sobel operators compute horizontal and vertical gradients G_x, G_y; gradient magnitude G(x,y) = sqrt(G_x² + G_y²).
- Pixel cost: C(x,y) = 1 / (1 + G(x,y)) — strong edges yield low cost, making them preferred by the pathfinder.
- Border regions optionally enhanced by boosting edge strength at image boundaries to ensure complete closed paths.
Graph and Shortest-Path Search:
- 8-neighborhood graph: each pixel node connected to 8 neighbors; edge weight = (C(i,j) + C(k,l)) / 2 × d, where d = 1 for orthogonal, √2 for diagonal.
- Dijkstra computes shortest paths from the active seed to all pixels; results cached as a per-pixel cost map + predecessor map — path retrieval on cursor move is O(1) (no re-computation until seed changes).
- A* used in ROI-constrained mode with Euclidean-distance heuristic (α = 0.005), focusing search within a dynamic window around the start/end points for faster interactive response.
Cursor Snap:
Within radius r, the cursor position is attracted to the pixel maximizing:
S(x’, y’) = E(x’, y’) / (1 + α · d((x,y), (x’,y’)))
where E is precomputed edge strength and α = 0.3 is a decay factor. This snaps the active endpoint to nearby strong edges, reducing required precision and stabilizing boundary tracing along prominent features.
Path Cooling (auto-seeding):
When the live path remains stable across N consecutive frames (measured by Euclidean distance between uniformly sampled points), and a candidate point c satisfies minimum spacing (||c − s_last|| > D_min) and minimum edge strength (E(c) > E_thresh), a new seed is automatically inserted. This progressively locks completed path segments without explicit clicks, reducing total click count.
Performance Optimizations:
- Precomputed cost matrix cached after edge detection — no recomputation during pathfinding.
- Dynamic window restriction: graph construction limited to a bounding region around start/end points; window size adapts to the distance between them.
IndexMinPQpriority queue for efficient A* open-set management.- Boolean mask matrix for O(1) pixel state queries in segmentation and path cooling.
Challenges
Real-time Dijkstra at megapixel scale: A naïve re-run on every cursor move (O(N log N), N ≈ 300K for a 661×494 image) was too slow for interactive use. The seed-caching architecture — precomputing the full Dijkstra result once per seed and recovering paths via predecessor-map traversal — reduced per-frame work from O(N log N) to O(path length).
Path instability in low-gradient regions: In flat-texture regions, small cursor perturbations caused large path jumps. Path Cooling’s stability detection (consecutive-frame similarity check) added path inertia, while the sharpening kernel parameter tuning (value 5) improved edge detectability in low-contrast areas at the cost of some visual noise.
Closed-path formation: The algorithm naturally traces from a seed toward the cursor; closing the path requires the endpoint to meet the starting seed. Implemented arc-length parameterization and a minimum-seed-distance threshold to prevent the path from “collapsing” prematurely near the starting seed.
Reflection and Insights
Two lessons dominated this project: caching architecture matters more than algorithmic choice, and UX features drive perceived performance more than raw compute.
The 5.2×–11.2× speedup over Photoshop came almost entirely from the Dijkstra seed-cache — a single structural insight eliminating redundant computation. Meanwhile, Cursor Snap and Path Cooling — entirely orthogonal to the shortest-path algorithm — were responsible for the majority of improvement in actual segmentation speed during user testing. These two auxiliary features reduced user effort (click count, repositioning) far more than algorithmic refinements to the path cost function did.
This generalizes to interactive system design broadly: the component the user interacts with most frequently (cursor movement, click placement) has disproportionate leverage on end-to-end task time.
Team and Role
Course project at CMU with Tingjun Huang (12112619) and Guoheng Ma (12211611). My contributions: graph construction and Dijkstra/A* implementation, cost matrix caching architecture, Path Cooling algorithm design and implementation, and performance benchmarking.
Stack
Java 17, IntelliJ IDEA, Swing (GUI), Princeton algorithm library (EdgeWeightedDigraph, DijkstraSP, IndexMinPQ)
Intelligent Scissors: Interactive Image Segmentation Tool
https://liferli.com/2025/05/31/projects/intelligent-scissors/