Pose Estimation in Action: Visualizing the Golf Swing Frame by Frame ⛳️🤖
In my last post, I introduced SwingSense—a golf swing diagnostic tool powered by AI and computer vision, built as both a technical challenge and a personal love letter to movement, mastery, and meaningful machine learning.
This is where we start building with intention.
In this post, I’ll walk you through how I used MediaPipe and OpenCV to process a swing video, detect human pose landmarks, and visualize the beauty of motion—one frame at a time.
🎯 Where We’re Headed
Here’s what we’re unlocking today:
- Loading a real golf swing video
- Applying pose estimation to detect key joints
- Overlaying the skeletal movement onto video frames
- Preparing to extract joint movement data for ML modeling
We're laying the foundation for understanding a swing not just as motion—but as data.
🎥 Step 1: Loading the Swing Video
I started with a slow-mo swing clip from YouTube, dropped it into my working directory under:
data/raw/sample_swing.mp4
To load it up with OpenCV:
import cv2
cap = cv2.VideoCapture('data/raw/sample_swing.mp4')
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
cv2.imshow('Raw Frame', frame)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Simple, direct, and a great sanity check that your video pipeline is working.
🧍🏾♀️ Step 2: Pose Estimation with MediaPipe
Next, I brought in MediaPipe’s Pose model to identify 33 body landmarks—from head to toe.
import mediapipe as mp
mp_pose = mp.solutions.pose
pose = mp_pose.Pose()
results = pose.process(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
Once the landmarks were detected, I used MediaPipe’s drawing tools to overlay the skeletal structure onto each frame:
mp_drawing = mp.solutions.drawing_utils
if results.pose_landmarks:
mp_drawing.draw_landmarks(
frame,
results.pose_landmarks,
mp_pose.POSE_CONNECTIONS)
It’s genuinely cool seeing the body mapped out mid-swing—limbs aligning, torso rotating, power coiled in motion.
👀 Step 3: Real-Time Motion Visuals
Displaying that annotated frame was easy with OpenCV:
cv2.imshow('Swing Pose Frame', frame)
But what it revealed wasn’t just working code—it was insight.
I could see the motion patterns emerge: shoulder tilt, hand lag, torso compression. The feedback engine we’ll build later? This is where it begins.
✨ Visualization = validation.
Before we start calculating metrics, we need to trust the model's eye.
🔄 Step 4: Prepping for Data Extraction
Next, I’ll be extracting the landmark coordinates from each frame—x, y, z, and visibility—for all 33 points. This is where we transition from visuals to structured, analyzable data.
We’ll format this into:
- A Pandas DataFrame
- One row per frame, one column per joint
- Ready to calculate angles, offsets, tempo patterns, and movement phases
🧠 Why This Matters
Golf swings aren’t just art—they’re math, balance, timing, biomechanics.
By mapping joints, we capture the hidden structure of performance.
And by starting with pose estimation, we’re already creating a foundation where:
- Movement becomes measurable
- Feedback becomes automated
- And improvement becomes data-driven
📍Up Next in Part 3:
- Export pose landmark data for each frame
- Calculate joint angles (like spine tilt, wrist lag, etc.)
- Define a few “good vs improvable” swing metrics
- Start building our training data for the feedback model
💬 Your Turn
Would you use pose estimation outside sports—like for dance, yoga, rehab, or even gesture control?
If you’re working on something similar, tag me—I’d love to learn from you, collaborate, or share notes.
SwingSense is just getting started.
But already, this journey feels like more than a project. It feels like insight in motion.
Until next time,
Keep swinging with intention.