Building a Comprehensive Film Review Pipeline: From CSV to Professional Reports

A comprehensive data pipeline and static site generator for weekly and season-long player evaluations from CSV film logs. This project transforms raw game film data into professional-grade reports, dashboards, and analytics for football coaching staff.
The Problem I'm Solving
Most football coaching staff face significant challenges when analyzing player performance:
- Manual data entry from game film is time-consuming and error-prone
- Inconsistent scoring across different evaluators and games
- Limited analytics beyond basic statistics
- No standardized reporting format for player development tracking
- Difficulty comparing players across different positions and roles
I needed a solution that would automate the entire process from data collection to report generation.
The Technology Stack
Backend: Python, Pandas, ReportLab, Pillow, OpenPyXL
Frontend: HTML, CSS, JavaScript
Deployment: GitHub Pages, GitHub Actions
Analytics: Google Analytics 4
Data Storage: CSV files, Static HTML
Architecture Overview
The pipeline processes weekly CSV files containing player performance data through several stages:
CSV Input → Data Preparation → Film Grading → Report Generation → Static Site
Data Flow
- CSV Preparation - Raw game film data is cleaned and standardized
- Film Grading - Automated scoring based on custom rubrics
- Report Generation - Individual player PDFs and summary reports
- Dashboard Creation - Interactive HTML dashboards for analysis
- Static Site Generation - Complete website with all outputs
Custom Scoring System
The pipeline implements a sophisticated scoring rubric that goes beyond basic statistics:
Positive Plays
- Touchdown: +15 points
- Relentless Effort: +5 points
- Elite Route: +7 points
- Good Route: +2 points
- Catch/Rush yardage: +0.5 per yard
- Broken Tackles: +1.0 per tackle
- Good Block: +2 points
- Pancake Block: +10 points
- First Down: +5 points
- Spectacular Catch: +10 points
Negative Plays
- Missed Assignment: -10 points
- Dropped Pass: -15 points
- Bad Route: -2 points
- Loaf: -2 points
- Not Full Speed: -3 points
- Whiffed Block: -1 point
Advanced Metrics Calculation
The system automatically calculates sophisticated metrics that provide deeper insights:
# Key metrics automatically calculated
Catch Rate = catches / (catches + drops)
Drop Rate = drops / (catches + drops)
Yards per Target = (rec_yards + rush_yards) / targets
TDs per 30 = (touchdowns / snaps) * 30
Targets per 30 = (targets / snaps) * 30
Key Plays per 30 = (key_plays / snaps) * 30
Implementation Highlights
Automated Report Generation
The system generates multiple output formats:
- Individual Player Reports - Detailed text analysis for each player
- Weekly Summary PDFs - Coaching staff overview with key insights
- Group Film PDFs - Aggregated play-by-play details
- HTML Dashboards - Interactive player comparisons
- Snapshot Tables - Quick weekly overviews
Batch Processing Workflow
# Process entire season with one command
for d in out/Wk*; do
wk="${d##*/Wk}"
prep=$(ls "$d"/Wk*_*_prepared.csv 2>/dev/null | head -n1 || true)
details=$(ls "$d"/results_*.csv 2>/dev/null | grep -v summary | head -n1 || true)
[ -f "$prep" ] && [ -f "$details" ] || continue
# Grade and generate reports
./venv/bin/python film_grade.py "$prep" --out_dir "$d" --out "$(basename "$details")"
# Create PDFs and dashboards
./venv/bin/python tools/make_pdfs.py \
--reports_dir "$d/reports" \
--out_dir "$d/pdfs" \
--summary_csv "${details%.csv}_summary.csv" \
--details_csv "$details" \
--title "Week $wk Summary"
done
CI/CD Integration
The project uses GitHub Actions for automated deployment:
name: Deploy to GitHub Pages
on:
push:
branches: [ main ]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: '3.9'
- name: Install dependencies
run: |
pip install -r requirements.txt
- name: Build site
run: |
# Run build scripts
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
Technical Challenges Solved
Data Normalization
Creating fair comparisons between players with different snap counts and roles required sophisticated normalization algorithms that account for playing time and position-specific metrics.
Report Generation
Generating professional-quality PDFs with dynamic content, charts, and formatting while maintaining performance across large datasets.
Mobile Optimization
Ensuring all HTML outputs work seamlessly across devices with responsive design and touch-friendly interfaces.
Batch Processing
Handling large datasets efficiently while maintaining data integrity and providing progress feedback during long-running operations.
Analytics & Tracking
The system includes comprehensive analytics:
- Google Analytics 4 integration for usage tracking
- PDF download tracking to monitor report usage
- Navigation click tracking for dashboard interactions
- Custom event tracking for coaching staff engagement
Project Impact
This pipeline has transformed how the coaching staff analyzes player performance:
- Time Savings: Reduced manual film analysis from hours to minutes
- Consistency: Standardized scoring across all players and games
- Insights: Data-driven player development decisions
- Scalability: Handles entire seasons with minimal effort
Lessons Learned
Building this pipeline taught me valuable lessons about:
- Data Pipeline Design: The importance of modular, testable components
- User Experience: Making complex data accessible to non-technical users
- Performance Optimization: Balancing feature richness with processing speed
- Documentation: Clear instructions are crucial for adoption
What's Next
This project demonstrates how thoughtful automation can transform manual processes into efficient, scalable systems that provide real value to end users. The pipeline continues to evolve with new features and improvements based on user feedback and changing requirements.
Human Reflections
This film review pipeline represents one of my most comprehensive data engineering projects. What started as a simple script to help with wide reciever evaluations evolved into a full-featured system that the coaching staff could now use weekly. After watching the flim and taking notes for years I realized there I coudl developer a better way to grade player performace and share that information with them quickly and permanantly.
The most rewarding aspect was seeing how the automated reports changed the coaching staff's workflow. Instead of spending hours manually calculating statistics and creating reports, they can now focus on the actual analysis and player development. No longer trying to remember what play number to show in the flim session or trying to find that play where 22 yard catch was made, or "that one pay where I broke 2 tackles". With these reports it's all documented, all the data is readily available and easy to digest and use.
The GitHub Actions integration was particularly satisfying, as it eliminated the manual deployment process entirely.
What surprised me most was how the system continues to evolved beyond its original scope. What began as a simple grading tool became a comprehensive analytics platform that provides insights the coaching staff never had before.
More about this project here
Building AI with AI Series
Part 2 of 2