Building AI with AI

Introducing My Multi-Tenant Video Processing Platform: The Vision and Architecture

6 min read
Introducing My Multi-Tenant Video Processing Platform: The Vision and Architecture

Introducing My Multi-Tenant Video Processing Platform: The Vision and Architecture

Welcome to the first entry in my "Building AI with AI" series, where I'll document the process of creating a multi-tenant video processing platform with the assistance of AI tools.

The Problem I'm Solving

Organizations today are creating more video content than ever before - training materials, webinars, customer interviews, internal communications, and more. Within these videos lies valuable information, but extracting insights from hours of footage remains challenging:

  • Manual transcription and analysis is time-consuming and expensive
  • Generic AI solutions lack domain-specific knowledge for specialized use cases
  • Building custom solutions requires significant ML expertise and infrastructure

My video processing platform aims to solve these challenges by providing a scalable, multi-tenant system that allows organizations to:

  1. Automatically process and transcribe videos
  2. Train custom models on their specific video content
  3. Extract insights and enable powerful search functionality
  4. Scale resources based on their needs

The Meta Approach: Building AI with AI

What makes this project particularly interesting is the meta-concept at its core: using AI to build an AI system. Rather than assembling a traditional development team, I'm partnering with several AI assistants:

  • Claude (Anthropic): My architect and systems designer
  • ChatGPT (OpenAI): My primary coding partner
  • Grok (xAI): My creative consultant for unconventional approaches
  • Cursor: My AI-powered code editor

Together, these tools form a comprehensive development environment that represents what I believe is the future of software creation. Throughout this series, I'll be transparent about which parts were created with AI assistance and share my unfiltered thoughts in the "Human Reflections" sections.

The Technology Stack

After consulting with my AI architects, I've settled on a technology stack that balances performance, scalability, and development efficiency:

Infrastructure: EC2, Docker, S3, PostgreSQL
Backend: Python, Flask, FFmpeg, Whisper API
Machine Learning: HuggingFace Transformers, PyTorch
Frontend: React for the admin console

Architecture Overview

The system employs a multi-tenant architecture with Docker containers providing isolation between tenants. Here's a high-level overview of the components:

Client Layer

  • Admin Console (React)
  • API Clients
  • Mobile App (future)

API Layer

  • Flask API with JWT Authentication
  • Endpoints for Tenant, User, Video, Model, and Payment management

Processing Layer

  • Tenant Container Manager for resource allocation
  • Video Processing Pipeline (download, audio extraction, transcription)
  • Model Training Pipeline

Data Layer

  • PostgreSQL for metadata
  • S3 for video, audio, transcription, and model storage

External Integrations

  • Whisper API for transcription
  • Stripe for payments

Core Processing Workflows

The platform has two main processing workflows:

Video Processing Flow:

  1. User uploads video through admin console
  2. System dispatches processing job to tenant container
  3. Container extracts audio using FFmpeg
  4. Audio sent to Whisper API for transcription
  5. Results stored in S3 and database

Model Training Flow:

  1. User initiates model training via admin console
  2. System queues training job based on tenant tier
  3. Tenant container creates Q&A dataset from transcriptions
  4. System fine-tunes base model (e.g., google/flan-t5-base)
  5. Trained model stored in tenant-specific S3 location

Development Philosophy

I'm committed to "building in public" through this blog series. This approach offers several benefits:

  • Transparency: Sharing the actual development process, including challenges
  • Community Input: Gathering feedback from readers as I build
  • Learning Resource: Creating a case study in AI-assisted development
  • Exploration: Testing the boundaries of what's possible with AI tools

Human Reflections

When I first conceived this project, I wondered if it was too meta: using AI to build AI-powered systems and documenting the process. But the more I thought about it, the more I realized that this represents something larger and more impactful about where software development is heading.

Engineers working with or being replaced by AI agents have already taken root in the tech industry; if you don't believe me, just Google search how many tech companies are cutting jobs and moving those resources to AI. In this process, I've noticed that my role in developing this platform feels similar to working with a well-equipped engineer doing many of the coding tasks that would typically consume most of my time. This shift allows me to consider the bigger picture for the project and ask questions that would normally be asked later in the development lifecycle. The ability to crank out POCs while I iterate to a viable product has been a nice change of pace.

The conversation with the AI tools has been surprising. Working with different approaches using ChatGPT or Claude has pushed me to better understand what I am aiming to do initially, and then find the most effective way to get it done. Bouncing ideas off the AI systems has given good early results, often bringing considerations into the picture I hadn't thought about.

I have run into some limitations; these AI tools can sometimes offer solutions that do not work or are overly engineered. I can see where my experience can play a key role in keeping the overall project on track and functionally lean. As I move forward with this project, it will be interesting to see where this partnership with AI goes. Can AI tools assist in building a production-ready, scalable system? What parts will still require significant human intervention? I'll be addressing these questions honestly throughout this series.

What's Next

In my next post, I'll dive into the infrastructure setup process, showing how I used Docker for tenant isolation and how Claude helped me design a secure multi-tenant system. I'll share actual prompts, responses, and code snippets to give you an inside look at the development process.

I invite you to follow along on this journey. Whether you're skeptical about AI's capabilities or enthusiastic about its potential, I hope this series provides valuable insights into the present and future of AI-assisted development.

Have you experimented with AI tools in your development workflow? What aspects of this project are you most curious about? Let me know in the comments below!

Building AI with AI Series

Part 1 of 1