Case Study
2024
Archived

I Built an AI That
Writes My Blog Posts

A voice-to-blog pipeline using Whisper, GPT-4, and DALL-E. Because typing is overrated.

Next.js
OpenAI
Whisper
DALL-E
Supabase
TipTap

The Problem

I Have Thoughts. I Hate Typing Them.

Here's the thing about blogging: I actually have plenty of things I want to write about. The problem is the writing part. Sitting down, opening a blank document, staring at the cursor blinking judgmentally at me — it's the worst part of having a personal website.

But talking? Talking I can do. I ramble to myself in the car all the time. What if I could just... record those rambles and have AI turn them into actual content? It sounded like either a brilliant idea or a very lazy one. Maybe both.

The goal was simple: build a system where I could record a 2-minute voice memo, hit a button, and have a fully-formatted blog post with a custom header image appear on my site. No typing, no image hunting, no excuses for not updating my blog.

Spoiler alert: it actually worked. Then I forgot to use it for 90 days and Supabase paused my database. But hey, the technology was sound.

The Stack

What Powered This Thing

0
AI APIs Used
<0s
Seconds to Publish
0
Days Before I Forgot About It

The Pipeline

Voice → Text → Blog → Image → Done

The architecture was surprisingly straightforward — at least in concept. Each piece of the puzzle handled one specific job, and they all fed into each other like a content assembly line.

Voice Memo
Whisper API
GPT-4
DALL-E
Supabase
Published!

How It All Works

  1. 1Record voice memo in the browser using MediaRecorder API
  2. 2Upload audio blob to Next.js API route
  3. 3Send to OpenAI Whisper for transcription
  4. 4Pass transcript to GPT-4 with blog-writing prompt
  5. 5Extract title and generate DALL-E image prompt
  6. 6Create header image with DALL-E 3
  7. 7Save everything to Supabase with timestamps
  8. 8Display on blog with TipTap rich text rendering

Interactive Demo

See It In Action

Since the actual service is currently hibernating (RIP free tier), here's a simulation of what the pipeline looked like when it was running. The real version was slightly slower because, you know, actual AI processing takes time.

Content Pipeline Demo

AI_BLOGGER v1.0
Recording voice memo...
Click "Run Demo" to see the pipeline in action...
// The magic behind the curtain
const createPost = async (audioBlob) => {
  // 1. Transcribe with Whisper
  const transcript = await whisper.transcribe(audioBlob);
  
  // 2. Expand with GPT-4
  const content = await gpt4.complete({
    prompt: `Write a blog post based on: ${transcript}`,
    style: "conversational, engaging"
  });
  
  // 3. Generate header image
  const image = await dalle.generate(content.title);
  
  // 4. Save to database
  await supabase.from('posts').insert({
    title: content.title,
    content: content.body,
    header_image_url: image.url
  });
}

Features

What It Could Do

Voice-to-Post

Record a quick voice memo and watch it transform into a polished blog post.

AI Writing Assistant

GPT-4 expands your rough thoughts into engaging, well-structured content.

Auto-Generated Images

DALL-E creates unique header images based on your post content.

Rich Text Editor

TipTap-powered editor for manual tweaks and formatting.

One-Click Publish

From voice memo to live post in under a minute.

Supabase Backend

Real-time database with authentication and storage.

Challenges

Things That Went Wrong

Building with AI is fun until you realize these models are simultaneously brilliant and completely unhinged. Here's what I learned the hard way.

Voice Quality Issues

Background noise and mumbling made transcriptions hilariously wrong. 'AI blogging' became 'eye flogging' more than once.

Hover to see solution →

Solution

Added audio preprocessing and prompt engineering to help GPT-4 interpret context clues even with imperfect transcriptions.

Content Hallucinations

GPT-4 would sometimes add 'facts' I never mentioned. Apparently I'm an expert in quantum computing now?

Hover to see solution →

Solution

Refined prompts to be more constrained, telling the model to only expand on what was actually said, not invent new claims.

DALL-E Consistency

Header images ranged from 'actually pretty good' to 'nightmare fuel.' The AI really struggled with hands.

Hover to see solution →

Solution

Wrote better image prompts and added style constraints. Still avoid any prompts involving human fingers.

Database Costs

Supabase's free tier pauses databases after 90 days of inactivity. Guess who forgot to use their own blog?

Hover to see solution →

Solution

Learned this the hard way. The project now lives on as a portfolio piece rather than an active service. RIP, little blog.

Lessons Learned

Was It Worth It?

Absolutely. Even if the project is currently in database purgatory, I learned a ton about building with AI APIs, handling audio in the browser, and the importance of actually using the things you build.

The core idea is sound: AI can genuinely help bridge the gap between "I have thoughts" and "those thoughts are published." The technology is there. The execution worked. The only thing that failed was my commitment to actually recording voice memos.

What I'd Do Differently

  • Add a mobile app for easier voice recording on the go
  • Set up reminder notifications to actually use the thing
  • Use a database that doesn't hibernate (or pay for the premium tier)
  • Build in a "draft" mode for posts that need human editing first

If you want to build something similar, the stack is solid: Next.js for the frontend and API routes, OpenAI for the AI magic, and Supabase for the backend. Just remember to actually use it once in a while. Or pay for the premium tier. Or both.

BS

Brian Stever

View ResumeMore Projects