Academic Image Alt-Text Optimization System

Overview

This system provides automated, SEO-optimized alt-text generation for your academic website images using AI-powered analysis of site content and academic context.

🎯 Purpose

Transform your 281 images from having poor or missing alt-text to academic SEO-optimized descriptions that will:

  • Boost Google Image Search visibility
  • Improve accessibility for screen readers
  • Enhance academic discoverability
  • Target physics/academic keywords effectively

📁 Files Created

Core Scripts

  • optimize_image_alt_text.py - Master orchestration script
  • generate_alt_text.py - Individual image alt-text generation
  • batch_generate_alt_text.py - Batch processing for multiple images
  • apply_alt_text.py - Apply generated alt-text to site files

Analysis & Data

  • image_catalogue_detailed.py - Creates comprehensive image analysis
  • image_catalogue.txt - Simple list of all 281 images
  • image_catalogue_detailed.json - Full analysis with classifications
  • image_catalogue_summary.txt - Human-readable summary

Generated Files (after running)

  • alt_text_results.txt - Generated alt-text results
  • alt_text_prompt_*.txt - Individual prompts for debugging

🚀 Quick Start

# Generate alt-text for high-priority images (limit 10 for testing)
python3 optimize_image_alt_text.py --priority HIGH --limit 10

# For production - process all high-priority images
python3 optimize_image_alt_text.py --priority HIGH

Option 2: Step-by-Step Control

# 1. Create image catalogue
python3 image_catalogue_detailed.py

# 2. Generate alt-text for specific priority
python3 batch_generate_alt_text.py --priority HIGH --limit 5

# 3. Apply to site files
python3 apply_alt_text.py

Option 3: Single Image Testing

# Test with one image
python3 generate_alt_text.py assets/home/dr-will-barker-theoretical-physicist-cambridge.jpg

🎛️ Command Options

Priority Levels

  • CRITICAL: Profile photos (1 image)
  • HIGH: Teaching photos + outreach (116 images)
  • MEDIUM: Research projects + backgrounds (137 images)
  • LOW: Icons + logos (27 images)

Useful Flags

  • --dry-run: See what would happen without making changes
  • --limit N: Process only N images (for testing)
  • --generate-only: Generate alt-text but don’t update files
  • --skip-catalogue: Use existing catalogue (faster)

📊 Expected Results

Before Optimization

<img src="Undergraduate1.jpg" alt="Theoretical physicist Will Barker teaching general relativity to Cambridge University undergraduates.">

After Optimization

<img src="physics-teaching-blackboard-lecture-cambridge-01.jpg" 
     alt="Dr. Barker teaching quantum mechanics at Cambridge showing wave function calculations on blackboard">

SEO Impact

  • Keywords: Dr. Will Barker, theoretical physicist, Cambridge, quantum mechanics
  • Context: Academic teaching, research, professional
  • Length: 80-125 characters (optimal for screen readers)
  • Accessibility: Descriptive, meaningful for users with disabilities

🔧 Technical Details

How It Works

  1. Catalogue: Scans all 281 images, classifies by priority/category
  2. Context: Uses code2prompt to gather all site markdown content
  3. AI Generation: Feeds context + image info to Gemini 2.5 Pro
  4. Application: Updates actual markdown/HTML files with new alt-text

Image Classification

  • Teaching Photos: Undergraduate/outreach photo reels
  • Teaching Materials: Course diagrams, equations, blackboards
  • Graduate Research: Research project documentation
  • Professional: Profile photos, conference presentations
  • Technical: Backgrounds, logos, icons

Safety Features

  • Dry run mode: Test without changes
  • Backup prompts: All prompts saved for debugging
  • Results logging: Track what was generated
  • Selective processing: Choose priority levels and limits

📈 Priority Recommendations

Phase 1: Critical Impact (Start Here)

# Profile photo + top teaching images
python3 optimize_image_alt_text.py --priority CRITICAL
python3 optimize_image_alt_text.py --priority HIGH --limit 20

Phase 2: Teaching Documentation

# All teaching materials
python3 optimize_image_alt_text.py --priority HIGH

Phase 3: Research Content

# Graduate research projects  
python3 optimize_image_alt_text.py --priority MEDIUM

Phase 4: Comprehensive

# Everything else
python3 optimize_image_alt_text.py --all-priorities

🐛 Troubleshooting

Common Issues

  • No MCP access: Scripts show placeholder text, replace with actual Gemini calls
  • code2prompt missing: Install with npm install -g code2prompt
  • Permission errors: Check file permissions on image directories
  • Large batch timeouts: Use --limit to process smaller batches

Debug Commands

# Check catalogue generation
python3 image_catalogue_detailed.py

# Test single image
python3 generate_alt_text.py assets/home/dr-will-barker-theoretical-physicist-cambridge.jpg

# Dry run batch
python3 batch_generate_alt_text.py --dry-run --limit 3

📋 Quality Assurance

Generated Alt-Text Should Include:

✅ “Dr. Will Barker” or “Will Barker”
✅ “theoretical physicist”
✅ “Cambridge” or “Cambridge University”
✅ Specific physics concepts (quantum mechanics, cosmology, etc.)
✅ Academic context (teaching, research, outreach)
✅ 80-125 characters length
✅ Descriptive, accessible language

Avoid:

❌ Generic descriptions (“Photo of person”)
❌ Starting with “Image of” or “Picture of”
❌ Too short (<50 chars) or too long (>150 chars)
❌ Technical jargon without context
❌ Purely visual descriptions without meaning

🚀 Expected SEO Impact

After implementation and Google reindexing (2-4 weeks):

Google Image Search Improvements

  • Current: 0 images appear for “will barker physics”
  • Expected: 20-50+ images appear for relevant academic searches
  • Target queries: “theoretical physicist cambridge”, “physics teaching”, “quantum mechanics lecturer”

Academic Visibility

  • Enhanced discoverability for physics researchers
  • Better representation in academic image searches
  • Improved accessibility compliance
  • Professional academic presence in visual search

📞 Support

For issues or questions about this optimization system:

  1. Check the generated debug files (alt_text_prompt_*.txt)
  2. Run with --dry-run to test without changes
  3. Process small batches first (--limit 5)
  4. Review image_catalogue_summary.txt for priorities

Status: Ready for deployment
Last Updated: September 2025
Estimated Completion Time: 2-4 hours for all 281 images
Expected Impact: Significant improvement in Google Image Search visibility