SEO & AEO Audit Tools¶
Automated tools for analyzing and tracking documentation quality metrics for search engine optimization (SEO) and answer engine optimization (AEO).
Tools Overview¶
1. seo_aeo_analyzer.py¶
Main analysis tool that scans documentation files and scores them against SEO and AEO metrics.
Features:
Analyzes all Markdown files in tutorial/, how-to/, explanation/, and reference/ directories
Scores 14 different metrics on a 1-5 scale
Generates detailed CSV output with per-page scores
Provides summary statistics and identifies critical issues
Usage:
# Basic usage (analyzes docs/ directory)
python tools/seo_aeo_analyzer.py
# Specify custom docs directory and output file
python tools/seo_aeo_analyzer.py --docs-dir ../other-docs --output audit-2025-12-10.csv
# Suppress summary output
python tools/seo_aeo_analyzer.py --quiet
Output:
CSV file with detailed metrics per page
Console summary with averages, top/bottom performers, and critical issues
2. compare_audits.py¶
Comparison tool for tracking changes between two audit runs over time.
Features:
Compares two audit CSV files
Identifies improvements and regressions
Tracks metric-by-metric changes
Generates markdown comparison report
Usage:
# Compare two audits and print to console
python tools/compare_audits.py baseline.csv current.csv
# Generate markdown report file
python tools/compare_audits.py audit-2025-12-10.csv audit-2025-12-17.csv --output weekly-report.md
Output:
Markdown report showing:
Overall score changes
Top improvements and regressions
Metric-by-metric comparison
New/removed pages
Recommendations
Metrics Explained¶
SEO Metrics (Search Engine Optimization)¶
Title Tag Quality (1-5)
Checks for meta description presence and length (50-160 chars)
Evaluates keyword inclusion
Content Depth (1-5)
Based on word count
5: 800+ words
4: 500-799 words
3: 300-499 words
2: 150-299 words
1: <150 words
Heading Structure (1-5)
Proper H1-H6 hierarchy
Single H1 per page
Descriptive headings
Internal Links (1-5)
Counts MyST refs, term references, and relative links
5: 9+ links
4: 6-8 links
3: 3-5 links
2: 1-2 links
1: 0 links
Meta Description (1-5)
Same as Title Tag Quality
Compelling description for search results
URL Quality (1-5)
Descriptive filename
Hyphenated words
Reasonable length (<60 chars ideal)
Freshness (1-5)
Mentions Ubuntu versions
Specifically mentions recent releases (24.04, Noble)
AEO Metrics (Answer Engine Optimization)¶
Direct Answer Quality (1-5)
First paragraph clarity
Direct answer to implied question
Length and completeness
Structured Content (1-5)
Use of lists (bullet, numbered)
Code blocks
Clear sections
Semantic Clarity (1-5)
MyST semantic markup ({term}, {manpage}, etc.)
Acronym expansions
Clear definitions
Code Examples (1-5)
Number of code blocks
5: 5+ examples
4: 3-4 examples
3: 2 examples
2: 1 example
1: 0 examples
Prerequisites (1-5)
Presence of prerequisites section
Clarity of requirements
More important for how-to guides
Step Format (1-5)
Numbered procedural steps
Most relevant for how-to guides
Clear sequence
Version Specificity (1-5)
Ubuntu version mentions
Version-specific callouts
Currency (24.04 LTS references)
Workflow for Periodic Audits¶
Initial Baseline Audit¶
# Run initial audit
cd /path/to/ubuntu-server-documentation
python tools/seo_aeo_analyzer.py --output audits/baseline-2025-12-10.csv
# Store baseline for future comparisons
mkdir -p audits
cp seo-aeo-audit.csv audits/baseline-2025-12-10.csv
Regular Re-audits (Weekly/Monthly)¶
# Run new audit
python tools/seo_aeo_analyzer.py --output audits/audit-$(date +%Y-%m-%d).csv
# Compare with baseline
python tools/compare_audits.py \
audits/baseline-2025-12-10.csv \
audits/audit-$(date +%Y-%m-%d).csv \
--output audits/report-$(date +%Y-%m-%d).md
# Review the report
cat audits/report-$(date +%Y-%m-%d).md
Automated Scheduled Audits¶
Add to cron or CI/CD pipeline:
# Example cron entry (every Monday at 9 AM)
0 9 * * 1 cd /path/to/ubuntu-server-documentation && python tools/seo_aeo_analyzer.py --output audits/audit-$(date +%Y-%m-%d).csv
GitHub Actions Example:
name: Weekly SEO/AEO Audit
on:
schedule:
- cron: '0 9 * * 1' # Every Monday at 9 AM UTC
workflow_dispatch:
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Run SEO/AEO Audit
run: |
python tools/seo_aeo_analyzer.py --output audit-$(date +%Y-%m-%d).csv
- name: Upload Audit Results
uses: actions/upload-artifact@v4
with:
name: seo-aeo-audit
path: audit-*.csv
Integration with AI Agents¶
These tools generate structured CSV data that can be easily consumed by AI agents for deeper analysis. See agent-prompt-template.md for a reusable prompt that guides AI analysis of the audit results.
Customization¶
Adding New Metrics¶
To add a new metric to seo_aeo_analyzer.py:
Add scoring logic in the
analyze_page()methodAdd the score to either
seo_metricsoraeo_metricslistUpdate the
PageMetricsdataclass with the new fieldUpdate the
metricsdict inprint_summary()for reporting
Adjusting Scoring Thresholds¶
Edit the scoring functions in seo_aeo_analyzer.py. For example, to make word count requirements stricter:
# Original
if word_count >= 800:
depth_score = 5
# Stricter
if word_count >= 1200:
depth_score = 5
Filtering Analyzed Files¶
Modify the find_content_files() method to change which files are analyzed:
def find_content_files(self) -> List[Path]:
# Add more directories
target_dirs = ["tutorial", "how-to", "explanation", "reference", "guides"]
# Skip certain patterns
for md_file in dir_path.rglob("*.md"):
if md_file.name != "index.md" and "deprecated" not in str(md_file):
content_files.append(md_file)
Troubleshooting¶
“No files found”¶
Check that
--docs-dirpoints to the correct documentation directoryEnsure target directories (tutorial/, how-to/, etc.) exist
“Error analyzing file”¶
Check file encoding (should be UTF-8)
Verify Markdown syntax is valid
Check for unusual characters in frontmatter
“Module not found”¶
Ensure you’re running Python 3.7+
All required modules are in Python standard library (no pip install needed)
Best Practices¶
Regular Audits: Run audits at least monthly to track progress
Baseline Comparison: Always compare against baseline to see trends
Focus on Trends: Individual page scores matter less than overall trends
Prioritize Issues: Address high-impact, low-effort issues first (like internal linking)
Document Changes: Keep audit reports in version control to track improvements
Automate: Use CI/CD to run audits automatically on pull requests
File Locations¶
ubuntu-server-documentation/
├── tools/
│ ├── seo_aeo_analyzer.py # Main analysis script
│ ├── compare_audits.py # Comparison tool
│ ├── README.md # This file
│ └── agent-prompt-template.md # AI agent prompt
├── audits/ # Store audit results here
│ ├── baseline-2025-12-10.csv
│ ├── audit-2025-12-17.csv
│ └── report-2025-12-17.md
└── docs/ # Documentation to analyze
├── tutorial/
├── how-to/
├── explanation/
└── reference/
Contributing¶
To contribute improvements to these tools:
Test changes on a subset of files first
Ensure backward compatibility with existing CSV format
Update this README with any new features
Add examples for new functionality
License¶
These tools are part of the Ubuntu Server Documentation project and follow the same license.