Clean up spelling exceptions listΒΆ
This prompt should assist with cleaning up your spelling exception list. This is likely to need more than one round of prompting.
- Created with:
LLM: Claude Sonnet 4.5
Agent: GitHub Copilot
# Prompt: Clean Up `.custom_wordlist.txt` Spelling Exception List
You are tasked with cleaning up the spelling exception list in `/home/sally/src/ubuntu-server-documentation/docs/.custom_wordlist.txt`. This file contains words that should be exempted from spell-checking in a Sphinx-based Ubuntu Server documentation project that uses **US English** (`en-US`).
## Your Task
Perform the following operations **in a single pass** using the `multi_replace_string_in_file` tool or equivalent batch operation. Read the entire file once, process all changes in memory, then write the complete cleaned result back in one operation.
### Processing Steps (in order):
1. **Sort alphabetically** (case-insensitive, A-Z)
2. **Remove exact duplicates** (case-sensitive - keep only first occurrence)
3. **Remove GB English spellings** that differ from US English (e.g., "colour" vs "color", "organise" vs "organize") - these don't need exceptions since the project uses US English
4. **Remove words that are valid in standard US English dictionary** (common words like "adapter", "backend", "config" that don't need special exceptions)
5. **Remove invalid/erroneous spellings** (typos, malformed words, garbage entries)
### Critical Exclusions - DO NOT REMOVE:
- **Acronyms** (e.g., ABI, ACL, API, DNS, LDAP, UUID, VM)
- **Product names** (e.g., Apache, MySQL, Nginx, OpenStack, QEMU, Ubuntu)
- **Software/package names** (e.g., systemd, rsync, nginx, apparmor, netplan)
- **Technical jargon** specific to Linux/Ubuntu/networking/systems (e.g., multipath, syslog, journald, bootloader, netboot, cgroup)
- **Domain-specific terms** (e.g., DPDK, LXD, Netplan, Subiquity, LVM, DRBD)
- **File/command names** (e.g., sshd, dhcpd, multipathd, systemctl)
- **Technical compound words** (e.g., autoinstall, multicast, passthrough, filesystem)
- **Person names** (e.g., Danika, ArrayBolt, Stephane)
- **Organization names** (e.g., Canonical, Launchpad, GitHub)
### Examples of What TO REMOVE:
- Common English words: "adapter", "backend", "config", "lookup", "runtime", "utils"
- GB spellings if US is correct: (check for -ise/-ize, -our/-or, -re/-er patterns)
- Clearly wrong: random letter combinations, obvious typos
- Words already correct in US English dictionaries
### Output Format:
The cleaned file should:
- Be sorted alphabetically (case-insensitive)
- Have no duplicates
- Contain only technical terms, acronyms, and product names that genuinely need spell-check exceptions
- End with exactly one blank line (as noted in the file header comment)
- Preserve the header comment: `# Leave a blank line at the end of this file to support concatenation`
### Implementation:
1. Read the entire `.custom_wordlist.txt` file
2. Parse into a list, excluding the comment line
3. Apply all five transformations in sequence
4. Sort the final result
5. Use `replace_string_in_file` to replace the entire content (from first word to last word) with the cleaned, sorted list
6. Verify the blank line at end is preserved
**Execute this task now.** Provide a brief summary of what was removed and the final count.