% cd ..

Automating Blog Tagging with AI

Tagging is a chore

Adding tags to blog posts is surprisingly tedious.

  • You have to consider consistency with existing tags.
  • When creating new tags, you want to maintain a consistent level of granularity (genre? theme? specific technology name?).
  • You need to decide how labels should be handled in different languages.
  • As the number of articles grows, reconciling tags with historical ones becomes increasingly difficult.

"Classification based on rules" is an area where AI excels. I created a script to automate tagging using the Gemini API and further integrated it into my GitHub Actions pipeline for complete automation.

tags.ts: The AI Tagging Script

First, I built a script that runs locally. It works simply: it passes the body of a Japanese article and the list of existing tags to Gemini, asking it to propose and apply appropriate tags. However, letting AI create tags freely leads to chaos, so I've established a few rules.

Granularity Rules for Tags

I've set up three levels of granularity for tags and provide this hierarchy to the AI.

Granularity Meaning Example
large Genre/Category blog, tech, entertainment
medium Field/Theme ai, web-development, anime
small Specific Tech/Product nextjs, vercel, supabase

The goal is to have at least 2 and at most 10 tags per article, with at least one tag per granularity level as a soft target (making this a hard rule would lead to tag stuffing in short articles).

Reconciliation (Preventing Duplication)

Allowing AI to create tags freely often results in overlaps like "ai" and "machine-learning." I explicitly state in the prompt, "Do not create concepts that overlap with existing tags," and only have new tags generated when the topic is truly new.

Japanese Label Formatting

I follow the conventions actually used in the Japanese tech community. I do not force transliterations for terms commonly used in English.

  • java → "Java" (O), "ジャバ" (X)
  • embedding → "Embedding" (O), "エンベディング" (△)
  • blog → "ブログ" (O)

If I manually edit the existing tag list tags.json, the AI respects it as the source of truth for subsequent runs. The design ensures human judgment always takes priority.

Mitigating LLM Output Variance

Since LLM output varies slightly, I’ve implemented validation—checking tag counts and the existence of tag IDs—with a retry mechanism if validation fails. The trick is to separate rules into "hard rules" and "soft targets" to avoid being overly restrictive.

Model Selection

Due to free tier constraints, I use different models depending on the task. For tagging, I use a lightweight model (gemini-3.1-flash-lite-preview) with an RPD of 500, which is suitable for high-volume execution. I perform a manual quality review of the entire tag hierarchy (checking for duplicates, granularity errors, and unnatural labels) separately. This requires a broader perspective, so I use a smarter model (gemini-2.5-flash) for that.

Effectiveness of Prompt Tuning

Let's compare the tagging results based on different prompts for the same article (Mechanisms for Automatic Translation Quality Verification).

Simple Prompt

When instructed simply with, "Please add appropriate tags to this article":

ai, llm, translation, backtranslation, embedding,
vectorsearch, pgvector, postgresql, nlp, automation

It’s full of problems:

  • Overlap: "ai," "llm," and "nlp" have overlapping concepts.
  • Overlap: "pgvector" and "postgresql" are the same, just at different levels of granularity.
  • Ignoring Existing Tags: "backtranslation" was created separately from the existing "translation" tag.
  • Inconsistent Granularity: Genre tags and specific name tags are listed without distinction.
  • Hitting the Limit: Used all 10 available slots.

Prompt with Rules

When passing the existing tag list, granularity hierarchy, and reconciliation rules:

[medium] ai, translation
[small]  postgresql, embedding, gemini, supabase

It was narrowed down to 6 tags with no duplicates. Existing tags were prioritized, and new tags were only proposed when absolutely necessary.

The reason for this difference is that asking an AI to "classify freely" versus "classify within this rule system" are fundamentally different tasks. The latter provides better, more stable results due to the constraints.

Integration with GitHub Actions

With tags.ts working locally, the next step was integrating it into the GitHub Actions translation pipeline.

Design: Tagging → Translation → Combined Commit

In this blog, there was already a mechanism where pushing an article triggers an automatic translation (details on the translation mechanism here). I added tagging as a pre-step.

Push to posts/ja/
  → Detect slug of changed article
  → AI Tagging (tags.ts apply)       ← Added
  → Translation (translate.ts translate)
  → Select Best Translation (translate.ts best)
  → git commit & push all together

The key is combining the tag changes and translation changes into a single commit. I git add the posts/ja/ (updated frontmatter tags), posts/en/ (generated English translation), and content/tags.json (new tags) and commit them all at once. It's important to group them to avoid cluttering the commit history.

This Article is a Test

This article was pushed with tags: [] (no tags). If the pipeline is working correctly, the AI should read this article, auto-generate appropriate tags, and write them into the frontmatter.

If this article has tags, the automated tagging pipeline is functioning correctly.

Summary

  • Created an AI tagging script (tags.ts) using the Gemini API.
  • Implemented granularity rules, reconciliation, Japanese label generation, and validation with retries.
  • Integrated it into the GitHub Actions translation pipeline, consolidating everything into a single commit.
eof