% cd ..

Adding Search to My Blog with Pagefind

Adding Search to My Blog with Pagefind

I Needed Search

Now that I've been adding more posts, I find myself wondering, "Where did I write about that?" more often. Even though tags help filter things down, it's still nice to have a quick keyword search.

However, since this blog runs on a completely free stack, I wanted to avoid adding paid services just for search. And I didn't want to add more server-side processing. Ideally, I wanted something that runs entirely at build time.

Checking Out the Options

For adding search to a static personal blog, the options usually boil down to these:

ToolMethodJapanese SupportCostFeatures
PagefindBuild-time index generationNative support
CJK tokenizer
FreeLightweight (~6KB), multi-language
OramaBrowser / Edge full-text searchVia pluginFreeFeature-rich, but custom index building
Fuse.jsClient-side fuzzy searchYesFreeHeavy full JSON load with more articles
FlexsearchClient-side full-text searchYesFreeSame scalability issues as Fuse.js
AlgoliaSaaS (DocSearch free tier)SupportedConditionally freeHigh performance, but external dependency

Why I Chose Pagefind

Pagefind is an open-source tool developed by CloudCannon. It automatically generates search indexes from static site build artifacts (HTML).

The deciding factors were:

  • Free! — It's MIT-licensed OSS, so no extra hosting costs.
  • Works with Japanese — It has a built-in CJK (Chinese, Japanese, Korean) tokenizer, so Japanese search works with no extra setup.
  • Build-time Only — It generates the index as files when deploying to Vercel, with no API calls during search.
  • Great for Multi-language Sites — My blog separates content into /ja/ and /en/, so it fits perfectly with the existing directory structure.
  • Fast — The initial payload is small, and the search index is split and lazy-loaded only when actually needed for a search.

Fuse.js and Flexsearch aren't bad options, but sending all the data to the client becomes a bottleneck as the number of articles grows. Algolia is functionally excellent, but managing the free tier conditions is a hassle, and it's simply too heavy for a personal blog's tech stack.

Setting Up with Next.js + Vercel

Installation

npm install pagefind

Build Script

I modified the build command in package.json.

{
  "scripts": {
    "build": "next build && pagefind --site .next/server/app --output-path public/_pagefind"
  }
}

After next build finishes, Pagefind scans .next/server/app (where the built HTML is output) to generate the index and places it in public/_pagefind/. Vercel runs this command during the build process, so after deployment, it's statically served as /_pagefind/pagefind.js.

The core part of the index creation is written in Rust, and it's extremely fast. For a personal blog of this size, index creation finishes in seconds flat, making it practically a zero-cost operation. Before I started using it, I was worried about whether lengthy index creation times would impact Vercel's free build tier, but that turned out to be unnecessary.

Since public/_pagefind/ is regenerated with each build, I added it to .gitignore.

Controlling What Gets Indexed

If you don't specify anything, the entire <body> gets indexed, meaning text from sidebars and headers would show up in search results. To limit the indexing to just the article content, I added the data-pagefind-body attribute to the article's main <div>.

<div className="prose prose-invert max-w-none" data-pagefind-body>
  {parse(post.contentHtml, parserOptions)}
</div>

Now, Pagefind will only index elements with the data-pagefind-body attribute. When I actually built it, the index was narrowed down from 82 files to just 16 pages (article bodies only).

The Search Component

Pagefind comes with a default search UI, but I wanted to match my blog's theme, so I wrote my own component:

  1. Clicking the magnifying glass icon opens the search panel.
  2. On the first open, /_pagefind/pagefind.js is dynamically imported and loaded.
  3. Each keypress calls pagefind.search() and displays the results in a list.
  4. Closes with Escape. Toggles with / key.
// Dynamically load /_pagefind/pagefind.js at runtime
// It doesn't exist at build time, so pass the path via a variable
// to avoid TypeScript's static analysis errors.
const path = "/_pagefind/pagefind.js";
const pf = await import(/* webpackIgnore: true */ path);
await pf.init();

Since Pagefind's JS is generated after the build, TypeScript's static analysis throws a 'module not found' error. I'm bypassing this by putting the path into a variable and passing it to import().

Local Development Caveats

Since Pagefind builds its index from built HTML, search won't work with next dev (the development server). To test search in your local environment, you need to:

npm run build   # next build + pagefind to generate the index
npm start       # Start the production build on localhost:3000

My workflow is to use next dev for everything except search, and then run build → start just to check the search functionality.

Using It with Japanese

Common Japanese searches (like '翻訳' (translation), '対応' (response), '対策' (countermeasure)) work without issues. However, I noticed some katakana words producing unexpected results. Digging into the cause led me down a rabbit hole into how Pagefind handles Japanese internally, so I wrote it up in a separate post.

Wrapping Up

  • Pagefind is the simplest option for adding search to static sites.
  • Works by just adding one line to your build command.
  • Japanese is supported (though there are some structural accuracy limitations).
  • JS is ~6KB + lazy loading, so virtually no performance impact.
  • Works great with Vercel, no extra costs.

Bonus: CloudCannon, the Company Behind Pagefind

The creators of Pagefind, CloudCannon, actually specialize in a Git-based CMS of the same name. It's highly compatible with Next.js, and its powerful WYSIWYG UI editing features are particularly well-regarded. However, that's a paid service. Even the Standard plan at $49/month (USD) wasn't quite the right price point for a personal blog.

So I'm glad the OSS version of Pagefind is free.