365Dz
Dazy Dayz Diary// 霞(かすみ)立つ日々の雑記帖
Posts
JAEN
←% cd ..

% grep -r "tokenizer" /posts/

Tokenizer

  • 2026 - 04 - 13pagefind-japanese-tokenizer

    Pagefind Japanese Search Deep Dive: Tokenizers and IME Fixes

    Why searching "サイバー" (cyber) in Pagefind returns "サイドバー" (sidebar) -- digging into the tokenizer mismatch between the index side (lindera/IPAdic) and query side (Intl.Segmenter), plus fixing IME input issues.

1 posts found_

Tags

4DXAIAmazon PrimeAutomationBack-translationBlogBlumhouseClaudeCMSCrimeDramaEIKENEmbeddingEntertainmentFront Matter CMSGeminiGhost in the ShellGitHub ActionsGitHub AppHeading anchorsIMEIndexNowIntl.SegmenterJamstackKeystaticLinderaLocalizationMarkdocMarkdownMovieMysteryNewsNext.jsPagefindPostgreSQLProject Hail MaryPrompt EngineeringRecommendationremark/rehypeScarpettaSci-FiSearchSpoilersSupabaseTachikomaTechTokenizerTokuryūTranslationVercel
Dazy Dayz

Dazy Dayz

Daily Life and a Translation Lab 日常と翻訳の実験場

© 2026 Dazy DayzCreditsSitemapLocation neoTokyo; // ネオ東京