↓ Archives ↓

Remove diacritics (Umlauts, Accents, Special characters) in JavaScript

So I recently found myself generating permalinks in JavaScript again which can be fun and painful. It seems to be less painful if you just ignore anything that’s not [a-zA-Z0-9] and replace it with a hyphen -.

However, this starts looking ugly rather quickly if you’re from Germany or France for instance, where use of umlauts and accents is very common. Something really nice like
J'ai montré les éléphants à ma sœur
becomes something really ugly like
j-ai-montr-les-l-phants-ma-s-ur.

So as Holger pointed out, I needed a diacritics table which I found here. After some modifications for the German language (e.g. ä -> ae, ß -> ss), I came up with this.

It’s still heavily based on what lehel built, so thank him, not me. I just wanted to put my improved version here, so I don’t forget it.

Update: I have created a Gist for this over at Github so we can continue to update it there…

2 Comments

  • Sep 22nd 201116:09
    by Silvan T. Golega

    Thanks for sharing. This could be very useful indeed. Would be interesting to know what languages are covered to make it easier to extend it.

  • Oct 13th 201112:10
    by Jan Schulz-Hofen

    Unfortunately, I don’t know. German and french seem to be covered.

  • Leave a Reply