Archive: 2006-06-20

  • Fixing invalid UTF-8 in Ruby, revisited

    When working with UTF-8-encoded text from an untrusted source like a web form, it’s a good idea to fix any invalid byte sequences at the first stage, to avoid breaking later processing steps that depend on valid input.

    More …