Logo

Programming-Idioms

  • Clojure

Idiom #231 Test if bytes are a valid UTF-8 string

Set b to true if the byte sequence s consists entirely of valid UTF-8 character code points, false otherwise.

using System.Text;
var encoding = new UTF8Encoding(false, true);
bool b;
try
{
    encoding.GetCharCount(s);
    b = true;
}
catch (DecoderFallbackException)
{
    b = false;
}

.NET encodings use replacement fallback by default; exception fallback can be specified using the UTF8Encoding constructor.

New implementation...