Logo

Programming-Idioms

Set b to true if the byte sequence s consists entirely of valid UTF-8 character code points, false otherwise.
New implementation

Be concise.

Be useful.

All contributions dictatorially edited by webmasters to match personal tastes.

Please do not paste any copyright violating material.

Please try to avoid dependencies to third-party libraries and frameworks.

Other implementations
using System.Text;
var encoding = new UTF8Encoding(false, true);
bool b;
try
{
    encoding.GetCharCount(s);
    b = true;
}
catch (DecoderFallbackException)
{
    b = false;
}
import "unicode/utf8"
b := utf8.Valid(s)
import java.nio.ByteBuffer
import java.nio.charset.CharacterCodingException

import static java.nio.charset.StandardCharsets.UTF_8
final decoder = UTF_8.newDecoder()
final buffer = ByteBuffer.wrap(s)
try {
    decoder.decode(buffer)
    b = true
} catch (CharacterCodingException e) {
    b = false
}
uses LazUtf8;
b := FindInvalidUTF8Codepoint(s) = -1;
# use utf8 is not required
$b = utf8::is_utf8($s);
try:
    s.decode('utf8')
    b = True
except UnicodeError:
    b = False

b = s.force_encoding("UTF-8").valid_encoding?  
let b = std::str::from_utf8(&bytes).is_ok();