Logo

Programming-Idioms

  • Pascal
  • VB
  • Lua

Idiom #169 String length

Assign to the integer n the number of characters of the string s.
Make sure that multibyte characters are properly handled.
n can be different from the number of bytes of s.

local n = utf8.len(s)

The utf8 module is new in Lua 5.3
local utf8={}
function utf8.bytes(str,index)
 local byte=string.byte(str,index)
 local ret
 if     byte==nil then ret=0
 elseif byte<128  then ret=1
 elseif byte<192  then ret=2
 elseif byte<224  then ret=3
 else                  ret=4
 end
 return ret
end
function utf8.len(str)
 local count=0
 local fini=#str+1
 local index=1
 while index~=fini do
  count=count+1
  index=index+utf8.bytes(str,index)
 end
 return count
end
local n=utf8.len(s)

Use for lua version under 5.3. (which doesn't have utf8 lib)
uses sysutils;
n := s.length;

Works correctly if s is of type UnicodeString and the string is not encoded using combining characters.
uses LazUtf8;
n := length(s);

Single byte encoding or widestring, plain pascal
uses LazUtf8;
n := Utf8Length(s);

UTF8-encoding, as used in Lazarus
Dim n As Integer = s.Length
n : Integer := s'Length;

New implementation...