nonempty, it points a dynamically allocated block of memory that contains the string value. The eight bytes before
the location contain a 32-bit length indicator and a 32-bit reference count. This memory is allocated on the heap, but
its management is entirely automatic and requires no user code.
Because long-string
variables are pointers, two or more of them can reference the same value without consuming
additional memory. The compiler exploits this to conserve resources and execute assignments faster. Whenever a
long-string variable is destroyed or assigned a new value, the reference count of the old string (the variable's previous
value) is decremented and the reference count of the new value (if there is one) is incremented;
if the reference
count of a string reaches zero, its memory is deallocated. This process is called reference-counting. When indexing
is used to change the value of a single character in a string, a copy of the string is made if - but only if - its reference
count is greater than one. This is called copy-on-write semantics.
WideString
The WideString type represents a dynamically allocated string of 16-bit Unicode characters. In most respects it is
similar to AnsiString. On Win32, WideString is compatible with the COM BSTR type.
Note:
Under Win32, WideString values are not reference-counted.
The Win32 platform supports single-byte and multibyte character sets as well as Unicode.
With a single-byte
character set (SBCS), each byte in a string represents one character.
In a multibyte character set (MBCS), some characters are represented by one byte and others by more than one
byte. The first byte of a multibyte character is called the lead byte. In general, the lower 128 characters of a multibyte
character set map to the 7-bit
ASCII characters, and any byte whose ordinal value is greater than 127 is the lead
byte of a multibyte character. The null value (#0) is always a single-byte character.
Multibyte character sets -
especially double-byte character sets (DBCS) - are widely used for Asian languages.
In the Unicode character set, each character is represented by two bytes. Thus a Unicode string is a sequence not
of individual bytes but of two-byte words. Unicode characters and strings are also called wide characters and wide
character strings. The first 256 Unicode characters map to the ANSI character set. The Windows operating system
supports Unicode (UCS-2).
The Delphi language supports single-byte and multibyte characters and strings through the Char, PChar, AnsiChar,
PAnsiChar, and AnsiString types. Indexing of multibyte strings is not reliable, since
S[i]
represents the
ith byte (not
necessarily the
ith character) in
S
. However, the standard string-handling functions
have multibyte-enabled
counterparts that also implement locale-specific ordering for characters. (Names of multibyte functions usually start
with
Ansi
-. For example, the multibyte version of
StrPos
is
AnsiStrPos
.) Multibyte character support is operating-
system dependent and based on the current locale.
Delphi supports Unicode characters and strings through the WideChar, PWideChar, and WideString types.
Dostları ilə paylaş: