D a t a s t r u c t u r e s a n d a L g o r I t h m s Annotated Reference with Examples

Yüklə 1,04 Mb.

Pdf görüntüsü

səhifə	21/27
tarix	24.06.2022
ölçüsü	1,04 Mb.
	#62232

1 ... 17 18 19 20 21 22 23 24 ... 27

shu kerak

IsPalindrome algorithm in operation on the string “Was it Eliot’s toilet I saw?”
If you remove all punctuation, and white space from the aforementioned string
you will find that it is a valid palindrome.

CHAPTER 11. STRINGS
81
Figure 11.1: lef t and right pointers marching in towards one another
1) algorithm IsPalindrome(value)
2)
Pre:
value 6= ∅
3)
Post: value is determined to be a palindrome or not
4)
word ← value.Strip().ToUpperCase()
5)
lef t ← 0
6)
right ← word.Length −1
7)
while word[lef t] = word[right] and lef t < right
8)
lef t ← lef t + 1
9)
right ← right − 1
10)
end while
11)
return word[lef t] = word[right]
12) end IsPalindrome
In the IsPalindrome algorithm we call a method by the name of Strip. This
algorithm discards punctuation in the string, including white space. As a result
word contains a heavily compacted representation of the original string, each
character of which is in its uppercase representation.
Palindromes discard white space, punctuation, and case making these changes
allows us to design a simple algorithm while making our algorithm fairly robust
with respect to the palindromes it will detect.
11.3
Counting the number of words in a string
Counting the number of words in a string can seem pretty trivial at first, however
there are a few cases that we need to be aware of:
1.
tracking when we are in a string
2.
updating the word count at the correct place
3.
skipping white space that delimits the words
As an example consider the string “Ben ate hay” Clearly this string contains
three words, each of which distinguished via white space. All of the previously
listed points can be managed by using three variables:
1.
index
2.
wordCount
3.
inW ord

CHAPTER 11. STRINGS
82
Figure 11.2: String with three words
Figure 11.3: String with varying number of white space delimiting the words
Of the previously listed index keeps track of the current index we are at in
the string, wordCount is an integer that keeps track of the number of words we
have encountered, and finally inW ord is a Boolean flag that denotes whether
or not at the present time we are within a word. If we are not currently hitting
white space we are in a word, the opposite is true if at the present index we are
hitting white space.
What denotes a word? In our algorithm each word is separated by one or
more occurrences of white space. We don’t take into account any particular
splitting symbols you may use, e.g. in .NET String.Split
1
can take a char (or
array of characters) that determines a delimiter to use to split the characters
within the string into chunks of strings, resulting in an array of sub-strings.
In Figure 11.2 we present a string indexed as an array. Typically the pattern
is the same for most words, delimited by a single occurrence of white space.
Figure 11.3 shows the same string, with the same number of words but with
varying white space splitting them.
1
http://msdn.microsoft.com/en-us/library/system.string.split.aspx

CHAPTER 11. STRINGS
83
1) algorithm WordCount(value)
2)
Pre:
value 6= ∅
3)
Post: the number of words contained within value is determined
4)
inW ord ← true
5)
wordCount ← 0
6)
index ← 0
7)
// skip initial white space
8)
while value[index] = whitespace and index < value.Length −1
9)
index ← index + 1
10)
end while
11)
// was the string just whitespace?
12)
if index = value.Length and value[index] = whitespace
13)
return 0
14)
end if
15)
while index < value.Length
16)
if value[index] = whitespace
17)
// skip all whitespace
18)
while value[index] = whitespace and index < value.Length −1
19)
index ← index + 1
20)
end while
21)
inW ord ← f alse
22)
wordCount ← wordCount + 1
23)
else
24)
inW ord ← true
25)
end if
26)
index ← index + 1
27)
end while
28)
// last word may have not been followed by whitespace
29)
if inW ord
30)
wordCount ← wordCount + 1
31)
end if
32)
return wordCount
33) end WordCount
11.4
Determining the number of repeated words
within a string
With the help of an unordered set, and an algorithm that can split the words
within a string using a specified delimiter this algorithm is straightforward to
implement. If we split all the words using a single occurrence of white space
as our delimiter we get all the words within the string back as elements of
an array. Then if we iterate through these words adding them to a set which
contains only unique strings we can attain the number of unique words from the
string. All that is left to do is subtract the unique word count from the total
number of stings contained in the array returned from the split operation. The
split operation that we refer to is the same as that mentioned in §11.3.

CHAPTER 11. STRINGS
84
Figure 11.4: a) Undesired uniques set; b) desired uniques set
1) algorithm RepeatedWordCount(value)
2)
Pre:
value 6= ∅
3)
Post: the number of repeated words in value is returned
4)
words ← value.Split(’ ’)
5)
uniques ← Set
6)
foreach word in words
7)
uniques.Add(word.Strip())
8)
end foreach
9)
return words.Length −uniques.Count
10) end RepeatedWordCount
You will notice in the RepeatedWordCount algorithm that we use the Strip
method we referred to earlier in

Yüklə 1,04 Mb.

Dostları ilə paylaş:

1 ... 17 18 19 20 21 22 23 24 ... 27