Tải bản đầy đủ (.pdf) (5 trang)

PHP and MySQL Web Development - P29 pptx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (80.71 KB, 5 trang )

107
Matching and Replacing Substrings with String Functions
Finding Strings in Strings: strstr(), strchr(), strrchr(), stristr()
To find a string within another string you can use any of the functions strstr(),
strchr(), strrchr(), or stristr().
The function strstr() is the most generic, and can be used to find a string or char-
acter match within a longer string. Note that in PHP, the strchr() function is exactly
the same as strstr(), although its name implies that it is used to find a character in a
string, similar to the C version of this function. In PHP, either of these functions can be
used to find a string inside a string, including finding a string containing only a single
character.
The prototype for strstr() is as follows:
string strstr(string haystack, string needle);
You pass the function a haystack to be searched and a needle to be found. If an exact
match of the needle is found, the function returns the haystack from the
needle onward, otherwise it returns false. If the needle occurs more than once, the
returned string will start from the first occurrence of needle.
For example, in the Smart Form application, we can decide where to send the email
as follows:
$toaddress = ''; // the default value
// Change the $toaddress if the criteria are met
if (strstr($feedback, 'shop'))
$toaddress = '';
else if (strstr($feedback, 'delivery'))
$toaddress = '';
else if (strstr($feedback, 'bill'))
$toaddress = '';
This code checks for certain keywords in the feedback and sends the mail to the appro-
priate person. If, for example, the customer feedback reads “I still haven’t received deliv-
ery of my last order,” the string “delivery” will be detected and the feedback will be sent
to



There are two variants on strstr().The first variant is stristr(), which is nearly
identical but is not case sensitive.This will be useful for this application as the customer
might type 'delivery', 'Delivery', or 'DELIVERY'.
The second variant is strrchr(), which is again nearly identical, but will return the
haystack from the last occurrence of the needle onward.
Finding the Position of a Substring: strpos(), strrpos()
The functions strpos() and strrpos() operate in a similar fashion to strstr(),
except, instead of returning a substring, they return the numerical position of a needle
within a haystack.
06 525x ch04 1/24/03 2:55 PM Page 107
108
Chapter 4 String Manipulation and Regular Expressions
The strpos() function has the following prototype:
int strpos(string haystack, string needle, int [offset] );
The integer returned represents the position of the first occurrence of the needle within
the haystack.The first character is in position 0 as usual.
For example, the following code will echo the value 4 to the browser:
$test = 'Hello world';
echo strpos($test, 'o');
In this case, we have only passed in a single character as the needle, but it can be a string
of any length.
The optional offset parameter is used to specify a point within the haystack to start
searching. For example,
echo strpos($test, 'o', 5);
This code will echo the value 7 to the browser because PHP has started looking for the
character o at position 5, and therefore does not see the one at position 4.
The strrpos() function is almost identical, but will return the position of the last
occurrence of the needle in the haystack. Unlike strpos(), it only works with a single
character needle.Therefore, if you pass it a string as a needle, it will only use the first

character of the string to match.
In any of these cases, if the needle is not in the string, strpos() or strrpos() will
return false.This can be problematic because false in a weakly typed language such as
PHP is equivalent to 0, that is, the first character in a string.
You can avoid this problem by using the === operator to test return values:
$result = strpos($test, 'H');
if ($result === false)
echo 'Not found'
else
echo 'Found at position 0';
Note that this will only work in PHP 4—in earlier versions you can test for false by
testing the return value to see if it is a string (that is, false).
Replacing Substrings: str_replace(), substr_replace()
Find-and-replace functionality can be extremely useful with strings.We have used find
and replace in the past for personalizing documents generated by PHP—for example by
replacing <<name>> with a person’s name and <<address>> with their address.You can
also use it for censoring particular terms, such as in a discussion forum application, or
even in the Smart Form application.
Again, you can use string functions or regular expression functions for this purpose.
The most commonly used string function for replacement is str_replace(). It has
the following prototype:
06 525x ch04 1/24/03 2:55 PM Page 108
109
Introduction to Regular Expressions
mixed str_replace(mixed needle, mixed new_needle, mixed haystack);
This function will replace all the instances of needle in haystack with new_needle and
return the new version of the haystack.
Note
As of PHP 4.0.5 you can pass all parameters as arrays and the function will work remarkably intelligently.
You can pass an array of words to be replaced, an array of words to replace them with (respectively), and an

array of strings to apply these rules to. The function will then return an array of revised strings.
For example, because people can use the Smart Form to complain, they might use some
colorful words. As programmers, we can prevent Bob’s various departments from being
abused in that way:
$feedback = str_replace($offcolor, '%!@*', $feedback);
The function substr_replace() is used to find and replace a particular substring of a
string based on its position. It has the following prototype:
string substr_replace(string string, string replacement, int start,
int [length] );
This function will replace part of the string string with the string replacement.Which
part is replaced depends upon the values of the start and optional length parameters.
The start value represents an offset into the string where replacement should begin.
If it is 0 or positive, it is an offset from the beginning of the string; if it is negative, it is
an offset from the end of the string. For example, this line of code will replace the last
character in
$test with "X":
$test = substr_replace($test, 'X', -1);
The length value is optional and represents the point at which PHP will stop replacing.
If you don’t supply this value, the string will be replaced from start to the end of the
string.
If length is zero, the replacement string will actually be inserted into the string with-
out overwriting the existing string.
A positive length represents the number of characters that you want replaced with
the new string.
A negative length represents the point at which you’d like to stop replacing charac-
ters, counted from the end of the string.
Introduction to Regular Expressions
PHP supports two styles of regular expression syntax: POSIX and Perl.The POSIX style
of regular expression is compiled into PHP by default, but you can use the Perl style by
06 525x ch04 1/24/03 2:55 PM Page 109

110
Chapter 4 String Manipulation and Regular Expressions
compiling in the PCRE (Perl-compatible regular expression) library.We’ll cover the sim-
pler POSIX style, but if you’re already a Perl programmer, or want to learn more about
PCRE, read the online manual at .
Note
POSIX regular expressions are easier to learn and execute faster, but are not binary-safe.
So far, all the pattern matching we’ve done has used the string functions.We have been
limited to exact match, or to exact substring match. If you want to do more complex
pattern matching, you should use regular expressions. Regular expressions are difficult to
grasp at first but can be extremely useful.
The Basics
A regular expression is a way of describing a pattern in a piece of text.The exact (or lit-
eral) matches we’ve done so far are a form of regular expression. For example, earlier we
were searching for regular expression terms like "shop" and "delivery".
Matching regular expressions in PHP is more like a strstr() match than an equal
comparison because you are matching a string somewhere within another string. (It can
be anywhere within that string unless you specify otherwise.) For example, the string
"shop" matches the regular expression "shop". It also matches the regular expressions
"h", "ho", and so on.
We can use special characters to indicate a meta-meaning in addition to matching
characters exactly.
For example, with special characters you can indicate that a pattern must occur at the
start or end of a string, that part of a pattern can be repeated, or that characters in a pat-
tern must be of a particular type.You can also match on literal occurrences of special
characters.We’ll look at each of these.
Character Sets and Classes
Using character sets immediately gives regular expressions more power than exact
matching expressions. Character sets can be used to match any character of a particular
type—they’re really a kind of wildcard.

First of all, you can use the
. character as a wildcard for any other single character
except a new line (\n). For example, the regular expression
.at
matches the strings 'cat', 'sat',and 'mat',among others.
This kind of wildcard matching is often used for filename matching in operating sys-
tems.
With regular expressions, however, you can be more specific about the type of char-
acter you would like to match, and you can actually specify a set that a character must
belong to. In the previous example, the regular expression matches 'cat' and 'mat',but
06 525x ch04 1/24/03 2:55 PM Page 110
111
Introduction to Regular Expressions
also matches '#at'. If you want to limit this to a character between a and z, you can
specify it as follows:
[a-z]
Anything enclosed in the special square brace characters [ and ] is a character class—a
set of characters to which a matched character must belong. Note that the expression in
the square brackets matches only a single character.
You can list a set; for example
[aeiou]
means any vowel.
You can also describe a range, as we just did using the special hyphen character, or a
set of ranges:
[a-zA-Z]
This set of ranges stands for any alphabetic character in upper- or lowercase.
You can also use sets to specify that a character cannot be a member of a set. For
example,
[^a-z]
matches any character that is not between a and z.The caret symbol means not when it is

placed inside the square brackets. It has another meaning when used outside square
brackets, which we’ll look at in a minute.
In addition to listing out sets and ranges, a number of predefined character classes can
be used in a regular expression.These are shown in Table 4.3.
Table 4.3 Character Classes for Use in POSIX Style Regular Expressions
Class Matches
[[:alnum:]] Alphanumeric characters
[[:alpha:]] Alphabetic characters
[[:lower:]] Lowercase letters
[[:upper:]] Uppercase letters
[[:digit:]] Decimal digits
[[:xdigit:]] Hexadecimal digits
[[:punct:]] Punctuation
[[:blank:]] Tabs and spaces
[[:space:]] Whitespace characters
[[:cntrl:]] Control characters
[[:print:]] All printable characters
[[:graph:]] All printable characters except for space
06 525x ch04 1/24/03 2:55 PM Page 111

×