Preg match examples. PHP (regular expression) - what is it? Examples and testing of regular expressions. Basic Regular Expression

The most common uses of regular expressions in Perl are in search and replace operators such as s// , m/ , the linking operators =~ or != , etc. As a rule, all these operators have similar options such as:

Typically all these options are denoted as "/x". They can even be used inside templates using the new construction (?...)

Regular Expressions or patterns are the same as regexp procedures in Unix. Expressions and syntax are borrowed from the freely distributed V8 procedures by Henry Spencer, where they are described in detail.

Templates use the following metacharacters (characters that denote groups of other characters) often called the egrep standard:

Metacharacters have modifiers (written after the metacharacter):

In all other cases, curly braces are considered ordinary (regular) characters. Thus "*" is equivalent to (0,) , "+" is (1,) and "?" - (0.1). n and m cannot be greater than 65536.

By default, metacharacters are greedy. The match is propagated as many times as possible, without taking into account the effect of subsequent metacharacters. If you want to "reduce their appetite", then use the "?" symbol. This doesn't change the meaning of the metacharacters, it just reduces the propagation. Thus:

Wildcards work the same way as double quotes, so you can use `\` - characters (backslash characters) in them:

\t - tab character
\n - new line
\r - a carriage return
\A - format translation
\v - vertical tabulation
\a - call
\e - escape
\033 - octal symbol notation
\x1A - hexadecimal
\c[ - control symbol
\l - lowercase next character
\u - uppercase -//-
\L - all characters are lowercase up to \E
\U - in the upper -//-
\E - register change limiter
\Q - cancel action as a metacharacter

Additionally, the following metacharacters have been added to Perl:

Note that this is all "one" character. Use modifiers to indicate sequence. So:

In addition, there are imaginary metacharacters. Denoting non-existent symbols at the place where the value changes. Such as:

A word boundary (\b) is an imaginary point between the characters \w and \W. Within a character class, "\b" represents the backspace character. The metacharacters \A and \Z are similar to "^" and "$", but if the line start "^" and line end "$" act for each line in a multiline string, then \A and \Z mark the start and end of the entire multiline string .

If grouping (parentheses) is used within the pattern, then the group substring number is designated as "\digit". Note that following the pattern within an expression or block, these groups are denoted as "$digit". In addition, there are additional variables:

Example:

$s = "One 1 two 2 and three 3";

if ($s =~ /(\d+)\D+(\d+)/) ( print "$1\n"; # Result "1" print "$2\n"; # "2" print "$+\n" ; # "2" print "$&\n"; # "1 two 2" print "$`\n"; # "One " print "$"\n"; # " and three 3" )

Example:

Perl version 5 contains additional template constructs:

$s = "1+2-3*4"; if ($s =~ /(\d)(?=-)/) # Find the number followed by "-" ( print "$1\n"; # Result "2" ) else ( print "search error\n" ; )

Example:

(?!pattern) - “looking” ahead by negation:

$s = "1+2-3*4"; if ($s =~ /(\d)(?!\+)/) # Find a digit that is not followed by "+" ( print "$1\n"; # Result "2" ) else ( print "search error\ n"; )

(?ismx) - “internal” modifiers. It is convenient to use in templates, where, for example, you need to specify a modifier inside the template.

  • Regular expression rules. (regex)
  • Any character represents itself unless it is a metacharacter. If you need to cancel the effect of a metacharacter, put "\" in front of it.
  • Character string denotes a string of these characters. The set of possible symbols (class) is square brackets
  • "", this means that one of the characters indicated in brackets can appear in this place. If the first character in brackets is “^”, then none of the specified characters can appear at this point in the expression. Within a class, you can use the symbol "-" to denote a range of characters. For example, a-z is one of the small letters of the Latin alphabet, 0-9 is a number, etc.

    PHP portal forum. S.U.
    Regular expressions are a very useful tool for developers. Regular expressions allow you to check, search, and modify text for correctness.

    This article contains some very useful expressions that you often need to work with.

    When first introduced to regular expressions, they may seem difficult to understand and use. In fact, everything is simpler than it seems. Before we look at complex examples, let's look at the basics:

    Functions for working with regular expressions in PHP Domain checking

    Checking for the correct domain name.

    $url = "http://example.com/"; if (preg_match("/^(http|https|ftp)://(*(?:.*)+):?(d+)?/?/i", $url)) ( echo "Ok."; ) else ( echo "Wrong url."; )

    Highlighting words in text

    A very useful regular expression for . Useful for searching.

    $text = "Sample sentence, regex has become popular in web programming. Now we learn regex. According to wikipedia, Regular expressions (abbreviated as regex or regexp, with plural forms regexes, regexps, or regexen) are written in a formal language that can be interpreted by a regular expression processor"; $text = preg_replace("/b(regex)b/i", "1", $text); echo $text;

    Search results highlighting in WordPress

    As already mentioned, the previous example is very useful for . Let's apply it to WordPress. Open the file search.php, find the function the_title(). Replace it with the following:

    Echo $title;

    Now, before this line, insert the code:

    Open the file style.css. Add the line to it:

    Strong.search-excerpt ( background: yellow; )

    Get all images from an HTML document

    If you ever need to find all the images on HTML page, you will find the following code useful. With it, you can easily create an image uploader using .

    $images = array(); preg_match_all("/(img|src)=("|")[^"">]+/i", $data, $media); unset($data); $data=preg_replace("/(img|src)("|"|="|=")(.*)/i","$3",$media); foreach($data as $url) ( $info = pathinfo($url); if (isset($info["extension"])) ( if (($info["extension"] == "jpg") || ($info["extension"] == "jpeg") || ($info["extension"] == "gif") || ($info["extension"] == "png")) array_push($ images, $url);

    Removing duplicate words (case independent) $text = preg_replace("/s(w+s)1/i", "$1", $text); Removing duplicate punctuation marks

    Similar to the previous one, but removes punctuation marks.

    $text = preg_replace("/.+/i", ".", $text);

    Finding an XML/HTML tag

    A simple function that takes two arguments: the tag to be found and a string containing XML or HTML.

    Function get_tag($tag, $xml) ( $tag = preg_quote($tag); preg_match_all("(]*>(.*?).")", $xml, $matches, PREG_PATTERN_ORDER); return $matches;

    Finding an XML/HTML tag with a specific attribute value

    The function is similar to the previous one, but it becomes possible to specify a tag attribute. For example: .

    Function get_tag($attr, $value, $xml, $tag=null) ( if(is_null($tag)) $tag = "\w+"; else $tag = preg_quote($tag); $attr = preg_quote($ attr); $value = preg_quote($value); $tag_regex = "/]*$attr\s*=\s*". "(["\"])$value\\2[^>]*>( .*?)/" preg_match_all($tag_regex, $xml, $matches, PREG_PATTERN_ORDER); return $matches; )

    Finding Hex Color Codes

    The function allows you to find or check the correctness of hexadecimal color codes.

    $string = "#555555"; if (preg_match("/^#(?:(?:(3))(1,2))$/i", $string)) ( echo "example 6 successful."; )

    Finding the page title

    This code will find and display the text between the tags And HTML pages.

    $fp = fopen("http://www.catswhocode.com/blog","r"); while (!feof($fp))( $page .= fgets($fp, 4096); ) $titre = eregi("(.*)",$page,$regs); echo $regs; fclose($fp);

    Parsing Apache logs

    Many sites work on Apache web server. If your site also runs on a taco server, then the following regular routines may be useful.

    //Logs: Apache web server// Successful accesses to html files. Useful for counting page impressions. "^((?#client IP or domain name)S+)s+((?#basic authentication)S+s+S+)s+[((?#date and time)[^]]+)]s+"(?: GET|POST|HEAD) ((?#file)/[^ ?]+?.html?)??((?#parameters)[^ ?]+)? HTTP/+"s+(?#status code)200s+((?#bytes transferred)[-0-9]+)s+"((?#referrer)[^"]*)"s+"((?#user agent )[^"]*)"$" //Logs: Apache web server //404 errors "^((?#client IP or domain name)S+)s+((?#basic authentication)S+s+S+)s+ [((?#date and time)[^]]+)]s+"(?:GET|POST|HEAD) ((?#file)[^ ?]+)??((?#parameters)[^ ?"]+)? HTTP/+"s+(?#status code)404s+((?#bytes transferred)[-0-9]+)s+"((?#referrer)[^"]*)"s+"((?#user agent )[^"]*)"$")

    Replacing double quotes with curly ones preg_replace("B"b([^"x84x93x94rn]+)b"B", "?1?", $text); Checking password complexity

    This regular expression checks a string point by point: the string must contain at least 6 letters, numbers, underscores and dashes. The line must contain at least one capital letter, lowercase and number.

    "A(?=[-_a-zA-Z0-9]*?)(?=[-_a-zA-Z0-9]*?)(?=[-_a-zA-Z0-9]*?) [-_a-zA-Z0-9](6,)z"

    WordPress: Getting Post Images Using Regular Expression

    If you use WordPress, you might find it useful to have a feature that will get all the images from a post and display them. To use this code, copy it into your theme files.



    2024 wisemotors.ru. How it works. Iron. Mining. Cryptocurrency.