Php characters to uppercase. Convert first letters to uppercase (upper case) - PHP
PHP is good with a variety of native language functions. However, in some cases it is necessary to add the missing functionality. This is especially noticeable when working with string conversion functions and various encodings.
For example, ucfirst is a function that converts the first character of a string to uppercase. It would seem that there should be no problems, but when working with Cyrillic, such a conversion does not occur. This also applies to the ucwords function - it converts the first character of each word in the string to uppercase.
There are no problems with English alphabet characters:
"; //converts to uppercase the first character of each word in the string echo ucwords($str); ?>
test string test string
But there are problems with Cyrillic:
"; //converts to uppercase the first character of each word in the string echo ucwords($str); ?>
test string test string
For PHP, there are typical cases when functions "badly" or do not work with Cyrillic at all. Some functions with the mb prefix solve problems with the Cyrillic alphabet. For example, mb_strtolower - lowercase string. Unlike strtolower() , whether a character is a letter is determined based on the properties of the Unicode character.
To solve the problem, let's define a function mb_ucfirst(string str [, string encoding]) that will process Unicode characters.
To convert the first character of each word in a string to uppercase, just use mb_convert_case in MB_CASE_TITLE mode.
Converts the characters of a string to lowercase.
Syntax:
Stringstrtolower(stringstr);
Converts a string to lowercase. Returns the translation result.
It should be noted that if the locale is set incorrectly, the function will produce, to put it mildly, strange results when working with Cyrillic letters.
$str = "HeLLo World"; $str = strtolower($str); echo $str; // print hello world
strtoupper
Converts the given string to upper case.
Syntax:
String strtoupper(string str);
Converts a string to uppercase. Returns the result of the transformation. This function works well with English letters, but it can be weird with Russian letters.
$str = "Hello World"; $str = strtoupper($str); echo $str; // print HELLO WORLD
ucfirst
Converts the first character of a string to uppercase.
Syntax:
String ucfirst(string str);
Returns a string whose first character is capitalized.
$str = "hello world"; $str = ucfirst($str); echo $str; // print Hello world
ucwords
Converts the first character of each word in a string to uppercase.
Syntax:
String ucwords(string str);
Returns a string with the first character of each word in the string capitalized.
A word here means a section of a line preceded by a space character: space, new line, page feed, carriage return, horizontal and vertical tabs.
Cyrillic characters may not be converted correctly.
$str = "hello world"; $str = ucfirst($str); echo $str; // print Hello World
11 years ago
Note that mb_strtolower() is very SLOW, if you have a database connection, you may want to use it to convert your strings to lower case. Even latin1/9 (iso-8859-1/15) and other encodings are possible.
Have a look at my simple benchmark:
$text = "Lörem ipßüm dölör ßit ämet, cönßectetüer ädipißcing elit. Sed ligülä. Präeßent jüßtö tellüß, grävidä eü, tempüß ä, mättiß nön, örci. Näm qüiß lörem. Näm äliqüet elit ßed elit. Phäßellüß venenätiß jüßtö eget enim. Dönec nißl. Pröin mättiß venenätiß jüßtö. Sed äliqüäm pörtä örci. Cräß elit nißl, cönvälliß qüiß, tincidünt ät, vehicülä äccümßän, ödiö. Sed möleßtie. Etiäm mölliß feügiät elit. Veßtibülüm änte ipßüm primiß in fäücibüß örci lüctüß et ültriceß pößüere cübiliä Cüräe; Mäecenäß nön nüllä.";
// mb_strtolower()
$timeMB = microtime(true);
For($i = 0 ; $i<
30000
;
$i
++)
$lower = mb_strtolower(" $text /no-cache- $i " );
$timeMB = microtime (true ) - $timeMB ;
// MySQL lower()
$timeSQL = microtime(true);
mysql_query("set names latin1");
for($i = 0 ; $i<
30000
;
$i
++) {
$r = mysql_fetch_row (mysql_query ("select lower(" $text /no-cache- $i ")" ));
$lower = $r[ 0 ];
}
$timeSQL = microtime(true ) - $timeSQL ;
echo "mb: " . sprintf("%.5f" , $timeMB ). "sec.
"
;
echo "sql: " . sprintf("%.5f" , $timeSQL ). "sec.
"
;
// Result on my notebook:
// mb: 11.50642 sec.
// sql: 5.44143 sec.
?>
7 years ago
Please, note that when using with UTF-8 mb_strtolower will only convert upper case characters to lower case which are marked with the Unicode property "Upper case letter" ("Lu"). However, there are also letters such as "Letter numbers" (Unicode property "Nl") that also have lower case and upper case variants. These characters will not be converted be mb_strtolower!
Example:
The Roman letters Ⅰ, Ⅱ, Ⅲ, ..., Ⅿ (UTF-8 code points 8544 through 8559) also exist in their respective lower case variants ⅰ, ⅱ, ⅲ, ..., ⅿ (UTF-8 code points 8560 through 8575) and should, in my opinion, also be converted by mb_strtolower, but they are not!
Big internet-companies (like Google) do match both variants as semantically equal (since the representations only differ in case).
Since I was not finding any proper solution in the internet on how to map all UTF8-strings to their lowercase counterpart in PHP, I offer the following hard-coded extended mb_strtolower function for UTF-8 strings:
The function wraps the existing function mb_strtolower() and additionally replaces uppercase UTF8-characters for which there is a lowercase representation. Since there is no proper Unicode uppercase and lowercase character-table in the internet that I was able to find, I checked the first million UTF8-characters against the Google-search and -KeywordTool and identified the following 78 characters as uppercase-characters, not being replaced by mb_strtolower, but having a UTF8 lowercase counterpart.
//the numbers in the in-line-comments display the characters" Unicode code-points (CP).
function strtolower_utf8_extended ($utf8_string )
{
$additional_replacements = array
("Dž" => "dž" // 453 -> 454
, "Lj" => "lj" // 456 -> 457
, "Nj" => "nj" // 459 -> 460
, "Dz" => "dz" // 498 -> 499
, "Ϸ" => "ϸ" // 1015 -> 1016
, "Ϲ" => "ϲ" // 1017 -> 1010
, "Ϻ" => "ϻ" // 1018 -> 1019
, "ᾈ" => "ᾀ" // 8072 -> 8064
, "ᾉ" => "ᾁ" // 8073 -> 8065
, "ᾊ" => "ᾂ" // 8074 -> 8066
, "ᾋ" => "ᾃ" // 8075 -> 8067
, "ᾌ" => "ᾄ" // 8076 -> 8068
, "ᾍ" => "ᾅ" // 8077 -> 8069
, "ᾎ" => "ᾆ" // 8078 -> 8070
, "ᾏ" => "ᾇ" // 8079 -> 8071
, "ᾘ" => "ᾐ" // 8088 -> 8080
, "ᾙ" => "ᾑ" // 8089 -> 8081
, "ᾚ" => "ᾒ" // 8090 -> 8082
, "ᾛ" => "ᾓ" // 8091 -> 8083
, "ᾜ" => "ᾔ" // 8092 -> 8084
, "ᾝ" => "ᾕ" // 8093 -> 8085
, "ᾞ" => "ᾖ" // 8094 -> 8086
, "ᾟ" => "ᾗ" // 8095 -> 8087
, "ᾨ" => "ᾠ" // 8104 -> 8096
, "ᾩ" => "ᾡ" // 8105 -> 8097
, "ᾪ" => "ᾢ" // 8106 -> 8098
, "ᾫ" => "ᾣ" // 8107 -> 8099
, "ᾬ" => "ᾤ" // 8108 -> 8100
, "ᾭ" => "ᾥ" // 8109 -> 8101
, "ᾮ" => "ᾦ" // 8110 -> 8102
, "ᾯ" => "ᾧ" // 8111 -> 8103
, "ᾼ" => "ᾳ" // 8124 -> 8115
, "ῌ" => "ῃ" // 8140 -> 8131
, "ῼ" => "ῳ" // 8188 -> 8179
, "Ⅰ" => "ⅰ" // 8544 -> 8560
, "Ⅱ" => "ⅱ" // 8545 -> 8561
, "Ⅲ" => "ⅲ" // 8546 -> 8562
, "Ⅳ" => "ⅳ" // 8547 -> 8563
, "Ⅴ" => "ⅴ" // 8548 -> 8564
, "Ⅵ" => "ⅵ" // 8549 -> 8565
, "Ⅶ" => "ⅶ" // 8550 -> 8566
, "Ⅷ" => "ⅷ" // 8551 -> 8567
, "Ⅸ" => "ⅸ" // 8552 -> 8568
, "Ⅹ" => "ⅹ" // 8553 -> 8569
, "Ⅺ" => "ⅺ" // 8554 -> 8570
, "Ⅻ" => "ⅻ" // 8555 -> 8571
, "Ⅼ" => "ⅼ" // 8556 -> 8572
, "Ⅽ" => "ⅽ" // 8557 -> 8573
, "Ⅾ" => "ⅾ" // 8558 -> 8574
, "Ⅿ" => "ⅿ" // 8559 -> 8575
, "Ⓐ" => "ⓐ" // 9398 -> 9424
, "Ⓑ" => "ⓑ" // 9399 -> 9425
, "Ⓒ" => "ⓒ" // 9400 -> 9426
, "Ⓓ" => "ⓓ" // 9401 -> 9427
, "Ⓔ" => "ⓔ" // 9402 -> 9428
, "Ⓕ" => "ⓕ" // 9403 -> 9429
, "Ⓖ" => "ⓖ" // 9404 -> 9430
, "Ⓗ" => "ⓗ" // 9405 -> 9431
, "Ⓘ" => "ⓘ" // 9406 -> 9432
, "Ⓙ" => "ⓙ" // 9407 -> 9433
,
"Ⓚ"
=>
"ⓚ"
// 9408 -> 9434
,
"Ⓛ"
=>
"ⓛ"
// 9409 -> 9435
,
"Ⓜ"
=>
"ⓜ"
// 9410 -> 9436
,
"Ⓝ"
=>
"ⓝ"
// 9411 -> 9437
,
"Ⓞ"
=>
"ⓞ"
// 9412 -> 9438
,
"Ⓟ"
=>
"ⓟ"
// 9413 -> 9439
,
"Ⓠ"
=>
"ⓠ"
// 9414 -> 9440
,
"Ⓡ"
=>
"ⓡ"
// 9415 -> 9441
,
"Ⓢ"
=>
"ⓢ"
// 9416 -> 9442
,
"Ⓣ"
=>
"ⓣ"
// 9417 -> 9443
,
"Ⓤ"
=>
"ⓤ"
// 9418 -> 9444
,
"Ⓥ"
=>
"ⓥ"
// 9419 -> 9445
,
"Ⓦ"
=>
"ⓦ"
// 9420 -> 9446
,
"Ⓧ"
=>
"ⓧ"
// 9421 -> 9447
,
"Ⓨ"
=>
"ⓨ"
// 9422 -> 9448
, "Ⓩ" => "ⓩ" // 9423 -> 9449
,
"𐐦"
=>
"𐑎"
// 66598 -> 66638
, "𐐧" => "𐑏" // 66599 -> 66639
);
$utf8_string = mb_strtolower($utf8_string , "UTF-8" );
$utf8_string = strtr ($utf8_string , $additional_replacements );
Return $utf8_string ;
) //strtolower_utf8_extended()
Case Change Functions
strtolower
Converts the characters of a string to lowercase.
Syntax:
string strtolower(string str);
Converts a string to lowercase. Returns the translation result.
It should be noted that if the locale is set incorrectly, the function will produce, to put it mildly, strange results when working with Cyrillic letters.
$str = "HeLLo World";
$str = strtolower($str);
// print hello world
strtoupper
Converts the given string to upper case.
Syntax:
stringstrtoupper(stringstr);
Converts a string to uppercase. Returns the result of the transformation. This function works well with English letters, but it can be weird with Russian letters.
$str = "Hello World";
$str = strtoupper($str);
// print HELLO WORLD
Converts the first character of a string to uppercase.
Syntax:
string ucfirst(string str);
Returns a string whose first character is capitalized.
$str = "hello world";
$str = ucfirst($str);
// print Hello world
Converts the first character of each word in a string to uppercase.
Syntax:
string ucwords(string str);
Returns a string with the first character of each word in the string capitalized.
A word here means a section of a line preceded by a space character: space, new line, page feed, carriage return, horizontal and vertical tabs.
Cyrillic characters may not be converted correctly.
$str = "hello world";
$str = ucfirst($str);
// print Hello World
From the book Linux Kernel Module Developer Encyclopedia author Pomerants Ori From the book The Human Factor in Programming author Constantine Larry L54 Agents of Change One fish, by making the right move at the right moment, can change the course of an entire school. In a software development team, the success of introducing a new tool or improved versioning method often depends on
From the book Using the STL Effectively by Meyers ScottTip 35: Implement simple case-insensitive string comparisons using mismatch or lexicographical_compare One of the questions often asked by newcomers to the STL is "How does the STL compare case-insensitive strings?" The simplicity of this question is deceptive. Comparisons
From the book Fundamentals of Object-Oriented Programming by Meyer BertrandCase-Insensitive String Comparison Matt Ostern
From the book TCP/IP Architecture, Protocols, Implementation (including IP version 6 and IP Security) the author Faith Sidney M From the book XSLT Technology author Valikov Alexey NikolaevichChange and Consistency Software development, as already mentioned, is largely about repetition. To understand the technical difficulty of reuse, one must understand the nature of repetitiveness. Despite the fact that programmers usually repeat the same and
From the book Developing Applications in a Linux Environment. Second Edition author Johnson Michael K.11.8.2 Moves and Changes What happens if the user moves the computer to another location by connecting it to a different network or subnet? During boot, a computer using DHCP will automatically change its IP address and subnet mask, and if necessary -
From the C++ book. Recipe collection author Diggins Christopher22.10.3 Changes in DNS A new type of resource record, AAAA, maps domain names to IP version 6 addresses. An example of such a record is: MICKEY IN AAAA 4321:0:1:2:3:4:567:89AB Reverse lookup must also be provided . To convert addresses to names for IPv6, you will need to add new domains. Reverse lookup
From the book Linux and UNIX: shell programming. Developer Guide. author Tainsley David From the book Office Computer for Women author Pasternak EvgeniyaChanges in XPath 2.0 The development version of the XPath language, due to integration with XQuery, will obviously undergo major changes. The new specification is already split into two documents: a document describing the data model and a document describing functions and operators. Therefore, on
From the book HTML, XHTML and CSS 100% the author Quint Igor11.7.2. Code Changes Once parseCommand() has correctly reflected the data structures, running the commands in the correct order becomes fairly easy with enough attention to detail. First of all, we add a loop to parseCommand() to run child processes,
From the author's book4.13. Performing a Case-Insensitive String Comparison Problem You have two strings and you want to know if they are not equal, regardless of the case of their characters. For example, "cat" is not equal to "dog", but "Cat" must be equal to "cat", "CAT", or "caT". SolutionCompare the strings using the standard
From the author's book4.14. Performing a case-insensitive search for strings
8.1.8. Case Ignore By default, grep is case sensitive. To search in a case-insensitive manner, use the -i option. In the data.f file, the month designation Sept occurs in both upper and lower case. Therefore, for
From the author's bookChanges If there are many changes in the document, it is convenient to use this group of buttons. For example, you see a correction, and you agree with it. Place the cursor in it and click the Accept button. The correction highlight disappears and it blends seamlessly into your text. This
From the author's bookThe assignment of tag and attribute names is case-sensitive In HTML documents, tag and attribute names are case-insensitive, so, for example, the notation