urlencode and urldecode

Since space is not allowed in a url, the following html is illegal in syntax:

<a href="http://myprogrammingnotes.com/test.php?p=var1 + var2">click me</a>

However, if you open that html in firefox, you will not receive a warning message. If you click the url, the browser will navigate to the website without a problem. In fact, Firefox will replace the spaces in the url with “%20″ and go to that new url. But if you move the mouse pointer over the url, you still see the original url(myprogrammingnotes.com/test.php?p=var1 + var2) in the bottom information bar of the browser.

If you manually change the space to %20 in the html code, you still see “myprogrammingnotes.com/test.php?p=var1 + var2″ in the bottom bar of firefox. This implies firefox decodes the url in some way before it shows in the bottom bar. Firefox not only decodes %20 but other percent-encoded characters for showing in the bottom bar. For example, if you change “+” to “%2b” in the html, you still see the same in the bar(“myprogrammingnotes.com/test.php?p=var1 + var2″). But that does not mean firefox will percent-encode “+” in the url before navigating to it. Firefox only percent-encodes the space character.

In practice, you do not need to manually percent-encode the non-ascii characters in the url. You can use the php urlencode function to do the work.

php urlencode percent-encodes all non-alphanumeric characters except -_.  and encodes space with “+”(not “%20″). So the following code

$url="http://myprogrammingnotes.com/test.php?p=".urlencode("var1 + var2");
echo "<a href='$url'>click me</a>";

will generate this url:

http://myprogrammingnotes.com/test.php?p=var1+%2B+var2

Now if you move the mouse pointer over the link, what will you see in the bottom bar? Is it the same as you see before? Not actually. You will see this “http://myprogrammingnotes.com/test.php?p=var1+++var2″. Firefox only decodes the percent-encoded characters before showing, not other characters such as “+”. Of course, firefox won’t decode or encode the “+” or “%2b” before using it to navigate.

Now let’s go to the first html example. What do you think the following code in test.php will produce?

echo $_GET['p'];

Is it “var1%20+%20var2″ or “var1 + var2″? Neither, in fact. http server will parse the parameter, decode all percent-encoded characters and other special characters such as “+”. So the “%20″ will be decoded to space, and “+” will be decoded to space too. This is the functionality of php urldecode.

Php beginners often make this mistake. They use urlencode to encode the value of a url parameter, then use urldecode to try to get the original value of the parameter. Specifically, they write the following code

$url="http://myprogrammingnotes.com/test.php?p=".urlencode("var1 + var2");
echo "<a href='$url'>click me</a>";

and this test.php

echo urldecode($_GET['p']);

Do they get the original value of the parameter p(“var1 + var2″)? Not! What they get is “var1   var2″. Note there are 3 spaces between var1 and var2. That is because http server has done the first decoding to form $_GET[‘p’] which is “var1 + var2″, urldecode now does the second decoding, which replaces “+” with a space and forms the final result “var1   var2″.

This mistake reminds us not to use urldecode often. Using  $_GET[‘p’] is enough to get the value of the parameter. So when to use urldecode? Note that http server only does the urldecode to form the _GET,_POST array, the variables in _SERVER array such as $_SERVER[‘REQUEST_URI’], are not urldecoded, which means you may need to decode it yourself using the  urldecode  function.

Posted in

Comments are closed, but trackbacks and pingbacks are open.