Getting up to speed with SVN on Windows

I’ve been saying it for a couple of years now. “I’ve got to get all my projects under source control!” We all know it’s the right thing to do. The smart thing. But for some reason it never got done, or got partially done only to fall apart later because of the half measures taken. Well last night I finally decided to sit down and kick myself into doing it. Surprisingly, it was much easier than I feared.

I’ve had a WAMP stack on my laptop for quite some time, for all of my personal projects, and for work projects if I need to be offline from the office. My personal favorite is XAMPP, because of the ability to drop it on a USB stick if I need to and go.

I’ve also had SVN installed for quite some time. I use the package with the Apache 2.2 bindings located at Tigris.org (they also have an Apache 2.0 version available).

It was as simple as going into the file c:\xampp\apache\conf\httpd.conf and adding a couple of lines to get things up.

LoadModule dav_module modules/mod_dav.so
LoadModule dav_svn_module "C:\Program Files\svn\bin\mod_dav_svn.so"

<location /repos>
	DAV svn
	SVNParentPath C:\svn_repository
</location>

Basically this loads the WEB_DAV module and the SVN extensions to that. Then it defines the location /repos to be the root of my subversion repositories. Using the SVNParentPath directive tells SVN to look for a directory in C:\svn_repository that matches the repository name you are trying to access.

Example:
I build a repository Foo at C:\svn_repository\foo. I access it by visiting http://localhost/repos/foo

Once this is up and running I installed http://tortoisesvn.tigris.org/. This makes it simple to create repositories and manage working directories by integrating the SVN functions with the windows explorer context menu.

Finally, I use Eclipse to do my development, so I installed Subclipse to add SVN integration to Eclipse.

Now it came to the most tedious portion. I had SVN installed, the server configured, and the management tools in place. Now its time to import some projects.

Going to the C:\svn_repository directory I create a new folder {project-name} for the project I am going to import. I right click on the folder and select the TortoiseSVN sub-menu and click on Create Repository Here. I select Native File System and click on OK. Congrats, your repository is waiting. You can confirm this by visiting http://localhost/repos/{project-name} . You should see revision 0 and an empty directory.

I use the recommended repository structure from the book Version Control with Subversion. I create a temporary directory and within it I create a directory named for the repository. Within that directory I create the tags, branches and trunk directories and copy the project files into the trunk directory. I back out to the temporary directory, and right-click on the project directory. From the context menu I select the TortoiseSVN sub menu and click on Import. You will be asked what repository you would like to import to. Enter http://localhost/repos/{project-name} . Where project-name is the folder name you created the repository above. Enter an initial comment to describe what you are doing, and maybe what the project is. Click on OK and sit back and watch as your project is imported.

Congratulations, you are now under revision control!

PREG_REPLACE_EVAL

This afternoon, while trying to come up with a solution for a problem on a client’s site, I had one of those “AHA!” moments. I’ve been working with PCRE (Perl Compatible Regular Expressions) just about as long as I have been using PHP, but I never really looked into the docs until today. I discovered the ‘e’ expression modifier for use with preg_replace.

The problem is fairly simple. I cannot trust the users to input names and headlines with a consistent capitalization at times. So normally I trust the simple text handling methods in PHP and do something like:

echo uc_words(strtolower($name));

Now for most cases this will work like a charm, until you come across a hypenated last name, like:

Kathy Jones-Smith

You end up with:

Kathy Jones-smith

Not real pretty, and when Mrs. Jones-Smith sees her name on the site like that, she can get a little irate (Do you blame her?). So I started looking into ways to resolve this. I stumbled across the example on the preg_replace documentation page that shows the use of the ‘e’ modifier to capitalize all HTML tags on a page. The light popped on and I replaced my code above with:

echo preg_replace("/(^|\s|-)([a-z])/e","'\\1'.strtoupper('\\2')",$name);

To explain it, the expression looks for any lower case letter immediately following one of: the beginning of the string, a white-space or a hyphen. It captures both the preceding character and the lowercase letter into two numbered capture groups. It passes them into the replacement string and then eval()’s the string as PHP code, allowing the strtoupper() function to do its work.

Regular expressions for the win yet again.

Bad Information

I get really frustrated when I see bad information given out.  On any topic, it doesn’t really matter, if I know its false, I hate to hear it get perpetuated.  Today has been no different, except it has really torqued me because the subject is information security.

I read in another blog that the best way to prevent POST requests to your site from originating elsewhere is to review the value stored in $_SERVER['HTTP_REFERER'], and make sure that it matches the domain of the site itself before processing the POST request.  Yes folks, its that simple.  Or is it?

The last time I checked the PHP Language Documentation I recall it saying that this value was not to be trusted. In fact it still says just this, here is the relevant text from the PHP web-site:

The address of the page (if any) which referred the user agent to the current page. This is set by the user agent. Not all user agents will set this, and some provide the ability to modify HTTP_REFERER as a feature. In short, it cannot really be trusted.

Hmm… so it seems that this is a bad idea all around.  All I would need to do as an attacker would be to crank out a script using CURL, and set the CURLOPT_REFERER option to be my target’s web-site, and BANG, I’m golden, happily cranking away at POST requests to his contact book form and filling it with viagra spam.

Well, if this is a bad idea, what is a good way to prevent this?

Well, my first thought is to take advantage of PHP sessions to do this.

In the source form file, initiate a session, and store a secret value to the session data store.  In the processing script, check this value for validity, clear it out (to prevent simply hijacking the session to send repeated posts), and then process the input only if valid.  Further more, I would add abuse checking, to prevent repeated attempts at submitting the form.  A simple counter variable, again stored in the session data store, or an array of time-stamps, with a threshold check to prevent more than X submissions in Y seconds for a given session, before the IP address is added to a block list and denied access to the processing form logic at all.

Now this won’t prevent a determined attacker, nor will it help you if someone just decides to use their pet bot-net to flood the site with POST requests simply to create a Denial of Service situation.  But it should put a crimp in the operations of your basic comment spammer who simply wants to sell the world on the benefits of cheap, questionable-quality, pharmaceuticals.

Download the code discussed in this post!

Is there such a thing as too much?

Recent (okay not so recent) developments in the area of web development have led a lot of designers to hop on the latest and greatest and slam WAY too much 2.0 into their sites.  Sadly, I have to admit that I am guilty as well.  Some points to remember as you are AJAX-ing and flashing your site to death:

1)  It’s only a good thing if it’s useful to the user and his or her end experience.  If you are just adding flashy elements because you can, stop.  Go back to some solid HTML or dynamically generated content with you favorite coding language.

2)  Search engines are NOT, repeat NOT, going to be indexing your super cool AJAX and, unless you’ve coded it properly, flash elements.  Rather than try to explain the specifics around this, know that Google is your friend.  Spend some time researching.

That being said, there are times when adding some flash elements or some nice AJAX to a page are certainly called for but be careful that you pay attention to the two points above.  I personally love a good AJAX login, voting system, or modal box/light box element shoved in when called for.

Expect the unexpected

I have seen many examples lately of “newbie” help posts where the code given, though technically correct, does not suit a wide range of situations. The most recent example of this I found on DZone’s PHP Zone. I read this post, and couldn’t help but have to comment on the limited view that was embraced by the original poster. Yes, checking for port 443 use will indeed work to determine if the incoming request is SSL encrypted, but only provided your server is using the standard ports. I work with a situation where when we have a client site with an installed SSL certificate, we place our beta server on a non standard port with the same domain name as the live site. The allows us to ensure that there are no issues with the certificate while not having to purchase or bill our clients for an additional certificate. For this situation we use PHP’s built in support for detecting HTTPS communication.

if ($_SERVER['HTTPS'] == '' || $_SERVER['HTTPS'] == 'off') {
    // redirect here to the proper hostname, port number and page
    header("Location: https://{$_SERVER['HTTP_HOST']}:{$secure_port}{$_SERVER['REQUEST_URI']}");
    exit();
}

This code will support you in re-directing your non HTTPS communication to HTTPS when using non-standard ports, you will need to supply the $secure_port variable to ensure that redirection goes to the proper target.

Zend Framework

I’ve been working with the Zend Framework finally for the last week or so. Doing some tutorials to get the basics, and now starting to build a website based on it. I’m pretty impressed with what I have seen so far.  I plan on writing a series of posts around my experiences.

Welcome friends and foes alike

Recently, I stumbled across Oliver Steele’s site and found his link to The Programmer’s Food Pyramid. Looking it over, I recognized the importance of most of the items there. Reading code, and reading about code of course. Writing code, how obvious. Revising code, okay, I had always lumped that one into the reading code and writing code blocks, but I could see how it could be considered a separate activity. Then, up there at the top, the one that made me think for a minute.

Writing about code.

I had never thought about that one before. But in seeing it there, it makes complete sense. It’s something I should have realized much earlier. It’s something I have always done when reading code. When I find a particularly dense piece of code, something that is far from being intuitive, how did I work it out, and understand it? I would go through it, line-by-line, instruction-by-instruction, and write out what it was doing, in plain English. I was writing about code, while reading it. But it was never a consistent thing, a tool only reserved for special occasions.

This blog is my attempt to change all that. If writing about code once in a great while has helped me in the past, what about writing about it far more often? Once or twice a week? Find some piece of code and analyze it. Tear it down, put it back together, and explain how I think it could be improved in the process. I have to think, it can’t hurt.

Though the blog title contains PHP, it only one of the languages I plan on writing about here. I’m the type who is always trying something new, so I plan on using this to write about everything I try. So expect that you might see some Java, C, C++, C#, Groovy, Ruby, and who knows what else.

Maybe my musings here will eventually help me become a better programmer, but even better, it might help someone else become a better programmer as well. Feel free to comment, criticize and debate. It can’t hurt, and it might just help.

Musings From the World of PHP