Tips, Tricks, Tools & Techniques

for Internet Business, Life, the Universe and Everything

RSS Feed



Category: Security

Database backups

9 September, 2007 (01:22) | Security, Tools, WordPress | By: Nick Dalton

Databases cannot be backed up as regular files because the database continuously writes to its files, so you are very likely to backup an incomplete or corrupt set of files. Therefore you need to use a special database backup program to create the backup. After the backup program has done its job you can then copy, move and archive the backup files just like any ordinary file.

Most web server control panels come with a database administration program. A popular option is phpMyAdmin for MySQL databases and phpPgAdmin for Postgres. Here are step by step backup instructions for some of the more popular control panels. And equally important: restore instructions.

The drawback with these database administration programs is that you have to perform the backup manually. If you’re like most people you will put off a manual chore like this until it’s too late. Therefore the goal should always be to automate your backups. If you’re running a WordPress blog on your web site you should definitely install the excellent WordPress Database Backup plugin. This plugin used to be distributed with WordPress 2.0 but was later mysteriously dropped from the standard distribution. If you have other programs on your web site that store information in a MySQL database then you need a full backup script like AutoMySQLBackup. Note that this latter solution requires some Linux shell knowledge to setup.

A great feature of both the WordPress Database Backup plugin and AutoMySQLBackup is the ability to email the backup files to yourself every day. I recommend that you setup a new Google Gmail account which has over 2 GB of storage to receive your backup files. Unless you’re a very prolific blogger, 2 GB should last you for quite a while. Then once a month or so you can login to the email account and delete old files.

Just like with regular backups is critical that you test your restore procedure occasionally. For this you can install MySQL or Postgres locally on your PC and restore the data to it. Backup files typically contain checksums and integrity checks, so if the file restores without errors then it’s very likely that the restore was successful without you having to verify the contents of all the data.

Now go back something up!

Backup your web site

4 September, 2007 (14:57) | Security | By: Nick Dalton

If something were to happen to your web server right now, how long would it take you to restore all files and all functionality to your site? Maybe your web hosting plan includes daily backups, so why worry? There are many reasons to worry, and a little bit of proactive worry is in this case good.

When was the last time that you actually tested the backup and restore function offered by your web host? Assuming that this checks out ok, there are other reasons why you may need to quickly restore your web site to a different location: a billing dispute with your current web host, spam complaints against your server, copyright infringements on your site. (If you are guilty of sending spam and using copyrighted material without permission, you certainly deserve to lose your site. But even allegations against an innocent may lead you to lose control of your site during the investigation.) The hosting company may also simply go out of business. The ways you can lose your web site are many and unpredictable.

There are several parts to a complete backup of your web site:

1. All the files on the web server

This one is the easiest. Just use your favorite ftp program and copy all the files in the web server root directory (httpdocs or public_html depending on your control panel) and all subdirectories to your PC. Do it right now.

2. Any software you have installed

If you have installed WordPress, a forum, a shopping cart or any other software that did not come with your hosting plan, make sure you have a backup copy of that software and the accompanying installation instructions. Don’t rely on the software vendor to be able to provide you with the exact old version of their software that you have installed. (When you need to quickly restore your web site is not a good time to upgrade software components. You have enough things to worry about as it is.)

It is also better to have the original installation files and reinstall them from scratch, rather than to rely on finding all the necessary installed files scattered on the web server file system.

3. Configurations

Don’t forget to keep notes of all configuration changes you make to your web site. This includes DNS settings, Apache configuration changes, and anything you setup in your control panel.

It is difficult to automate backups of configurations since you typically don’t have ready access to where the settings are saved. I just keep a text file on my PC where I make running notes as I make configuration changes to my sites. One text file per web server.

4. Databases

Backing up databases can be tricky. I’ll cover this topic in a subsequent post.

Restore

Any backup system is worthless if you cannot restore your data. (Actually it’s worse than worthless: you’ve wasted time backing up data and you’ve been lulled into a false sense of security.) The time to test if you can restore your data is not after disaster strikes. Make sure that you test your procedures before you need to rely on them.

A simple (but not complete) check is to restore all the files to a different subdirectory. Create a new “restore” directory and try to restore all files there. Then compare the contents of the files with the originals. You probably cannot “run” your web site from this restore directory but at least you have verified that your backup does contain valid data and that you are able to restore it.

A more complete test would involve getting a separate web server to restore your files to. This can be a cheap shared hosting plan as long as it has the same software installed as your regular web server. If you are able to restore all your data and create a functional web site on this new web server, then you are in good shape. Repeat this exercise on a regular basis. If you have outsourced your web master tasks then this is a good exercise to ensure that the outsourced resource fully understands your site and your systems.

Keeping this second web site online and regularly updated using the procedures above makes it a disaster recovery site. Many large corporations have such a site available and ready to switch over to should disaster strike the main site. All you have to do is change the DNS settings to point to this site instead of the original site and you’re back in business while you work on restoring the main site. Even if the disaster recover site is not quite up-to-date it is still probably better than having no site at all.

What are some tools and techniques that you use to backup your web server?

Happeneur Call

29 August, 2007 (16:06) | Security | By: Nick Dalton

I had a great teleconference today with Mike Jay and his Happeneur coaching students. The topic was web site security. It was a very interactive call with great questions from the participants. Before we were done we had covered topics from backups and protecting sensitive data to redundant systems.

I managed to squeeze in one sentence or two about my Digital Security Report. But this was a no pitch, no fluff just pure information type of call . Just the way I like it.

If you want me to do a similar call with your customers let me know. My availability is limited, but this is something I enjoy doing so I do my best to accommodate requests.

Browser toolbars reveal more than you think

27 August, 2007 (06:34) | Search Engines, Security | By: Nick Dalton

All the major search engines provide toolbars that you can download and install in your browser. Each toolbar has some nifty features that are commonly not found in browsers, which makes them compelling enough to download and install. One feature of all toolbars is to be ale to search the web using the search engine that made the toolbar. This is of course the reason for the toolbar’s existence: to funnel more searches to the search engine.

Another common “feature” of search engine toolbars is to report home about each web page that you visit. Even though you can in most cases turn off this feature, the toolbar offers some compelling extra benefit so that most users keep it enabled. (Or they are just unaware of the “call home” feature.)

If we for the moment disregard the privacy aspects of reporting every web page that you visit, there is another implication that most web site owners are not aware of: The web pages reported by toolbars are fed into the search engine’s web crawler. (I don’t have prof that this is the case for all toolbars, but I know it’s true in at least one case. And that’s enough to cause trouble for web masters.)

What’s the problem with that, you say? One example could be that you’re working on a new web site that is not quite ready to be public yet. And you haven’t bothered to password protect it during the development. Who is going to guess your new domain name anyway? As you’re busy developing your site, the toolbar sends the URL of every page – finished or not – to the search engine.

Another, perhaps more serious, example is the thank you page of web sites that sell digital products. When you – or anyone of your customers – goes to the thank you page, the toolbar reports the URL to the search engine. If you don’t have any additional protection on the thank you page it will be included in the search engine index. Then when a potential customer uses that search engine it’s possible that your thank you page shows up in the search results. And it’s very likely that the person searching was looking to buy your product. But now, with direct access to the thank you page the potential customer can download it for free. You just lost a sale.

If you have good web analytics it may be possible to see these direct accesses and calculate how much money you’re loosing. But it’s also very likely that the search engine has cached your page, and possibly even the product download itself. In that case you will never even know that your product was downloaded without payment.

My Digital Security Report has advice on how to protect your digital products from overzealous search engine toolbars.

Can anyone view your WordPress plugins?

20 August, 2007 (06:24) | Security | By: Nick Dalton

If you are running WordPress go to www.yourdomain.com/wp-content/plugins. If you see a directory listing of all your installed plugins you may want to follow the steps described by Shoemoney here.

This is not a major security hole and you are not alone in exposing your plugins. Google has indexed over 500,000 plugin directory listing pages.

It appears that this will be fixed in the 2.3 release of WordPress.

robots.txt

13 August, 2007 (22:02) | Search Engines, Security | By: Nick Dalton

Back in the days around 3 B.G (Before Google) AltaVista was the new search engine on the block. In an effort to show off the power of their minicomputers, the AltaVista team at Digital decided to crawl and index the entire web. This was at the time a new concept. Many web masters didn’t relish the idea of a “robot” program accessing every page on their web site as this would add more load to their web servers and increase their bandwidth costs. So in 1996 the Robots Exclusion Standard was created to address these web master concerns.

Using a simple text file called robots.txt you can instruct web crawlers (a.k.a. robots) to stay out of certain directories. Here is a very simple robots.txt which disallows all robots (User-agents) access to the /images directory.

User-agent: *
Disallow: /images

By disallowing /images you are also implicitly disallowing all subdirectories under /images, such as /images/logos and any files beginning with /images such as /images.html.

Curiously there was no “Allow” directive in the first draft of the standard. It was added later, but it’s not guaranteed to be supported by all robots. So anything that is not specifically disallowed should be considered fair game for web crawlers.

To disallow access to your entire web site use a robots.txt like this:

User-agent: *
Disallow: /

If User-agent is * then the following lines apply to all search engine robots. By specifying the signature of a web crawler as the User-agent you can give specific instructions to that robot.

User-agent: Googlebot
Disallow: /google-secrets

Since the original spec was published several search engines have extended the protocol. One popular extension is to allow wildcards.

User-agent: Slurp
Disallow: /*.gif$

This prevents Yahoo! (whose web crawler is called Slurp) from indexing any files on your site that end with “.gif”. Keep in mind that wildcard matches are not supported by all search engines so you have to preface these lines with the appropriate User-agent line.

You can combine several of the above techniques in one robots.txt file. Here’s a theoretical example.

User-agent: *
Disallow: /bar
User-agent: Googlebot
Allow: /foo
Disallow: /bar
Disallow: /*.gif$
Disallow: /

This would result in the following access results for a few URLs:

URL Googlebot Other robots
example.com/foo.html Allowed Allowed
example.com/food.html Allowed Allowed
example.com/foo/ Allowed Allowed
example.com/foo/index.html Allowed Allowed
example.com/foo.gif Allowed Allowed
example.com/fu.html Blocked Allowed
example.com/bar.html Blocked Blocked
example.com/bar/index.html Blocked Blocked
example.com/img.gif Blocked Allowed

Computer programs are pretty good at following instructions like these. But for a human brain it can quickly get overwhelming, so I highly encourage you to keep it simple. One of the longer robots.txt files I’ve encountered is from www.seobook.com – it’s over 300 lines long. The site owner Aaron Wall is the author of the excellent SEO Book; he knows what he’s doing.

For us mortals there is a robots.txt analysis tool in Google’s webmaster tools. Highly recommended. Another good resource for more information on the Robots Exclusion Standard is www.robotstxt.org

Today when companies are spending a lot of money to be included in search engine listings, the idea of excluding your content may seem quaint. But from a security perspective there are many valid reasons for limiting what a search engine indexes on your site. See my Digital Security Report for more information.

Update to WordPress 2.2.2

6 August, 2007 (21:55) | Security, WordPress | By: Nick Dalton

If you are using WordPress 2.2.1 you should immediately get the 2.2.2 security update.

The discovered bug is a Cross-Site Scripting vulnerability. See http://trac.wordpress.org/ticket/4689 for more details.

The WordPress developers assigned this bug a priority of “highest omg bbq” :-)

How secure is your web site?

30 July, 2007 (10:47) | Security | By: Nick Dalton

Even if your web site does not hold any national security document you should take the security of your web site seriously. This is especially important if you are selling products on your web site.

A typical setup is that you have one or more sales pages for your product and when a prospect clicks on an order link they are redirected to PayPal, 2CheckOut or some other payment processing service. This setup is good for several reasons, the most important being the fact that you avoid having to deal with credit card numbers and other sensitive customer information. So far in 2007 there have been published reports of more than 89 million identity records exposed from data breaches. See the Identity Theft Resource Center for some really scary reading. Leaving data theft worries to companies who specialize in handling financial information is a great strategy for most small businesses.

But that does not leave you totally in the clear. If you are selling a digital product that the customer can download immediately after the purchase, you need to ensure that the product is protected. There are many ways that web site owners inadvertently leave their valuable products unprotected – making them available for free to anyone who knows where to look.

Here are the 3 most common errors:

1. Easy to guess filenames.

If the title of your e-book is “AdWords Secrets”, then don’t name the file AdWordsSecrets.pdf. It is just too easy to guess that the URL for downloading your e-book might be www.example.com/AdWordsSecrets.pdf

At least add a version number or a date into the filename, e.g. AdWordsSecrets_v42.pdf or AdWordsSecrets_20070707.pdf. This will make it much more difficult to guess the filename and the URL.

2. Search engines indexing the download page or the product itself.

Today’s search engines are extremely efficient in spidering content on the web and keeping your web pages secret from search engines is becoming increasingly difficult. Even if you don’t have any public links to your secret product download page there are several ways that a search engine can find out about the page and index it. Once it’s indexed anyone who uses that search engine may see your product download page in the search results, and they can download your product for free.

You should regularly check what each search engine knows about your web site. In most major search engines you can use the site: operator, e.g. site:example.com, to get a listing of all the pages on your web site that have been indexed.

3. Improperly configured robots.txt

robots.txt is a text file that you can place on your web server to guide search engines to what content they are allowed to index and what is off limits. While this may prevent most search engines from indexing your secret web pages, it opens up another vulnerability: any curious web surfer is able to view your robots.txt file. If the file explicitly forbids search engines from looking in the /downloads or /report directories, then it’s very likely that’s where the secret files are stored. With this knowledge the web surfer can more easily find your product and download it for free.

You need to strike the right balance between protecting certain files and directories in robots.txt while not revealing too much about the structure of your web site.

Selling digital products online is a great business. Make sure that you get paid for the products that you have painstakingly created by following the guidelines above and applying common sense.

More details on how to protect your digital products can be found in my latest report: The Digital Security Report.

jvAlert News recommends the Digital Security Report

24 July, 2007 (23:02) | Security | By: Nick Dalton

Micheal Savoie at jvAlert News has a nice plug for my Digital Security Report.

It’s a long post, so you need to scroll down to the “Security Expert Exposes Holes In Your Website!” headline.