Hardening our websites security

I’ve lead technical teams for over ten years in web or communication agencies, and in this time frame, I’ve had the chance to contribute into putting many sites online. The last thing you want then is a website you’ve worked on to be a hacking target.

So let’s be very clear about one thing. It will be. Do not imagine the internet to be a harmless happy place, because it is not. You’ll find online every type of person, from true good to true evil, just like the real world. There are 100% chance that the site you’ve put online will be a target at some point.

Let’s not dramatize too much though, you’ll be a target some day, but not necessarily the target of a huge, personal, and direct attack. You may be scanned by a black SEO bot, one of your forms might be picked up by another one that will hit it for a while. Some bot will try to gain access to your admin interface, while another one might want to slow you down with a flood of requests. These last examples are all automated attacks, wandering on the internet to find a new site to focus on. If your site is not ready to cope, they can damage it.

The reason I write this article today is not to write another security best practices list. Instead I wanted to share with you the attacks, the sites I was responsible of were victim of, and what we’ve changed to make them more robust.

Improving the admin area protection

Your site admin area is an important place to protect because when someone gains access to it, it can perform operations on your site. Be it content creation, user manipulation, template changes and so on. So it’s crucial to prevent unwanted visitors in there.

Attacks we’ve been victim of on the admin area

Usually the vector of attack for this is to try the most common admin urls, (for example /admin, /wp-admin (hello WordPress), /administration, /login and so on), then on a successful page found, to hit the login form with a combination of login / password until successful. It already happened to us to spot unknown administrators in our admin areas, who created some black SEO content.

The vector of attack could be split into two parts : On one side the attacker tries to locate the admin login form, one the other side, he tries to hit the form until he finds a good login / password combination.

Denying access to the admin area

The first thing we can do is to hide the login page to make sure the admin urls cannot be reached neither by an internet user, neither by a bot. Here are two techniques we’ve applied which worked for us. The first one is to protect your admin area pages behind a .htpassword protection, and the second one is to only allow certain explicit IP addresses to access the admin pages.

.htpassword protection

As its name implies, this protection can be installed by creating an .htpasswd file that holds login / password combinations. When a visitor tries to visit an .htpasswd protected page, the browser will prompt a login form that you must successfully fill to view the page behind. If the login is incorrect, the browser will redirect you to a native 403 forbidden error page.

Advantage

Rapid and not complex to implement. It does not need any further configuration once in place. It essentially behaves as a double authentication.

Disadvantage

Essentially, what we’ve done here is to add an authentication in front of another authentication. Which means there are now two closed doors in front of your admin pages to enforce to get access, but the .htpassword door is still public, and can still be subject to an attack, that if successful, opens to the admin area again. But that’s still better than nothing and should avoid some low level bots from attacking you already.

IP whitelist protection

In other words, here you deny access to your admin urls to everyone, but certain specifically authorized IP addresses. This is done with a few lines to add in your .htaccess file.

Advantage

This protection will isolate your admin interface from bots and unwanted visitors, and that’s a huge security gain.

Disadvantage

Every IP is denied by default, you need to manually specify which IP can reach the back office. This poses an issue on IP management because your IP can change. If you’re at work, or at home, or on the go, you won’t have the same IP address. The recommended solution for this is to use a VPN, and to authorize the VPN IP address, or an address you can reach via VPN.

Here’s the policy we enforce at Wonderful : for each website we build, the admin area is protected by a whitelist. It can be seen as a hard decision, but we’ve chosen to do this not because we’ve read it in a blog post like this, but because we’ve found out the hard way that we were being hacked at some point. The Wonderful agency IP address is fixed, and is whitelisted. While we are on the go or remote working, we use a VPN to connect to the Wonderful agency network, which then grants us remote access to the back office via the agency’s IP.

For our clients we apply the same technique, we explain them that for good security reasons, we must enforce a strict IP whitelist policy. With some explanations on the risk encountered and the reality that hacking attempts are a real thing even for us, this policy is something they both understand and appreciate. We authorize our clients offices fixed IP addresses, and encourage them to adopte a VPN policy for remote working. This is a good practice beyond the scope of website security anyways, especially in the avent of remote working.

We are also able to temporarily authorize a home IP if the use of a VPN is not possible for them at some point, but that’s considered an exception.

Variant : country whitelist

On the web, there’s no such thing as distance protection like in the physical world. Living in France protects me physically from bad intended persons that live for example like 100 kilometers apart. On the internet my site can be targeted by anyone anywhere on the planet. When the IP whitelist variant is not available, I’ve seen site owners allowing their back office access (even the public site as well sometimes) only to IP addresses from one or few explicit specific countries. Because a very locally focused brand for example can have no interest whatsoever in having an online presence abroad, and might get more troubles than advantages to be reachable from anywhere in the world.

Preventing login trial

If you’ve hidden your admin login page, an attacker should not be able to access your login form. But let’s have a look at this subject nonetheless in case you could not hide your admin urls, or in case you have a public facing login form.

There are several techniques to prevent login trial : limit the number of attempts a user can make, limit the number of attempts an IP can make, create a user blacklist, changing explicit login errors…

Limiting the number of connection attempts

The aim here is to count unsuccessful login attempts, to reset the counter on successful login, and to ban a user if a counter reaches a certain amount.

Advantage

The website auto protects itself from unwanted intruders. It can either ban unwanted IPs or usernames temporarily or definitely.

Disadvantage

An attacker aware of this protection mechanism could turn it against you by generating wrong passwords attempts on valid usernames, which in turn, would generate a ban on this valid user. You can temper this by preferring a temporary ban, for example 24 hours, over a definitive one. This is long enough to discourage an automated attacker that would only be able to try for example 5 login / password combinations per day. And if a valid username is blocked by this, it will regain access the next day (or call you before that).

Using a blacklist

Using a blacklist is a way to eliminate usernames or IP addresses from login to your site altogether. You can start with a manual blacklist, especially with common user names like admin, administrator, webmaster…, that you should not use as a valid username for you.

You can then automate its growth. For example, the connection limit technique can be used in combination with a blacklist. In more details, after a certain limit of failed login attempts, you’ll automatically put the rogue username or IP address in the blacklist.

Advantage

The blacklist grows bigger over time, and can also be mutualized between websites or web services.

Disadvantage

It works well with manual or poorly implemented attacks, but be aware that usually login attacks are a bit more sophisticated than that, with an automatic username or IP change with every request made. This remark is also valid for every login trial technique by the way.

Improving forms protection

Forms are extremely widely used on websites around the web as they are a way to allow users to interact with you through your website, be it contact forms, newsletter forms and so on. As an interaction medium, forms are also a target of choice for hackers, that’s why it’s essential to harden their treatment too. Over the past 4 years, at least one website under my responsibility has been targeted by the following attempts, and they were not necessarily famous nor trendy websites.

Attacks we’ve spotted on forms

Mostly, a form is used by an attacker to send data to your form treatment page, either by filling and submitting the form, either by forging a request directly to the form treatment page without using the form.

The mean of the attack might be to persist malicious data into your database or delete some. It might also be to flood your service, either by saving a lot of data into your database, or to make your site work a lot, massively slowing it down as a result, until complete unavailability. It might also be to lower your reputation when your form task is to send an email to someone. If this form is attacked and generates too many emails, your server might end up in a spam blacklist and not deliver anymore.

We’ve had to deal with WordPress comments spam in the past, as well as a newsletter or contact forms being targeted en masse.

Overall, the actions to consider revolve around the action of checking how the data was posted, and to check the posted data itself.

Make sure the posted data comes from the form supposed to send it

It’s important because without this, an attacker can write a script in a loop (a finite one if you’re lucky), and tell it to perform requests with an array of data directly to your form treatment page. Upon execution of the script, you might get 1000 post requests on your form in a second. As you can expect, a user is no way near this fast. If you have logs on your site, you can detect attempts like this based on the requests frequency, furthermore if you site has an average of 100 visits per day.

Here are two techniques to prevent this.

The first one consists of checking the request’s origin on the form treatment page. If the request’s referrer is not from your website, nor even the page where the form is supposed to be, you could detect a bad request.

The second one consists of generating a unique random value on the form, and to check this value on the form treatment page. If this value is absent or does not correspond to a valid one, then it’s a clue the data might not come from the page and you can code to abort the form treatment (plus maybe take a ban action). In WordPress, you can use nonces for this, while in symfony, you can use CSRF tokens.

Make sure the posted data uses the correct HTTP method

The aim here is to choose which http method best describes the form treatment. Is it better to use a GET, POST, PUT or DELETE method for this form? Once you’ve chosen, make sure your form treatment page is only accessible through this method, and deny the others. It won’t protect you from someone who’ve studied your form specifically, but will deny any bot that uses for example GET or POST on any form it finds if yours is set on another method.

Make sure the data has not been posted by a bot

Here are two techniques we’ve experienced that start before the form submission to detect robots. The most commonly known and used technique for this is to use a captcha. They are very effective, but also a bit big and ugly. Sometimes, it can be hard to cope with a captcha on specific forms for design reasons for example, so we’ve experience another technique called the honeypot.

To summarize the honeypot technique, you add a field to a form that should be hidden somehow to your users. Because users can’t see the field, they should not fill it, and we expect the data for this field to be empty in the form processing page. Bots targeting the form code however, would see a field to fill, and put a value in automatically. Once the form is submitted, if the field data is not empty, you know it’s a bot and can abort the form treatment. David Walsh wrote about the technique in his spam prevention blog post if you are interested in the details.

The next step is to add a bot detection routine once the form data is submitted to the processing page. The first thing we verify there is the result of the honeypot of course, but we’ve added more checks over the experiences, such as:

  • Nonce or csrf checks as explained above
  • Email pattern detection,
  • IP black or white list,
  • Spam plugin feedback,
  • Content blacklist (profanity for example, spam keywords…)
  • Content analysis (too much misspelling, toxicity, lorem ipsum)
  • And so on depending on your use case

The idea there is to capitalise on a detection routing that can be improved over time and shared between sites.

Escaping form data correctly when writing or reading

This matter really is code based related. It consists into not trusting any value coming from a form, and to run it through some escaping functions before storing it into a database, and before displaying it on a page. It’s very important to do so because without this, an attacker can do two things : inject malicious code into your website (also known as XSS), or alter database queries (also known as SQL injection). Those two attacks can be dramatic, but well avoid by escaping form values correctly.

For french readers, this article on website security by @jesuisundev explains the different exploits in detail.

Limiting bots impact

We’ve seen in the admin section of this article how to prevent bots from making requests to the site admin, but they can still harm the front office. We’ve spotted some bot activity in our logs, here’s what they seem to do apart from trying to hit the admin of forms, and how we can limit their impact.

By sacking bad crawlers

Bad crawlers are robots that try to crawl your entire site, even the urls they should not as defined in your robots.txt file. Good bots are usually respectful of the robots.txt rules, so to trap bad crawlers we can use the blackhole technique. This technique is explained in depth by Jeff Starr in his blackhole article, but it can be summarized as follows. Put a link on your website to a page that is specifically forbidden to bots by a robots.txt rule. When a bot ignores this rule and crawls the hidden page nonetheless, the hidden page, which is in fact a trap, puts the bot IP address into a blacklist and thus prevents it from accessing your site. If you have a look at the blacklist over time, you’ll see it growing inexorably.

By acting on a bot list (be it a blacklist or whitelist)

I’ve seen many times in analytics data (Google Analytics for example), traces of bots. If you want to look into your own analytics, there are many articles on the web about how to spot bot traffic in GA, and chances are that by following one of them, you’ll stumble upon bot traffic in your own stats.

The technique here is to build a whitelist or blacklist, of bots IPs or user agents then to make sure your site works with this list to allow or deny bot traffic towards your website. If you’re looking for an example of user agent blacklist, you’ll find two in this article from @carlosesal about removing spam data from google analytics. But for the purpose of this article, I encourage you to block bot traffic before it ends up in your analytics and leaves you with no choice but to filter them out in there afterward.

A few additional points

Here are three other miscellaneous points that we had to deal with in the past, along with the solution that worked on the problem.

Maintenance is important

Having a good maintenance procedure is key for a website’s integrity for several reasons.

The first reason is that bugs are discovered in live codebases all the time. And some of those bugs can lead to known vulnerabilities, which in turn can lead to exploits. That’s especially true if your websites rely on some sort of shared dependencies, like an open source library, or a CMS, which is a given on most of the websites online these days. I’ve witnessed an exploit on one of our website one day due to a lack of WordPress maintenance. The thimbthumb exploit for those who remember. So when a vulnerability is discovered, the best way to protect your site is to update this dependency to a version where the problem has been fixed. But you cannot control what truly changed in this dependency, it might be a very focused bug fix, or a new major version. That implies you need to test parts of your website to make sure it still works like before the update (with one less security hole), that’s why having a proper maintenance routine is key.

Performing a maintenance is also a good time to take a look at logs in general to see if everything looks fine. It’s a good time to be on the lookout for bot or spider traces regarding all the aforementioned topics of this article and see if your efforts are paying or not.

A note on security headers

Security headers are special types of http headers oriented towards security policies. Here’s the list of them as explained by the site securityheaders.com. The following lines are reported from their website :

  • HTTP Strict Transport Security is an excellent feature to support on your site and strengthens your implementation of TLS by getting the User Agent to enforce the use of HTTPS. Recommended value “Strict-Transport-Security: max-age=31536000; includeSubDomains”.
  • Content Security Policy is an effective measure to protect your site from XSS attacks. By whitelisting sources of approved content, you can prevent the browser from loading malicious assets.
  • X-Frame-Options tells the browser whether you want to allow your site to be framed or not. By preventing a browser from framing your site you can defend against attacks like clickjacking. Recommended value “X-Frame-Options: SAMEORIGIN”.
  • X-Content-Type-Options stops a browser from trying to MIME-sniff the content type and forces it to stick with the declared content-type. The only valid value for this header is “X-Content-Type-Options: nosniff”.
  • Referrer Policy is a new header that allows a site to control how much information the browser includes with navigations away from a document and should be set by all sites.
  • Permissions Policy is a new header that allows a site to control which features and APIs can be used in the browser.

I’m not a security headers expert at all, I’m just mentioning the fact that we had to implement them for one of our customer as part of their own security policy for their web apps and websites. So we took the opportunity to enforce them at server level to increase everyone’s security in one shot.

Hotlinking prevention

This is another subject we had to deal with once on a client’s website. We saw higher than normal traffic on specific imageries on his site, higher than page views for example. Hotlinking can be a bandwidth drain so it’s a good thing to include a few hotlink prevention lines in your .htaccess file.

Time for action

This list covers many security challenge I’ve had to face as an agency lead, at least once for the last 10 years. It is far from an exhaustive security list, it is a pragmatic one learned the hard way. A security list is never complete anyway, there’s always something more you can do, something more that can be hacked.

Audit your website

If you’re interested in auditing your own websites with the points listed above, I’ve prepared for you a little bundle you can download at the bottom of this article. The bundle contains a security checklist, as well as a few tools that can help you start things out. But if you’d like to audit your site beyond the scope of this article, I encourage you to use a security scanner tool for a more thorough scan.

Adding some logging and monitoring is also a good way to audit your website on the long haul. By looking at logs, you’ll find some traces of things you’d like to correct (bad bots being an example). The next step is to implement some automatic monitoring to try to automatically detect certain symptoms and trigger some alerts for you instead of discovering things later on.

Add some protection

Many security measures can be set up rapidly with your .htaccess file. I’ve added some .htaccess snippets in the security bundle you can download at the bottom of this article. For things that are a bit more complex to implement, you’ll also find in the download a list of articles to read, and code snippets to look at.

Beyond modifying the .htaccess file, implementing correct security protections is a quite hard thing to do. It can be hard to implement smartly enough (I mean smarter than the attacker), it can be hard to cover enough parts of your application, and it can be hard to maintain.

That’s why at wonderful we’ve made the choice to use a third party security solution to protect us, called SecuPress, and to add our own code to extend this protection where it stops. SecuPress is a WordPress security plugin that provides expert protection to WordPress websites. It works by scanning your site against a check list, and gives you a grade based on the result. Then it helps you improve this grade either automatically (by injecting code into your .htaccess file for example), or manually by telling you how. That way, one hand we benefit from a pro tool, that is excellent in terms of vulnerability scan, defence, and is updated often. And on the other hand we complete it with our own code to maximise the cover. Forms protection for example, are usually more of a code based matter.

Conclusion

  • You’ll be a target some day. You might have already been.
  • Take the matter seriously or you could be seriously harmed. This is not meant to be a scary prophecy, but a factual advice based on the attacks I’ve seen being victim of in the past vs the number of sites my team produces.
  • Some simple actions can already be greatly efficient, those are a good base to start.
  • For more complex actions it’s better to get some specialist help.
  • Maintain and audit your site regularly, then harden the spotted points of failure.
  • Industrialise the security response to enhance all your websites fleet easily (code once, deploy everywhere)

I’m not an IT forensics expert, and the given list is far from being an ultimate and complete one, but I hope these advices will help you make your sites safer and less prone to attacks nonetheless.

My field of expertise is more about production team optimization, with the help of processes, team management and relevant tooling. If you’re interested in this subject, I encourage you to follow me via the form below. You’ll receive updates on my work, as well as advices, tips, and tools on the matter. I’m even preparing an online course to help you get the very best out of your development team.

Download your security bundle

Get away with a security checklist, code samples, references and resources to help you harden your websites security.