Tag Archives: apache

why I choose the mit license

14 Jan

I’ve been slowly working on moving all of my projects and tutorials to one site. Mainly because there treated like blog posts instead of projects. This of course lead me to start looking at which license to start looking at which license to release everything as.

After a few days pouring over everything on the web I decided to choose the mit license. Why? The mit license fits how I want to release my code. It releases my code openly to anyone, only asking that the copyright in my code remain.

This will probably lead to you ask why not gpl, apache, or bsd?

why I didn’t choose gpl

the gpl license [http://en.wikipedia.org/wiki/GNU_General_Public_License] is restrictive IMO, due to the fact that it forces a user to release anyone that uses what i create under gpl as well. This is not to say that the gpl is wrong, just that its restrictive to end users / developers in a way I choose not to me. If someone takes code that I am openly releasing, I don’t want to limit them in anyway other than to leave a portion giving me credit for my portion of code, that’s it.

If i was working on some huge next best thing (think wordpress) and planned to release it, then I would use a gpl license. why? something like wordpress could easily be exploited commercially. If I take the time to build something like that I would want anyone who decides to change it or build off of it to release it themselves, to also make sure it’s free under the same license. The biggest difference here is intent. If you were to create the next swfobject.js (which uses an mit license), why restrict your users? But if your creating the next wordpress you don’t want to restrict users, just ensure that your work continues to be free.

why i didn’t choose bsd

the bsd license is similar to the mit license, but with an advertising clause. So to prevent anyone from any legal issues (who knows what counts as advertising in the future), why restrict end users?

why I didn’t choose apache

the apache license is the default license for projects at google code (you can select others), so it’s gotta be good. But it feel slightly more restrictive. That said: I’m releasing anything that uses patents. If you are then an apache license might be for you if you’d like to not restrict users.

in a nutshell

I think the best quote I came across was from eagain.wordpress.com:

if you want software to be free GPL is better than BSD. If you want use of software to be free BSD is better.

I think mit/bsd licenses are better for developers when your creating random things that might help someone (think swfobject). gpl is better if your developing an application to help everyone (think wordpress).

sid

*update* – found another good article:

http://fosswire.com/2007/04/06/the-differences-between-the-gpl-lgpl-and-the-bsd/

couchdb: Now how to use it

9 Dec

A few months ago I decided to try and build a web application in a way that supported how the application actually worked. The biggest hurdle was trying to keep 3nf and still retain the original goal. Needless to say that was a major exercise in futility, the big idea was lost and we ended up just doing it the normal way.

I’ve been looking at things like hypertable, but have no need for it. Enter couchdb.

Instead of your typical db structure data is stored as a document (or object for us oop minded). Documentation is still growing, but the project was adopted by apache, and regardless of how you feel about apache that should for the project.

After a few days of research my main question is how best do you implement it? I’d love to use this for my gray network project, but I’m thinking I may still be in the discovery phase with couchdb when it launches :(

Anyways whats your take on how best to implement couch db? I’m still researching security, performance, etc, but I think couchdb is definitely in my future :)

project: benchsid – automate benchmarks with siege, ab, and httperf using shell scripts

7 Dec

While benchmarking performance for deathy death match I realized something: Benchmarking sucks!

That said, and me being the lazy person I am, I asked myself how can I make this easier? My answer: benchsid, two shell scripts that make benchmarking easier.

disclaimer

This is a 1,2,3 trick pony – use it as such

These scripts are offered as is and I am not responsible for anything you do, create, break, etc.

Both scripts are released under the do whatever the hell you want, but don’t blame me license. This of course isn’t a real license, but you get the point.

Make benchmarks easier, if only a bit

I’ve tried as much as possible to keep your commands to a minimum. If running as root you can run a complete benchmark, save both the results and you benchmark servers top + free output, and reboot with 3 commands. See lazy right? :)

monitor.sh – run this script on the machine your benchmarking
bench.sh – run this script on the machine your benchmarking from

you do not need to run these on separate machines, but I recommend it

How do I use…

monitor:

 ./monitor.sh
 ./monitor.sh help

bench:

 ./bench.sh
 ./bench.sh help

If your still having issues post a comment

monitor.sh

Monitor should be ran on the machine your benchmarking. It automates the process of recording free (free -m) and top, and if running as root will reboot the server when you tell it to stop monitoring a benchmark

Sounds good so far right?

monitor.sh process overview

start process

clear out any benchmarks that will be overwritten
save output from free to a file (free -m)
save output from top to a file
start a top process to monitor to a file while we benchmark
confirm that we are up and running

stop process

stop top
save output of free to file
save output of top to file
reboot server

monitor.sh: umm why?

monitor.sh saves the information about the server before, during and after your benchmark has been run. By running top during the benchmark process (10 second intervals), we can get a real time overview of both server load and the loads created by applications

bench.sh

bench.sh is the benchmarking portion. bench.sh can benchmark your server using one of three programs: ab, siege, or httpref. I considered adding an option to do all three, but honestly that more poor mans dos than benchmark.

bench.sh first will perform a warm-up cycle of 2000 requests to your server using your desired (or default) concurrency. After that it will perform 3 benchmark cycles of your desired number of requests and concurrency.

All cycle results (including warm up) are saved to the filesystem.

download

download benchsid: http://gregsidberry.com/httpdocs/wp-content/uploads/2008/12/benchsid.tar.gz

known issues

siege output – siege outputs a couple of variables to screen that aren’t included, or properly recorded, in the created file. I’ve tried output redirection, script, etc, but haven’t been able to find a fix that keeps everything in one script

input validation – there is no input validation, so please don’t put these files in a web public location.

funyuns – I’m out of funyuns, please send more

enjoy
Sid

Building somethng scalable: language / frameworks aka use ror or php

13 Nov

When I first started this experiment I planned on using a custom php framework. Recently I realized that its kinda pointless to attempt to do something like this and at the same time lock myself into something that may not be the best solution….

enter google, research, testing, and little sleep. Result: codeigniter was the best choice. Huh? Here’s why.

snakes not on the plane

I should say from jump: If I knew and had the time to learn python / django it would have been the winner, sadly I do not.

ruby, rails, hype

I’m a fan of ror, and of course when I decided to take what I’ve created so far and migrate it into something usable I of course thought of ruby on rails. Sadly I saw more than a few hurdles.

First off ror doesn’t play we’ll with others, meaning you shouldn’t run rails in a shared environment. The whole concept behind this experiment is a small start-up with a shoe string budget and 2-3 websites / apps. Yes its possible to host multiple rails apps on one server / vps, but it’s not recommended. There is of course passenger, but that leads into the next point. php +1

Ror is a resource hog, there I said it. When compared to php, rails is more resource intensive than php. Of course the answer is to optimize and scale out, but remember I’m trying to keep the monthly hosting budget under 100-120 bucks (yes seriously shoe string). So php was the choice here.

So far php seems like the best choice for what I’m trying to do, but I needed more than just a few issues I could work around. You can’t work around speed and optimizing. ruby is faster than php via command line, but ror is not amazingly faster when used via web. When you add an opcode cache into the picture ror gets is butt kicked hands down. Of course this is comparing a language to a framework, enter codeigniter.

Codeigniter still out performs ror with an opcode cache. That is just one optimization and php shows drastic improvement.

I wasn’t ready to abandon ror yet, simply put: why use an imitator if you can have the real deal. Codeigniter is a great framework, but it makes more since to actually use rails right?

In the end the answer is no. Rails has a higher level of maintainability out the box, but does less code, easier maintenance, and of course the ror cool factor out weight slower speed, higher cost to deploy, and fewer production optimizations? No.

MVC while not as strict in codeigniter, is there. That combined with OOP will make using a php framework easier to maintain, not ror easy, but easy enough for production imho.

framework, shamework

So we now know why php won over ruby, but why pick codeigniter? I originally started out writing a custom framework, which is always fun to do from time to time to push yourself, but in the end you see the downside and benefits of doing so. Since this entire idea revolves around a small start-up we also need to take development time into consideration.

A custom framework takes alot of trial and error, coding, and recoding, and still more coding. Using a php framework I can reduce the time it takes to get up and running, while still building what I want and need. Yes there’s some overhead compared to a custom framework, but in the end a start-up isn’t yahoo. You should be back at the planning table long before you reach yahoo numbers. Thats not to say you shouldn’t plan for massive growth (hence scalable), just that traffic / users / data on yahoo’s scale is far beyond my scope of experience and even I know its another elephant to eat.

When comparing php frameworks, i was originally looking at cakephp, but after some research found it to slow and with to much overhead. There are other frameworks, but only codeigniter and cakephp met my needs. 2-1=1 (or less than one mr. V), so codeigniter was the winner.

One thing I should note is how poor most of the php frameworks I looked at perform out the box. Yes rails (and merb) are better out the box. Luckily a few simple optimizations speeds things up.

all your base are belong to theory

So far everything has been pretty much all theory, research, and testing. No worries, the meat and potatoes are coming shortly.

the next few posts will cover setting everything up(centos5): memcached / memcachedb, ngninx, apache, varnish, mysql, s3 + rsync, varnish -> s3 relations, etc.

If you feel like following along head over to http://linode.com and setup a few machines, or setup a team in vmware.

Worth Reading

Here are few articles that I came across while researching, some are more on topic than others, but all have some value.

PHP vs Java vs Ruby:
http://www.cmswire.com/cms/industry-news/php-vs-java-vs-ruby-000887.php

Ruby vs PHP performance (cli)
http://izumi.plan99.net/blog/index.php/2008/01/17/ruby-vs-php-performance/

Ruby on Rails Fans
http://shiflett.org/blog/2006/feb/ruby-on-rails-fans

The performance test of 6 leading frameworks
http://www.alrond.com/en/2007/jan/25/performance-test-of-6-leading-frameworks/

Time to bite the bullet

21 Oct

I currently have 3 vpses. One on Mediatemple and two at linode. Thing is I need at least 3 servers at linode to properly setup the environment for my scaling experiment / future gs setup….

So as of tonight I’m biting the bullet and now own 4 linodes. If all goes well I’ll toss 2-3 more linodes into the mix for geo targeted content delivery. And yes I’ll try and document everything. Also worth noting I’m trying out ubuntu again, this time ubuntu hardy – server only. We’ll see how this works out.

in case anyone is wondering here’s a brief overview of the setup (simple version)

server 1+2:
apache +php (you can config apache to run almost as fast as lighttpd, so why not)
varnish
memcache
nginx

server 3: db – private network access only

server 4: everything thats on mt for now

lates, sid

Linode Rocks!

15 Oct

I haven’t had much time to post, or do anything fun in the past few months, but I’m currently working on a nginx, php, apache, varnish, memcachedb, mysql setup distributed between two servers. Why does this matter? Because I decided to upgrade my linode so I could slice up my resources into a nice little mini scalable network. Only problem is thats not how linodes work…

Linodes support team (Jim) had me upgraded 10 minutes after my upgrade request, then had me downgraded even quicker.

Yeah thats pretty good support. Better still is their control panel. I have yet to use their forums, and I’ve spoken to support once once before today.

If your looking for a good vps with intel processors, private network support, an awesome dns manager, and top support use these guys.

I still have my mediatemple [ http://mediatemple.net ] server, but for less than I’m paying I now have two servers with better stats and no bloat with linode [ http://linode.com ].

Check em out : http://linode.com

Building Something Scalable: Caching

5 Oct

I’ve been seriously getting my kicks with scalability fora number of months, so why not start on ongoing series where I talk about what I’ve learned / found?

So welcome to to the first: Building Something Scalable – An ongoing experiment. This post covers caching. I’ll cover delivery in the next post.

Keep in mind language wise I’m using php, but the general advice should be sound, regardless of language. If you disagree with something or have a better way, feel free to comment.

So now that I’ve ranted off 4 topics, maybe I should expand on them a bit.

Use Caching

Caching is a good thing, but use caching is a pretty vague statement, so let’s expand.

Caching isn’t just a one ring to rule them all type of solution. Its actually a fixture of a number of different solutions, that work together to boost your site / applications overall performance.

Database caching

I consider database caching a 2 part solution. You have the mysql query cache, but I also like to have a server side query cache as well. Why? I tend to use oop and having a server side query cache allows me to cut some overhead both appilcation wise, and by preventing me from having to connect / query mysql.

The big issue with server side query caches is stale queries. The mysql query cache prevents stale caches automaticly, but with a server side query cache we’ll need to set a TTL (time to live). I tend to go with something really low like 5-10 seconds.

5-10 seconds may seem pointless, but it allows higher traffic applications to handle a number of requests with fewer queries to mysql. This takes some of the load from mysql, so  your database is performing under less load than it would have without the server side query cache.

There is plenty of information on the mysql query cache online, so fire up google and start researching. For your server side cache here a few things to keep in mind

  1. keep your cache in a secure location. If your using a file cache this means outside of your web directory
  2. hashing is a quick and painless way to uniquely id your queries. md5(’select * from table1′) will allows return md5(’select * from table1′) if done correctly.
  3. prevent cache filename collisons.
  4. do a light weight encoding on cache files. base64_encode / base64_decode are quick and easy to use. They’re not secure, but its a good idea to add some basic obfuscation
  5. keep your TTL low. Your query cache should try to stay as fresh as possible.

Opcode Cache

Php is compiled / ran at runtime (when you request a page / script). Opcode caches store the compiled code so that your code doesn’t have to be compiled for every request. Opcode caches can increase your codes performence by up to 90%, but then again, any increase helps the overall perforence of your site / application.

There are a number of opcode caches avaible for php. I prefer xcache, but there are a number of other opcode caches available for php.

Static content cache

Static content unlike dynamic content is, well static. Your probably wondering: Why cache something thats already static? Simple, performence. Static content though is cached / served differently than dynamic content. I know this touching more on delivery, but its still worth mentioning.

Static content is often served through a CDN (content delivery network) or a web cache. A CDN and web cache act similarly, except that a cdn has a number of servers setup in various locations.

Content Delivery Network
A CDN acts just as it’s name says: It delivers your content via its network of servers. The CDN selects a server closest to the location of the user, and serves your content from that location. Whats the benifit? Faster delivery of your content. Is it worth it? That’s a question only you can answer. Do some research, compare the solutions, check you budget – and you’ll have your answer.

Web Cache
A web cache or reverse proxy simply put delivers your content faster. I’m not to well versed in the science of it all, but here’s a basic break down of what I do know:

Web caching software like vanish (in the past squid was the standard) handle servering static content better than apache, and with a smaller footprint. The web cache creates a cache of your content when requested and then delivers your content from its memory / disk cache.

The most obvious benefit from all this? Reduced server load. Apache is a resource hog (there I said it), but that will be covered in a future post in this series. By moving static content delivery to software created just for this task your freeing resources and of course getting content to users quicker.

Output Cache
So far we’ve looked at a number of ways to increase the speed of dynamic and static content, but there’s still one major item left out: Output Caching.

As your scripts / application generates pages, you can cache them to be served for future requests. Output caches in general can be as basic or complex as you need them. A few things to keep in mind.

  1. stale content, your cache should have aTTL(time to live) that prevents it from serving stale content
  2. filename collisons – your naming scheme should prevent filename collisons
  3. store your cache outside of your web folder
  4. logged in users / vs non logged in users – come up with a solution that deals with this.

Variable / Object Caches

Your code has objects and variables, often some of these objects are database intensive. An object / Variable cache is a way to store your objects and variables. The thing to keep in mind with these types of caches is speed.

It makes no sense to cache something like $var=1+1;. You can run that command quicker than you would access it from the cache. A good example of somethign to cache would a class object that runs a number of queries on the database, but accesses content that doesn’t change as often. By caching this object you can prevent a few database queries (or cache calls). Or a class object that generates a number of child class objects.

I could go on and on about this subject but lets get to the point. If your application is running on only one server use a file cache. If your application uses more than once server look into memcached / memcachedb.

Thats it.
Hopefully that was short and sweet, the next post will cover delivery.

Greg – Out

php tip: securing .inc include files

17 Aug

just a quick tip for anyone using .inc files via apache.

add the following to your apache configuration to prevent viewing of .inc files via the web. This will not prevent php from including the files locally

<FilesMatch "\.inc$">
Deny From All
</FilesMatch>

There ya go, now feel free to use .inc files as much as you’d like. Also an fyi – I recommend using .php instead of .inc, security wise a few configuration changes will make both extensions about the same. Mainly it’s for developers. Some developer tools treat .inc differently than .php. So to keep it easier for the developer .php is recommended, but not required.

elsid out

Bye, Bye ubuntu

22 Jun

So I’m now pretty much fed up with Ubuntu. It’s a great os, but I’m having nothing but issues with it.

First were the wireless issues. For some reason ubuntu dropped the connection every 20min-2hours. I eventually got that fixed.

Next we’re the repo issues. If you pull a file from the main repo it should work, not be full of bugs. I like building from source, but thats seriously time consuming. On top of which finding where the bugs are is a pain in the ass

Then came the php5 issues, and now the latest issue with apache. I can have 5 different machines run the same conf, and the only ones having issues are the ubuntu server and my ubuntu desktop.

So now I’m back in vista, till I have time to install and setup openSuse. I’m also having to wipe my new server (again) and setting it up on centos. I was worried about going with ubuntu in the first place, but hey – what can ya do.

Sid

php security in a nutshell

9 May

I have a friend I’m teaching foundation security to. This post is for him, but also as a protest to some of the materials I’ve found when looking for reference material for him.

Security at it’s simplest form is common sense. ask yourself, how can I make sure I get exactly what I want? How do I make sure I only give what I want. One article mentions xss attacks, and only says prevent them. Why? Thats the question alot of people have when starting why? So why not teach them how to do it first?

How to avoid sql injection / xss, and other misc attacks.

As mentioned this is part rant, part helpful. I’ll explain the following tips and why / how you do it.

  • always use require_once, or include_once. why? it keeps someone from getting your files stuck in loops.
  • clean everything that calls, enters, looks at your db.
  • typecast whenever you expect a certain type of variable.
  • control access and check permissions
  • use your own sessions
  • track everything in some form
  • setup php correctly
  • hide whats not to be seen / accessed

first off let me say I’m by no means a security god. Actually I’m not even an advanced user. Sad as it is maybe to say: I’ve never used pear. that said, the majority of attacks / exploits can be easily avoided. Why? because the majority of attacks on the web don’t come from hackers they come from script kiddies. We can be lax with our own stuff ( like this blog ), but any application you build for a client should at least have the basics.

Enough ranting now to the meat and tators…. I’ll keep everything short and sweet. fyi – this is pretty much a brain dump, so prob not in “good form”.

why do we use the _once functions?

if you have a file that loads another file, say index.php?get=/calender.php

what happens if someone changes get to /index.php? yeap your suck in a loop, unless you use require_once / include_once

simple huh?

State changes

Your first question is prob, what the hell is a state change? a state change is simply any change, anytime you change something, whether in the db, a file, an upload: it should always use post. why? Post can be hacked yes, but it’s harder to hack post.

imagine we’re using an online game
ex: update=1&user=87897&add_money=8.

so any user who can add will know: hey i can change add_money to 100 and gain 100 points. On top of that any user can now see all your get vars. Why does that matter? The less they know about your vars, the harder it is for a kiddie / developer to exploit it?

why else? It makes it easier to validate changes. Why? Honestly I don’t even remember why right now, but hopefully you won’t hold that against me

all users are evil

I know kinda overzealous, but you need to have this mindset, why? users will accidentally mess up your system every chance they get. And script kiddies love telling you how l33t they are if they do something as simple as figure out how to make a game page display a different page.

as for making a game page display a different page, honestly: who cares ( yes that was me venting). But in order to prevent accidents, or worst kiddie hacks, control everything! I’m not saying make your app so restrictive that users hate it, but – actually an example would fit best.

today the team made a new flash widget that requires user data.
ex1: pass user data as flash vars, then use loadvars to pass update to server

or more secure
ex2: use loadvars to recieve user data from server, and then pass update to server with loadvars.

in this example we could have honestly used either way as the update script validates all data and any real changes are driven off the database, not the state change, but you get the point. Its one less thing worry about if a user decides to try and change the vars passed, and also one less file to update if we change something.

Users will enter strings when numbers should be entered, upload swf’s when you only want images – you get the point. And the point is validate, and whenever possible take the control from the frontend and move it to the backend.

Be as lazy as possible

I say I am a smart lazy person. building 30 different files takes more time than building one file, and using includes, or a template structure. Pretty lazy huh? but also easier to update and more secure. The more files you have that each have their own independent / copy + pasted code, the more opportunities you have for a slip up. Make one file, and let it handle the logic. You’ll have more freetime, and get more sleep. Or maybe you’ll just spend that time working on more projects. See being lazy is a good thing, but only if done correctly.

we can take this a step further and say why even ftp into the server, it’s so time consuming. why not just build a backend that not only manages your files, but controls access to them – Thats more of a teaser than anything, but try it out, you’ll like the results.

typecast whenever possible

imagine we’re using an online game
ex: update=1&user=87897&add_money=8

ok so what if someone changes add_money to a delete statement, or attempts some form of sql injection. whats the simplest way to defeat it? $money=(integer) $_GET[' add_money'];

Yeap one simple change is all it took to defeat the sql injection. why? Typecasting is basically a way of forcing something to be something. huh? if i want a value to always be an integer, i use (integer). If I want a double i use (double), string (string).

Yeah it’s that simple. the only issue i’ve run into is that you can’t use typecasting in defining function / method params. huh?

ex: function foo((integer)$f=0)

that will cause an error, but you can do

ex: function foo($f=0){

$f=(integer)$f;
}

Make sense? of course I can’t force something to (mocha frap with extra mocha) $coffee, but thats life. good now on to more, or learn more about typecasting

Validate, validate, validate

using typecasting is great for numbers, but theres other ways to validate your data. the best and most powerful being regular expressions

ex: preg_replace(‘/[^a-z0-9]/i’,”,$value);

The above regex replaces any non alpha numeric characters in value. spend sometime getting comfortable with regex as its an extremely powerful and useful feature. Not just for validating data, but regex has many other uses as well.

Be a neat freak, or cleaning your sql

By now you understand sql injection, if not

ok so now we all understand it. basically its a cool way of saying, someones trying to make my query do bad things, but saying it like that would make me should like a user, so we say sql injection and confuse the heck out of clients :p

we just saw how to prevent one form of sql attack. now lets see how we can handle preventing them at the query level.

Whenever data is sent to your db it should always be cleaned. Me I like to make sure both the table, columns are cleaned using a function that makes sure tables / column names follow a standard, and a cleaning function for actual data. Why? When developing an app from scratch you normally have freedom over how tables, columns are named. I prefer to keep all tables and columns lower case, and only allow _ as a special char (non alpha numeric character). what does it look like?

ex: //convert name to proper db format
function dbProperObjectName($objectName){
//if you want to use caps is table / column names then please uncomment this
$objectName=strtolower($objectName);
return @preg_replace(“/([\\x00-\\x2d\\x3a-\\x40\\x5b-\\x60\\x7b-\\xff{$this->mSystemDatabase['restricted_chars']}\\x2f])/e”, ‘_’, $objectName);
}

You can ignore the {$this->mSystemDatabase['restricted_chars']} thats some carry over from the db class. If you don’t understand what heck that says I’ll explain. first I’m changing $objectName to all lowercase, if it’s not already. then we’re using a regular expression (regex) to clean our string of anything thats is not a letter or number and replacing it with _. why does this matter? because if for any reason our table name contains a sql injection, when ran it will only return nothing. why? because if $objectName was SELECT * FROM HOME, it will now be select___from_home. which will return nothing because select___from_home isn’t a valid table. See and you thought cleaning wasn’t fun.

Ok you do windows, but what else?

As much fun as cleaning a table name maybe, we really need to make sure our data is safe. why? ummm because I say so. There are many reasons, ranging from controlling content, preventing xss, sql injection. But I like to think you’ll do it because users are evil :)

ex: //strip bad things from a string you plan to use in a query
function dbFriendlyValue($value=false,$fixNewlines=true,$allowedTags=[pass your list of allowed tags here]){
//if no value then just return 0, use this because empty returns false if $value =0
if($value===false) return 0;

//convert to string for checking, this is fine for text / numeric values
$value=(string)$value;

//strip slashes if magic quotes enabled
if( get_magic_quotes_gpc() ) $value = stripslashes( $value );

//clear white space
$value=trim($value);

//fix \r\n
$value=str_replace(“\r\n”, “\n”, $value);

//clear tags (except allowed) or just use html entities
$value=(!empty($allowedTags)) ? strip_tags($value,$allowedTags) : htmlentities($value, ENT_QUOTES);

//change newlines to <br>
if($fixNewlines) $value=nl2br($value);

//clear any bad sql we might find untested regex
$value=@preg_replace(‘/(insert(\s?)into|\).(\s?)values.(\s?)\(|DELETE.(\s?)FROM|CREATE.(\s?)[datbsetl]{5,8}|alter.(\s?)[datbsetlcoumn]{5,8}|drop.(\s?)[datbsetlcoumn]{5,8}|update.\s?(.*?).\s?set|alter.(\s?)[datbsetlcoumn]{5,8})/i’,”,$value);

//add slashes
$value=(@mysql_real_escape_string($value)) ? @mysql_real_escape_string($value) : addslashes($value);

return $value;
}

woah what the hell was that? it was me doing the windows and the oven. lets break it down

when calling the function we pass the value, whether to fix newlines ( default : true) , and the string containing allowed tags if any.
next we make sure we have a value to clean, if not return 0, just in case the function is being used to create a sql statement. we check for magic quotes because if this value came from a submitted variable and magic quotes is on, it will add slashes. if its on the strip those slashes so we can continue.

i’ll skip trim and str_replace, now we’re at strip tags. php is pretty good at striping tags, but you want another option use htmlentities( $value, ENT_QUOTES)

and now on to our regex. this is untested ( sorry still building the class ), but points you in the right direction. the regex searches the value for any sql statements and strips them. lastly we add slashes to our value to make its sql / db safe.

woah – we’ve covered alot. almost done

setup php right

TURN OF REGISTER GLOBALS! yes thats all caps for a reason. Also disable magic quotes and change the headers sent my apache to hide version / software information. Can’t turn of register globals? try this function:

function clearRegisteredGlobals(){
global $_GET,$_REQUEST,$_POST,$_SESSION,$_COOKIE,$_FILES;

//check if register globals is on – register globals check taken fron drupal installed patch : http://drupal.org/files/issues/register_globals_check-D6_3.patch
//get php ini setting
$register_globals = trim(ini_get(‘register_globals’));
//check ini value
if(!empty($register_globals) && strtolower($register_globals) != ‘off’){
//ok now lets clear the variables set with register globals

//make array of superglobals
$registered=$_REQUEST;
$registered=(!empty($_POST)) ? array_merge($registered,$_POST) : $registered;
$registered=(!empty($_GET)) ? array_merge($registered,$_GET) : $registered;
$registered=(!empty($_SESSION)) ? array_merge($registered,$_SESSION) : $registered;
$registered=(!empty($_COOKIE)) ? array_merge($registered,$_COOKIE) : $registered;

foreach($registered as $var=>$void){
@unset($GLOBALS[$var]);
}
}
}

Hide everything

hide everything – that simple. if a folder, file, etc doesn’t / should be seen hide it. How? well if you use the .inc file extension like me, configure apache to handle .inc files with php. another option, or added protection: use htaccess to prevent access to .inc files, this will not effect your scripts, just web browsing.

In addition to hiding your inc files, don’t allow access to directories that aren’t needed to view your site. so you images directory should allow access, but your lib, class, or inc folder shouldn’t.

Important files (db config) should be in a directory outside of your hosting directory, but if you name it .inc, or .php and follow these directions you should be ok.

Lastly – turn off directory browsing.

Control access

ever part of your site should have an access level. So areas like your home page, public areas would a level 0. areas a users settings page would be a 1 (making sure only the user can access it of course), and your admin area – thats another story. Your admin area is the heart / backbone / investors dream of your site. That said protect it! all users in your admin should have an access level, and different parts of admin should have different access requirements.

ex: moderator – login, see’s flagged post area, does not see links to other areas, can not access other areas. manager: login, sees users, can add or remove users, but can not access critical site areas, and can not add a user >= his level. Admin can do almost everything, and lastly: rot – your root account can be named anything, but only allow one account full control over the system.

so quick review: users should only be able to see and access areas within their permissions scope, users should never be add / give users permissions >= their permissions.

Lastly, track everything your admin users do. You can go as far as adding an approval system for changes, tying your backend to svn to undo / redo changes, it’s pretty much up to you and the project / budget.

using sessions

sessions are like raymond, everybody loves them. But if your depending fully on php sessions you should make some changes. There should only be 1-2 cookie and session variables sent ( you can also send session id with get ), everything else should be handled internally in your application. Which means session / user validation, tracking, and variables.

misc

  • using isset only tells you if a var is set, not if it contains a value, use empty instead.
  • instead of adding more columns to your db for certain options, you can build a flag system. this allows you to add new options, without always adding a new column.
  • encrypt sensitive data (ssn’s, sin’s, cc data, phone taps, next weeks lotto numbers)
  • aes is your safest bet if your using encryption
  • if your using encryption, you need to spec out an information access process, permissions system (more than just roles)
  • kiss, the simpler it is to the end user, the less likely they are to break, figure out how to exploit it.
  • Variable names shouldn’t match table column names
  • separate code from design. not so much security a being smart and lazy – saves alot of work in the future

Read the php manual, you’ll find lots of good advice / functions in the classes.

Are we done?

Yes, hopefully someone gets something out of this, and I kept my promise of short and sweet. a quick google is all you’ll need to learn more about a subject. so right click -> search google

Gotta question, feedback, or recommendation? leave a comment

Cheers Sid / Greg