Oct 15th, 2010

Comments

Linux Collection of Handy Scripts and One Liners - Volume 1.5 (Feedback Edition)

Following reader feedback please see below for an updated version of Volume 1

Ever wanted / needed HTTPD or another service to run with a raised thread priority?

Well you have a couple of options, add additional lines to the /etc/init.d script to change the nice level by adding additional lines on startup, or if you only need to do this on a temporary basis without restarting the service but need every thread to have a raised priority you can use a bash script

Much cleaner script here again thanks to Matthew Ife.

#!/bin/bash
pgrep httpd | while read pid; do renice -20 $pid; done

You can renice between -20 and +20, depending on your requirements you can use this script in a cron job to raise/lower priorities, change httpd for whatever service you want to change the thread priority for.

Ever needed to check files were being accessed / written to?

For this one you’re going to need the inotify-tools package, specifically the inotifywait binary.

inotifywait -m --timefmt "[%a %b %d %H:%M:%S %Y]" --format "%T [%e] %f" -r /folder/to/watch

An example usage is to ensure that caching is working correctly and that cache files are being used in place of processing PHP files, simply change “/folder/to/watch” to be your cache folder, and refresh a few pages.

All being well you’ll get an output similar to the following:

y-tools-3.14)
(root@132 BUZZ1) # /usr/local/bin/inotifywait -m --timefmt "[%a %b %d %H:%M:%S %Y]" --format "%T [%e] %f" -r /path/to/saiweb/wp-content/cache/supercache/*
Setting up watches.  Beware: since -r was given, this may take a while!
Watches established.
[Thu Jul 15 20:59:37 2010] [OPEN] index.html
[Thu Jul 15 20:59:37 2010] [CLOSE_NOWRITE,CLOSE] index.html
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] 
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] security
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] 
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] vsftpd-chrooting-without-the-headache-allowing-shared-directories
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] 
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] vsftpd-chrooting-without-the-headache-allowing-shared-directories
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] 
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] the-zen-of-secured-shared-hosting-part-1
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] 
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] the-zen-of-secured-shared-hosting-part-1
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] 
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] php-security-considerations
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] 
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] php-security-considerations
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] 
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] antivirus-xp-2008-removal
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] 
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] antivirus-xp-2008-removal
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] 
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] suphplookupexception
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] 
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] suphplookupexception
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] 
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] honeypotting-for-viruses-statement-of-fees-200809
[Thu Jul 15 21:00:08 2010] [OPEN,ISDIR] 
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] honeypotting-for-viruses-statement-of-fees-200809
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] 
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR] security
[Thu Jul 15 21:00:08 2010] [CLOSE_NOWRITE,CLOSE,ISDIR]

Alternatively you can use the following approach contributed by Matthew Ife:

auditctl -w /some/path -p w

This will persist for the duration of your ssh session and relevant log entries will appear in /var/log/audit/audit.log, admittedly with far more useful information than inotifywait, and does not require you to install additional packages.

As can be seen the re-write rules are redirecting users to the cached files/folders, in the example above I have used my wp-supercache folder.

Ever needed to quickly get the memory usage of all threads for a service?

You have two options for this a single line

 ps -Ao rsz,comm,pid | grep <process name>

or a bash function you can place in your ~/.bashrc

function appmem(){
    if [ -z "$1" ]; then
        echo "appmem <string to filter>"
        echo "i.e. appmem httpd";
    else
        ps -Ao rsz,comm,pid | grep $1
    fi
}

You can then call this (after logging back in again to load the .bashrc up) using

appmem <filter>

replacing for instance with httpd will give you an output similar to the following:

8032 httpd            6207
33080 httpd           13828
 8552 httpd           14095
28952 httpd           14102
 8540 httpd           14103
30848 httpd           16741
31296 httpd           16832
30452 httpd           18439
31044 httpd           19996
30968 httpd           23287
30356 httpd           23300
25636 httpd           24553
29712 httpd           24771
25588 httpd           24777
31632 httpd           24778
25608 httpd           24796
29716 httpd           24812
28152 httpd           24813
31684 httpd           31291

This shows memory in kilobytes, command, process id, you can see here I currently have 3mb/pid for each httpd process (due to my optimizations, I highly recommend you read parts 1-3)

Dump mysql data and compress on the fly

mysqldump -h <host> -u <user> -p <dbname> | bzip2 -c7 > /path/to/dump.sql.bz2

Self explanatory that one, pipes the output from mysqldump through bzip2 (which has better compression over gzip) and dumps it out to a file, if you realy need a gziped file just replace bzip2 with gzip in the line above.

Ever needed a selection of passwords generated?

Using a slightly modified line originally provided by Matthew Ife,

function pwgen(){
        dd if=/dev/urandom bs=2048 count=1 | tr -cd ‘a-zA-Z0-9+@\!\$\(\)’ | cut -b1-15
}

Plant this in your ~/.basrc for a callable function that will generate a selection of 10 secure passwords, handy when you’re fed up of 1337’ifying everything

example output:

)S9esjccl?MMiC1

If you want runtime variable length you could change to cut -1-$1 and then call pwgen 15 for example.

Check mySQL myISAM fragmentation

use information_schema;
SELECT CONCAT(TABLE_SCHEMA,'.',TABLE_NAME) AS TABLE_NAME, ENGINE, (DATA_LENGTH/1024/1024) AS DATA_LENGTH, (INDEX_LENGTH/1024/1024) AS INDEX_LENGTH, ((DATA_LENGTH + INDEX_LENGTH)/1024/1204) AS TOTAL_LENGTH,TABLE_ROWS, UPDATE_TIME, ((INDEX_LENGTH/(DATA_LENGTH + INDEX_LENGTH))*100) AS INDEX_PER,((DATA_LENGTH/(DATA_LENGTH + INDEX_LENGTH))*100) AS DATA_PER, (DATA_FREE/DATA_LENGTH) AS FRAG_RATIO FROM TABLES WHERE ENGINE IS NOT NULL AND DATA_LENGTH >=(1024*1024) AND (DATA_FREE/DATA_LENGTH) >=0.02 ORDER BY FRAG_RATIO DESC;

Gives you a very quick overview of make up of your myISAM tables and their fragmentation (Data free vs data length).

Oct 12th, 2010

hosting, linux, technology

Comments

Make Your Webapp Shine With Varnish - Part 1

Part 1, what is varnish?

The varnish cache project is one you really need to get familiar with if you manage any high volume websites, it can mean the difference between a self destructing web app that buckles under it’s own load, and an apparently seamless web app serving 1000’s of concurrent connections per second with relative ease.

How does it work?

Varnish acts as a proxy server, in that when a use sends a GET request varnish will lookup in its internal database for a cached version and if it can not find one it will pass the request to the “back end” or in this case an apache server, varnish will then cache the response for subsequent accesses.

Now you may ask yourself why do you need this? this boils down to what you are trying to achieve with your web application, if your application is heavily reliant on dynamic content and regularly gets some 400 concurrent users for example, lets assume the following:

400 concurrent unique users
Average page render time is 0.85s

The Math

Based on this if you were to place varnish in front of your application with a 60second ttl (time to live, length of time varnish will hold an object in cache):

Varnish ttl 60 seconds
400/0.85 = 470.59/second
28235.29/minute
Factor of reduction to “back end”: x28235.29

So in the example above simply by caching a page for as little as 60 seconds, the requests/minute as reduced from 28235.29 to 1, now even reducing the cache times to 10 seconds in this example would give a x4705.88 reduction.

How is this reduction a good thing, well time on cpu for one, varnish when configured correctly is very very fast, and even with an out of the box configuration it’s still going to be much faster than your dynamic web application.

Summary

So here ends a brief introduction to varnish and why you realy want to start using it, in the following parts we will cover

Configuration overview

brief overview of each sub section based on the 2.1 syntax

Advanced configuration

Load balancing
Failover handling
Raising cache hitrate
Pros and cons of each setup
Benchmarks

Sep 30th, 2010

linux

Comments

Ssh X11 Forwarding Who Needs Vnc?

This is one of those things I find my jaw dropping at, whilst punching myself for not knowing about it sooner. It’s true as much as I live in the cli & ssh to do my job I find sometimes I require a VNC connection (i.e. the plethora of system-config-* stuff in RH)

Now however there is an alternative (so long as your client machine has x11 installed)

SSH -X <server ip> -l <user>

That’s it simple as that, now use a cli command to launch your normal gui tool i..e

kate ~/.bashrc

And x11 will launch on the machine you are working from, now don’t think the gui is running form your machine it’s not!

your machine is now acting as a thin client simply interacting over SSH, with the gui tool running from the server itself!

And there is where the awesomeness lies, esp if like me you run OSX whilst managing *nix servers.

grin

Sep 22nd, 2010

linux

Comments

Iptables Instanity

Namely a bug to do with iptables rate limiting,

iptables -I INPUT 2 -p tcp --dport http -m state --state NEW -m recent --update --seconds 60 --hitcount 20 -j LOG --log-level=7

works!

iptables -I INPUT 2 -p tcp --dport http -m state --state NEW -m recent --update --seconds 60 --hitcount 60 -j LOG --log-level=7
iptables: Unknown error 18446744073709551615

-j REJECT also produces the same.

Simply increasing the “hitcount” causes this error, the only work around I have come up with is decreasing the –seconds arg, to yield more hits/sec, still bloody annoying!

Aug 26th, 2010

hosting, php

Comments

PHP & Caching an in Depth Review.

Sounds simple enough, right?

Use a cache to serve pages faster, well yes that is true but people often do not realize the fundamentals of caching and how if not done properly it can lead to a detriment in performance.

The first thing you need to realize that by caching your content is no longer dynamic, … (short pause while we wait for the outrage in the back to die down).

The whole point behind your cache is that it will be used instead of processing all your code, why this is beneficial?

You have to remember that PHP is an interpreted language, meaning it takes the following I/O flow:

Apache -> mod_php -> Script -> Interpreter -> Bytecode -> Execution -> Output Buffer

Now there are two types of caching to consider, the first is completion output caching, this also yields the best performance, the second is opcode caching, this caches the byte code generated by the interpreter thus removing that step from the chain of execution.

With me so far? Ok take a deep breath because here we go …

Output caching

This option often yields the best performance, but at the cost of removing the dynamic element from your web app. But this can be summed up in a single line: What good is dynamic content if you can serve all of 5% of your audience at a given time?

Another turn of phrase is “The slashdot effect”, there are many options for output caching, and you should ideally provide gziped and plain cache files to your end user, for instance on this blog I use WP Super Cache, and can high recommend it, as new content is posted the relevant caches are regenerated, if you are writing your own WebApp check for the “Accept-Encoding:gzip” header being sent via the users browser.

For end user transparency couple this with some mod_rewrite voodoo

RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/%{REQUEST_FILENAME}.gz -f
RewriteRule ^(.*) "/cache/%{HTTP_HOST}/%{REQUEST_FILENAME}.gz" [L]

1: If gzip is supported 2: and the cache file exists 3: Redirect visitor to compressed cached file

You “chain of execution” is now

Apache -> readfile

To serve non gziped content:

RewriteCond %{HTTP:Accept-Encoding} !gzip
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/%{REQUEST_FILENAME} -f
RewriteRule ^(.*) "/cache/%{HTTP_HOST}/%{REQUEST_FILENAME}" [L]

Now to clarify a point you should not be caching images,css,js etc, we’re only covering dynamic content here, and the above are only examples to get you started, you should write rules to exclude certain content specific to your needs.

And before going of at any more of a tangent, here are some figures for you!

ab -c 100 -n 500 -g ./saiweb-nocache-nogzip.bpl https://blog.oneiroi.co.uk/

No caching
No Gzip

Server Hostname: blog.oneiroi.co.uk Server Port: 80

Document Path: / Document Length: 109086 bytes

Concurrency Level: 100 Time taken for tests: 123.304 seconds Complete requests: 500 Failed requests: 0 Write errors: 0 Total transferred: 54831652 bytes HTML transferred: 54692607 bytes Requests per second: 4.06 [#/sec] (mean) Time per request: 24660.828 [ms] (mean) Time per request: 246.608 [ms] (mean, across all concurrent requests) Transfer rate: 434.26 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 57 423 225.5 374 1837 Processing: 2331 20460 16701.2 17232 115192 Waiting: 270 1835 4155.8 576 38549 Total: 2656 20882 16648.1 17692 115421

Percentage of the requests served within a certain time (ms) 50% 17692 66% 20700 75% 24063 80% 25770 90% 35157 95% 53328 98% 82957 99% 101497 100% 115421 (longest request)

As can be seen as the number of requests grew the response time began to increase sharply and the overall performace of the site degrade, bare in mind these benchmarks are being made on my home DSL for the time being.

ab -c 100 -n 500 -g ./saiweb-cached.bpl https://blog.oneiroi.co.uk/

Server Hostname: blog.oneiroi.co.uk Server Port: 80

Document Path: / Document Length: 109086 bytes

Concurrency Level: 100 Time taken for tests: 79.212 seconds Complete requests: 500 Failed requests: 0 Write errors: 0 Total transferred: 54889292 bytes HTML transferred: 54705058 bytes Requests per second: 6.31 [#/sec] (mean) Time per request: 15842.342 [ms] (mean) Time per request: 158.423 [ms] (mean, across all concurrent requests) Transfer rate: 676.70 [Kbytes/sec] received

Connection Times (ms)

          min  mean[+/-sd] median   max

Connect: 56 314 112.5 322 1341 Processing: 2545 14721 5116.7 14296 36677 Waiting: 216 1283 2228.2 351 13776 Total: 2647 15035 5108.9 14624 36897

Percentage of the requests served within a certain time (ms) 50% 14624 66% 16675 75% 18058 80% 19093 90% 21608 95% 23489 98% 27684 99% 29972 100% 36897 (longest request)

A much more consistent line here, however as you can clearly see response times are roughly equal this is due to my DSL connection, so lets run these tests from somewhere with a little more bandwidth say the webserver itself using a loop back connection.

ab -c 100 -n 500 -g ./saiweb-cached.bpl https://blog.oneiroi.co.uk/

Server Hostname: blog.oneiroi.co.uk Server Port: 80

Document Path: / Document Length: 109086 bytes

Concurrency Level: 100 Time taken for tests: 0.262199 seconds Complete requests: 500 Failed requests: 0 Write errors: 0 Total transferred: 54945406 bytes HTML transferred: 54761172 bytes Requests per second: 1906.95 [#/sec] (mean) Time per request: 52.440 [ms] (mean) Time per request: 0.524 [ms] (mean, across all concurrent requests) Transfer rate: 204642.27 [Kbytes/sec] received

Connection Times (ms)

          min  mean[+/-sd] median   max

Connect: 0 1 2.6 0 9 Processing: 4 45 10.3 49 58 Waiting: 1 38 9.9 41 50 Total: 9 47 9.5 50 64

Percentage of the requests served within a certain time (ms) 50% 50 66% 51 75% 52 80% 52 90% 54 95% 56 98% 59 99% 61 100% 64 (longest request)

In this case the response times rise and then plateau, no after which no further degradation occurs.

ab -c 100 -n 500 -g ./saiweb-nocache.bpl https://blog.oneiroi.co.uk/

Server Hostname: blog.oneiroi.co.uk Server Port: 80

Document Path: / Document Length: 109086 bytes

Concurrency Level: 100 Time taken for tests: 8.919565 seconds Complete requests: 500 Failed requests: 0 Write errors: 0 Total transferred: 54680788 bytes HTML transferred: 54543000 bytes Requests per second: 56.06 [#/sec] (mean) Time per request: 1783.913 [ms] (mean) Time per request: 17.839 [ms] (mean, across all concurrent requests) Transfer rate: 5986.73 [Kbytes/sec] received

Connection Times (ms)

          min  mean[+/-sd] median   max

Connect: 0 14 30.7 0 85 Processing: 246 1556 714.3 1365 6735 Waiting: 241 1539 707.8 1360 6731 Total: 250 1571 708.0 1368 6735

Percentage of the requests served within a certain time (ms) 50% 1368 66% 1451 75% 1550 80% 1700 90% 2658 95% 3121 98% 3491 99% 3638 100% 6735 (longest request)

Oh dear of dear lets cut to the hard facts shall we?

We’ve gone from serving 1906.95 requests a second to 56.06

a 97.1% decrease in performance when removing caching
or a 3401.1% increase in performance when implementing caching

We’ve gone from a response time of ~50ms to ~2000ms

a 97.5% decrease in performance when removing caching
or a 4000% increase in performance when caching is on

Then there is the CPU an memory overheads to consider, in this case a more prolonged test is required to gain the relevant sar data, now let me tell you that intentionally trying to get a test like this to run over a 10 minute period with the correct caching on is a lot harder than it sounds, the tests infact were completing far too quickly …

The problem I face is to make ab perform a long enough timed duration of results cached, I know for a fact uncached the server will fail under the load, so I have no way at present of grabbing this reliably,

what I can tell you is that this command: ab -c 300 -n 1000000 -g ./saiweb-cached.bpl https://blog.oneiroi.co.uk/

caused a load average of 2.96, 1.9,0.93 cache, and got as high as 21 before I killed it uncached.

Now I am going to bring this post to an end as it is getting quiet long, I plan to cover the following in a 2nd part.

Opcode caching
CPU & Memory usage, Cached vs. UNcached