Author Archives: Carles Mateo

Stopping a BitTorrent DDoS attack

After all the success about the article stopping an XMLRPC to WordPress site attack and thanks messages (I actually helped a company that was being thrown down every day and asked me for help) it’s the moment to explain how to stop an attack much more heavily in evilness.

The first sign I saw was that the server was more and more slower, what is nearly impossible as I setup a very good server, and it has a lot of good development techniques to not having bottlenecks.

I looked at the server and I saw like 3,000 SYN_SENT packets. Apparently we were under a SYN Flood attack.

blog-carlesmateo-com-atack-to-the-web-2015-high-load-blacknetstat revealed more than 6k different ip addresses connecting to the Server.

Server had only 30 GB of RAM so, and started to be full, with more and more connections, and so more Apache processes to respond to the real users fast it was clear that it was going to struggle.

I improved the configuration of the Apache so the Server would be able to handle much more connections with less memory consumption and overhead, added some enhancements for blocking SYN Flood attacks, and restarted the Apache Server.

I reduced greatly the scope of the attacks but I knew that it would only be being worst. I was buying time while not disrupting the functioning of the website.

The next hours the attacks increased to having around 7,500 concurrent connections simultaneously. The memory was reaching its limits, so I decided it was time to upgrade the instance. I doubled the memory and added much more cores, to 36, by using one of the newest Amazon c4.8xlarge.

blog-carlesmateo-com-heavy-load-with-c4-8xlarge-black

The good thing about Cloud is that you pay for the time you use the resources. So when the waters calm down again, I’m able to reduce the size of the instance and save some hundreds to the company.

I knew it was a matter of time. The server was stabilized at using 40 GB out of the 60 GB but I knew the pirates will keep trying to shutdown the service.

Once the SYN Flood was stopped and I was sure that the service was safe for a while, I was checking the logs to see if I can detect a pattern among the attacks. I did.

attacks-access-log-bittorrent

Most request that we were receiving where to a file called announce.php that obviously does not exist in the server, and so it was returning 404 error.

The user agent reported in many cases BitTorrent, or Torrent compatible product, and the url sending a hash, uploaded, downloaded, left… so I realized that somehow my Server was targeted by a Torrent attack, where they indicated that the Server was a Torrent tracker.

As the .htacess in frameworks like Laravel, Catalonia Framework… and CMS like WordPress, Joomla, ezpublish… try to read the file from filesystem and if it doesn’t exist index.php is served, then as first action I created a file /announce.php that simple did an exit();

Sample .htaccess from Laravel:

<IfModule mod_rewrite.c>
    <IfModule mod_negotiation.c>
        Options -MultiViews
    </IfModule>

    RewriteEngine On

    # Redirect Trailing Slashes...
    RewriteRule ^(.*)/$ /$1 [L,R=301]

    # Handle Front Controller...
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteRule ^ index.php [L]
</IfModule>

Sample code for announce.php would be like:

<?php
/**
 * Creator: Carles Mateo
 * Date: 2015-01-21 Time: 09:39
 */

// A cheap way to stop an attack based on requesting this file
http_response_code(406);
exit();

The response_code 406 was an attemp to see if the BitTorrent clients were sensible to headers and stop. But they didn’t.

With with simple addition of announce.php , with exit(), I achieved reducing the load on the Server from 90% to 40% in just one second.

The reason why a not found page was causing so many damage was that as the 404 error page from the Server is personalized, and offers alternative results (assuming the product you was looking for is no longer available), and before displaying all the Framework is loaded and the routes are checked to see if the url fits and so has some process to be done in the PHP side (it takes 100 ms to reply, is not much, but it was not necessary to waste so much CPU), even being very optimized, every single not found url was causing certain process and CPU waste. Since the attack had more than 7,000 different ip’s simultaneously coming to the Server it would be somewhat a problem at certain point and start returning 500 errors to the customers.

The logs were also showing other patterns, for example:

announce?info_hash…

So without the PHP extension. Those kind of requests would not go through my wall file announce.php but though index.php (as .htaccess tells what is not found is directed there).

I could change the .htaccess to send those requests to hell, but I wanted a more definitive solution, something that would prevent the Server from wasting CPU and the Servers to being able to resist an attack x1000 times harder.

At the end the common pattern was that the BitTorrent clients were requesting via GET a parameter called info_hash, so I blocked through there all the request.

I wrote this small program, and added it to index.php

// Patch urgency Carles to stop an attack based on Torrent
// http://blog.carlesmateo.com
if (isset($_GET['info_hash'])) {

    // In case you use CDN, proxy, or load balancer
    $s_ip_proxy = '';

    $s_ip_address = $_SERVER['REMOTE_ADDR'];


    // Warning if you use a CDN, a proxy server or a load balancer do not add the ip to the blacklisted
    if ($s_ip_proxy == '' || ($s_ip_proxy != '' && $s_ip_address != $s_ip_proxy)) {
        $s_date = date('Y-m-d');

        $s_ip_log_file = '/tmp/ip-to-blacklist-'.$s_date.'.log';
        file_put_contents($s_ip_log_file, $s_ip_address."\n", FILE_APPEND | LOCK_EX);
    }


    // 406 means 'Not Acceptable'
    http_response_code(406);
    exit();
}

 

Please note, this code can be added to any Software like Zend Framework, Symfony, Catalonia Framework, Joomla, WordPress, Drupal, ezpublish, Magento… just add those lines at the beginning of the public/index.php just before the action of the Framework starts. Only be careful that after a core update, you’ll have to reapply it.

After that I deleted the no-longer-needed announced.php

What the program does is, if you don’t have defined a proxy/CDN ip, to write the ip connecting with the Torrent request pattern to a log file called for example:

/tmp/ip-to-blacklist-2015-01-23.log

And also exit(), so stopping the execution and saving many CPU cycles.

The idea of the final date is to blacklist the ip’s only for 24 hours as we later will see.

With this I achieved reducing the CPU consumption to around 5-15% of CPU.

Then, there is the other part of stopping the attack, that is a bash program, that can be run from command line or added to cron to be launched, depending on the volume of the attacks, every 5 minutes, or every hour.

blog-carlesmateo-com-blocking-traffick-black

ip_blacklist.sh

#!/bin/bash
# Ip blacklister by Carles Mateo
s_DATE=$(date +%Y-%m-%d)
s_FILE=/tmp/ip-to-blacklist-$s_DATE.log
s_FILE_UNIQUE=/tmp/ip-to-blacklist-$s_DATE-unique.log
cat $s_FILE | sort | uniq > $s_FILE_UNIQUE

echo "Counting the ip addresses to block in $s_FILE_UNIQUE"
cat $s_FILE_UNIQUE | wc -l

sleep 3
# We clear the iptables rules
iptables -F
iptables -X
iptables -t nat -F
iptables -t nat -X
iptables -t mangle -F
iptables -t mangle -X
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT

# To list the rules sudo iptables -L
# /sbin/iptables -L INPUT -v -n
# Enable ssh for all (you can add a Firewall at Cloud provider level or enstrict the rule to your ip)
sudo iptables -A INPUT -p tcp --dport ssh -j ACCEPT

for s_ip_address in `cat $s_FILE_UNIQUE`
do
    echo "Blocking traffic from $s_ip_address"
    sudo iptables -A INPUT -s $s_ip_address -p tcp --destination-port 80 -j DROP
    sudo iptables -A INPUT -s $s_ip_address -p tcp --destination-port 443 -j DROP
done

# Ensure Accept traffic on Port 80 (HTTP) and 443 (HTTPS)
sudo iptables -A INPUT -p tcp --dport 80 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 443 -j ACCEPT

# To block the rest
# sudo iptables -A INPUT -j DROP

# User iptables -save and iptables -restore to make this changes permanent
# sudo sh -c "iptables-save > /etc/iptables.rules"
# sudo pre-up iptables-restore < /etc/iptables.rules
# https://help.ubuntu.com/community/IptablesHowTo

This scripts gets the list of ip’s addresses, gets the list of unique ip’s into another file, and then makes a loop and adds all of them to the iptables, the Firewall from Linux, and blocks them for accessing the web at port 80 (http) or 443 (https, ssl). You can block all the ports also if you want for those ip’s.

With this CPU use went to 0%.

Note: One of my colleagues, a wonderful SysAdmin at Ackstorm ISP, points that some of you may prefer using REJECT instead of DROP. An interesting conversation on serverfault about this.

After fixing the problem I looked over the Internet to locate any people reporting attacks like what I suffered. The most interesting I found was this article: BotTorrent: Misusing BitTorrent to Launch DDoS Attacks, from University of California, Irvine. (local copy on this website BotTorrent)

Basically any site on the Internet can be attacked at a large scale, as every user downloading Torrent will try to connect to the innocent Server to inform of the progress of the down/upload. If this attack is performed with hundreds of files, the attack means hundreds of thousands of ip’s connecting to the Server… the server will run out of connections, or memory, or bandwidth will be full from the bad traffic.

I saw that the attackers were using porno files that were highly downloaded and apparently telling the Torrent network that our Server was a Torrent tracker, so corroborating my hypothesis all the people downloading Torrents were sending updates to our Server, believing that our Server was a tracker. A trick from the sad pirates.

Some people, business users, asked me who could be interested in injuring other’s servers or disrupting other’s businesses without any immediate gain (like controlling your Servers to send Spam).

I told:

  • Competitors that hate you because you’re successful and want to disrupt your business (they pay to the pirates for doing attacks. I’ve helped companies that were let down by those pirates)
  • Investors that may want to buy you at a cheaper price (after badly trolling you for a week or two)
  • False “security” companies that will offer their services “casually” when you most need them and charge a high bill
  • Pirates that want to extort you

So bad people that instead that using their talent to create, just destroy and act bad being evil to others.

In other cases could be bad luck to have been assigned an Ip that previously had a Torrent tracker, it has not much sense for the Cloud as it is expensive, but it has that a Server with that ip was hacked and used as tracked for a while.

Also governments could be so wanting to disrupt services (like torrent) by clumsy redirecting dns to random ip’s, or entertainment companies trying to shutdown Torrent trackers could try to poison dns to stop users from using Bittorrent.

 

See the definitive solution in the next article.

Performance of several languages

Notes on 2017-03-26 18:57 CEST – Unix time: 1490547518 :

  1. As several of you have noted, it would be much better to use a random value, for example, read by disk. This will be an improvement done in the next benchmark. Good suggestion thanks.
  2. Due to my lack of time it took more than expected updating the article. I was in a long process with google, and now I’m looking for a new job.
  3. I note that most of people doesn’t read the article and comment about things that are well indicated on it. Please before posting, read, otherwise don’t be surprise if the comment is not published. I’ve to keep the blog clean of trash.
  4. I’ve left out few comments cause there were disrespectful. Mediocrity is present in the society, so simply avoid publishing comments that lack the basis of respect and good education. If a comment brings a point, under the point of view of Engineering, it is always published.

Thanks.

(This article was last updated on 2015-08-26 15:45 CEST – Unix time: 1440596711. See changelog at bottom)

One may think that Assembler is always the fastest, but is that true?.

If I write a code in Assembler in 32 bit instead of 64 bit, so it can run in 32 and 64 bit, will it be faster than the code that a dynamic compiler is optimizing in execution time to benefit from the architecture of my computer?.

What if a future JIT compiler is able to use all the cores to execute a single thread developed program?.

Are PHP, Python, or Ruby fast comparing to C++?. Does Facebook Hip Hop Virtual machine really speeds PHP execution?.

This article shows some results and shares my conclusions. It is as a base to discuss with my colleagues. Is not an end, we are always doing tests, looking for the edge, and looking at the root of the things in detail. And often things change from one version to the other. This article shows not an absolute truth, but brings some light into interesting aspects.

It could show the performance for the certain case used in the test, although generic core instructions have been selected. Many more tests are necessary, and some functions differ in the performance. But this article is a necessary starting for the discussion with my IT-extreme-lover friends and a necessary step for the next upcoming tests.

It brings very important data for Managers and Decision Makers, as choosing the adequate performance language can save millions in hardware (specially when you use the Cloud and pay per hour of use) or thousand hours in Map Reduce processes.

Acknowledgements and thanks

Credit for the great Eduard Heredia, for porting my C source code to:

  • Go
  • Ruby
  • Node.js

And for the nice discussions of the results, an on the optimizations and dynamic vs static compilers.

Thanks to Juan Carlos Moreno, CTO of ECManaged Cloud Software for suggesting adding Python and Ruby to the languages tested when we discussed my initial results.

Thanks to Joel Molins for the interesting discussions on Java performance and garbage collection.

Thanks to Cliff Click for his wonderful article on Java vs C performance that I found when I wanted to confirm some of my results and findings.

I was inspired to do my own comparisons by the benchmarks comparing different framework by techempower. It is amazing to see the results of the tests, like how C++ can serialize JSon 1,057,793 times per second and raw PHP only 180,147 (17%).

For the impatients

I present the results of the tests, and the conclusions, for those that doesn’t want to read about the details. For those that want to examine the code, and the versions of every compiler, and more in deep conclusions, this information is provided below.

Results

This image shows the results of the tests with every language and compiler.

All the tests are invoked from command line. All the tests use only one core. No tests for the web or frameworks have been made, are another scenarios worth an own article.

More seconds means a worst result. The worst is Bash, that I deleted from the graphics, as the bar was crazily high comparing to others.

* As later is discussed my initial Assembler code was outperformed by C binary because the final Assembler code that the compiler generated was better than mine.

After knowing why (later in this article is explained in detail) I could have reduced it to the same time than the C version as I understood the improvements made by the compiler.

blog-carlesmateo-com-performance-several-languages-php7-phantomjs-nodejs-java-bash-go-perl-luajit-hhvm3_9-scale_mod5

Table of times:

Seconds executing Language Compiler used Version
6 s. Java Oracle Java Java JDK 8
6 s. Java Oracle Java Java JDK 7
6 s. Java Open JDK OpenJDK 7
6 s. Java Open JDK OpenJDK 6
7 s. Go Go Go v.1.3.1 linux/amd64
7 s. Go Go Go v.1.3.3 linux/amd64
8 s. Lua LuaJit Luajit 2.0.2
10 s. C++ g++ g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2
10 s. C gcc gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
10 s.
(* first version was 13 s. and then was optimized)
Assembler nasm NASM version 2.10.09 compiled on Dec 29 2013
10 s. Nodejs nodejs Nodejs v0.12.4
14 s. Nodejs nodejs Nodejs v0.10.25
18 s. Go Go go version xgcc (Ubuntu 4.9-20140406-0ubuntu1) 4.9.0 20140405 (experimental) [trunk revision 209157] linux/amd64
20 s. Phantomjs Phantomjs phantomjs 1.9.0
21 s. Phantomjs Phantomjs phantomjs 2.0.1-development
38 s. PHP Facebook HHVM HipHop VM 3.4.0-dev (rel)
44 s. Python Pypy Pypy 2.2.1 (Python 2.7.3 (2.2.1+dfsg-1, Nov 28 2013, 05:13:10))
52 s. PHP Facebook HHVM HipHop VM 3.9.0-dev (rel)
52 s. PHP Facebook HHVM HipHop VM 3.7.3 (rel)
128 s. PHP PHP PHP 7.0.0alpha2 (cli) (built: Jul 3 2015 15:30:23)
278 s. Lua Lua Lua 2.5.3
294 s. Gambas3 Gambas3 3.7.0
316 s. PHP PHP PHP 5.5.9-1ubuntu4.3 (cli) (built: Jul 7 2014 16:36:58)
317 s. PHP PHP PHP 5.6.10 (cli) (built: Jul 3 2015 16:13:11)
323 s. PHP PHP PHP 5.4.42 (cli) (built: Jul 3 2015 16:24:16)
436 s. Perl Perl Perl 5.18.2
523 s. Ruby Ruby ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux]
694 s. Python Python Python 2.7.6
807 s. Python Python Python 3.4.0
47630 s. Bash GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu)

 

Conclusions and Lessons Learnt

  1. There are languages that will execute faster than a native Assembler program, thanks to the JIT Compiler and to the ability to optimize the program at runtime for the architecture of the computer running the program (even if there is a small initial penalty of around two seconds from JIT when running the program, as it is being analysed, is it more than worth in our example)
  2. Modern Java can be really fast in certain operations, it is the fastest in this test, thanks to the use of JIT Compiler technology and a very good implementation in it
  3. Oracle’s Java and OpenJDK shows no difference in performance in this test
  4. Script languages really sucks in performance. Python, Perl and Ruby are terribly slow. That costs a lot of money if you Scale as you need more Server in the Cloud
  5. JIT compilers for Python: Pypy, and for Lua: LuaJit, make them really fly. The difference is truly amazing
  6. The same language can offer a very different performance using one version or another, for example the go that comes from Ubuntu packets and the last version from official page that is faster, or Python 3.4 is much slower than Python 2.7 in this test
  7. Bash is the worst language for doing the loop and inc operations in the test, lasting for more than 13 hours for the test
  8. From command line PHP is much faster than Python, Perl and Ruby
  9. Facebook Hip Hop Virtual Machine (HHVM) improves a lot PHP’s speed
  10. It looks like the future of compilers is JIT.
  11. Assembler is not always the fastest when executed. If you write a generic Assembler program with the purpose of being able to run in many platforms you’ll not use the most powerful instructions specific of an architecture, and so a JIT compiler can outperform your code. An static compiler can also outperform your code with very clever optimizations. People that write the compilers are really good. Unless you’re really brilliant with Assembler probably a C/C++ code beats the performance of your code. Even if you’re fantastic with Assembler it could happen that a JIT compiler notices that some executions can be avoided (like code not really used) and bring magnificent runtime optimizations. (for example a near JMP is much more less costly than a far JMP Assembler instruction. Avoiding dead code could result in a far JMP being executed as near JMP, saving many cycles per loop)
  12. Optimizations really needs people dedicated to just optimizations and checking the speed of the newly added code for the running platforms
  13. Node.js was a big surprise. It really performed well. It is promising. New version performs even faster
  14. go is promising. Similar to C, but performance is much better thanks to deciding at runtime if the architecture of the computer is 32 or 64 bit, a very quick compilation at launch time, and it compiling to very good assembler (that uses the 64 bit instructions efficiently, for example)
  15. Gambas 3 performed surprisingly fast. Better than PHP
  16. You should be careful when using C/C++ optimization -O3 (and -O2) as sometimes it doesn’t work well (bugs) or as you may expect, for example by completely removing blocks of code if the compiler believes that has no utility (like loops)
  17. Perl performance really change from using a for style or another. (See Perl section below)
  18. Modern CPUs change the frequency to save energy. To run the tests is strictly recommended to use a dedicated machine, disabling the CPU governor and setting a frequency for all the cores, booting with a text only live system, without background services, not mounting disks, no swap, no network

(Please, before commenting read completely the article )

Explanations in details

Obviously an statically compiled language binary should be faster than an interpreted language.

C or C++ are much faster than PHP. And good code machine is much faster of course.

But there are also other languages that are not compiled as binary and have really fast execution.

For example, good Web Java Application Servers generate compiled code after the first request. Then it really flies.

For web C# or .NET in general, does the same, the IIS Application Server creates a native DLL after the first call to the script. And after this, as is compiled, the page is really fast.

With C statically linked you could generate binary code for a particular processor, but then it won’t work in other processors, so normally we write code that will work in all the processors at the cost of not using all the performance of the different CPUs or use another approach and we provide a set of different binaries for the different architectures. A set of directives doing one thing or other depending on the platform detected can also be done, but is hard, long and tedious job with a lot of special cases treatment. There is another approach that is dynamic linking, where certain things will be decided at run time and optimized for the computer that is running the program by the JIT (Just-in-time) Compiler.

Java, with JIT is able to offer optimizations for the CPU that is running the code with awesome results. And it is able to optimize loops and mathematics operations and outperform C/C++ and Assembler code in some cases (like in our tests) or to be really near in others. It sounds crazy but nowadays the JIT is able to know the result of several times executed blocks of code and to optimize that with several strategies, speeding the things incredible and to outperform a code written in Assembler. Demonstrations with code is provided later.

A new generation has grown knowing only how to program for the Web. Many of them never saw Assembler, neither or barely programmed in C++.

None of my Senior friends would assert that a technology is better than another without doing many investigations before. We are serious. There is so much to take in count, so much to learn always, that one has to be sure that is not missing things before affirming such things categorically. If you want to be taken seriously, you have to take many things in count.

Environment for the tests

Hardware and OS

Intel(R) Core(TM) i7-4770S CPU @ 3.10GHz with 32 GB RAM and SSD Disk.

Ubuntu Desktop 14.04 LTS 64 bit

Software base and compilers

PHP versions

Shipped with my Ubuntu distribution:

php -v
PHP 5.5.9-1ubuntu4.3 (cli) (built: Jul  7 2014 16:36:58)
Copyright (c) 1997-2014 The PHP Group
Zend Engine v2.5.0, Copyright (c) 1998-2014 Zend Technologies
with Zend OPcache v7.0.3, Copyright (c) 1999-2014, by Zend Technologies

Compiled from sources:

PHP 5.6.10 (cli) (built: Jul  3 2015 16:13:11)
Copyright (c) 1997-2015 The PHP Group
Zend Engine v2.6.0, Copyright (c) 1998-2015 Zend Technologies
PHP 5.4.42 (cli) (built: Jul  3 2015 16:24:16)
Copyright (c) 1997-2014 The PHP Group
Zend Engine v2.4.0, Copyright (c) 1998-2014 Zend Technologies

 

Java 8 version

java -showversion
java version "1.8.0_05"
Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)

C++ version

g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.2-19ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.8 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)

Gambas 3

gbr3 --version
3.7.0

Go (downloaded from google)

go version
go version go1.3.1 linux/amd64

Go (Ubuntu packages)

go version
go version xgcc (Ubuntu 4.9-20140406-0ubuntu1) 4.9.0 20140405 (experimental) [trunk revision 209157] linux/amd64

Nasm

nasm -v
NASM version 2.10.09 compiled on Dec 29 2013

Lua

lua -v
Lua 5.2.3  Copyright (C) 1994-2013 Lua.org, PUC-Rio

Luajit

luajit -v
LuaJIT 2.0.2 -- Copyright (C) 2005-2013 Mike Pall. http://luajit.org/

Nodejs

Installed with apt-get install nodejs:

nodejs --version
v0.10.25

Installed by compiling the sources:

node --version
v0.12.4

Phantomjs

Installed with apt-get install phantomjs:

phantomjs --version
1.9.0

Compiled from sources:

/path/phantomjs --version
2.0.1-development

Python 2.7

python --version
Python 2.7.6

Python 3

python3 --version
Python 3.4.0

Perl

perl -version
This is perl 5, version 18, subversion 2 (v5.18.2) built for x86_64-linux-gnu-thread-multi
(with 41 registered patches, see perl -V for more detail)

Bash

bash --version
GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Test: Time required for nested loops

This is the first sample. It is an easy-one.

The main idea is to generate a set of nested loops, with a simple counter inside.

When the counter reaches 51 it is set to 0.

This is done for:

  1. Preventing overflow of the integer if growing without control
  2. Preventing the compiler from optimizing the code (clever compilers like Java or gcc with -O3 flag for optimization, if it sees that the var is never used, it will see that the whole block is unnecessary and simply never execute it)

Doing only loops, the increment of a variable and an if, provides us with basic structures of the language that are easily transformed to Assembler. We want to avoid System calls also.

This is the base for the metrics on my Cloud Analysis of Performance cmips.net project.

Here I present the times for each language, later I analyze the details and the code.

Take in count that this code only executes in one thread / core.

C++

C++ result, it takes 10 seconds.

Code for the C++:

/*
* File:   main.cpp
* Author: Carles Mateo
*
* Created on August 27, 2014, 1:53 PM
*/

#include <cstdlib>
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <ctime>

using namespace std;

typedef unsigned long long timestamp_t;

static timestamp_t get_timestamp()
{
    struct timeval now;
    gettimeofday (&now, NULL);
    return  now.tv_usec + (timestamp_t)now.tv_sec * 1000000;
}

int main(int argc, char** argv) {

    timestamp_t t0 = get_timestamp();

    // current date/time based on current system
    time_t now = time(0);

    // convert now to string form
    char* dt_now = ctime(&now);

    printf("Starting at %s\n", dt_now);

    int i_loop1 = 0;
    int i_loop2 = 0;
    int i_loop3 = 0;

    

    for (i_loop1 = 0; i_loop1 < 10; i_loop1++) {
        for (i_loop2 = 0; i_loop2 < 32000; i_loop2++) {
            for (i_loop3 = 0; i_loop3 < 32000; i_loop3++) {
                i_counter++;

                if (i_counter > 50) {
                    i_counter = 0;
                }
            }
            // If you want to test how the compiler optimizes that, remove the comment
            //i_counter = 0;
         }
     }

    // This is another trick to avoid compiler's optimization. To use the var somewhere
    printf("Counter: %i\n", i_counter);

    timestamp_t t1 = get_timestamp();
    double secs = (t1 - t0) / 1000000.0L;
    time_t now_end = time(0);

    // convert now to string form
    char* dt_now_end = ctime(&now_end);

    printf("End time: %s\n", dt_now_end);

    return 0;
}

blog-carlesmateo-com-test-nested-loops-cpp-netbeans-10seconds

You can try to remove the part of code that makes the checks:

                /* if (i_counter > 50) {
                    i_counter = 0;
                }*/

And the use of the var, later:

    //printf("Counter: %i\n", i_counter);

Note: And adding a i_counter = 0; at the beginning of the loop to make sure that the counter doesn’t overflows. Then the C or C++ compiler will notice that this result is never used and so it will eliminate the code from the program, having as result and execution time of 0.0 seconds.

Java

The code in Java:

package cpu;

/**
 *
 * @author carles.mateo
 */
public class Cpu {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        
        int i_loop1 = 0;
        //int i_loop_main = 0;
        int i_loop2 = 0;
        int i_loop3 = 0;
        int i_counter = 0;
        
        String s_version = System.getProperty("java.version");
        
        System.out.println("Java Version: " + s_version);

        System.out.println("Starting cpu.java...");
        
        for (i_loop1 = 0; i_loop1 < 10; i_loop1++) {            
                for (i_loop2 = 0; i_loop2 < 32000; i_loop2++) {
                    for (i_loop3 = 0; i_loop3 < 32000; i_loop3++) {
                        i_counter++;
                        
                        if (i_counter > 50) { 
                            i_counter = 0;
                        }
                    }
                }
        }
        
        System.out.println(i_counter);
        System.out.println("End");
    }
    
}

It is really interesting how Java, with JIT outperforms C++ and Assembler.

It takes only 6 seconds.

Netbeans with Java IDE executing with OpenJDK 1.6 in 6 seconds

Go

The case of Go is interesting because I saw a big difference from the go shipped with Ubuntu, and the the go I downloaded from http://golang.org/dl/. I downloaded 1.3.1 and 1.3.3 offering the same performance. 7 seconds.

blog-carlesmateo-com-go1-3-3-linux-amd64-performance-37Source code for nested_loops.go

package main

import ("fmt"
        "time")

func main() {
   fmt.Printf("Starting: %s", time.Now().Local())
   var i_counter = 0;
   for i_loop1 := 0; i_loop1 < 10; i_loop1++ {
       for i_loop2 := 0; i_loop2 < 32000; i_loop2++ {
           for i_loop3 := 0; i_loop3 < 32000; i_loop3++ {
               i_counter++;
               if i_counter > 50 {
                   i_counter = 0;
               }
           }
       }
    }

   fmt.Printf("\nCounter: %#v", i_counter)
   fmt.Printf("\nEnd: %s\n", time.Now().Local())
}

Assembler

Here is the Assembler for Linux code, with SASM, that I created initially (bellow is optimized).

%include "io.inc"

section .text
global CMAIN
CMAIN:
    ;mov rbp, rsp; for correct debugging
    ; Set to 0, the faster way
    xor     esi, esi

DO_LOOP1:
    mov ecx, 10
LOOP1:
    mov ebx, ecx
    jmp DO_LOOP2
LOOP1_CONTINUE:
    mov ecx, ebx
    
    loop LOOP1
    jmp QUIT

DO_LOOP2:
    mov ecx, 32000
LOOP2:
    mov eax, ecx
    ;call DO_LOOP3
    jmp DO_LOOP3
LOOP2_CONTINUE:
    mov ecx, eax
        
    loop LOOP2
    jmp LOOP1_CONTINUE

DO_LOOP3:
    ; Set to 32000 loops    
    MOV ecx, 32000 
LOOP3:
    inc     esi
    cmp     esi, 50
    jg      COUNTER_TO_0
LOOP3_CONTINUE:

    loop LOOP3
    ;ret
    jmp LOOP2_CONTINUE
    
COUNTER_TO_0:
    ; Set to 0
    xor     esi, esi
    
    jmp LOOP3_CONTINUE
    
;    jmp QUIT

QUIT:
    xor eax, eax
    ret

It took 13 seconds to complete.

One interesting explanation on why binary C or C++ code is faster than Assembler, is because the C compiler generates better Assembler/binary code at the end. For example, the use of JMP is expensive in terms of CPU cycles and the compiler can apply other optimizations and tricks that I’m not aware of, like using faster registers, while in my code I use ebx, ecx, esi, etc… (for example, imagine that using cx is cheaper than using ecx or rcx and I’m not aware but the guys that created the Gnu C compiler are)

blog-carlesmateo-com-sasm-assembler-linux-64-bits-code-12-13-secondsTo be sure of what’s going on I switched in the LOOP3 the JE and the JMP of the code, for groups of 50 instructions, INC ESI, one after the other and the time was reduced to 1 second.

(In C also was reduced even a bit more when doing the same)

To know what’s the translation of the C code into Assembler when compiled, you can do:

objdump --disassemble nested_loops

Look for the section main and you’ll get something like:

0000000000400470 <main>:
400470:    bf 0a 00 00 00           mov    $0xa,%edi
400475:    31 c9                    xor    %ecx,%ecx
400477:    be 00 7d 00 00           mov    $0x7d00,%esi
40047c:    0f 1f 40 00              nopl   0x0(%rax)
400480:    b8 00 7d 00 00           mov    $0x7d00,%eax
400485:    0f 1f 00                 nopl   (%rax)
400488:    83 c2 01                 add    $0x1,%edx
40048b:    83 fa 33                 cmp    $0x33,%edx
40048e:    0f 4d d1                 cmovge %ecx,%edx
400491:    83 e8 01                 sub    $0x1,%eax
400494:    75 f2                    jne    400488 <main+0x18>
400496:    83 ee 01                 sub    $0x1,%esi
400499:    75 e5                    jne    400480 <main+0x10>
40049b:    83 ef 01                 sub    $0x1,%edi
40049e:    75 d7                    jne    400477 <main+0x7>
4004a0:    48 83 ec 08              sub    $0x8,%rsp
4004a4:    be 34 06 40 00           mov    $0x400634,%esi
4004a9:    bf 01 00 00 00           mov    $0x1,%edi
4004ae:    31 c0                    xor    %eax,%eax
4004b0:    e8 ab ff ff ff           callq  400460 <__printf_chk@plt>
4004b5:    31 c0                    xor    %eax,%eax
4004b7:    48 83 c4 08              add    $0x8,%rsp
4004bb:    c3                       retq

Note: this is in the AT&T syntax and not in the Intel. That means that add $0x1,%edx is adding 1 to EDX registerg (origin, destination).

As you can see the C compiler has created a very differed Assembler version respect what I created.
For example at 400470 it uses EDI register to store 10, so to control the number of the outer loop.
It uses ESI to store 32000 (Hexadecimal 0x7D00), so the second loop.
And EAX for the inner loop, at 400480.
It uses EDX for the counter, and compares to 50 (Hexa 0x33) at 40048B.
In 40048E it uses the CMOVGE (Mov if Greater or Equal), that is an instruction that was introduced with the P6 family processors, to move the contents of ECX to EDX if it was (in the CMP) greater or equal to 50. As in 400475 a XOR ECX, ECX was performed, EXC contained 0.
And it cleverly used SUB and JNE (JNE means Jump if not equal and it jumps if ZF = 0, it is equivalent to JNZ Jump if not Zero).
It uses between 4 and 16 clocks, and the jump must be -128 to +127 bytes of the next instruction. As you see Jump is very costly.

Looks like the biggest improvement comes from the use of CMOVGE, so it saves two jumps that my original Assembler code was performing.
Those two jumps multiplied per 32000 x 32000 x 10 times, are a lot of Cpu clocks.

So, with this in mind, as this Assembler code takes 10 seconds, I updated the graph from 13 seconds to 10 seconds.

Lua

This is the initial code:

local i_counter = 0

local i_time_start = os.clock()

for i_loop1=0,9 do
    for i_loop2=0,31999 do
        for i_loop3=0,31999 do
            i_counter = i_counter + 1
            if i_counter > 50 then
                i_counter = 0
            end
        end
    end
end

local i_time_end = os.clock()
print(string.format("Counter: %i\n", i_counter))
print(string.format("Total seconds: %.2f\n", i_time_end - i_time_start))

In the case of Lua theoretically one could take great advantage of the use of local inside a loop, so I tried the benchmark with modifications to the loop:

for i_loop1=0,9 do
    for i_loop2=0,31999 do
        local l_i_counter = i_counter
        for i_loop3=0,31999 do
             l_i_counter = l_i_counter + 1
             if l_i_counter > 50 then
                 l_i_counter = 0
             end
        end
        i_counter = l_i_counter
    end
end

I ran it with LuaJit and saw no improvements on the performance.

Node.js

var s_date_time = new Date();
console.log('Starting: ' + s_date_time);

var i_counter = 0;

for (var $i_loop1 = 0; $i_loop1 < 10; $i_loop1++) {
   for (var $i_loop2 = 0; $i_loop2 < 32000; $i_loop2++) {
       for (var $i_loop3 = 0; $i_loop3 < 32000; $i_loop3++) {
           i_counter++;
           if (i_counter > 50) {
               i_counter = 0;
           }
       }
   } 
}

var s_date_time_end = new Date();

console.log('Counter: ' + i_counter + '\n');

console.log('End: ' + s_date_time_end + '\n');

Execute with:

nodejs nested_loops.js

Phantomjs

The same code as nodejs adding to the end:

phantom.exit(0);

In the case of Phantom it performs the same in both versions 1.9.0 and 2.0.1-development compiled from sources.

PHP

The interesting thing on PHP is that you can write your own extensions in C, so you can have the easy of use of PHP and create functions that really brings fast performance in C, and invoke them from PHP.

<?php

$s_date_time = date('Y-m-d H:i:s');

echo 'Starting: '.$s_date_time."\n";

$i_counter = 0;

for ($i_loop1 = 0; $i_loop1 < 10; $i_loop1++) {
   for ($i_loop2 = 0; $i_loop2 < 32000; $i_loop2++) {
       for ($i_loop3 = 0; $i_loop3 < 32000; $i_loop3++) {
           $i_counter++;
           if ($i_counter > 50) {
               $i_counter = 0;
           }
       }
   } 
}

$s_date_time_end = date('Y-m-d H:i:s');

echo 'End: '.$s_date_time_end."\n";

Facebook’s Hip Hop Virtual Machine is a very powerful alternative, that is JIT powered.

Downloading the code and compiling it is just easy, just:

git clone https://github.com/facebook/hhvm.git
cd hhvm
rm -r third-party
git submodule update --init --recursive
./configure
make

Or just grab precompiled packages from https://github.com/facebook/hhvm/wiki/Prebuilt%20Packages%20for%20HHVM

Python

from datetime import datetime
import time

print ("Starting at: " + str(datetime.now()))
s_unixtime_start = str(time.time())

i_counter = 0

# From 0 to 31999
for i_loop1 in range(0, 10):
    for i_loop2 in range(0,32000):
         for i_loop3 in range(0,32000):
             i_counter += 1
             if ( i_counter > 50 ) :
                 i_counter = 0

print ("Ending at: " + str(datetime.now()))
s_unixtime_end = str(time.time())

i_seconds = long(s_unixtime_end) - long(s_unixtime_start)
s_seconds = str(i_seconds)

print ("Total seconds:" + s_seconds)

Ruby

#!/usr/bin/ruby -w

time1 = Time.new

puts "Starting : " + time1.inspect

i_counter = 0;

for i_loop1 in 0..9
    for i_loop2 in 0..31999
        for i_loop3 in 0..31999
            i_counter = i_counter + 1
            if i_counter > 50
                i_counter = 0
            end
        end
    end
end

time1 = Time.new

puts "End : " + time1.inspect

Perl

The case of Perl was very interesting one.

This is the current code:

#!/usr/bin/env perl

print "$s_datetime Starting calculations...\n";
$i_counter=0;

$i_unixtime_start=time();

for my $i_loop1 (0 .. 9) {
    for my $i_loop2 (0 .. 31999) {
        for my $i_loop3 (0 .. 31999) {
            $i_counter++;
            if ($i_counter > 50) {
                $i_counter = 0;
            }
        }
    }
}

$i_unixtime_end=time();

$i_seconds=$i_unixtime_end-$i_unixtime_start;

print "Counter: $i_counter\n";
print "Total seconds: $i_seconds";

But before I created one, slightly different, with the for loops like in the C style:

#!/usr/bin/env perl

$i_counter=0;

$i_unixtime_start=time();

for (my $i_loop1=0; $i_loop1 < 10; $i_loop1++) {
    for (my $i_loop2=0; $i_loop2 < 32000; $i_loop2++) {
        for (my $i_loop3=0; $i_loop3 < 32000; $i_loop3++) {
            $i_counter++;
            if ($i_counter > 50) {
                $i_counter = 0;
            }
        }
    }
}

$i_unixtime_end=time();

$i_seconds=$i_unixtime_end-$i_unixtime_start;

print "Total seconds: $i_seconds";

I repeated this test, with the same version of Perl, due to the comment of a reader (thanks mpapec) that told:

In this particular case perl style loops are about 45% faster than original code (v5.20)

And effectively and surprisingly the time passed from 796 seconds to 436 seconds.

So graphics are updated to reflect the result of 436 seconds.

Bash

#!/bin/bash
echo "Bash version ${BASH_VERSION}..."
date

let "s_time_start=$(date +%s)"
let "i_counter=0"

for i_loop1 in {0..9}
do
     echo "."
     date
     for i_loop2 in {0..31999}
     do
         for i_loop3 in {0..31999}
         do
             ((i_counter++))
             if [[ $i_counter > 50 ]]
             then
                 let "i_counter=0"
             fi
         done
#((var+=1))
#((var=var+1))
#((var++))
#let "var=var+1"
#let "var+=1"
#let "var++"
     done
done

let "s_time_end=$(date +%2)"

let "s_seconds = s_time_end - s_time_start"
echo "Total seconds: $s_seconds"

# Just in case it overflows
date

Gambas 3

Gambas is a language and an IDE to create GUI applications for Linux.
It is very similar to Visual Basic, but better, and it is not a clone.

I created a command line application and it performed better than PHP. There has been done an excellent job with the compiler.

blog-carlesmateo-com-gbr3-gambas-performanceNote: in the screenshot the first test ran for few seconds more than in the second. This was because I deliberately put the machine under some load and I/O during the tests. The valid value for the test, confirmed with more iterations is the second one, done under the same conditions (no load) than the previous tests.

' Gambas module file MMain.module

Public Sub Main()

    ' @author Carles Mateo http://blog.carlesmateo.com
    
    Dim i_loop1 As Integer
    Dim i_loop2 As Integer
    Dim i_loop3 As Integer
    Dim i_counter As Integer
    Dim s_version As String
    
    i_loop1 = 0
    i_loop2 = 0
    i_loop3 = 0
    i_counter = 0
    
    s_version = System.Version
    
    Print "Performance Test by Carles Mateo blog.carlesmateo.com"    
    Print "Gambas Version: " & s_version

    Print "Starting..." & Now()
    
    For i_loop1 = 0 To 9
        For i_loop2 = 0 To 31999
            For i_loop3 = 0 To 31999
                i_counter = i_counter + 1
                
                If (i_counter > 50) Then
                    i_counter = 0
                Endif
            Next
        Next
    Next
    
    Print i_counter
    Print "End " & Now()

End

Changelog

2015-08-26 15:45

Thanks to the comment of a reader, thanks Daniel, pointing a mistake. The phrase I mentioned was on conclusions, point 14, and was inaccurate. The original phrase told “go is promising. Similar to C, but performance is much better thanks to the use of JIT“. The allusion to JIT is incorrect and has been replaced by this: “thanks to deciding at runtime if the architecture of the computer is 32 or 64 bit, a very quick compilation at launch time, and it compiling to very good assembler (that uses the 64 bit instructions efficiently, for example)”

2015-07-17 17:46

Benchmarked Facebook HHVM 3.9 (dev., the release date is August 3 2015) and HHVM 3.7.3, they take 52 seconds.

Re-benchmarked Facebook HHVM 3.4, before it was 72 seconds, it takes now 38 seconds. I checked the screen captures from 2014 to discard an human error. Looks like a turbo frequency issue on the tests computer, with the CPU governor making it work bellow the optimal speed or a CPU-hungry/IO process that triggered during the tests and I didn’t detect it. Thinking about forcing a fixed CPU speed for all the cores for the tests, like 2.4 Ghz and booting a live only text system without disk access and network to prevent Ubuntu launching processes in the background.

2015-07-05 13:16

Added performance of Phantomjs 1.9.0 installed via apt-get install phantomjs in Ubuntu, and Phantomjs 2.0.1-development.

Added performance of nodejs 0.12.04 (compiled).

Added bash to the graphic. It has so bad performance that I had to edit the graphic to fit in (color pink) in order prevent breaking the scale.

2015-07-03 18:32

Added benchmarks for PHP 7 alpha 2, PHP 5.6.10 and PHP 5.4.42.

2015-07-03 15:13
Thanks to the contribution of a reader (thanks mpapec!) I tried with Perl for style, resulting in passing from 796 seconds to 436 seconds.
(I used the same Perl version: Perl 5.18.2)
Updated test value for Perl.
Added new graphics showing the updated value.

Thanks to the contribution of a reader (thanks junk0xc0de!) added some additional warnings and explanations about the dangers of using -O3 (and -O2) if C/C++.

Updated the Lua code, to print i_counter and do the if i_counter > 50
This makes it take a bit longer, few cents, but passing from 7.8 to 8.2 seconds.
Updated graphics.

Stopping and investigating a WordPress xmlrpc.php attack

One of my Servers got heavily attacked for several days. I describe here the steps I took to stop this.

The attack consisted in several connections per second to the Server, to path /xmlrpc.php.

This is a WordPress file to control the pingback, when someone links to you.

My Server it is a small Amazon instance, a m1.small with only one core and 1,6 GB RAM, magnetic disks and that scores a discrete 203 CMIPS (my slow laptop scores 460 CMIPS).

Those massive connections caused the server to use more and more RAM, and while the xmlrpc requests were taking many seconds to reply, so more and more processes of Apache were spawned. That lead to more memory consumption, and to use all the available RAM and start using swap, with a heavy performance impact until all the memory was exhausted and the mysql processes stopped.

I saw that I was suffering an attack after the shutdown of MySql. I checked the CloudWatch Statistics from Amazon AWS and it was clear that I was receiving many -out of normal- requests. The I/O was really high too.

This statistics are from today to three days ago, look at the spikes when the attack was hitting hard and how relaxed the Server is now (plain line).

blog-carlesmateo-com-statistics-use-last-3-days

First I decided to simply rename the xmlrpc.php file as a quick solution to stop the attack but the number of http connections kept growing and then I saw very suspicious queries to the database.

blog-carlesmateo-suspicious-queries-2014-08-30-00-11-59Those queries, in addition to what I’ve seen in the Apache’s error log suggested me that may be the Server was hacked by a WordPress/plugin bug and that now they were trying to hide from the database’s logs. (Specially the DELETE FROM wp_useronline WHERE user_ip = the Ip of the attacker)

[Tue Aug 26 11:47:08 2014] [error] [client 94.102.49.179] Error in WordPress Database Lost connection to MySQL server during query a la consulta SELECT option_value FROM wp_options WHERE option_name = 'uninstall_plugins' LIMIT 1 feta per include('wp-load.php'), require_once('wp-config.php'), require_once('wp-settings.php'), include_once('/plugins/captcha/captcha.php'), register_uninstall_hook, get_option
[Tue Aug 26 11:47:09 2014] [error] [client 94.102.49.179] Error in WordPress Database Lost connection to MySQL server during query a la consulta SELECT option_value FROM wp_options WHERE option_name = 'uninstall_plugins' LIMIT 1 feta per include('wp-load.php'), require_once('wp-config.php'), require_once('wp-settings.php'), include_once('/plugins/captcha/captcha.php'), register_uninstall_hook, get_option
[Tue Aug 26 11:47:10 2014] [error] [client 94.102.49.179] Error in WordPress Database Lost connection to MySQL server during query a la consulta SELECT option_value FROM wp_options WHERE option_name = 'widget_wppp' LIMIT 1 feta per include('wp-load.php'), require_once('wp-config.php'), require_once('wp-settings.php'), do_action('plugins_loaded'), call_user_func_array, wppp_check_upgrade, get_option

The error log was very ugly.

The access log was not reassuring, as it shown many attacks like that:

94.102.49.179 - - [26/Aug/2014:10:34:58 +0000] "POST /xmlrpc.php HTTP/1.0" 200 598 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"
94.102.49.179 - - [26/Aug/2014:10:34:59 +0000] "POST /xmlrpc.php HTTP/1.0" 200 598 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"
127.0.0.1 - - [26/Aug/2014:10:35:09 +0000] "OPTIONS * HTTP/1.0" 200 126 "-" "Apache/2.2.22 (Ubuntu) (internal dummy connection)"
94.102.49.179 - - [26/Aug/2014:10:34:59 +0000] "POST /xmlrpc.php HTTP/1.0" 200 598 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"
94.102.49.179 - - [26/Aug/2014:10:34:59 +0000] "POST /xmlrpc.php HTTP/1.0" 200 598 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"
94.102.49.179 - - [26/Aug/2014:10:35:00 +0000] "POST /xmlrpc.php HTTP/1.0" 200 598 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"
94.102.49.179 - - [26/Aug/2014:10:34:59 +0000] "POST /xmlrpc.php HTTP/1.0" 200 598 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"

Was difficult to determine if the Server was receiving SQL injections so I wanted to be sure.

Note: The connection from 127.0.0.1 with OPTIONS is created by Apache when spawns another Apache.

As I had super fresh backups in another Server I was not afraid of the attack dropping the database.

I was a bit suspicious also because the /readme.html file mentioned that the version of WordPress is 3.6. In other installations it tells correctly that the version is the 3.9.2 and this file is updated with the auto-update. I was thinking about a possible very sophisticated trojan attack able to modify wp-includes/version.php and set fake $wp_version = ‘3.9.2’;
Later I realized that this blog had WordPress in Catalan, my native language, and discovered that the guys that do the translations forgot to update this file (in new installations it comes not updated, and so showing 3.6). I have alerted them.

In fact later I did a diff of all the files of my WordPress installation against the official WordPress 3.9.2-ca and later a did a diff between the WordPress 3.9.2-ca and the WordPress 3.9.2 (English – default), and found no differences. My Server was Ok. But at this point, at the beginning of the investigation I didn’t know that yet.

With the info I had (queries, times, attack, readme telling v. 3.6…) I balanced the possibility to be in front of something and I decided that I had an unique opportunity to discover how they do to inject those Sql, or discover if my Server was compromised and how. The bad point is that it was the same Amazon’s Server where this blog resides, and I wanted the attack to continue so I could get more information, so during two days I was recording logs and doing some investigations, so sorry if you visited my blog and database was down, or the Server was going extremely slow. I needed that info. It was worth it.

First I changed the Apache config so the massive connections impacted a bit less the Server and so I could work on it while the attack was going on.

I informed my group of Senior friends on what’s going on and two SysAdmins gave me some good suggestions on other logs to watch and on how to stop the attack, and later a Developer joined me to look at the logs and pointed possible solutions to stop the attack. But basically all of them suggested on how to block the incoming connections with iptables and to do things like reinstalling WordPress, disabling xmlrpc.php in .htaccess, changing passwords or moving wp-admin/ to another place, but the point is that I wanted to understand exactly what was going on and how.

I checked the logs, certificates, etc… and no one other than me was accessing the Server. I also double-checked the Amazon’s Firewall to be sure that no unnecessary ports were left open. Everything was Ok.

I took a look at the Apache logs for the site and all the attacks were coming from the same Ip:

94.102.49.179

It is an Ip from a dedicated Servers company called ecatel.net. I reported them the abuse to the abuse address indicated in the ripe.net database for the range.

I found that many people have complains about this provider and reports of them ignoring the requests to stop the spam use from their servers, so I decided that after my tests I will block their entire network from being able to access my sites.

All the requests shown in the access.log pointed to requests to /xmlrpc.php. It was the only path requested by the attacker so that Ip did nothing more apparently.

I added some logging to WordPress xmlrpc.php file:

if ($_SERVER['REMOTE_ADDR'] == '94.102.49.179') {
    error_log('XML POST: '.serialize($_POST));
    error_log('XML GET: '.serialize($_GET));
    error_log('XML REQUEST: '.serialize($_REQUEST));
    error_log('XML SERVER: '.serialize($_SERVER));
    error_log('XML FILES: '.serialize($_FILES));
    error_log('XML ENV: '.serialize($_ENV));
    error_log('XML RAW: '.$HTTP_RAW_POST_DATA);
    error_log('XML ALL_HEADERS: '.serialize(getallheaders()));
}

This was the result, it is always the same:

[Fri Aug 29 19:02:54 2014] [error] [client 94.102.49.179] XML POST: a:0:{}
[Fri Aug 29 19:02:54 2014] [error] [client 94.102.49.179] XML GET: a:0:{}
[Fri Aug 29 19:02:54 2014] [error] [client 94.102.49.179] XML REQUEST: a:0:{}
[Fri Aug 29 19:02:54 2014] [error] [client 94.102.49.179] XML SERVER: a:24:{s:9:"HTTP_HOST";s:24:"barcelona.afterstart.com";s:12:"CONTENT_TYPE";s:8:"text/xml";s:14:"CONTENT_LENGTH";s:3:"287";s:15:"HTTP_USER_AGENT";s:50:"Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)";s:15:"HTTP_CONNECTION";s:5:"close";s:4:"PATH";s:28:"/usr/local/bin:/usr/bin:/bin";s:16:"SERVER_SIGNATURE";s:85:"<address>Apache/2.2.22 (Ubuntu) Server at barcelona.afterstart.com Port 80</address>\n";s:15:"SERVER_SOFTWARE";s:22:"Apache/2.2.22 (Ubuntu)";s:11:"SERVER_NAME";s:24:"barcelona.afterstart.com";s:11:"SERVER_ADDR";s:14:"[this-is-removed]";s:11:"SERVER_PORT";s:2:"80";s:11:"REMOTE_ADDR";s:13:"94.102.49.179";s:13:"DOCUMENT_ROOT";s:29:"/var/www/barcelona.afterstart.com";s:12:"SERVER_ADMIN";s:19:"webmaster@localhost";s:15:"SCRIPT_FILENAME";s:40:"/var/www/barcelona.afterstart.com/xmlrpc.php";s:11:"REMOTE_PORT";s:5:"40225";s:17:"GATEWAY_INTERFACE";s:7:"CGI/1.1";s:15:"SERVER_PROTOCOL";s:8:"HTTP/1.0";s:14:"REQUEST_METHOD";s:4:"POST";s:12:"QUERY_STRING";s:0:"";s:11:"REQUEST_URI";s:11:"/xmlrpc.php";s:11:"SCRIPT_NAME";s:11:"/xmlrpc.php";s:8:"PHP_SELF";s:11:"/xmlrpc.php";s:12:"REQUEST_TIME";i:1409338974;}
[Fri Aug 29 19:02:54 2014] [error] [client 94.102.49.179] XML FILES: a:0:{}
[Fri Aug 29 19:02:54 2014] [error] [client 94.102.49.179] XML ENV: a:0:{}
[Fri Aug 29 19:02:54 2014] [error] [client 94.102.49.179] XML RAW: <?xmlversion="1.0"?><methodCall><methodName>pingback.ping</methodName><params><param><value><string>http://seretil.me/</string></value></param><param><value><string>http://barcelona.afterstart.com/2013/09/27/afterstart-barcelona-2013-09-26/</string></value></param></params></methodCall>
[Fri Aug 29 19:02:54 2014] [error] [client 94.102.49.179] XML ALL_HEADERS: a:5:{s:4:"Host";s:24:"barcelona.afterstart.com";s:12:"Content-type";s:8:"text/xml";s:14:"Content-length";s:3:"287";s:10:"User-agent";s:50:"Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)";s:10:"Connection";s:5:"close";}

So nothing in $_POST, nothing in $_GET, nothing in $_REQUEST, nothing in $_SERVER, no files submitted, but a text/xml Posted (that was logged by storing: $HTTP_RAW_POST_DATA):

<?xmlversion="1.0"?><methodCall><methodName>pingback.ping</methodName><params><param><value><string>http://seretil.me/</string></value></param><param><value><string>http://barcelona.afterstart.com/2013/09/27/afterstart-barcelona-2013-09-26/</string></value></param></params></methodCall>

I show you in a nicer formatted aspect:blog-carlesmateo-com-xml-xmlrpc-requestSo basically they were trying to register a link to seretil dot me.

I tried and this page, hosted in CloudFare, is not working.

accessing-seretil-withoud-id

The problem is that responding to this spam xmlrpc request took around 16 seconds to the Server. And I was receiving several each second.

I granted access to my Ip only on the port 80 in the Firewall, restarted Apache, restarted MySql and submitted the same malicious request to the Server, and it even took 16 seconds in all my tests:

cat http_post.txt | nc barcelona.afterstart.com 80

blog-carlesmateo-com-response-from-the-server-to-xmlrpc-attackI checked and confirmed that the logs from the attacker were showing the same Content-Length and http code.

Other guys tried xml request as well but did one time or two and leaved.

The problem was that this robot was, and still sending many requests per second for days.

May be the idea was to knock down my Server, but I doubted it as the address selected is the blog of one Social Event for Senior Internet Talents that I organize: afterstart.com. It has not special interest, I do not see a political, hateful or other motivation to attack the blog from this project.

Ok, at this point it was clear that the Ip address was a robot, probably running from an infected or hacked Server, and was trying to publish a Spam link to a site (that was down). I had to clarify those strange queries in the logs.

I reviewed the WPUsersOnline plugin and I saw that the strange queries (and inefficient) that I saw belonged to WPUsersOnline plugin.

blog-carlesmateo-com-grep-r-delete-from-wp-useronline-2014-08-30-21-11-21-cut

The thing was that when I renamed the xmlrpc.php the spamrobot was still posting to that file. According to WordPress .htaccess file any file that is not found on the filesystem is redirected to index.php.

So what was happening is that all the massive requests sent to xmlrpc.php were being attended by index.php, then showing an error message that page not found, but the WPUsersOnline plugin was deleting those connections. And was doing it many times, overloading also the Database.

Also I was able to reproduce the behaviour by myself, isolating by firewalling the WebServer from other Ips other than mine and doing the same post by myself many times per second.

I checked against a friend’s blog but in his Server xmlrpc.php responds in 1,5 seconds. My friend’s Server is a Digital Ocean Virtual Server with 2 cores and SSD Disks. My magnetic disks on Amazon only bring around 40 MB/second. I’ve to check in detail why my friend’s Server responds so much faster.

Checked the integrity of my databases, just in case, and were perfect. Nothing estrange with collations and the only errors in the /var/log/mysql/error.log was due to MySql crashing when the Server ran out of memory.

Rechecked in my Server, now it takes 12 seconds.

I disabled 80% of the plugins but the times were the same. The Statistics show how the things changed -see the spikes before I definitively patched the Server to block request from that Spam-robot ip, to the left-.

I checked against another WordPress that I have in the same Server and it only takes 1,5 seconds to reply. So I decided to continue investigating why this WordPress took so long to reply.

blog-carlesmateo-com-statistics-use-last-24-hours

As I said before I checked that the files from my WordPress installation were the same as the original distribution, and they were. Having discarded different files the thing had to be in the database.

Even when I checked the MySql it told me that all the tables were OK, having seen that the WPUserOnline deletes all the registers older than 5 minutes, I guessed that this could lead to fragmentation, so I decided to do OPTIMIZE TABLE on all the tables of the database for the WordPress failing, with InnoDb it is basically recreating the Tables and the Indexes.

I tried then the call via RPC and my Server replied in three seconds. Much better.

Looking with htop, when I call the xmlrpc.php the CPU uses between 50% and 100%.

I checked the logs and the robot was gone. He leaved or the provider finally blocked the Server. I don’t know.

Everything became clear, it was nothing more than a sort of coincidences together. Deactivating the plugin the DELETE queries disappeared, even under heavy load of the Server.

It only was remain to clarify why when I send a call to xmlrpc to this blog, it replies in 1,5 seconds, and when I request to the Barcelona.afterstart.com it takes 3 seconds.

I activated the log of queries in mysql. To do that edit /etc/mysql/my.cnf and uncomment:

general_log_file        = /var/log/mysql/mysql.log
general_log             = 1

Then I checked the queries, and in the case of my blog it performs many less queries, as I was requesting to pingback to an url that was not existing, and WordPress does this query:

SELECT   wp_posts.* FROM wp_posts  WHERE 1=1  AND ( ( YEAR( post_date ) = 2013 AND MONTH( post_date ) = 9 AND DAYOFMONTH( post_date ) = 27 ) ) AND wp_posts.post_name = 'afterstart-barcelona-2013-09-26-meet' AND wp_posts.post_type = 'post'  ORDER BY wp_posts.post_date DESC

As the url afterstart-barcelona-2013-09-26-meet with the dates indicated does not exist in my other blog, the execution ends there and does not perform the rest of the queries, that in the case of Afterstart blog were:

40 Query     SELECT post_id, meta_key, meta_value FROM wp_postmeta WHERE post_id IN (81) ORDER BY meta_id ASC
40 Query     SELECT ID, post_name, post_parent, post_type
FROM wp_posts
WHERE post_name IN ('http%3a','','seretil-me')
AND post_type IN ('page','attachment')
40 Query     SELECT   wp_posts.* FROM wp_posts  WHERE 1=1  AND (wp_posts.ID = '0') AND wp_posts.post_type = 'page'  ORDER BY wp_posts.post_date DESC
40 Query     SELECT * FROM wp_comments WHERE comment_post_ID = 81 AND comment_author_url = 'http://seretil.me/'

To confirm my theory I tried the request to my blog, with a valid url, and it lasted for 3-4 seconds, the same than Afterstart’s blog. Finally I double-checked with the blog of my friend and was slower than before. I got between 1,5 and 6 seconds, with a lot of 2 seconds response. (he has PHP 5.5 and OpCache that improves a bit, but the problem is in the queries to the database)

Honestly, the guys creating WordPress should cache this queries instead of performing 20 live queries, that are always the same, before returning the error message. Using Cache Lite or Stash, or creating an InMemory table for using as Cache, or of course allowing the use of Memcached would eradicate the DoS component of this kind of attacks. As the xmlrpc pingback feature hits the database with a lot of queries to end not allowing the publishing.

While I was finishing those tests (remember that the attacker ip has gone) another attacker from the same network tried, but I had patched the Server to ignore it:

94.102.52.157 - - [31/Aug/2014:02:06:16 +0000] "POST /xmlrpc.php HTTP/1.0" 200 189 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"

This was trying to get a link published to a domain called socksland dot net that is a domain registered in Russia and which page is not working.

As I had all the information I wanted I finally blocked the network from the provider to access my Server ever again.

Unfortunatelly Amazon’s Firewall does not allow to block a certain Ip or range.
So you can block at Iptables level or in .htaccess file or in the code.
I do not recommend blocking at code level because sadly WordPress has many files accessible from outside so you would have to add your code at the beginning of all the files and because when there is a WordPress version update you’ll loss all your customizations.
But I recommend proceeding to patch your code to avoid certain Ip’s if you use a CDN. As the POST will be sent directly to your Server, and the Ip’s are Ip’s from the CDN -and you can’t block them-. You have to look at the Header: X-Forwarded-For that indicates the Ip’s the proxies have passed by, and also the Client’s Ip.

I designed a program that is able to patch any PHP project to check for blacklisted Ip’s (even though a proxy) with minimal performance impact. It works with WordPress, drupal, joomla, ezpublish and Framework like Zend, Symfony, Catalonia… and I patched my code to block those unwanted robot’s requests.

A solution that will work for you probably is to disable the pingback functionality, there are several plugins that do that. Disabling completely xmlrpc is not recommended as WordPress uses it for several things (JetPack, mobile, validation…)

The same effect as adding the plugin that disables the xmlrpc pingback can be achieved by editing the functions.php from your Theme and adding:

add_filter( 'xmlrpc_methods', 'remove_xmlrpc_pingback_ping' );
function remove_xmlrpc_pingback_ping( $methods ) {
    unset( $methods['pingback.ping'] );
    
    return $methods;
}

Update: 2016-02-24 14:40 CEST
I got also a heavy dictionary attack against wp-login.php .Despite having a Captcha plugin, that makes it hard to hack, it was generating some load on the system.
What I did was to rename the wp-login.php to another name, like wp-login-carles.php and in wp-login.php having a simply exit();

<?php
exit();

Improving performance in PHP

This year I was invited to speak at the PHP Conference at Berlin 2014.

It was really nice, but I had to decline as I was working hard in a Start up, and I hadn’t the required time in order to prepare the nice conference I wanted and that people deserves.

However, having time, I decided to write an article about what I would had speak at the conference.

I will cover improving performance in a single server, and Scaling out multi-Server architecture, focusing on the needs of growing and Start up projects. Many of those techniques can be used to improve performance with other languages, not just with PHP.

Many of my friends are very good Developing, but know nothing about Architecture and Scaling. Hope this approach the two worlds, Development ad Operatings, into a DevOps bridge.

Improving performance on a single server

Hosting

Choose a good hosting. And if you can afford it choose a dedicated server.

Shared hostings are really bad. Some of them kill your http and mysql instances if you reach certain CPU use (really few), while others share the same hardware between 100+ users serving your pages sloooooow. Others cap the amount of queries that your MySql will handle per hour at so ridiculous few amount that even Drupal or WordPress are unable to complete a request in development.

Other ISP (Internet Service Providers) have poor Internet bandwidth, and so you web will load slow to users.

Some companies invest hundreds of thousands in developing a web, and then spend 20 € a year in the hosting. Less than the cost of a dinner.

You can use a decent dedicated server from 50 to 99 €/month and you will celebrate this decision every day.

Take in count that virtualization wastes between 20% and 30% of the CPU power. And if there are several virtual machines the loss will be more because you loss the benefits of the CPU caching for optimizing parallel instructions execution and prediction. Also if the hypervisor host allows to allocate more RAM than physically available and at some point it swaps, the performance of all the VM’s will be much worst.

If you have a VM and it swaps, in most providers the swap goes over the network so there is an additional bottleneck and performance penalty.

To compare the performance of dedicated servers and instances from different Cloud Providers you can take a look at my project cmips.net

Improve your Server

If your Sever has few RAM, add more. And if your project is running slow and you can afford a better Server, do it.

Using SSD disk will incredibly improve the performance on I/O operations and on swap operations. (but please, do backups and keep them in another place)

If you use a CMS like ezpublish with http_cache enabled probably you will prefer to have a Server with faster cores, rather tan a Server with one or more CPU’s plenty of cores, but slower cores, and that last for a longer time to render the page to the http cache.

That may seem obvious but often companies invest 320 hours in optimizing the code 2%, at a cost of let’s say 50 €/h * 320 hours = 16.000 €, while hiring a better Server would had bring between a 20% to 1000% improvement at a cost of additional 50€/month only or at the cost of 100 € of increasing the RAM memory.

The point here is that the hardware is cheap, while the time of the Engineers is expensive. And good Engineers are really hard to find.

And you probably, as a CEO or PO, prefer to use the talent to warranty a nice time to market for your project, or adding more features, rather than wasting this time in refactorizing.

Even with the most optimal code in the universe, if your project is successful at certain point you’ll have to scale. So adding more Servers. To save a Server now at the cost of slowing the business has not any sense.

Upgrade you PHP version

Many projects still use PHP 5.3, and 5.4.

Latest versions of PHP bring more and more performance. If you use old versions of PHP you can have a Quick Win by just upgrading to the last PHP version.

Use OpCache (or other cache accelerator)

OpCache is shipped with PHP 5.5 by default now, so it is the recommended option. It is though to substitute APC.

To activate OpCache edit php.ini and add:

Linux/Unix:

zend_extension=/path/to/opcache.so

Windows:

zend_extension=C:\path\to\php_opcache.dll

opcache-screenshotsIt will greatly improve your PHP performance.

Ensure that OpCache in Production has the optimal config for Production, that will be different from Development Environment.

Note: If you plan to use it with XDebug in Development environments, load OpCache before XDebug.

Disable Profiling and xdebug in Production

In Production disable the profiling, xdebug, and if you use a Framework ensure the Development/Debug features are disabled in Production.

Ensure your logs are not full of warnings

Check that Production logs are not full of warnings.

I’ve seen systems were every seconds 200 warnings were written to logs, the same all the time, and that obviously was slowing down the system.

Typical warnings like this can be easily fixed:

Message: date() [function.date]: It is not safe to rely on the system’s timezone settings. You are required to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected ‘UTC’ for ‘8.0/no DST’ instead

Profile in Development

To detect where your slow code is, profile it in Development to see where it is spent the most CPU/time.

Check the slow-queries if you use MySql.

Cache html to disk

Imagine you have a sort of craigslist and you are displaying all the categories, and the number of new messages in this landing page. To do that you are performing many queries to the database, SELECT COUNTs, etc… every time a user visits your page. That certainly will overload your database with actually few concurrent visitors.

Instead of querying the Database all the time, do cache the generated page for a while.

This can be achieved by checking if the cache html file exists, and checking the TTL, and generating a new page if needed.

A simple sample would be:

<?php
    // Cache pages for 5 minutes
    $i_cache_TTL = 300;

    $b_generate_cache = false;

    
    $s_cache_file = '/tmp/index.cache.html';


    if (file_exists($s_cache_file)) {
        // Get creation date
        $i_file_timestamp = filemtime($s_cache_file);
        $i_time_now = microtime(true);
        if ($i_time_now > ($i_file_timestamp + $i_cache_TTL)) {
            $b_generate_cache = true;
        } else {
            // Up to date, get from the disk
            $o_fh = fopen($s_cache_file, "rb");
            $s_html = stream_get_contents($o_fh);
            fclose($o_fh);
            
            // If the file was empty something went wrong (disk full?), so don't use it
            if (strlen($s_html) == 0) {
                $b_generate_cache = true;
            } else {
                // Print the page and exit
                echo $s_html;
                exit();
            }
        }
    } else {
        $b_generate_cache = true;
    }

    ob_start();

    // Render your page normally here
    // ....

    $s_html = ob_get_clean();

    if ($b_generate_cache == true) {
        // Create the file with fresh contents
        $o_fp = fopen($s_cache_file, 'w');
        if (fwrite($o_fp, $s_html) === false) {
            // Error. Impossible to write to disk
            // throw new Exception('CacheCantWrite');
        }
        fclose($o_fp);
    }

    // Send the page to the browser
    echo $s_html;

This sample is simple, and works for many cases, but presents problems.

Imagine for example that the page takes 5 seconds to be generated with a single request, and you have high traffic in that page, let’s say 500 requests per second.

What will happen when the cache expires is that the first user will trigger the cache generation, and the second, and the third…. so all of the 500 requests * 5 seconds will be hitting the database to generate the cache, but… if creating the page per one requests takes 5 seconds, doing this 2,500 times will not last 5 seconds… so your process will enter in a vicious state where the first queries have not ended after minutes, and more and more queries are being added to the queue until:

a) Apache runs out of childs/processes, per configuration

b) Mysql runs out of connections, per configuration

c) Linux runs out of memory, and processes crashes/are killed

Not to mention the users or the API client, waiting infinitely for the http request to complete, and other processes reading a partial file (size bigger than 0 but incomplete).

Different strategies can be used to prevent that, like:

a) using semaphores to lock access to the cache generation (only one process at time)

b) using a .lock file to indicate that the file is being generated, and so next requests serving from the cache until the cache generation process ends the task, also writing to a buffer like acachefile.buffer (to prevent incomplete content being read) and finally when is complete renaming to the final name and removing the .lock

c) using memcached, or similar, to keep an index in memory of what pages are being generated now, and why not, keeping the cached files there instead of a filesystem

d) using crons to generate the cache files, so they run hourly and you ensure only one process generates the cache files

If you use crons, a cheap way to generate the .html content is that the crons curls/wget your webpage. I don’t recommend this as has some problems, like if that web request fails for any reason, you’ll have cached an error instead of content.

I prefer preparing my projects to being able of rendering the content being invoked from HTTP/S or from command line. But if you use curl because is cheap and easy and time to market is important for your project, then be sure that you check that your backend code writes an Status OK in the HTML that the cron can check to ensure that the content has been properly generated. (some crons only check for http status, like 200, but if your database or a xml gateway you use fails you will likely get a 200 and won’t detect that you’re caching pages with “error I can’t connect to the database” instead)

Many Frameworks have their own cache implementation that prevent corruption that could come by several processes writing to the same file at the same time, or from PHP dying in the middle of the render.

You can see a more complex MVC implementation, with Views, from my Framework Catalonia here:

blog-carlesmateo-com-capture-code-cataloniaframeworkBy serving .html files instead of executing PHP with logic and performing queries to the database you will be able to serve hundreds of thousands requests per day with a single machine and really fast -that’s important for SEO also-.

I’ve done this in several Start ups with wonderful results, and my Framework Catalonia also incorporates this functionality very easily to use.

Note: This is only one of the techniques to save the load of the Database Servers. Many more come later.

Cache languages to disk

If you have an application that is multi-language, or if your point for the Strings (sections, pages, campaigns..) to be edited by Marketing is the Database, there is no need to query it all the time.

Simply provide a tool to “generate language files”.

Your languages files can be Javascript files loaded by the page, or can be PHP files generated.

For example, the file common_footer_en.php could be generated reading from Database and be like that:

<php
/* Autogenerated English translations file common_footer_en.php
   on 2014-08-10 02:22 from the database */
$st_translations['seconds']                = 'seconds';
$st_translations['Time']                   = 'Time';
$st_translations['Vars used']              = 'Vars used in these templates';
$st_translations['Total Var replacements'] = 'Total replaced';
$st_translations['Exec time']              = 'Execution time';
$st_translations['Cached controller']      = 'Cached controller';

So the PHP file is going to be generated when someone at your organization updates the languages, and your code is including it normally like with any other PHP file.

Use the Crons

You can set cron jobs to do many operations, like map reduce, counting in the database or effectively deleting the data that the user selected to delete.

Imagine that you have classified portal, and you want to display the number of announces for that category. You can have a table NUM_ANNOUNCES to store the number of announces, and update it hourly. Then your database will only do the counting once per hour, and your application will be reading the number from the table NUM_ANNOUNCES.

The Cron can also be used to make expire old announces. That way you can avoid a user having to wait for that clean up taking process when you have a http request to PHP.

A cron file can be invoked by:

php -f cron.php

By:

./cron.php

If you give permissions of execution with chmod +x and set the first line in cron.php as:

#!/usr/bin/env php

Or you can do a trick, that is emulate a http request from bash, by invoking a url with curl or with wget. Set the .htaccess so the folder for the cron tasks can only be executed from localhost for adding security.

This last trick has the inconvenient that the calling has the same problems as any http requests: restarting Apache will kill the process, the connection can be closed by timed out (e.g. if process is taking more seconds than the max. execution time, etc…)

Use Ramdisk for PHP files

With Linux is very easy to setup a RamDisk.

You can setup a RamDisk and rsync all your web .PHP files at system boot time, and when deploying changes, and config Apache to use the Ramdisk folder for the website.

That way for every request to the web, PHP files will be served from RAM directly, saving the slow disk access. Even with OpCache active, is a great improvement.

At these times were one Gigabyte of memory is really cheap there is a huge difference from reading files from disk, and getting them from memory. (Reading and writing to RAM memory is many many many times faster than magnetic disks, and many times faster than SSD disks)

Also .js, .css, images… can be served from a Ram disk folder, depending on how big your web is.

Ramdisk for /tmp

If your project does operations on disk, like resizing images, compressing files, reading/writing large CSV files, etcetera you can greatly improve the performance by setting the /tmp folder to a Ramdisk.

If your PHP project receives file uploads they will also benefit (a bit) from storing the temporal files to RAM instead to the disk.

Use Cache Lite

Cache Lite is a Pear extension that allows you to keep data in a local cache of the Web Server.

You can cache .html pages, or you can cache Queries and their result.

Example from http://pear.php.net/manual/en/package.caching.cache-lite.cache-lite.save.php:

<?php
require_once "Cache/Lite.php";

$options = array(
    'cacheDir' => '/tmp/',
    'lifeTime' => 7200,
    'pearErrorMode' => CACHE_LITE_ERROR_DIE
);
$cache = new Cache_Lite($options);

if ($data = $cache->get('id_of_the_page')) {

    // Cache hit !
    // Content is in $data
    echo $data;

} else { 
    
    // No valid cache found (you have to make and save the page)
    $data = '<html><head><title>test</title></head><body><p>this is a test</p></body></html>';
    echo $data;
    $cache->save($data);
    
}

It is nice that Cache Lite handles the TTL and keeps the info stored in different sub-directories in order to keep a decent performance. (As you may know many files in the same directory slows the access much).

Use HHVM (HipHop Virtual Machine) from Facebook

Facebook Engineers are always trying to optimize what is run on the Servers.

Faster code means, less machines. Even 1% of CPU use improvement means a lot of Servers less. Less Servers to maintain, less money wasted, less space on the Data Centers…

So they created the HHVM HipHop Virtual Machine that is able to run PHP code, much much faster than PHP. And is compatible with most of the Frameworks and Open Source projects.

They also created the Hack language that is an improved PHP, with type hinting.

So you can use HHVM to make your code run faster with the same Server and without investing a single penny.

Use C extensions

You can create and use your own C extensions.

C extensions will bring really fast execution. Just to get the idea:
I built a PHP extension to compare the performance from calculating the Bernoulli number with PHP and with the .so extension created in C.
In my Core i7 times were:
PHP:
Computed in 13.872583150864 s
PHP calling the C compiled extension:
Computed in 0.038495063781738 s

That’s 360.37 times faster using the C extension. Not bad.

Use Zephir

Zephir is a an Open Source language, very similar to PHP,  that allows to create and maintain easily extensions for PHP.

Use Phalcon

Phalcon is a Web MVC Framework implemented as C extension, so it offers a high performance.

phalcon-chart

The views syntax are very very similar to Twig.

Tutorial – Creating a Poll application in 15 minutes with Phalcon from Phalcon Framework on Vimeo.

 

Check if you’re using the correct Engine for MySql

Many Developers create the tables and never worry about that. And many are using MyIsam by default. It was the by default Engine prior to MySql 5.5.

While MyIsam can bring good performance in some certain cases, my recommendation is to use InnoDb.

Normally you’ll have a gain in performance with MyIsam if you’ve a table were you only write or only read, but in all the other cases InnoDb is expected to be much more performant and safe.

MyIsam tables also get corruption from time to time and need manually fixing and writing to disks are not so reliable than InnoDb.

As MyIsam uses table-locking for updates and deletes to any existing row, it is easy to see that if you’re in a web environment with multiple users, blocking the table -so the other operations have to wait- will make things be slow.

If you have to use Joins clearly you will benefit from using InnoDb also.

Use InMemory Engine from MySql

MySql has a very powerful Engine called InMemory.

The InMemory Engine will store things in RAM and loss the data when MySql is restarted.

However is very fast and very easy to use.

Imagine that you have a travel application that constantly looks at which country belongs the city specified by user. A Quickwin would be to INSERT all this data in the InMemory Engine of MySql when it is started, and do just one change in your code: to use that Table.

Really easy. Quick improvement.

Use curl asynchronously

If your PHP has to communicate with other systems using curl, you can do the http/s call, and instead of waiting for a response let your PHP do more things in the meantime, and then check the results.

You can also call to multiple curl calls in parallel, and so avoid doing one by one in serial.

Here you have a sample.

Serialize

Guess that you have a query that returns 1000 results. Then you add one by one to an array.

Probably you’re going to have substantial gain if you keep in the database a single row, with the array serialized.

So an array like:

$st_places = Array(‘Barcelona’, ‘Dublin’, ‘Edinburgh’, ‘San Francisco’, ‘London’, ‘Berlin’, ‘Andorra la Vella’, ‘Prats de Lluçanès’);

Would be serialized to an string like:

a:8:{i:0;s:9:”Barcelona”;i:1;s:6:”Dublin”;i:2;s:9:”Edinburgh”;i:3;s:13:”San Francisco”;i:4;s:6:”London”;i:5;s:6:”Berlin”;i:6;s:16:”Andorra la Vella”;i:7;s:19:”Prats de Lluçanès”;}

This can be easily stored as String and unserialized later back to an array.

blog-carlesmateo-com-array_serializeNote: In Internet we have a lot of encodings, Hebrew, Japanese… languages. Be careful with encodings when serializing, using JSon, XML, storing in databases without UTF support, etc…

Use Memcached to store common things

Memcached is a NoSql database in memory that can run in cluster.

The idea is to keep things there, in order to offload the load of the database. And as everything is in RAM it really runs fast.

You can use Memcached to cache Queries and their results also.

For example:

You have query SELECT * FROM translations WHERE section=’MAIN’.

Then you look if that String exists as key in the Memcached, and if it exists you fetch the results (that are serialized) and you avoid the query. If it doesn’t exist, you do normally the query to the database, serialize the array and store it in the Memcached with a TTL (Time to Live) using the Query (String) as primary key. For security you may prefer to hash the query with MD5 or SHA-1 and use the hash as key instead of using it plain.

When the TTL is reached the validity of the data would have expired and so it’s time to reinsert the contents in the next query.

Be careful, I’ve seen projects that were caching private data from users without isolating the key properly, so other users were getting the info from other users.

For example, if the key used was ‘Name’ and the value ‘Carles Mateo’ obviously the next user that fetch the key ‘Name’ would get my name and not theirs.

If you store private data of users in Memcache, it is a nice idea to append the owner of that info to the hash. E.g. using key: 10701577-FFADCEDBCCDFFFA10C

Where ‘10701577’ would be the user_id of the owner of the info, and ‘FFADCEDBCCDFFFA10C’ a hash of the query.

Before I suggested that you can keep a table of counting for the announces in a classified portal. This number can be stored in the Memcached instead.

You can store also common things, like translations, or cities like in the example before, rate of change for a currency exchanging website…

The most common way to store things there is serialized or Json encoded.

Be aware of the memory limits of Memcached and contrl the cache hitting ratio to avoid inserting data, and losing it constantly because is used few and Memcached has few memory.

You can also use Redis.

Use jQuery for Production (small file) and minimized files for js

Use the Production jQuery library in Production, I mean do not use the bigger file Development jQuery library for Production.

There are product that eliminate all the necessary spaces in .js and .css files, and so are served much faster. These process is called minify.

It is important to know that in many emerging markets in the world, like Brazil, they have slow DSL lines. Many 512 Kbit/secons, and even modem connections!.

Activate compression in the Server

If you send large text files, or Jsons, you’ll benefit from activating the compression at the Server.

It consumes some CPU, but many times it brings an important improvement in speed serving the pages to the users.

Use a CDN

You can use a Content Delivery Network to offload your Servers from sending plain texts, html, images, videos, js, css…

You can delegate this to the CDN, they have very speedy Internet lines and Servers, so your Servers can concentrate into doing only BackEnd operations.

The most well known are Akamai and Amazon Cloud Front.

Please take attention to the documentation, a common mistake is to send Cache Headers to the CDN servers, while they’ll use this headers to set the cache TTL and ignore their web configuration parameters. (For example s-maxage, like: Cache-Control: public, s-maxage=600)

This sample header:

HTTP/1.1 200 OK
Server: nginx
Date: Wed, 20 Aug 2014 10:50:21 GMT
Content-Type: text/html; charset=UTF-8
Connection: close
Vary: Accept-Encoding
Cache-Control: max-age=0, public, s-maxage=10800
Vary: X-User-Hash,Accept-Encoding
X-Location-Id: 2
X-Content-Digest: ezlocation/2/end5139244ced4b25606ef0a39235982b1662d01cc
Content-Length: 68250
Age: 3

You can take a look at any website by telneting to the port 80 and doing the request manually or easily by using lynx:

lynx -mime_header http://blog.carlesmateo.com | less

Do you need a Framework?

If you’re processing only BackEnd petitions, like in the video games industry, serving API’s, RESTful, etc… you probably don’t need a Framework.

The Frameworks are generic and use much more resources than you’re really need for a fast reply.

Many times using a heavy Framework has a cost of factor times, compared to use simply PHP.

Save database connections until really needed

Many Frameworks create a connection to the Database Server by default. But certain parts of your code application do not require to connect to the database.

For example, validating the data from a form. If there are missing fields, the PHP will not operate with the Database, just return an error via JSon or refreshing the page, informing that the required field is missing.

If a not logged user is requesting the dashboard page, there is no need to open a connection to the database (unless you want to write the access try to an error log in the database).

In fact opening connections by default makes easier for attackers to do DoS attacks.

With a Singleton pattern you can easily implement a Db class that handles this transparently for you.

Scaling out / Multi Server Environment

Memcached session

When you have several Web Servers you’ll need something more flexible than the default PHP handler (that stores to a file in the Web Server).

The most common is to store the Session, serialized, in a Memcached Cluster.

Use Cassandra

Apache Cassandra is a NoSql database that allows to Scale out very easily.

The main advantage is that scales linearly. If you have 4 nodes and add 4 more, your performance will be doubled. It has no single point of failure, is also resilient to node failures, it replicates the data among the nodes, splits the load over the nodes automatically and support distributed datacenter architectures.

To know more abiut NoSql and Cassandra, read my article: Upgrade your scalability with NoSql. And to start developing with Cassandra in PHP, python or Java read my contributed article: Begin developing with Cassandra.

Use MySql primary and secondaries

A easy way to split the load is to have a MySql primary Server, that handles the writes, and MySql secondary (or Slave) Servers handling the reads.

Every write sent to the Master is replicated into the Slaves. Then your application reads from the slaves.

You have to tell your code to do the writes to database to the primary Server, and the reads to the secondaries. You can have a Load Balancer so your code always ask the Load Balancer for the reads and it makes the connection to the less used server.

Do Database sharding

To shard the data consist into splitting the data according to a criteria.

For example, imagine we have 8 MySql Servers, named mysql0 to mysql7. If we want to insert or read data for user 1714, then the Server will be chosen from dividing the user_id, so 1714, between the number of Servers, and getting the MOD.

blog-carlesmateo-com-mod-for-sharding

So 1714 % 8 gives 2. This means that the MySql Server to use is the mysql2.

For the user_id 16: 16 & 8 gives 0, so we would use mysql0. And so.

You can shard according to the email, or other fields as well. And you can have the same master and secondaries for the shards also.

When doing sharding in MySql you cannot do joins to data in other Servers. (but you can replicate all the data from the several shards in one big server in house, in your offices, and so query it and join if you need that for marketing purposes).

I always use my own sharding, but there is a very nice product from CodeFutures called dbshards. It handles the traffics transparently. I used it when in a video games Start up with very satisfying result.

Use Cassandra assync queries

Cassandra support asynchronous queries. That means you can send the query to the Server, and instead of waiting, do other jobs. And check for the result later, when is finished.

Consider using Hadoop + HBASE

A Cluster alternative to Cassandra.

Use a Load Balancer

You can put a Load Balancer or a Reverse Proxy in front of your Web Servers. The Load Balancer knows the state of the Web Servers, so it will remove a Web Server from the Array if it stops responding and everything will continue being served to the users transparently.

There are many ways to do Load Balancing: Round Robin, based on the load on the Web Servers, on the number of connections to each Web Server, by cookie…

To use a Cookie based Load Balancer is a very easy way to split the load for WordPress and Drupal Servers.

Imagine you have 10 Web Servers. In the .htaccess they set a rule to set a Cookie like:

SERVER_ID=WEB01

That was in the case of the first Web Server.

Second Server would have in the .htaccess to set a Cookie like:

SERVER_ID=WEB02

Etcetera

When for first time an user connects to the Load Balancer it sends the user to one of the 10 Web Servers. Then the Web Server sends its cookie to the browser of the Client. E.g. WEB07

After that, in the next requests from the client it will be redirected to the server by the Load Balancer to the Server that set the Cookie, so in this example WEB07.

The nice thing of this way of splitting the traffic is that you don’t have to change your code, nor handling the Sessions different.

If you use two Load Balancers you can have a heartbeat process in them and a Virtual Ip, and so in case your main Load Balancer become irresponsible the Virtual Ip will be mapping to the second Load Balancer in milliseconds. That provides HA.

Use http accelerators

Nginx, varnish, squid… to serve static content and offload the PHP Web Servers.

Auto-Scale in the Cloud

If you use the Cloud you can easily set Auto-Scaling for different parts of your core.

A quick win is to Scale the Web Servers.

As in the Cloud you pay per hour using a computer, you will benefit from cost reduction in you stop using the servers when you don’t need them, and you add more Servers when more users are coming to your sites.

Video game companies are a good example of hours of plenty use and valleys with few users, although as users come from all the planet it is most and most diluted.

Some cool tools to Auto-Scaling are: ECManaged, RightScale, Amazon CloudWatch.

Use Google Cloud

Actually the Performance of the Google Cloud to Scale without any precedent is great.

Opposite to other Clouds that are based on instances, Google Cloud offers the platform, that will spawn your code across so many servers as needed, transparently to you. It’s a black box.

Schedule operations with RabbitMQ

Or other Queue Manager.

The idea is to send the jobs to the Queue Manager, the PHP will continue working, and the jobs will be performed asynchronously and notify the end.

RabbitMQ is cool also because it can work in cluster and HA.

Use GlusterFs for NAS

GlusterFs (and other products) allow you to have a Distributed File System, that splits the load and the data across the Servers, and resist node failures.

If you have to have a shared folder for the user’s uploads, for example for the profile pictures, to have the PHP and general files locally in the Servers and the Shared folder in a GlusterFs is a nice option.

Avoid NFS for PHP files and config files

As told before try to have the PHP files in a RAM disk, or in the local disk (Linux caches well and also OpCache), and try to not write code that reads files from disk for determining config setup.

I remember a Start up incubator that had a very nice Server, but the PHP files were read from a mounted NFS folder.

That meant that on every request, the Server had to go over the network to fetch the files.

Sadly for the project’s performance the PHP was reading a file called ENVIRONMENT that contained “PROD” or “DEVEL”. And this was done in every single request.

Even worst, I discovered that the switch connecting the Web Server and the NFS Server was a cheap 10 Mbit one. So all the traffic was going at 10 Mbit/s. Nice bottleneck.

Improve your network architecture

You can use 10 GbE (10 Gigabit Ethernet) to connect the Servers. The Web Servers to the Databases, Memcached Cluster, Load Balancers, Storage, etc…

You will need 10 GbE cards and 10 GbE switchs supporting bonding.

Use bonding to aggregate 10 + 10 so having 20 Gigabit.

You can also use Fibre Channel, for example 10 Gb and aggregate them, like  10 + 10 so 20 Gbit for the connection between the Servers and the Storage.

The performance improvements that your infrastructure will experiment are amazing.

Begin developing with Cassandra

This article is contributed to Luna Cloud blog by Carles Mateo.

We architects, developers and start ups are facing new challenges.

We have now to create applications that have to scale and scale at world-wide level.

That puts over the table big and exciting challenges.

To allow that increasing level of scaling, we designed and architect tools and techniques and tricks, but fortunately now there are great products born to scale out and to deal with this problems: like NoSql databases like Cassandra, MongoDb, Riak, Hadoop’s Hbase, Couchbase or CouchDb, NoSql in-Memory like Memcached or Redis, big data solutions like Hadoop, distributed files systems like Hadoop’s HDFS, GlusterFs, Lustre, etc…

In this article I will cover the first steps to develop with Cassandra, under the Developer point of view.

As a first view you may be interested in Cassandra because:

  • Is a Database with no single point of failure
  • Where all the Database Servers work in Peer to Peer over Tcp/Ip
  • Fault-tolerance. You can set replication factor, and the data will be sharded and replicated over different servers and so being resilient to node failures
  • Because the Cassandra Cluster splits and balances the work across the Cluster automatically
  • Because you can scale by just adding more nodes to the Cluster, that’s scaling horizontally, and it’s linear. If you double the number of servers, you double the performance
  • Because you can have cool configurations like multi-datacenter and multi-rack and have the replication done automatically
  • You can have several small, cheap, commodity servers, with big SATA disks with better result than one very big, very expensive, and unable-to-scale-more server with SSD or SAS expensive disks.
  • It has the CQL language -Cassandra Query Language-, that is close to SQL
  • Ability to send querys in async mode (the CPU can do other things while waiting for the query to return the results)

Cassandra is based in key/value philosophy but with columns. It supports multiple columns. That’s cool, as theoretically it supports 2 GB per column (at practical level is not recommended to go with data so big, specially in multi-user environments).

I will not lie to you: It is another paradigm, and comes with a lot of knowledge to acquire, but it is necessary and a price worth to pay for being able of scaling at nowadays required levels.

Cassandra only offers native drivers for: Java, .NET, C++ and Python 2.7. The rest of solutions are contributed, sadly most of them are outdated and unmantained.

You can find all the drivers here:

http://planetcassandra.org/client-drivers-tools/

To develop with PHP

Cassandra has no PHP driver officially, but has some contributed solutions.

By myself I created several solutions: CQLSÍ uses cqlsh to perform queries and interfaces without needing Thrift, and Cassandra Universal Driver is a Web Gateway that I wrote in Python that allows you to query Cassandra from any language, and recently I contributed to a PHP driver that speaks the Cassandra binary protocol (v1) directly using Tcp/Ip sockets.

That’s the best solution for me by now, as it is the fastest and it doesn’t need any third party library nor Thrift neither.

You can git clone it from:

https://github.com/uri2x/php-cassandra

Here we go with some samples:

Create a keyspace

KeySpace is the equivalent to a database in MySQL.

<?php

require_once 'Cassandra/Cassandra.php';

$o_cassandra = new Cassandra();

$s_server_host     = '127.0.0.1';    // Localhost
$i_server_port     = 9042; 
$s_server_username = '';  // We don't use username
$s_server_password = '';  // We don't use password
$s_server_keyspace = '';  // We don't have created it yet

$o_cassandra->connect($s_server_host, $s_server_username, $s_server_password, $s_server_keyspace, $i_server_port);

// Create a Keyspace with Replication factor 1, that's for a single server
$s_cql = "CREATE KEYSPACE cassandra_tests WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 1 };";

$st_results = $o_cassandra->query($s_cql);

We can run it from web or from command line by using:

php -f cassandra_create.php

Create a table

<?php

require_once 'Cassandra/Cassandra.php';

$o_cassandra = new Cassandra();

$s_server_host     = '127.0.0.1';    // Localhost
$i_server_port     = 9042; 
$s_server_username = '';  // We don't use username
$s_server_password = '';  // We don't use password
$s_server_keyspace = 'cassandra_tests';

$o_cassandra->connect($s_server_host, $s_server_username, $s_server_password, $s_server_keyspace, $i_server_port);

$s_cql = "CREATE TABLE carles_test_table (s_thekey text, s_column1 text, s_column2 text,PRIMARY KEY (s_thekey));";

$st_results = $o_cassandra->query($s_cql);

If we don’t plan to insert UTF-8 strings, we can use VARCHAR instead of TEXT type.

Do an insert

In this sample we create an Array of 100 elements, we serialize it, and then we store it.

<?php

require_once 'Cassandra/Cassandra.php';

// Note this code uses the MT notation http://blog.carlesmateo.com/maria-teresa-notation-for-php/
$i_start_time = microtime(true);

$o_cassandra = new Cassandra();

$s_server_host     = '127.0.0.1';    // Localhost
$i_server_port     = 9042; 
$s_server_username = '';  // We don't have username
$s_server_password = '';  // We don't have password
$s_server_keyspace = 'cassandra_tests';  

$o_cassandra->connect($s_server_host, $s_server_username, $s_server_password, $s_server_keyspace, $i_server_port);

$s_time = strval(time()).strval(rand(0,9999));
$s_date_time = date('Y-m-d H:i:s');

// An array to hold a emails
$st_data_emails = Array();

for ($i_bucle=0; $i_bucle<100; $i_bucle++) {
    // Add a new email
    $st_data_emails[] = Array('datetime'  => $s_date_time,
                              'id_email'  => $s_time);

}

// Serialize the Array
$s_data_emails = serialize($st_data_emails);

$s_cql = "INSERT INTO carles_test_table (s_thekey, s_column1, s_column2)
VALUES ('first_sample', '$s_data_emails', 'Some other data');";

$st_results = $o_cassandra->query($s_cql);

$o_cassandra->close();

print_r($st_results);

$i_finish_time = microtime(true);
$i_execution_time = $i_finish_time-$i_start_time;

echo 'Execution time: '.$i_execution_time."\n";
echo "\n";

This insert took Execution time: 0.0091850757598877 seconds executed from CLI (Command line).

If the INSERT works well you’ll have a [result] => ‘success’ in the resulting array.

cassandra-php-insert-result-success

Do some inserts

Here we do 9000 inserts.

<?php

require_once 'Cassandra/Cassandra.php';

// Note this code uses the MT notation http://blog.carlesmateo.com/maria-teresa-notation-for-php/
$i_start_time = microtime(true);

$o_cassandra = new Cassandra();

$s_server_host     = '127.0.0.1';    // Localhost
$i_server_port     = 9042; 
$s_server_username = '';  // We don't have username
$s_server_password = '';  // We don't have password
$s_server_keyspace = 'cassandra_tests';  

$o_cassandra->connect($s_server_host, $s_server_username, $s_server_password, $s_server_keyspace, $i_server_port);

$s_date_time = date('Y-m-d H:i:s');

for ($i_bucle=0; $i_bucle<9000; $i_bucle++) {
    // Add a sample text, let's use time for example
    $s_time = strval(time());

    $s_cql = "INSERT INTO carles_test_table (s_thekey, s_column1, s_column2)
VALUES ('$i_bucle', '$s_time', 'http://blog.carlesmateo.com');";

    // Launch the query
    $st_results = $o_cassandra->query($s_cql);

}

$o_cassandra->close();

$i_finish_time = microtime(true);
$i_execution_time = $i_finish_time-$i_start_time;

echo 'Execution time: '.$i_execution_time."\n";
echo "\n";

Those 9,000 INSERTs takes 6.49 seconds in my test virtual machine, executed from CLI (Command line).

cqlsh-loaded-9000-rows-select-limit-10

Do a Select

<?php

require_once 'Cassandra/Cassandra.php';

// Note this code uses the MT notation http://blog.carlesmateo.com/maria-teresa-notation-for-php/
$i_start_time = microtime(true);

$o_cassandra = new Cassandra();

$s_server_host     = '127.0.0.1';    // Localhost
$i_server_port     = 9042; 
$s_server_username = '';  // We don't have username
$s_server_password = '';  // We don't have password
$s_server_keyspace = 'cassandra_tests';  

$o_cassandra->connect($s_server_host, $s_server_username, $s_server_password, $s_server_keyspace, $i_server_port);


$s_cql = "SELECT * FROM carles_test_table LIMIT 10;";

// Launch the query
$st_results = $o_cassandra->query($s_cql);
echo 'Printing 10 rows:'."\n";

print_r($st_results);

$o_cassandra->close();

$i_finish_time = microtime(true);
$i_execution_time = $i_finish_time-$i_start_time;

echo 'Execution time: '.$i_execution_time."\n";
echo "\n";

Printing 10 rows passing the query with LIMIT:

$s_cql = "SELECT * FROM carles_test_table LIMIT 10;";

echoing as array with print_r takes Execution time: 0.01090407371521 seconds (the cost of printing is high).

cassandra-php-select-limit-10

If you don’t print the rows, it takes only Execution time: 0.00714111328125 seconds.
Selecting 9,000 rows, if you don’t print them, takes Execution time: 0.18086194992065.

Java

The official driver for Java works very well.

The only initial difficulties may be to create the libraries required with Maven and to deal with the different Cassandra native data types.

To make that travel easy, I describe what you have to do to generate the libraries and provide you with a Db Class made by me that will abstract you from dealing with Data types and provide a simple ArrayList with the field names and all the data as String.

Datastax provides the pom.xml for maven so you’ll create you jar files. Then you can copy those jar file to Libraries folder of any project you want to use Cassandra with.

cmateo-cassandra-java-dependenciesMy Db class:

/*
 * By Carles Mateo blog.carlesmateo.com
 * You can use this code freely, or modify it.
 */

package server;

import java.util.ArrayList;
import java.util.List;
import com.datastax.driver.core.*;

/**
 * @author carles_mateo
 */
public class Db {

    public String[] s_cassandra_hosts = null;
    public String s_database = "cchat";
    
    public Cluster o_cluster = null;
    public Session o_session = null;
    
    Db() {
        // The Constructor
        this.s_cassandra_hosts = new String[10];
        
        String s_cassandra_server = "127.0.0.1";
        
        this.s_cassandra_hosts[0] = s_cassandra_server;
        
        this.o_cluster = Cluster.builder()
                                        .addContactPoints(s_cassandra_hosts[0]) // More than 1 separated by comas
                                        .build();
        this.o_session = this.o_cluster.connect(s_database);  // This is the KeySpace

    }
    
    public static String escapeApostrophes(String s_cql) {
        String s_cql_replaced = s_cql.replaceAll("'", "''");
        
        return s_cql_replaced;
    }
    
    public void close() {
        // Destructor calles by the garbagge collector
        this.o_session.close();
        this.o_cluster.close();
    }
    
    public ArrayList query(String s_cql) {
        
        ResultSet rows = null;
        
        rows = this.o_session.execute(s_cql);
        
        ArrayList st_results = new ArrayList();
        List<String> st_column_names = new ArrayList<String>();
        List<String> st_column_types = new ArrayList<String>();

        ColumnDefinitions o_cdef = rows.getColumnDefinitions();

        int i_num_columns = o_cdef.size();
        for (int i_columns = 0; i_columns < i_num_columns; i_columns++) {
            st_column_names.add(o_cdef.getName(i_columns));
            st_column_types.add(o_cdef.getType(i_columns).toString());                
        }                
        
        st_results.add(st_column_names);
        
        for (Row o_row : rows) {
            
            List<String> st_data = new ArrayList<String>();
            for (int i_column=0; i_column<i_num_columns; i_column++) {
                if (st_column_types.get(i_column).equals("varchar") || st_column_types.get(i_column).equals("text")) {
                    st_data.add(o_row.getString(i_column));
                } else if (st_column_types.get(i_column).equals("timeuuid")) {
                    st_data.add(o_row.getUUID(i_column).toString());
                } else if (st_column_types.get(i_column).equals("integer")) {
                    st_data.add(String.valueOf(o_row.getInt(i_column)));
                }
                // TODO: Implement other data types
                
            }
            st_results.add(st_data);
           
        }
        
        return st_results;
        
    }
    
    public static String getFieldFromRow(ArrayList st_results, int i_row, String s_fieldname) {
        
        List<String> st_column_names = (List)st_results.get(0);
        
        boolean b_column_found = false;
        
        int i_column_pos = 0;
        
        for (String s_column_name : st_column_names) {
            if (s_column_name.equals(s_fieldname)) {
                b_column_found = true;
                break;
            }
            i_column_pos++;
        }
        
        if (b_column_found == false) {
            return null;
        }
        
        int i_num_columns = st_results.size();
        
        List<String> st_data = (List)st_results.get(i_row);
        
        String s_data = st_data.get(i_column_pos);
        
        return s_data;
    }
    
}

 

Python 2.7

There is no currently driver for Python 3. I requested Datastax and they told me that they are working in a new driver for Python 3.

To work with Datastax’s Python 2.7 driver:

1) Download the driver from http://planetcassandra.org/client-drivers-tools/ or git clone from https://github.com/datastax/python-driver

2) Install the dependencies for the Datastax’s driver

Install python-pip (Installer)

sudo apt-get install python-pip

Install python development tools

sudo apt-get install python-dev

This is required for some of the libraries used by original Cassandra driver.

Install Cassandra driver required libraries

sudo pip install futures
sudo pip install blist
sudo pip install metrics
sudo pip install scales

Query Cassandra from Python

The problem is the same as with Java, the different data types are hard to deal with.
So I created a function convert_to_string that converts known data types to String, and so later we will only deal with Strings.

In this sample, the results of the query are rendered in xml or in html.

#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# Use with Python 2.7+

__author__ = 'Carles Mateo'
__blog__ = 'http://blog.carlesmateo.com'

import sys

from cassandra import ConsistencyLevel
from cassandra.cluster import Cluster
from cassandra.query import SimpleStatement

s_row_separator = u"||*||"
s_end_of_row = u"//*//"
s_data = u""

b_error = 0
i_error_code = 0
s_html_output = u""
b_use_keyspace = 1 # By default use keyspace
b_use_user_and_password = 1 # Not implemented yet

def return_success(i_counter, s_data, s_format = 'html'):
    i_error_code = 0
    s_error_description = 'Data returned Ok'

    return_response(i_error_code, s_error_description, i_counter, s_data, s_format)
    return

def return_error(i_error_code, s_error_description, s_format = 'html'):
    i_counter = 0
    s_data = ''

    return_response(i_error_code, s_error_description, i_counter, s_data, s_format)
    return

def return_response(i_error_code, s_error_description, i_counter, s_data, s_format = 'html'):

    if s_format == 'xml':
        print ("Content-Type: text/xml")
        print ("")
        s_html_output = u"<?xml version='1.0' encoding='utf-8' standalone='yes'?>"
        s_html_output = s_html_output + '<response>' \
                                        '<status>' \
                                        '<error_code>' + str(i_error_code) + '</error_code>' \
                                        '<error_description>' + '<![CDATA[' + s_error_description + ']]>' + '</error_description>' \
                                        '<rows_returned>' + str(i_counter) + '</rows_returned>' \
                                        '</status>' \
                                        '<data>' + s_data + '</data>' \
                                        '</response>'
    else:
        print("Content-Type: text/html; charset=utf-8")
        print("")
        s_html_output = str(i_error_code)
        s_html_output = s_html_output + '\n' + s_error_description + '\n'
        s_html_output = s_html_output + str(i_counter) + '\n'
        s_html_output = s_html_output + s_data + '\n'

    print(s_html_output.encode('utf-8'))
    sys.exit()
    return

def convert_to_string(s_input):
    # Convert other data types to string

    s_output = s_input

    try:
        if value is not None:

            if isinstance(s_input, unicode):
                # string unicode, do nothing
                return s_output

            if isinstance(s_input, (int, float, bool, set, list, tuple, dict)):
                # Convert to string
                s_output = str(s_input)
                return s_output

            # This is another type, try to convert
            s_output = str(input)
            return s_output

        else:
            # is none
            s_output = ""
            return s_output

    except Exception as e:
        # Were unable to convert to str, will return as empty string
        s_output = ""

    return s_output

def convert_to_utf8(s_input):
    return s_input.encode('utf-8')

# ********************
# Start of the program
# ********************

s_format = 'xml'  # how you want this sample program to output

s_cql = 'SELECT * FROM test_table;'
s_cluster = '127.0.0.1'
s_port = "9042" # default port
i_port = int(s_port)

b_use_keyspace = 1
s_keyspace = 'cassandra_tests'
if s_keyspace == '':
    b_use_keyspace = 0

s_user = ''
s_password = ''
if s_user == '' or s_password == '':
    b_use_user_and_password = 0

try:
    cluster = Cluster([s_cluster], i_port)
    session = cluster.connect()
except Exception as e:
    return_error(200, 'Cannot connect to cluster ' + s_cluster + ' on port ' + s_port + '.' + e.message, s_format)

if (b_use_keyspace == 1):
    try:
        session.set_keyspace(s_keyspace)
    except:
        return_error(210, 'Keyspace ' + s_keyspace + ' does not exist', s_format)

try:
    o_results = session.execute_async(s_cql)
except Exception as e:
    return_error(300, 'Error executing query. ' + e.message, s_format)

try:
    rows = o_results.result()
except Exception as e:
    return_error(310, 'Query returned result error. ' + e.message, s_format)

# Query returned values
i_counter = 0
try:
    if rows is not None:
        for row in rows:
            i_counter = i_counter + 1

            if i_counter == 1 and s_format == 'html':
                # first row is row titles
                for key, value in vars(row).iteritems():
                    s_data = s_data + key + s_row_separator

                s_data = s_data + s_end_of_row

            if s_format == 'xml':
                s_data = s_data + ''

            for key, value in vars(row).iteritems():
                # Convert to string numbers or other types
                s_value = convert_to_string(value)
                if s_format == 'xml':
                    s_data = s_data + '<' + key + '>' + '<![CDATA[' + s_value + ']]>' + ''
                else:
                    s_data = s_data + s_value
                    s_data = s_data + s_row_separator


            if s_format == 'xml':
                s_data = s_data + ''
            else:
                s_data = s_data + s_end_of_row

except Exception as e:
    # No iterable data
    return_success(i_counter, s_data, s_format)

# Just print the data
return_success(i_counter, s_data, s_format)

cassandra-lunacloud-sample-py

If you did not create the namespace like in the samples before, change those lines to:

s_cql = 'CREATE KEYSPACE cassandra_tests WITH REPLICATION = { \'class\': \'SimpleStrategy\', \'replication_factor\': 1 };'
s_cluster = '127.0.0.1'
s_port = "9042" # default port
i_port = int(s_port)

b_use_keyspace = 1
s_keyspace = ''

Run the program to create the Keyspace and you’ll get:

carles@ninja8:~/Desktop/codi/python/test$ ./lunacloud-create.py 
Content-Type: text/xml

<error_code>0<error_description>

Then you can create the table simply by setting:

s_cql = 'CREATE TABLE test_table (s_thekey text, s_column1 text, s_column2 text,PRIMARY KEY (s_thekey));'
s_cluster = '127.0.0.1'
s_port = "9042" # default port
i_port = int(s_port)

b_use_keyspace = 1
s_keyspace = 'cassandra_tests'

IDE PyCharm Community Edition

Cassandra Universal Driver

As mentioned above if you use a language Tcp/Ip enabled very new, or very old like ASP or ColdFusion, or from Unix command line and you want to use it with Cassandra, you can use my solution http://www.cassandradriver.com/.

cassandradriver-v1-1-xml-sample

It is basically a Web Gateway able to speak XML, JSon or CSV alike. It relies on the official Datastax’s python driver.

It is not so fast as a native driver, but it works pretty well and allows you to split your architecture in interesting ways, like intermediate layers to restrict even more security (For example WebServers may query the gateway, that will enstrict tome permissions instead of having direct access to the Cassandra Cluster. That can also be used to perform real-time map-reduce operations on the amount of data returned by the Cassandras, so freeing the webservers from that task and saving CPU).

Tip: If you use Cassandra for Development only, you can limit the amount of memory used by editing the file /etc/cassandra/cassandra-env.sh and hardcoding:

    # limit the memory for development environment
    # --------------------------------------------
    system_memory_in_mb="512"
    system_cpu_cores="1"
    # --------------------------------------------

Just before the line:

# set max heap size based on the following

That way Cassandra will believe your system memory is 512 MB and reserve only 256 MB for its use.

Things I hate from PHP

I love PHP, is fast to develop, it has many useful built-in features, can be extended with C modules, Arrays are wonderful and saves a lot of time from data-type conversion, but there are certain problems that you should know and that you should be aware of.

There are wonderful posts that mention a lot of issues with PHP, in this article I mention only the stuff I’ve not seen around.

I recommend you reading this wonderful post about the bad design on PHP:

PHP a fractal of bad design

this-is-php-very-bad

‘string’ == 0 is (often) true

This is something everyone has fell some times. And still see a lot of code on GitHub and in my new Teams when I go to a project that fall to that problem.

PHP is “clever” transforming the type of data to compare it. This allow to produce code much faster (try to parse floats in Java or C++ in web projects) but also leads to problems some times.

So if your developers write a code like this, that gets the income sent from a web form:

$s_income = $_POST['income'];
if ($s_income == 0) {
    // The guy is poor, save to the evil CRM database as no interesting person...
    // ...
}

That will fail detecting as 0 if someone enters in the texbox ‘milions’ or ‘$100000’.

The solution would be to check if $s_income is empty, and if is not empty but intval($s_income) is 0, request the user to reinput only with numeric values.

Another example. Imagine that you have a program that reads a CSV file, that has addresses. For example seven lines:

Facebook,Hacker Way,1,Menlo Park,94025,CA
Amazon,2nd Avenue,,1516,WA
Microsoft Corporation,One Microsoft Way,0,,WA
Apple,Infinite Loop, 1,Cupertino,95014,CA
Twitter,Market Street,  1355,,,CA
Netflix,Winchester Circle,one hundred,Los Gatos,95032,CA
Cmips,,0-1,Palo Alto,,CA
Fake Address,Nowhere,Building 5,Silicon Valley,,CA

CSV may contain errors, as most of times the data comes from data input manually at some moment or entered by users via web.

So your code reads it, puts each field in an array field as string, and you can use it.

Let’s assume that our buggy code looks for a 0 in the number, and then performs some action like setting a boolean to FALSE in the database or whatever.

Something like:

<?php
$i_row = 0;
if (($o_handle = fopen("addresses.csv", "r")) !== FALSE) {
    while (($st_data = fgetcsv($o_handle, 1000, ",")) !== false) {
        $i_row++;
        $i_num_fields = count($st_data);

        $s_company_name = $st_data[0];
        $s_street = $st_data[1];
        $s_number = $st_data[2];

        if ($s_number == 0) {
            // The address has no number
            // Do something real...
            echo 'Row: '.$i_row.' '.$s_company_name.' The address has no number! read ('.$s_number.')'."\n";
        } else {
            echo 'Row: '.$i_row.' '.$s_company_name.' number '.$s_number.' found!'."\n";
        }
    }
    fclose($o_handle);
}

And this is the result:

blog-carlesmateo-com-things-i-hate-from-php-sample-equal-0
Look at the results:

Expression Result by PHP
 ‘1’ == 0  FALSE
 ” == 0  TRUE
 ‘ ‘ == 0  (space)  TRUE
 ‘ 1’ == 0  (space and 1)  FALSE
 ‘  1355’ == 0  (space space and 1355)  FALSE
 ‘one hundred’ == 0  TRUE
 ‘0-1’ == 0  TRUE
 ‘Building 5’ == 0  TRUE

If you do ‘one string’ == 0 it returns TRUE, but, the mechanichs of why that happens are curious, capricious and quasi-random.

Normally the mechanichs are: PHP sees that has to evaluate a string to a number and converts the string to a number via intval, so ‘  1355’ == 0 is true because intval(‘  1355’) returns 1355. Please note that ‘  1355’ has two spaces.

Ok. That explains everything, but still is dangerous because ‘Building 5’ == 1 returns FALSE but ‘Building 5’ == 0 returns TRUE, so most Junior developers (and many self-called Seniors) will use that instead of ‘Building 5’ === ” empty string.

This is funny, but is more funny when we introduce another line to the CSV:

Fake Address outside US,Somewhere,.1,Andorra,AD100,Andorra

Here we introduced dot one ‘.1’ and when  run the program it detects as a number, so is not doing intval(‘.1’) but floatval(‘.1’) that returns 0.1

blog-carlesmateo-com-equal-dot-oneI introduced a postal code from Andorra because they start with ‘AD’, so ‘AD100’ in the example.

This is to demonstrate that our program would have detected the Postal Codes from US as numbers, but when used to deal with data from other countries would had failed as ‘AD100’ == 0 TRUE.

So always use === to check the type also and do the intval.

In this sample:

if ( !empty($s_postal_code) && intval($s_postal_code) === 0) {

// Wops! The postal code is there but is not a number

}

Also to check the input data to be sure that match the requirements, would have discovered future weird cases like postal codes with letters. Sample:

if (intval($s_postal_code) != $s_postal_code) {

// Wops! The code is not only numeric

}

Many professional people has explained the crazyness about that magic conversion and ‘string’ == 0, so I will not use more time.

Other crazy results

EXPRESSION RESULT BY PHP
 ‘1’ == ‘ 1’ (space and 1)  TRUE
 ‘1’ == ‘          1) (several spaces and 1)  TRUE
 ‘1’ == ‘+1’  (plus sign and 1)  TRUE
 ‘-1’ == ‘   -1’  (spaces and -1)  TRUE
 ‘-1’ == ‘                                        -1.00’  (spaces -1 dot 00)  TRUE
 ‘-1’ == ‘                                        -1.000000000000001’  FALSE
 ‘-1’ == ‘                                        -1.000000000000000000000000000000000000000000000001’  TRUE
 ‘1.0’ == ‘1.000000000000000’  TRUE
 ‘1e10’ == “1000000000”  TRUE
 ‘1e1’ == “0x0A”  TRUE
 -1 == ‘                                            -1’ (integer -1 equal to string with spaces and -1)  TRUE

So if you register in a PHP system with username 12345 you will be able to login later by entering [space][space][space][space][space]+12345.00000000 or if you pick an username like 1000000000 you’ll be able to login just by entering 1e10 (what could be very bad if there is another user in the system with username 1e10).
So always use the === to check values.

The “amazing” world of ++

Try to add ++ to an string, and to a string that contains decimal symbols…

carlesmateo-com-i-hate-from-php-randomnessity

The ‘horrible’, the floats

As the PHP documentation page says:

Warning

Never cast an unknown fraction to integer, as this can sometimes lead to unexpected results.

carlesmateo-com-from-php-net

So yes…

carlesmateo-com-php-with-floats

 

It is shocking that var_export and var_dump show different values, but more shocking is to get this:

php > $i_valor_float = 81.60;
php > echo intval($i_valor_float * 100).”\n”;

And getting 8159

carlesmateo-com-php-float-losing-cents

If you are and e-commerce or a bank losing cents you’ll not be happy.

In fact, the TPV Visa payment for Sermepa is as buggy as:

//-////////////////////////////////////////////
//desc: Asignamos el importe de la compra
//param: importe: total de la compra a pagar
//return: Retornamos el importe ya modificado
public function importe( $importe = 0 )
{
    $importe = $this->parseFloat( $importe );
    
    // sermepa nos dice Para Euros las dos últimas posiciones se consideran decimales.
    $importe = intval( $importe*100 );

You will have to do some workaround, use strval instead of intval:

php > $i_value_float = 81.60;
php > echo intval($i_value_float * 100).”\n”;
8159
php > echo (9000 – intval($i_value_float * 100)).”\n”;
841
php > echo strval($i_value_float * 100).”\n”;
8160
php > echo (9000 – strval($i_value_float * 100)).”\n”;
840

So in the Sermepa’s code you can do:

    $importe = strval($importe*100);

Or:

    $importe = intval(strval($importe*100));

carlesmateo-com-php-float-intval-lossing-strval-working

Float offers so bizarre scenarios like (code copied from contribution in PHP.net help):
$x = 8 – 6.4; // which is equal to 1.6
$y = 1.6;
var_dump($x == $y);

More funny is to do:
echo $x-$y;

You get:
-4.4408920985006E-16

More information on PHP float here: http://php.net/manual/en/language.types.float.php

Arrays keys being overwritten – part 1

As you know the arrays in PHP can be numeric or string.

As described in PHP documentation about arrays:

The key can either be an integer or a string. The value can be of any type.

Additionally the following key casts will occur:

  • Strings containing valid integers will be cast to the integer type. E.g. the key “8” will actually be stored under 8. On the other hand “08” will not be cast, as it isn’t a valid decimal integer.
  • Floats are also cast to integers, which means that the fractional part will be truncated. E.g. the key 8.7 will actually be stored under 8.
  • Bools are cast to integers, too, i.e. the key true will actually be stored under 1 and the key false under 0.
  • Null will be cast to the empty string, i.e. the key null will actually be stored under “”.
  • Arrays and objects can not be used as keys. Doing so will result in a warning: Illegal offset type.

If multiple elements in the array declaration use the same key, only the last one will be used as all others are overwritten.

But take a look to the following code:

<?php

$st_array = Array('+1' => 'This is key +1',
                  '*1' => 'This is key *1',
                  '-1' => 'This is key -1');

var_dump($st_array);

Look at the dump:

blog-carlesmateo-com-why-i-hate-php-array-keysSo key of the type string ‘-1’ has been converted to integer key -1.

Fut funny, ‘+1’ has been kept as string ‘+1’. If you do:

echo intval(‘+1’);

You get integer 1. So what is the exact function used to check the key to know if it can be casted to integer remains a mistery.

Arrays keys being overwritten – part 2

Taking the CSV sample file from before, and using the Postal Codes one may notice another problem.

When we load the postal code ‘94025’ it is added to the array as integer key, not string key.

This is a problem if we do an array_merge, because:

Values in the input array with numeric keys will be renumbered with incrementing keys starting from zero in the result array.

Creating an array with key string postal code and merging with another array of postal codes will cause the loss of the index. That is very bad.

There are much more problems.

In Barcelona we have many postal codes starting with 0 like ‘08014’, so those will remain as string key. As said in Andorra there are like ‘AD100’.

So if we have a program like that:

<?php

$st_array = Array( '08014' => 'Postal code from Barcelona, Catalonia',
                   '94025' => 'Postal code from Menlo Park, California',
                   'AD100' => 'Postal code from Andorra');

ksort($st_array);

var_dump($st_array);

We will get:

blog-carlesmateo-com--things-i-hate-from-php-array-sortSo we got the string sorted, and later the numeric keys sorted. So bad, as ‘08014’ would be expected to be before ‘94025’, but as the last was casted o integer and the first kept as string key, that caused a bizarre result in our postal code sorting program.

That can be very annoying in many cases, imagine you read a fixed length CSV file or you read from a Webservice, XML or databases the age of people stored with 2 digit so ’10’ means ten years old, ’99’ ninety-nine, etc… you will have the guys from 0 to 9 years old (00 to 09) kept as string, while the others as number. Now do an array_merge with another array and the mess is served.

I recommend adding a character infront of your keys to ensure that cast will not be produced, so use ‘C94024’ and ‘C08014’ and ‘CAD100’… Sort algorithms will work, and you will avoid the problems derivated from merging and sorting keys casted to numeric.

Bizarre behaviour on Boolean execution

Some Sysadmins are used to constructions like this in their bash scripts and Python:

true && do_something(); // This evaluates the first part always and then executes do_something();

file_exists(‘/tmp/whatever’) || touch(‘/tmp/whatever’); // If file_exist returns false so then the second part is executed and the file is created

Although I never found documented this way of executing in PHP, it works, and for sure you have seen samples like that:

$result = mysql_query('SELECT foo FROM bar', $o_db) or die('Query failed: ' . mysql_error($o_db));

 

When working Contractor I found a project using it. It was an habit from the CEO, that came from Systems branch. He had more bad habits like no documenting, programming with vim and refactoring the code of my Team during the weekends with bash replace commands (that caused conflicting variable names, methods, etc…). (* I know is weird that a CEO changes the code but believe me, companies do a lot of crazy things)

He told me that he thought that less lines of code meant more clever developer, and so he was always refactoring clear code to this Boolean a-like execution.

Using that style is bad. It doesn’t allow proper error handling, nor a flux of execution clear nor raising exceptions.

I shown him that this works not like he thought and found samples that crashes the thing:

Trying to throw an exception breaks:

true && throw new Exception('Hello');

If using echo, it breaks:

php-error-unexpectedSo it’s really a bad PHP design.

This behaviour is also funny:

2015-05-05-114624-blog-carlesmateo-com-php-5-5-9

So passing by reference an undefined index, causes it to be created with null without any warning.

I found that in the passing variables documentation, in a contributed comment from 10 years ago!. Still happens with PHP 5.5.

More samples:

blog-carlesmateo-com-passing-array-not-existing-by-ref

 

Upgrade your Scalability with NoSql

CAP-theorem
We’re experiencing another digital breach.

The first one was between people not knowing about IT and those knowing, but we’re living another between IT guys being unable to Scale and those being able to Scale well.

Few years ago I was working all the time with Relational Databases. Designing cool relational Schemas for amazing projects. I had work for years with Oracle, Microsoft Sql Server, Informix, Dbase, Trees, Xml, and in the last times with PostgreSql and MySql.

I was doing a lot of improvements to MySql installations to allow Scaling and Scaling more, to bring more reliability, to improve performance, to allow more sessions… in definitive to fit the needs of the businesses in a challenging world that demanded more and more avility to handle more and more users.

Master Master, Master with secondaries for read, cluster of memcached or redis to use as cache, database sharding, Ip’s fail over, load balancers, additional indexes, InMemory engines, Ramdisks… everything that could help to match an increase on the load volumes.

I used commercial products like Code Futures dbshards, I created my own database sharding solution, in order to split the data to severl MySql servers, etc..

Artisan’s setup and a lot of studying and testing, everything to Scale to the needs of the companies, to handle more and more traffic, more and more users…

And I was proud of my level. Since I was able of suceed where few were able.

But now that is not needed anymore.

Basically the NoSql systems were born to deal with the actual problems.

NoSql servers -take in mind that the term comprises a lot of different solutions- were born to:

  • Work in cluster
  • Split the load among the cluster
  • Work in cheap commodity servers (or small cloud instances)
  • Resistance to failure: Allow the destruction of some nodes without data loss
  • Work with nodes at distant-location datacenters

There are many different NoSql Softwares like: Cassandra, Hadoop, MongoDb, Riak, Neo4J, Redis…

And they do auto-sharding of the data, distribute the data across the network to fit the replication factor set, support load balancing, and in the case of Cassandra Scaling horizontally is so easy like adding more nodes to the Cassandra Cluster.

So yes, believe it. That’s why I write this article. So you can improve your projects and save tons of money.

Databases like Cassandra allow you to Scale so easily like adding new nodes. It is a peer to peer cluster with no single point of failure. All the nodes know the status of the other nodes and they distribute the load.

You can query all the time the same server, but it will be splitting the load among the other servers.

NoSql like hadoop allows you to create a large filesystem in cluster, with as-big-as-all-the-cluster files, but the best quality of HDFS is that it balances the load, and replicates the blocks of data among different servers, so if you loss nodes of the cluster and you have enough replication factor you’ll not loss data. I know companies in Barcelona with 500+ TB in HDFS and companies in the States with thousands of nodes.

So unlike most people believes, NoSql is not about how the information is stored in the database: Schemaless. (* take a look at Graph NoSql databases for relations in NoSql)

NoSql has not an Schema in the traditional sense of Relational Databases, but it has aggregation, columns, supercolumns, or documents depending on the solution, and the design has impact on the performance, but the principal virtue of the NoSql systems is that they were born to work in cluster, to distribute the load, to be resilent to errors and to Scale.

I’ve seen many Startups suffering problems of overloaded MySql databases, but it happens that nothing of this will happen with NoSql like Cassandra, or MongoDb.

Before they were scaling vertically the MySql server, so adding more Ram, adding more CPU, having better disks, until it was impossible to upgrade more. And if sharding was not possible due to joins, the project was in serious trouble.

But with NoSql you can have, instead of an expensive very powerful server, 5 really cheap servers, and it could be faster, cheaper, resilent to errors, with a better uptime. And if you want to Scale simply add more cheap servers.

The most important of this article has been said, so you can start to look at NoSql solutions.

For bonus, I add a list of NoSql’s and the kind of Data Model that they have:

 

Database name Type of data model Extra info Companies using it
Memcached Key-Value Storage is in Memory, so it is used mainly as cache Companies I’ve worked for: ECManaged, privalia.
Other well known companies:
LiveJournal, Wikipedia, Flickr, Bebo, Twitter, Typepad, Yellowbot, Youtube, Digg, WordPress.com, Craigslist, Mixi
Redis Key-Value Work in cluster. Can be used in memory or persistant Companies I’ve worked for: Atrapalo, ECManaged
Other well known companies: Twitter, Instagram, Github, Engine Yard, Craiglist, guardian.co.uk, blizzard, digg, flickr, stackoverflow, tweetdeck
Riak Key-Value Supports a REST API through HTTP and Protocol Buffers for basic PUT, GET, POST, and DELETE. MapReduce with native Javascript and Erlang. In multi-datacenter replication, one cluster acts as a “primary cluster”. AT&T, AOL, Ask.com, Best Buy, Boeing, Bump, Braintree, Comcast, DataPipe, Gilt Group, UK National Health Services (NHS), OpenX, Rovio, Symantec, TBS, The Weather Channel, WorkDay, Voxer, Yahoo! Japan, Yandex
BerkeleyDB Key-Value
LevelDB Key-Value
Project Voldemort Key-Value LinkedIn
Google BigTable Key-Value
Amazon DynamoDB Key-Value DynamoDB from Amazon, run in their AWS Cloud solution. See info on wikipedia
Cassandra Column-Family My favourite Db-alike. You can download my CQLSÍ wrapper for PHP :) NetFlix, Spotify, Facebok used it until 2010, Instagram, Rackspace, Rockyou, Zoho, Soundcloud, Hailo, ComCast, Hulu
HBase Column-Family Provides BigTable-like, SQL alike, support on the Hadoop core
Hypertable Column-Family
Amazon SimpleDB Column-Family
MongoDB Document Databases Written in C++, JSON-style documents, default stores to RAM until flush, high performance but dangerous for data integrity. Supports Map-Reduce
CouchDB Document Databases
OrientDb Document Databases
RavenDB Document Databases
Terrastore Document Databases (legacy)
Infinite Graph Graph Databases
HyperGraph DB Graph Databases
FlockDB Graph Databases
Neo4J Graph Databases
OrientDB Graph Databases

Bonus for PHP Developers: A kind of lightweight key-value store very simple component useful for one-server PHP projects are: APC (datastore capability), and Cache Lite (part of PEAR).

I can’t miss to mention hadoop, that is a NoSql that does not match the categories of Data Storage up, because is a Framework for the distributed processing of large data sets across clusters, so a monster, being able to do many many things and to distribute loads across its nodes. The most well-known components are HDFS, the distributed filesystem, and Map-Reduce: a simple to develop YARN-based system for parallel processing of large data sets across the clusters. All the big companies like Netflix, Amazon, Yahoo, etc… are using Hadoop. Often synomym when talking about BigData.

Hadoop is a world itself, and the many projects surrounding, but is worth, because allow incredible possibilities to distribute loads and to Scale.

Hadoop has a single point of failure in the namenode, that stores the name of the files of the HDFS in RAM, but solutions like MapR have overcome this.

Don’t get me wrong. Relational databases are wonderful, very useful, support transactions, stored procedures, have been tested for years, focused on consistency, and are very reliable.

Simply they don’t allow to Scale according to our current needs, while NoSql opens a wonderful world of easy, nearly infinite, Scaling.

As you see Open Source is ruling the world. :)

Companies are still sleeping and not supporting NoSql. I’m particularly disappointed with Open Source CMS that are still based on Relational Models, and are very hard to Scale. Drupal, WordPress, Joomla… and e-Commerces like Magento, osCommerce… and plugins for the CMS mentioned (uberkart, woocommerce, virtuemart…) need to be ported to NoSql urgently. (Although some partial support exists in some solutions, it is not fully supported)
That’s why I’ve started to create a very simple Open Source CMS based on NoSql. To help companies and bloggers that can’t Scale more their sites.

 

Troubleshooting apps in Linux

Let’s say you are on a system and a program stops working.

You check the space on disk, check that no one has modified the config files, check things like dns, etc… everything seems normal and you don’t know what else to check.

It could be that the filesystem got corrupted after a powerdown, for example, and one file or more are corrupted and this would be hard to figure out.

To find whats going wrong then you can use strace.

In the simplest case strace runs the specified command until it exits. It intercepts and records the system calls which are called by a process and the signals which are received by a process. The name of each system call, its arguments and its return value are printed on standard error or to the file specified with the -o option.

http://linux.die.net/man/1/strace

As you may know the programs request system calls, and get signals from the Operating System/Kernel.

strace will show all those requests done by the program, and the signals received. That means that you will see the requests from the program to the kernel to open a file, for example a config file.

Executing:

strace /usr/bin/ssh

That is the sample output:

strace /usr/bin/ssh
execve("/usr/bin/ssh", ["/usr/bin/ssh"], [/* 61 vars */]) = 0
brk(0)                                  = 0x7fc71509c000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713cb2000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=126104, ...}) = 0
mmap(NULL, 126104, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc713c93000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240Z\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=134224, ...}) = 0
mmap(NULL, 2234088, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc713870000
mprotect(0x7fc71388f000, 2097152, PROT_NONE) = 0
mmap(0x7fc713a8f000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1f000) = 0x7fc713a8f000
mmap(0x7fc713a91000, 1768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc713a91000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\361\5\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=1934816, ...}) = 0
mmap(NULL, 4045240, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc713494000
mprotect(0x7fc713646000, 2097152, PROT_NONE) = 0
mmap(0x7fc713846000, 155648, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b2000) = 0x7fc713846000
mmap(0x7fc71386c000, 14776, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc71386c000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\16\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=14664, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713c92000
mmap(NULL, 2109736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc713290000
mprotect(0x7fc713293000, 2093056, PROT_NONE) = 0
mmap(0x7fc713492000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7fc713492000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libz.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\36\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=100728, ...}) = 0
mmap(NULL, 2195784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc713077000
mprotect(0x7fc71308f000, 2093056, PROT_NONE) = 0
mmap(0x7fc71328e000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17000) = 0x7fc71328e000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libresolv.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320:\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=97144, ...}) = 0
mmap(NULL, 2202280, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc712e5d000
mprotect(0x7fc712e73000, 2097152, PROT_NONE) = 0
mmap(0x7fc713073000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16000) = 0x7fc713073000
mmap(0x7fc713075000, 6824, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc713075000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\234\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=252704, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713c91000
mmap(NULL, 2348608, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc712c1f000
mprotect(0x7fc712c5a000, 2097152, PROT_NONE) = 0
mmap(0x7fc712e5a000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3b000) = 0x7fc712e5a000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\36\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1853400, ...}) = 0
mmap(NULL, 3961912, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc712857000
mprotect(0x7fc712a14000, 2097152, PROT_NONE) = 0
mmap(0x7fc712c14000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bd000) = 0x7fc712c14000
mmap(0x7fc712c1a000, 17464, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc712c1a000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libpcre.so.3", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\31\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=256224, ...}) = 0
mmap(NULL, 2351392, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc712618000
mprotect(0x7fc712655000, 2097152, PROT_NONE) = 0
mmap(0x7fc712855000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3d000) = 0x7fc712855000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360l\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=135757, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713c90000
mmap(NULL, 2212936, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc7123fb000
mprotect(0x7fc712412000, 2097152, PROT_NONE) = 0
mmap(0x7fc712612000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17000) = 0x7fc712612000
mmap(0x7fc712614000, 13384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc712614000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/usr/lib/x86_64-linux-gnu/libkrb5.so.3", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260p\1\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=848672, ...}) = 0
mmap(NULL, 2944608, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc71212c000
mprotect(0x7fc7121f1000, 2093056, PROT_NONE) = 0
mmap(0x7fc7123f0000, 45056, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xc4000) = 0x7fc7123f0000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/usr/lib/x86_64-linux-gnu/libk5crypto.so.3", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360;\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=158136, ...}) = 0
mmap(NULL, 2257008, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc711f04000
mprotect(0x7fc711f2a000, 2093056, PROT_NONE) = 0
mmap(0x7fc712129000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7fc712129000
mmap(0x7fc71212b000, 112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc71212b000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libcom_err.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\200\24\0\0\0\0\0\0"..., 832) = 832
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713c8f000
fstat(3, {st_mode=S_IFREG|0644, st_size=14592, ...}) = 0
mmap(NULL, 2109896, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc711d00000
mprotect(0x7fc711d03000, 2093056, PROT_NONE) = 0
mmap(0x7fc711f02000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7fc711f02000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/usr/lib/x86_64-linux-gnu/libkrb5support.so.0", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@ \0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=31160, ...}) = 0
mmap(NULL, 2126632, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc711af8000
mprotect(0x7fc711aff000, 2093056, PROT_NONE) = 0
mmap(0x7fc711cfe000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6000) = 0x7fc711cfe000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libkeyutils.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\17\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=14256, ...}) = 0
mmap(NULL, 2109456, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc7118f4000
mprotect(0x7fc7118f6000, 2097152, PROT_NONE) = 0
mmap(0x7fc711af6000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7fc711af6000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713c8e000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713c8d000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713c8b000
arch_prctl(ARCH_SET_FS, 0x7fc713c8b840) = 0
mprotect(0x7fc712c14000, 16384, PROT_READ) = 0
mprotect(0x7fc711af6000, 4096, PROT_READ) = 0
mprotect(0x7fc713492000, 4096, PROT_READ) = 0
mprotect(0x7fc711cfe000, 4096, PROT_READ) = 0
mprotect(0x7fc712612000, 4096, PROT_READ) = 0
mprotect(0x7fc711f02000, 4096, PROT_READ) = 0
mprotect(0x7fc712129000, 4096, PROT_READ) = 0
mprotect(0x7fc713073000, 4096, PROT_READ) = 0
mprotect(0x7fc7123f0000, 40960, PROT_READ) = 0
mprotect(0x7fc712855000, 4096, PROT_READ) = 0
mprotect(0x7fc712e5a000, 4096, PROT_READ) = 0
mprotect(0x7fc71328e000, 4096, PROT_READ) = 0
mprotect(0x7fc713846000, 110592, PROT_READ) = 0
mprotect(0x7fc713a8f000, 4096, PROT_READ) = 0
mprotect(0x7fc713f1f000, 8192, PROT_READ) = 0
mprotect(0x7fc713cb4000, 4096, PROT_READ) = 0
munmap(0x7fc713c93000, 126104)          = 0
set_tid_address(0x7fc713c8bb10)         = 13672
set_robust_list(0x7fc713c8bb20, 24)     = 0
futex(0x7fff5c43f09c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, 7fc713c8b840) = -1 EAGAIN (Resource temporarily unavailable)
rt_sigaction(SIGRTMIN, {0x7fc7124017e0, [], SA_RESTORER|SA_SIGINFO, 0x7fc71240abb0}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x7fc712401860, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7fc71240abb0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
statfs("/sys/fs/selinux", 0x7fff5c43f090) = -1 ENOENT (No such file or directory)
statfs("/selinux", 0x7fff5c43f090)      = -1 ENOENT (No such file or directory)
brk(0)                                  = 0x7fc71509c000
brk(0x7fc7150bd000)                     = 0x7fc7150bd000
open("/proc/filesystems", O_RDONLY)     = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713cb1000
read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tb"..., 1024) = 328
read(3, "", 1024)                       = 0
close(3)                                = 0
munmap(0x7fc713cb1000, 4096)            = 0
open("/dev/null", O_RDWR)               = 3
close(3)                                = 0
openat(AT_FDCWD, "/proc/13672/fd", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, /* 6 entries */, 32768)     = 144
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0
getuid()                                = 1000
geteuid()                               = 1000
setresuid(-1, 1000, -1)                 = 0
socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=513, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc713cb1000
read(3, "# /etc/nsswitch.conf\n#\n# Example"..., 4096) = 513
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7fc713cb1000, 4096)            = 0
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=126104, ...}) = 0
mmap(NULL, 126104, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc713c93000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libnss_compat.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\23\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=35728, ...}) = 0
mmap(NULL, 2131288, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc7116eb000
mprotect(0x7fc7116f3000, 2093056, PROT_NONE) = 0
mmap(0x7fc7118f2000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7000) = 0x7fc7118f2000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libnsl.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`A\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=97296, ...}) = 0
mmap(NULL, 2202360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc7114d1000
mprotect(0x7fc7114e8000, 2093056, PROT_NONE) = 0
mmap(0x7fc7116e7000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16000) = 0x7fc7116e7000
mmap(0x7fc7116e9000, 6904, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc7116e9000
close(3)                                = 0
mprotect(0x7fc7116e7000, 4096, PROT_READ) = 0
mprotect(0x7fc7118f2000, 4096, PROT_READ) = 0
munmap(0x7fc713c93000, 126104)          = 0
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=126104, ...}) = 0
mmap(NULL, 126104, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc713c93000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libnss_nis.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240!\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=47760, ...}) = 0
mmap(NULL, 2143616, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc7112c5000
mprotect(0x7fc7112d0000, 2093056, PROT_NONE) = 0
mmap(0x7fc7114cf000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xa000) = 0x7fc7114cf000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\"\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=52160, ...}) = 0
mmap(NULL, 2148504, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc7110b8000
mprotect(0x7fc7110c4000, 2093056, PROT_NONE) = 0
mmap(0x7fc7112c3000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xb000) = 0x7fc7112c3000
close(3)                                = 0
mprotect(0x7fc7112c3000, 4096, PROT_READ) = 0
mprotect(0x7fc7114cf000, 4096, PROT_READ) = 0
munmap(0x7fc713c93000, 126104)          = 0
open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
lseek(3, 0, SEEK_CUR)                   = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=1823, ...}) = 0
mmap(NULL, 1823, PROT_READ, MAP_SHARED, 3, 0) = 0x7fc713cb1000
lseek(3, 1823, SEEK_SET)                = 1823
munmap(0x7fc713cb1000, 1823)            = 0
close(3)                                = 0
umask(022)                              = 022
write(2, "usage: ssh [-1246AaCfgKkMNnqsTtV"..., 466usage: ssh [-1246AaCfgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
           [-D [bind_address:]port] [-e escape_char] [-F configfile]
           [-I pkcs11] [-i identity_file]
           [-L [bind_address:]port:host:hostport]
           [-l login_name] [-m mac_spec] [-O ctl_cmd] [-o option] [-p port]
           [-R [bind_address:]port:host:hostport] [-S ctl_path]
           [-W host:port] [-w local_tun[:remote_tun]]
           [user@]hostname [command]
) = 466
exit_group(255)                         = ?
+++ exited with 255 +++

You can also generate a log with that info:

strace -o test_log.txt /usr/bin/ssh

Let’s pay attention to the open files:

carlesmateo-com-strace-openHere we can see what files were open, the mode and the result.

So if your program failen opening a certain file you will see it on the traces.

Also we can review the access:

cat test_log.txt | grep access --after-context=2

carlesmateo-com-strace-access

You can specify to trace only certain set of system calls by passing parameter -e trace=open,close,read,write,stat,chmod,unlink or -e trace=network or -e trace=process or -e trace=memory or -e trace=ipc or -e trace=signal etcetera.

Can also dump data read -e read=set or -e write=set for a full hexadecimal and ASCII dump of all the data written to file descriptors listed in the specified  set… or -e signal=set (default signal=all) or even by negation -e signal =! SIGIO (or signal=!io)…

You can also trace libraries with ltrace or processes with ptrace.

And see the open files with lsof.

carlesmateo-com-lsof-list-open-files

You can use lsof to see the TCP connections:

lsof -iTCP:80

carlesmateo-com-lsof-itcpYou can also know information of what process is owner of a tcp/udp connection:

netstat -tnp

carlesmateo-com-netstat-tnp-program-owner-connectionTake a look at ss for advanced sockets inspecting.

Of course you will find very interesting info on /proc pseudo-filesystem.

You can troubleshoot the environment for the process by doing:

strings /proc/1714/environ

Where 1714 is the process id, whatever.

/proc/[pid]/fd/ is a subdirectory containing one entry for each file open by the process, named by its file descriptor, being a symbolic link to the actual file.

/proc/[pid]/fdinfo/ will show information on the flags for the access mode of the open files and /proc/[pid]/io contains input/outputs statistics for the process.

carlesmateo-com-cat-proc-pid-io

 

 

 

 

 

 

The Cloud is for Scaling

dell-blades-m4110The Cloud is for Startups, and for Scaling. Nothing more.

In the future will be used by phone operators, to re-dimension their infrastructure and bandwidth in real time according to demand, but nowadays the Cloud is for Startups.

Examine the prices in my post in cmips, take a look, examine the performance also of the different CPU. You see that according to CMIPS v.1.03 a Desktop Processor Intel i7-4770S, worth USD $300, performs better than an Amazon M2 High Memory Quadruple Extra Large and than a Rackspace First gen. 30 GB RAM 8 Cores?.

Today the public cost of an Amazon M2 High Memory Quadruple Extra Large running for a month is USD $1,180.80 so USD $1.64 per hour and the Rackspace First Generation 30 GB RAM 8 Cores 1200 GB of disk costs is USD $1,425.60 so USD $1.98 per hour running.

And that’s the key, the cost per hour.

Because the greatness, the majesty of the Cloud is that you pay per hour, you pay as you need, or as you go. No attaching contracts. All on demand.

I had my company at a time where the hosting companies and the Data Centers were forcing customers to sign yearly contracts. What if a company only needs to host their Servers for three months? What if they have to close?. No options. You take it or you leave it.

Even renting a dedicated hosting was for at least a month or more, and what if the latency was not good? What if the bandwidth of the provider was not enough?.

Amazon irrupted in the market with strength. I really like that company because they grew the best eCommerce company for buying books, they did a system that really worked, and was able to recommend very useful computer books, and the delivery, logistics was so good, also post-sales service. They simply started to rent the same infrastructure they were using to attend their millions of customers and was a total success.

And for a while few people knew about Amazon deep technologies and functionalities, but later became a fashion.

Now people is using Amazon or whatever provider/Service that contains the word “Cloud” because the Cloud is in the mouth of everyone. Magazines and newspapers speak about the Cloud, so many many companies use it simply because everyone is talking about the Cloud. And those ISP that didn’t had a Cloud have invested heavily to create a Cloud, just because they didn’t want to be the ones without a Cloud, since everyone was asking for it and all the ISP companies were offering their “Clouds”.

Every company claims to have “Cloud” where the only many of them have is Vmware servers, Xen servers, Open Stack… running the tenants or instances of the customers always on the same host servers. No real Cloud, professional Cloud, abstract layered in a Professional way like Amazon, only the traditional “shared hosting” with another name, sharing CPU and RAM and Disk storage using virtual machines called instances.

So, Cloud fashion has become a confusing craziness where no one knows why they are in the Cloud but they believe they have to be in.

But do companies need the Cloud?. Cloud instances?

It depends. The best would be to ask that companies Why you choose the Cloud?.

If you compare the cost of having an instance in the Cloud, is much much more expensive than having a dedicated server. And for that high cost you don’t get more performance.

Virtualization is always slower and disk speed is always an issue in Cloud providers, where all the data travels via network from the disk cabins NAS to the Host servers running the guest instances. Data cannot be at local disks, since every time you start an instance, the resources like CPU and RAM are provisioned, and your instance run in totally different hardware. Only your data remain in the NAS (Network Attached Storage).

So unless you run your in-the-Cloud instance in a special provider that offers local disks, like DigitalOcean that offers SSD but monthly paying, (and so you pay the price by losing the hardware abstraction capability because you’re attached to the CPU that has the disk connected, and also you loss the flexibility of paying per hour of use, as you go), then you’ll face a bottleneck that is the hard disk performance (that for real takes all the data from NAS, where is stored, through the local network).

So what are the motivations to use the Cloud?. I try to put some examples, out of these it has no much sense, I think. You can send me your happy-in-Cloud scenarios if you found other good uses.

Example A) Saving initial costs, avoid contract attachment and grow easily own-made

Imagine a Developer that start its own project. May be it works, may be not, but instead of having a monthly contract for a dedicated server, he starts with an Amazon Free Tier (better not, use Small instance at least) and runs a web. If it does not work, simply stop the instance and pay no more. If the project works and has more and more users he can re-dimension the server with a click. Just stop the instance, change the type of instance, start it again with more RAM and more CPU power. Fast.

Hiring a dedicated server implies at least monthly contracts, average of USD $100 per month, and is not easy to move to a bigger server, not fast and is expensive as it requires the ISP tech guys to move the data, to migrate from a Server to another.

Also the available bandwidth is to be taken in consideration. Bandwidth is expensive and Amazon can offer 150 Mbit to smaller machines. Not all the Internet Service Providers can offer that bandwidth even with most advanced packets.

If the project still grows, with a click, in seconds, 20 instances with a lot of bandwidth can be deployed and serving traffic to your customers very quick.

You save the init costs of buying Servers, and the time to deal with hardware, bandwidth limitations and avoid contracts, but you pay an hourly rate a lot more expensive. So in the long run is much much expensive using Amazon and less powerful than having dedicated servers. That happened to Zynga, that was paying $63M annually to Amazon and decided to step back from Amazon to their own Data Centers again. (another fortune tech link)

The limited CPU power was also a deal breaker for many companies that needed really powerful CPU and gigs of RAM for their Database Servers. Now this situation is much better with the introduction of the new Servers.

This developer can benefit from doing bacups with a click, cloning, starting instances from an image, having more static ip’s with a click, deploying built-in (from the Cloud provider) load balancers, using monitoring services like CloudWatch, creating Volumes and attaching to the servers for additional space…

Example B) An Startup with fluctuating number of users and hopes of growing

Imagine an Startup with a wonderful Facebook Application.

During 80% of the day has few visits, may be only need 3 Servers, but during 20% of the hours of the day from 10:00 to 15:00 users connect like hell, so they need 20 servers to attend this traffic and workload, and may be tomorrow needs 30 servers.

With the Cloud they pay for 3 servers 24 hours per day and for the other 17 servers only pay the hours they are on, that’s 5 hours per day. Doing that they save money and they have an unlimited * amount of power. (* There are limits for real, you have to specially request authorisation to run more than default max. servers for the zone, that is normally 20 instances for Amazon. Also it can happen theoretically that when you request new instances the Zone has no instances available).

So well, for an Startup growing, avoiding hiring 20 dedicated servers and instead running into the Cloud as many as they need, for just the time they need, Auto-Scaling up and down, and can use the servers NOW and pay the next month with Visa card, all of that can make a difference for a growing Startup.

If the servers chosen are not powerful enough that is solved with a click, changing instance type. So fast. A minute.

It’s only a matter of money.

Example C) e-Learning companies and online universities

e-Learning platforms also get benefits from the Auto-Scaling for the full occupation hours.

The built-in functionalities of the Cloud to clone instances is very useful to deploy new web servers, or new environments for students doing practices, in the case of teaching Information Technology subjects, where the users need to practice against a real server (Linux or windows).

Those servers can be created and destroyed, cloned from the main -ready to go- template. And also servers can be scheduled to stop at a certain hour and to start also, so saving the money from the hours not needed.

Example D) Digital agencies, sports and other events

When there is an Special event, like motorcycle running, when a Football Team scores, when there is an spot in tv announcing a product…

At those moments the traffic to the site can multiply, so more servers and more bandwidth have to be deployed instantly. That cannot be done with physical servers, hardware, but is very easy to provision instances from the Cloud.

Mass mailing email campaigns can also benefit from creating new Servers when needed.

Example E) Proximity and SEO

Cloud providers have Data Centres everywhere. If you want to have servers in Asia, or static content to be deployed faster, or in South-America, or in Europe… the Cloud providers have plenty of Data Centers all over the world.

Example F) Game aficionado and friends sharing contents

People that loves cooperative games can find the needed hungry bandwidth and at a moderate price. If they run their private server few hours, at night, from 22:00 to 01:00 as example, they will benefit from a great bandwidth from the big Cloud provider and pay only 3 hours per day (the exceed of traffic uses to be paid in most providers, but price of additional GB uses to be really really competitive).

Friends sharing contents in an Ftp also, can benefit from this Cloud servers, but probably they will find more easy to use services like Dropbox.

Example G) Startup serving contents

An Startup serving videos, images, or books, can benefit not only from the great bandwidth of big Cloud providers (this has been covered before), but for a very cheap price for exceeding Gigabyte transferred.

Local ISP can’t offer 150 Mbit for an instance of USD $20 and USD $0.12 per additional GB transferred.

Many Cloud providers also allow unlimited incoming traffic from the Internet, and from Server to Server through private ip’s.

Other cases

For other cases Dedicated Servers are much more Powerful, faster and cheaper, at the price of being “static” in the sense of attached, not layer abstracted, but all the aspects of your Project have to be taken in count before deciding stepping into or out of the Cloud.

In general terms I would say that the Cloud is for Scaling.

NAS and Gigabit

I’ve found this problem in several companies, and I’ve had to show their error and convince experienced SysAdmins, CTOs and CEO about the erroneous approach. Many of them made heavy investments in NAS, that they are really wasting, and offering very poor performance.

Normally the rack servers have their local disks, but for professional solutions, like virtual machines, blade servers, and hundreds of servers the local disk are not used.

NAS – Network Attached Storage- Servers are used instead.

This NAS Servers, when are powerful (and expensive) offer very interesting features like hot backups, hot backups that do not slow the system (the most advanced), hot disk replacement, hot increase of total available space, the Enterprise solutions can replicate and copy data from different NAS in different countries, etc…

Smaller NAS are also used in configurations like Webservers’ Webfarms, were all the nodes has to have the same information replicated, and when a used uploads a new profile image, has to be available to all the webservers for example.

In this configurations servers save and retrieve the needed data from the NAS Servers, through LAN (Local Area Network).

The main error I have seen is that no one ever considers the pipe where all the data is travelling, so most configurations are simply Gigabit, and so are bottleneck.

dell-blade-servers-enclosureImagine a Dell blade server, like this in the image on the left.

This enclosure hosts 16 servers, hot plugable, with up to two CPU’s each blade, we also call those blade servers “pizza” (like we call before to rack servers).

A common use is to use those servers to have Vmware, OpenStack, Xen or other virtualization software, so the servers run instances of customers. In this scenario the virtual disks (the hard disk of the virtual machines) are stored in the NAS Server.

So if a customer shutdown his virtual server, and start it later, the physical server where its virtual machine is running will be another, but the data (the disk of the virtual server) is stored in the NAS and all the data is saved and retrieved from the NAS.

The enclosure is connected to the NAS through a Gigabit connection, as 10 Gigabit connections are still too expensive and not yet supported in many servers.

Once we have explained that, imagine, those 16 servers, each with 4 or 5 virtual machines, accessing to their disks through a Gigabit connection.

If only one of these 80 virtual machines is accessing to disk, the will be no problem, but if more than one is accessing the Gigabit connection, that’s a maximum of 125 MB (Megabytes) per second, will be shared among all the virtual machines.

So imagine, 70 virtual machines are accessing NAS to serve web pages, with not much traffic, OK, but the other 10 virtual machines are doing heavy data transmission: for example one is serving data through FTP server, the other is broadcasting video, the other is copying heavy log files, and so… Imagine that scenario.

The 125 MB per second is divided between the 80 servers, so those 10 servers using extensively the disk will monopolize the bandwidth, but even those 10 servers will have around 12,5 MB each, that is 100 Mbit each and is very slow.

Imagine one of the virtual machines broadcast video. To broadcast video, first it has to get it from the NAS (the chunks of data), so this node serving video will be able to serve different videos to few customers, as the network will not provide more than 12,5 MB under the circumstances provided.

This is a simplified scenario, as many other things has to be taken in count, like the SATA, SCSI and SAS disks do not provide sustained speeds, speed depends on locating the info, fragmentation, etc… also has to be considered that NAS use protocol iSCSI, a sort of SCSI commands sent through the Ethernet. And Tcp/Ip uses verifications in their protocol, and protocol headers. That is also an overhead. I’ve considered only traffic in one direction, so the servers downloading from the NAS, as assuming Gigabit full duplex, so Gigabit for sending and Gigabit for receiving.

So instead of 125 MB per second we have available around 100 MB per second with a Gigabit or even less.

Also the virtualization servers try to handle a bit better the disk access, by keeping a cache in memory, and not writing immediately to disk.

So you can’t do dd tests in virtual machines like you would do in any Linux with local disks, and if you do go for big files, like 10 GB with random data (not just 0, they have optimizations for that).

Let’s recalculate it now:

70 virtual machines using as low as 0.10 MB/second each, that’s 7 MB/second. That’s really optimist as most webservers running PHP read many big files for attending a simple request and webservers server a lot of big images.

10 virtual machines using extensively the NAS, so sharing 100 MB – 7 MB = 93 MB. That is 9.3 MB each.

So under these circumstances for a virtual machine trying to read from disk a file of 1 GB (1000 MB), this operation will take 107 seconds, so 1:47 minutes.

So with this considerations in mind, you can imagine that the performance of the virtual machines under those configurations are leaved to the luck. The luck that nobody else of the other guests in the servers are abusing the disk I/O.

I’ve explained you in a theoretical plan. Sadly reality is worst. A lot worst. Those 70 web virtual machines with webservers will be so slow that they will leave your company very disappointed, and the other 10 will not even be happier.

One of the principal problems of Amazon EC2 has been always disk performance. Few months ago they released IOPS, high performance disks, that are more expensive, but faster.

It has to be recognized that in Amazon they are always improving.

They have also connection between your servers at 10 Gbit/second.

Returning to the Blades and NAS, an easy improvement is to aggregate two Gigabits, so creating a connection of 2 Gbit. This helps a bit. Is not the solution, but helps.

Probably different physical servers with few virtual machines and a dedicated 1 Gbit connection (or 2 Gbit by 1+1 aggregated if possible) to the NAS, and using local disks as much as possible would be much better (harder to maintain at big scale, but much much better performance).

But if you provide infrastructure as a Service (IaS) go with 2 x 10 Gbit Fibre aggregated, so 20 Gigabit, or better aggregate 2 x 20 Gbit Fibre. It’s expensive, but crucial.

Now compare the 9.3 MB per second, or even the 125 MB theoretical of Gigabit of the average real sequential read of 50 MB/second that a SATA disk can offer when connected on local, or nearly the double for modern SAS 15.000 rpm disks… (writing is always slower)

… and the 550 MB/s for reading and 550 MB/s for writing that some SSD disks offer when connected locally. (I own two OSZ SSD disks that performs 550 MB).

I’ve seen also better configuration for local disks, like a good disk controller with Raid 5 and disks SSD. With my dd tests I got more than 900 MB per second for writing!.

So if you are going to spend 30.000 € in your NAS with SATA disks (really bad solution as SATA is domestic technology not aimed to work 24×7 and not even fast) or SAS disks, and 30.000 € more in your blade servers, think very well what you need and what configuration you will use. Contact experts, but real experts, not supposedly real experts.

Otherwise you’ll waste your money and your customers will have very very poor performance on these times where applications on the Internet demand more and more performance.