Category Archives: Cloud providers

My Sites are under Maintenance

2021-11-08 Update: There is a Postmortem analysis of what happened with Amazon here.

TLTR: I’m undergoing a Maintenance on all my sites.

The main reason was that I was getting unexpected API Exceptions on the AWS SDK for Python (boto3), so I connected to the AWS Console to get more information.

Then I saw a message indicating that they will stop EC2-Classic today 30th of October. (Please read the Update on the Postmortem analysis as I understood incorrectly that banner message)

I already started migrating my Services, some I move to other providers like Digital Ocean. Other I had plant to keep in Amazon.

EOL (End of Life) was scheduled for 2022 August, so when I saw the message from Amazon the evening of the 29th, I decided to migrate my EC2-Classic Public Ip’s and Compute to VPC. Trying to deploy from an AMI, Amazon APIs were returning many internal errors, and as I figured out where their failures would be I was able get instances being launch without being Terminated immediately without an explanation. Still I had many problems with the Internet Gateway, VPC NAT, etc… after hours fighting with their errors, and their console, that is more a bunch of pages to manage Infrastructure rather than a user/developer friendly Cloud Tool I decided that I had enough.

After 11 years using Amazon AWS, including a trip to Dublin to be hired as Manager for Cloud Watch, and giving them the idea to add AutoScaling (I was told the project was too easy for me and that I would get bored in a year or too so I was not hired), I decided to move my Services to Google Cloud and to Digital Ocean.

I’m very polite and I saw that when I told to one Manager that the User Interface was terrible he didn’t like, but I have to speak up and say that tools for developers cannot be cold as your evil girlfriend. Cannot be API alike, stand alone pages to manage infinite parts of Architecture. Web providing services for developers cannot be created like in cold SysAdmin style. If the infrastructure is hard to manage and internally you use APIs, build nice Wizards in Javascript. I was leading a Team of Developers with infinite less resources than Amazon or Google and we wrote a Multi-Cloud product, with nice, and clever, and easy to use Wizards, and they were infinitely more better that those giant CSPs. We won a prize at European level at that time. But it was 2013.

I’ve migrated everything, moved all the data, statics, VMs… but I’m completing the adjustments for certain services like Cassandra nodes, web sites, bootstrapping some of my sites based of my PHP Catalonia Framework, adding Firewall rules to GCP, doing changes for Ansible provisioning, deploying the Server scripts from IaC, Docker, etc…

I’ll be posting updates in Twitter.

News from the blog 2021-09-20

  • I’ve published a very simple game, Tic Tac Toe, that I created for my Python 3 Exercises for Beginners book.
  • I’ve raised back the price for my books to normal levels.
    I’ve been keeping the price to the minimum to help people that wanted to learn during covid-19. I consider that who wanted to learn has already done it.

I still have bundles with a somewhat reduced price, and I authorized LeanPub platform to do discounts up to 50% at their discretion.

Bundle of four books in https://leanpub.com/b/python3-exercises-zfs-assemble-computer

https://leanpub.com/b/python3-exercises-zfs-assemble-computer

  • I’ve been deleting AMIs, Snapshots, Volumes and backups from Amazon instances I’ll no longer use.

I’ve migrated to Docker some sites and WordPress sites and now I’m CSP (Cloud Service Provider) agnostic. I can deploy wherever I want.

We pay per GB used of storage, so my money will get a better usage.

As I said in my old article from 2013, The Cloud is for Scaling. For Startups and for Enterprises. It is too expensive for small and medium companies.

  • For those studying Python there is a Virtual Meetup about Data Analysis, in Spanish ,the 23th of September

https://www.meetup.com/tech-barcelona/events/280791310/

More meetups:

https://www.meetup.com/tech-barcelona/

Migrating some Services from Amazon to Digital Ocean

Analyzing the needs

I start with a VM, to learn about the providers and the migration project as I go.

My VM has been running in Amazon AWS for years.

It has 3.5GB of RAM and 1 Core. However is uses only 580MB of RAM. I’m paying around $85/month for this with Amazon.

I need to migrate:

  • DNS Server
  • Email
  • Web
  • Database

For the DNS Server I don’t need it anymore, each Domain provider has included DNS Service for free, so I do not longer to have my two DNS.

For the email I find myself in the same scenario, most providers offer 3 email accounts for your domain, and some alias, for free.

I’ll start the Service as Docker in the new CSP, so I will make it work in my computer first, locally, and so I can move easily in the future.

Note: exporting big images is not the idea I have to make backups.

I locate a Digital Ocean droplet with 1GB of RAM and 1 core and SSD disks for $5, for $6 I can have a NVMe version. That I choose.

Disk Space for the Statics

The first thing I do is to analyze the disk space needs of the service.

In this old AWS CentOS based image I have:

[root@ip-10-xxx-yyy-zzz ec2-user]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1       79G   11G   69G  14% /
devtmpfs        1.8G   12K  1.8G   1% /dev
tmpfs           1.8G     0  1.8G   0% /dev/shm

Ok, so if I keep the same I have I need 11GB.

I have plenty of space on this server so I do a zip of all the contents of the blog:

cd /var/www/wordpress
zip -r /home/ec2-user/wp_sizeZ.zip wp_siteZ

Database dump

I need a dump of the databases I want to migrate.

I check what databases are in this Server.

mysql -u root -p

mysql> show databases;

I do a dump of the databases that I want:

sudo mysqldump --password='XXXXXXXX' --databases wp_mysiteZ > wp_mysiteZ.sql

I get an error, meaning MySQL needs repair:

mysqldump: Got error: 145: Table './wp_mysiteZ/wp_visitor_maps_wo' is marked as crashed and should be repaired when using LOCK TABLES

So I launch a repair:

sudo mysqlcheck --password='XXXXXXXX' --repair --all-databases

And after the dump works.

My dump takes 88MB, not much, but I compress it with gzip.

gzip wp_mysiteZ.sql

It takes only 15MB compressed.

Do not forget the parameter –databases even if only one database is exported, otherwise the CREATE DATABASE and USE `wp_mysiteZ`; will not be added to your dump.

I will need to take some data form the mysql database, referring to the user used for accessing the blog’s database.

I always keep the CREATE USER and the GRANT permissions, if you don’t check the wp-config.php file. Note that the SQL format to create users and grant permissions may be different from a SQL version to another.

I create a file named mysql.sql with this part and I compress with gzip.

Checking PHP version

php -v
PHP 7.3.23 (cli) (built: Oct 21 2020 20:24:49) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.3.23, Copyright (c) 1998-2018 Zend Technologies

WordPress is updated, and PHP is not that old.

The new Ubuntu 20.04 LTS comes with PHP 7.4. It will work:

php -v
PHP 7.4.3 (cli) (built: Jul  5 2021 15:13:35) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
    with Zend OPcache v7.4.3, Copyright (c), by Zend Technologies

The Dockerfile

FROM ubuntu:20.04

MAINTAINER Carles Mateo

ARG DEBIAN_FRONTEND=noninteractive

# RUN echo "nameserver 8.8.8.8" > /etc/resolv.conf

RUN echo "Europe/Ireland" | tee /etc/timezone

# Note: You should install everything in a single line concatenated with
#       && and finalizing with 
# apt autoremove && apt clean

#       In order to use the less space possible, as every command 
#       is a layer

RUN apt update && apt install -y apache2 ntpdate libapache2-mod-php7.4 mysql-server php7.4-mysql php-dev libmcrypt-dev php-pear git mysql-server less zip vim mc && apt autoremove && apt clean

RUN a2enmod rewrite

RUN mkdir -p /www

# If you want to activate Debug
# RUN sed -i "s/display_errors = Off/display_errors = On/" /etc/php/7.2/apache2/php.ini 
# RUN sed -i "s/error_reporting = E_ALL & ~E_DEPRECATED & ~E_STRICT/error_reporting = E_ALL/" /etc/php/7.2/apache2/php.ini 
# RUN sed -i "s/display_startup_errors = Off/display_startup_errors = On/" /etc/php/7.2/apache2/php.ini 
# To Debug remember to change:
# config/{production.php|preproduction.php|devel.php|docker.php} 
# in order to avoid Error Reporting being set to 0.

ENV PATH_WP_MYSITEZ /var/www/wordpress/wp_mysitez/
ENV PATH_WORDPRESS_SITES /var/www/wordpress/

ENV APACHE_RUN_USER  www-data
ENV APACHE_RUN_GROUP www-data
ENV APACHE_LOG_DIR   /var/log/apache2
ENV APACHE_PID_FILE  /var/run/apache2/apache2.pid
ENV APACHE_RUN_DIR   /var/run/apache2
ENV APACHE_LOCK_DIR  /var/lock/apache2
ENV APACHE_LOG_DIR   /var/log/apache2

RUN mkdir -p $APACHE_RUN_DIR
RUN mkdir -p $APACHE_LOCK_DIR
RUN mkdir -p $APACHE_LOG_DIR
RUN mkdir -p $PATH_WP_MYSITEZ

# Remove the default Server
RUN sed -i '/<Directory \/var\/www\/>/,/<\/Directory>/{/<\/Directory>/ s/.*/# var-www commented/; t; d}' /etc/apache2/apache2.conf 

RUN rm /etc/apache2/sites-enabled/000-default.conf

COPY wp_mysitez.conf /etc/apache2/sites-available/

RUN chown --recursive $APACHE_RUN_USER.$APACHE_RUN_GROUP $PATH_WP_MYSITEZ

RUN ln -s /etc/apache2/sites-available/wp_mysitez.conf /etc/apache2/sites-enabled/

# Please note: It would be better to git clone from another location and
# gunzip and delete temporary files in the same line, 
# to save space in the layer.
COPY *.sql.gz /tmp/

RUN gunzip /tmp/*.sql.gz; echo "Starting MySQL"; service mysql start && mysql -u root < /tmp/wp_mysitez.sql && mysql -u root < /tmp/mysql.sql; rm -f /tmp/*.sql; rm -f /tmp/*.gz
# After this root will have password assigned

COPY *.zip /tmp/

COPY services_up.sh $PATH_WORDPRESS_SITES

RUN echo "Unzipping..."; cd /var/www/wordpress/; unzip /tmp/*.zip; rm /tmp/*.zip

RUN chown --recursive $APACHE_RUN_USER.$APACHE_RUN_GROUP $PATH_WP_MYSITEZ

EXPOSE 80

CMD ["/var/www/wordpress/services_up.sh"]

Services up

For starting MySQL and Apache I relay in services_up.sh script.

#!/bin/bash
echo "Starting MySql"
service mysql start

echo "Starting Apache"
service apache2 start
# /usr/sbin/apache2 -D FOREGROUND

while [ true ];
do
    ps ax | grep mysql | grep -v "grep "
    if [ $? -gt 0 ];
    then
        service mysql start
    fi
    sleep 10
done

You see that instead of launching apache2 as FOREGROUND, what keeps the loop, not exiting from my Container is a while [ true ]; that will keep looping and checking if MySQL is up, and restarting otherwise.

MySQL shutting down

Some of my sites receive DoS attacks. More than trying to shutdown my sites, are spammers trying to publish comment announcing fake glasses, or medicines for impotence, etc… also some try to hack into the Server to gain control of it with dictionary attacks or trying to explode vulnerabilities.

The downside of those attacks is that some times the Database is under pressure, and uses more and more memory until it crashes.

More memory alleviate the problem and buys time, but I decided not to invest more than $6 USD per month on this old site. I’m just keeping the contents alive and even this site still receives many visits. A restart of the MySQL if it dies is enough for me.

As you have seen in my Dockerfile I only have one Docker Container that runs both Apache and MySQL. One of the advantages of doing like that is that if MySQL dies, the container does not exit. However I could have had two containers with both scripts with the while [ true ];

When planning I decided to have just one single Container, all-in-one, as when I export the image for a Backup, I’ll be dealing only with a single image, not two.

Building and Running the Container

I created a Bash script named build_docker.sh that does the build for me, stopping and cleaning previous Containers:

#!/bin/bash

# Execute with sudo

s_DOCKER_IMAGE_NAME="wp_sitez"

printf "Stopping old image %s\n" "${s_DOCKER_IMAGE_NAME}"
sudo docker stop "${s_DOCKER_IMAGE_NAME}"

printf "Removing old image %s\n" "${s_DOCKER_IMAGE_NAME}"
sudo docker rm "${s_DOCKER_IMAGE_NAME}"

printf "Creating Docker Image %s\n" "${s_DOCKER_IMAGE_NAME}"
# sudo docker build -t ${s_DOCKER_IMAGE_NAME} . --no-cache
sudo docker build -t ${s_DOCKER_IMAGE_NAME} .

i_EXIT_CODE=$?
if [ $i_EXIT_CODE -ne 0 ]; then
    printf "Error. Exit code %s\n" ${i_EXIT_CODE}
    exit
fi

echo "Ready to run ${s_DOCKER_IMAGE_NAME} Docker Container"
echo "To run type: sudo docker run -d -p 80:80 --name ${s_DOCKER_IMAGE_NAME} ${s_DOCKER_IMAGE_NAME}"
echo "or just use run_in_docker.sh"
echo
echo "Debug running Docker:"
echo "docker exec -it ${s_DOCKER_IMAGE_NAME} /bin/bash"
echo

I assign to the image and the Running Container the same name.

Running in Production

Once it works in local, I set the Firewall rules and I deploy the Droplet (VM) with Digital Ocean, I upload the files via SFTP, and then I just run my script build_docker.sh

And assuming everything went well, I run it:

sudo docker run -d -p 80:80 --name wp_mysitez wp_mysitez

I check that the page works, and here we go.

Some improvements

This could also have been put in a private Git repository. You only have to care about not storing the passwords in it. (Like the MySQL grants)

It may be interesting for you to disable directory browsing.

The build from the Git repository can be validated with a Jenkins. Here you have an article about setup a Jenkins for yourself.

News from the blog 2020-09-21

  • I have benchmarked three different CPUs and two Compute optimized Amazon AWS instances with CMIPS 1.0.5 64bit. The two Intel Xeon baremetals equip 2 x Intel Xeon Processor and the third baremetal equips a single Intel Core i7-7800X:

If you’re surprised by the number of cores reported by the Amazon instance m5d.24xlarge, and even more for the baremetal c5n.metal, you’re guessing well that this comes from having Servers with 4 CPUs for Compute Optimized series.

CMIPS ScoreExecution time (seconds)Type of instanceTotal coresCPU model seen by Linux
5853634.16Amazon AWS m5d.24xlarge964 x Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
5416936.92Amazon AWS c5n.metal724 x Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
2632975.96Baremetal482 x Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
2173292.02Baremetal402 x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz
9810203.87Desktop computer12Intel(R) Core(TM) i7-7800X CPU @ 3.50 GHz

  • I can recommend these courses in Linux Academy:

https://linuxacademy.com/cp/library/catalog/view/DevOpsCourses

I’m finishing the 24 hours long Implementing a Full CI/CD Pipeline:

https://linuxacademy.com/cp/modules/view/id/218

  • When I can choose I use Linux, but in many companies I work with Windows workstations. I’ve published a list of useful Software I use in all my Windows workstations.
  • WFH I currently use two external monitors attached to the laptop. I planned to add a new one using a Display Port connected to the Dell USB-C dongle that provides me Ethernet and one additional HDMI as well. I got the cable from Amazon but unfortunately something is not working. In order to make myself comfortable and see some the graphs of the systems worldwide as I have on the office’s displays, I created a small HTML page, that joins several monitor pages in one single web page using frames.
    This way I only have one page loaded on the browser, maximized, and this monitor is dedicated to those graphs of the stats of the Systems.
    Something very simple, but very useful. You can extend the number of columns and rows it to have more graphics in the same screen.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
   "http://www.w3.org/TR/html4/frameset.dtd">
<HTML>
<HEAD>
<TITLE>Casa Monitor</TITLE>
</HEAD>
<FRAMESET cols="50%,50%">
  <FRAMESET rows="50%,*">
      <FRAME src="http://players-all-games/">
      <FRAME src="http://monthly-graphs/">
  </FRAMESET>
  <FRAMESET rows="50%,*">
	  <FRAME src="http://grafana/databases/">
	  <FRAME src="http://kibana/clusters/">
  </FRAMESET>
</FRAMESET>
</HTML>

If you don’t have the space or the resources for more monitors you can use the ingenious.

I have a cheap HDMI switch that allows me to do PinP (Picture in Picture) with one main source on the monitor, and two using a fraction of their original space. It may allow you to see variants in graphics.

And in you have only a single monitor, you can use a chrome extension that rotates tabs, which is also very useful.

Be careful if you use the reload features with software like Jira or Confluence. If they are slow normally, imagine if you mess it by reloading every 30 seconds… I discourage you to use auto refresh on these kind of Softwares.

My laptop and my Xbox One controller

This past week I have connected the XBOX One X Controller to the Windows laptop for the first time. Normally I use the Pc only for strategy games, but I wanted to play other games like Lost Planet 3, or Fall Guys in a console mode way. I figured that would be very easy and it was. You turn on the controller, press the connect button like you did to pair with the console, and in Windows indicate pair to a Xbox One controller. That’s it.

  • I’ve also updated my Python 3 Combat Guide, to add the explanation, step by step, about how to refactor and make resilient, and add Unit Testing to a spaghetti code, and turn it into a modern OOP. Is currently 255 DIN-A4 pages.
  • This is something I wanted to share with you for a while.
    One of the most funny things in my career is what I call:
    Squirrel Strikes Back

I named this as the first incident where a provider told that the reason of a fiber failure was a squirrel chewing the cable.

I popularized this with my friends in Systems Administration and SRE and when they suffer a Squirrel Attack incident, they forward it to me, for great joy.

I’m used to construction or gas, water, electricity, highways repair operations on the cities accidentally cutting fiber cables, thunders or truck accidents on the highway breaking the floor and cutting tubes and issues like that. I’ve been seeing that for around 25 years.

So the first time I saw a provider referring to a squirrel cutting the cables it was pretty hilarious. :)

In my funny mental picture: I could visually imagine a cable thrown in the middle of the forest, over trees, and a squirrel chewing it as it tastes like peanuts. :) or a shark cutting a Google’s or Facebook’s intercontinental cable thrown without any protection. ;)

The sense of humor and the good vibes, are two of the most important things in life.

How to recover access to your Amazon AWS EC2 instance if you loss your Private Key for SSH

This article covers the desperate situation where you had generated one or more instances, instructed Amazon to use a SSH Key Pair certs where only you have the Private Key, your instances are running, for example, an eCommerce site, running for months, and then you loss your Private Key (.pem file), and with it the SSH access to your instances’ Data.

Actually I’ve seen this situation happening several times, in actual companies. Mainly Start ups. And I solved it for them.

Assuming that you didn’t have a secondary method to access, which is another combination of username/password or other user/KeyPairs, and so you completely lost the access to the Database, the Webservers, etc… I’m going to show you how to recover the data.

For this article I will consider an scenario where there is only one Instance, which contains everything for your eCommerce: Webserver, code, and Database… and is a simple config, with a single persistent drive.

Warning: be very careful as if you use ephemeral drives, contents will be lost is you power off the instance.

Method 1: Quicker, launching a new instance from the previous

Step1: The first step you will take is to close the access from outside, using the Firewall, to avoid any new changes going to the disk. You can allow access to the instance only from your static Ip in the office/home.

Step 2: You’ll wait for 5 minutes to allow any transaction going on to conclude, and pending writes to be flushed to disk.

Step 3: From Amazon AWS Console, EC2, you’ll request an Snapshot. That step is to try to get extra security. Taking an Snapshot from a live, mounted, filesystem, is not the best of ideas, specially of a Database, but we are facing a desperate situation so we’re increasing the numbers of leaving this situation without Data loss. This is just for extra security and if everything goes well at the end you will not need this snapshot.

Make sure you select No reboot.

Step 4: Be very careful if you have extra drives and ephemeral drives.

Step 5: Wait till the Snapshot completes.

Step 6: Then request a graceful poweroff. Amazon will try to poweroff the Server in a gentle way. This may take two minutes.

Step 7: When the instance is powered off, request a new Snapshot. This is the one we really want. The other was just to be more safe. If you feel confident you can just unclick No Reboot on the previous Step and do only one Snapshot.

Step 8: Wait till the Snapshot completes.

Step 9: Generate and upload the new key you will use to AWS Console, or ask Amazon to generate a key pair for you. You can do it while creating the new instance through the wizard.

Step 10: Launch a new instance, based on your snapshot AMI. This will generate a copy of your previous instance (using the Snapshot) for the new one. Select the new Key pair. Finish assigning the Security groups, the elastic ip…

Step 11: Start the new instance. You can select a different flavor, like a more powerful instance, if you prefer. (scale vertically)

Step 12: Test your access by login via SSH with the new pair keys and from your static Ip which has access in the Firewall.

ssh -i /home/carles/Desktop/Data/keys/carles-ecommerce.pem ubuntu@54.208.225.14

Step 13: Check that the web Starts correctly, check the Database logs to see if there is any corruption. Should not have any if graceful shutdown went well.

Step 14: Reopen the access from the Firewall, so the world can connect to your instance.

Method 2: Slower, access the Data and rebuild whatever you need

The second method is exactly the same until Step 6 included.

Step 7: After this, you will create a new instance based on your favorite OS, with a new pair of Keys.

Step 8: You’ll detach the Volume from the eCommerce previous instance (the one you lost access).

Step 9: You’ll attach the Volume to the new instance.

Step 10: You’ll have access to the Data from the previous instance in the new volume. type cat /proc/partitions or df -h to see the mountpoints available. You can then download or backup, or install the Software again and import the Database…

Step 11: Check that everything works, and enable the access worldwide to the Web in the Firewall (Security Group Inbound Rules).

If you are confident enough, you can use this method to upgrade the OS or base Software of your instance, making it part of your maintenance window. For example, to get the last version of Ubuntu or CentOS, MySQL, Python or PHP, etc…

The Ethernet standards group announces a new 800 GbE specification

Here is the link to the new: https://www.pcgamer.com/amp/the-ethernet-standards-group-developed-a-new-speed-so-fast-it-had-to-change-its-name/

This is a great new for scaling performance in the Data Centers. For routers, switches…

And this makes me think about all the Architects that are using Memcached and Redis in different Servers, in Networks of 1Gbps and makes me want to share with you what a nonsense, is often, that.

So the idea of having Memcache or Redis is just to cache the queries and unload the Database from those queries.

But 1Gbps is equivalent to 125MB (Megabytes) per second.

Local RAM Memory in Servers can perform at 24GB and more (24,000,000 Megabytes) per second, even more.

A PCIE NVMe drive at 3.5GB per second.

A local SSD drive without RAID 550 MB/s.

A SSD in the Cloud, varies a lot on the provider, number of drives, etc… but I’ve seen between 200 MB/s and 2.5GB/s aggregated in RAID.

In fact I have worked with Servers equipped with several IO Controllers, that were delivering 24GB/s of throughput writing or reading to HDD spinning drives.

If you’re in the Cloud. Instead of having 2 Load Balancers, 100 Front Web servers, with a cluster of 5 Redis with huge amount of RAM, and 1 MySQL Master and 1 Slave, all communicating at 1Gbps, probably you’ll get a better performance having the 2 LBs, and 11 Front Web with some more memory and having the Redis instance in the same machine and saving the money of that many small Front and from the 5 huge dedicated Redis.

The same applies if you’re using Docker or K8s.

Even if you just cache the queries to drive, speed will be better than sending everything through 1 Gbps.

This will matter for you if your site is really under heavy load. Most of the sites just query the MySQL Server using 1 Gbps lines, or 2 Gbps in bonding, and that’s enough.

Datacenters, D&R and coronavirus

I’ve been working for years within Data centers, with D&R strategies, and then in the middle of COVID-19, with huge demands on increments of bandwidth and compute, some DCs decided to do not allow in the Engineers of their customers.

As somebody that had my own Startup and CSP and had infrastructure in DCs and servers from customers in colocation, and has replaced Hw components at 1AM, replaced drives from broken RAIDs, and fixed systems so many times inside so many Datacenters across the world, I’m shocked about that.

I understand health reasons can be argued, but I still have Servers in Datacenters because we all believed they were the most safe place, prepared for disaster and recovery, with security, 24×7… and now, one realise that cannot enter to fix or upgrade the own machines.
Please note, still you can use the remote hands from the DC, although this is not a good idea many times, I’m not sure this will still be an available option when the lock down in those countries becomes more strict.

I’m wondering if DCs current model have any future at all.

I think most of the D&R strategies from now will be in the cloud, in different regions, with different providers, so companies can resist providers or governments letting them down.

CTOP.py

For updated information visit the main page for CTOP.py

Current stable version is v.0.8.9 updated on 2022-07-03.

Current branch under development is v.0.8.10 updated on 2022-07-03.

Version 0.8.0 added compatibility with Python 2, for older Systems.

Find the source code in: https://gitlab.com/carles.mateo/ctop

Clone it with:

git clone https://gitlab.com/carles.mateo/ctop.git

ctop.py is an Open Source tool for Linux System Administration that I’ve written in Python3. It uses only the System (/proc), and not third party libraries, in order to get all the information required.
I use only this modules, so it’s ideal to run in all the farm of Servers and Dockers:

  • os
  • sys
  • time
  • shutil (for getting the Terminal width and height)

The purpose of this tool is to help to troubleshot and to identify problems with a single view to a single tool that has all the typical indicators.

It provides in a single view information that is typically provided by many programs:

  • top, htop for the CPU usage, process list, memory usage
  • meminfo
  • cpuinfo
  • hostname
  • uptime
  • df to see the free space in / and the free inodes
  • iftop to see real-time bandwidth usage
  • ip addr list to see the main Ip for the interfaces
  • netstat or lsof to see the list of listening TCP Ports
  • uname -a to see the Kernel version

Other cool things it does is:

  • Identifying if you’re inside an Amazon VM, Google GCP, OpenStack VMs, Virtual Box VMs, Docker Containers or lxc.
  • Compatible with Raspberry Pi (tested on 3 and 4, on Raspbian and Ubuntu 20.04LTS)
  • Uses colors, and marks in yellow the warnings and in red the errors, problems like few disk space reaming or high CPU usage according to the available cores and CPUs.
  • Redraws the screen and adjust to the size of the Terminal, bigger terminal displays more information
  • It doesn’t use external libraries, and does not escape to shell. It reads everything from /proc /sys or /etc files.
  • Identifies the Linux distribution
  • Supports Plugins loaded on demand.
  • Shows the most repeated binaries, so you can identify DDoS attacks (like having 5,000 apache instances where you have normally 500 or many instances of Python)
  • Indicates if an interface has the cable connected or disconnected
  • Shows the Speed of the Network Connection (useful for Mellanox cards than can operate and 200Gbit/sec, 100, 50, 40, 25, 10…)
  • It displays the local time and the Linux Epoch Time, which is universal (very useful for logs and to detect when there was an issue, for example if your system restarted, your SSH Session would keep latest Epoch captured)
  • No root required
  • Displays recent errors like NFS Timed outs or Memory Read Errors.
  • You can enforce the output to be in a determined number of columns and rows, for data scrapping.
  • You can specify the number of loops (1 for scrapping, by default is infinite)
  • You can specify the time between screen refreshes, for long placed SSH sessions
  • You can specify to see the output in b/w or in color (default)

Plugins allow you to extend the functionality effortlessly, without having to learn all the code. I provide a Plugin sample for starting lights on a Raspberry Pi, depending on the CPU Load, and playing a message “The system is healthy” or “Warning. The CPU is at 80%”.

Limitations:

  • It only works for Linux, not for Mac or for Windows. Although the idea is to help with Server’s Linux Administration and Troubleshot, and Mac and Windows do not have /proc
  • The list of process of the System is read every 30 seconds, to avoid adding much overhead on the System, other info every second
  • It does not run in Python 2.x, requires Python 3 (tested on 3.5, 3.6, 3.7, 3.8, 3.9)

I decided to code name the version 0.7 as “Catalan Republic” to support the dreams and hopes and democratic requests of the Catalan people, to become and independent republic.

I created this tool as Open Source and if you want to help I need people to test under different versions of:

  • Atypical Linux distributions

If you are a Cloud Provider and want me to implement the detection of your VMs, so the tool knows that is a instance of the Amazon, Google, Azure, Cloudsigma, Digital Ocean… contact me through my LinkedIn.

Monitoring an Amazon Instance, take a look at the amount of traffic sent and received

Some of the features I’m working on are parsing the logs checking for errors, kernel panics, processed killed due to lack of memory, iscsi disconnects, nfs errors, checking the logs of mysql and Oracle databases to locate errors

Adding my Server as Docker, with PHP Catalonia Framework, explained

Update: 2021-07-23 Ubuntu 19.04 is no longer available, so I updated the article in order to work with Ubuntu 20.04. and with PHP 7.4 and all their dependencies.

The previous day I explained how I migrated my old Server (Amazon Instance) to a more powerful model, with more recent OS, WebServer, etc…

This was interesting under the point of view of dealing with elastic Ip’s, Amazon AWS Volumes, etc… but was a process basically manual. I could have generated an immutable image to start from next time, but this is another discussion, specially because that Server Instance has different base Software, including a MySql Database.

This time I want to explain, step by step, how to containerize my Server, so I can port to different platforms, and I can be independent on what the Server Operating System is. It will work always, as we defined the Operating System for the Docker Container.

So we start to use IaC (Infrastructure as Code).

So first you need to install docker.

So basically if your laptop is an Ubuntu 18.04 LTS or 20.04 LTS you have to:

sudo apt install docker.io

Start and Automate Docker

The Docker service needs to be setup to run at startup. To do so, type in each command followed by enter:

sudo systemctl start docker
sudo systemctl enable docker

Create the Dockerfile

For doing this you can use any text editor, but as we are working with IaC why not use a Code Editor?.

You can use the versatile PyCharm, that has modules for understanding Docker and so you can use Control Version like git too.

This is the updated Dockerfile to work with Ubuntu 20.04 LTS

FROM ubuntu:20.04

MAINTAINER Carles <carles@carlesmateo.com>

ARG DEBIAN_FRONTEND=noninteractive

#RUN echo "nameserver 8.8.8.8" > /etc/resolv.conf

RUN echo "Europe/Ireland" | tee /etc/timezone

# Note: You should install everything in a single line concatenated with
#       && and finalizing with 
# apt autoremove && apt clean

#       In order to use the less space possible, as every command is a layer
RUN apt update && apt install -y apache2 ntpdate libapache2-mod-php7.4 mysql-server php7.4-mysql php-dev libmcrypt-dev php-pear git && apt autoremove && apt clean

RUN a2enmod rewrite

RUN mkdir -p /www

# In order to activate Debug
# RUN sed -i "s/display_errors = Off/display_errors = On/" /etc/php/7.2/apache2/php.ini 
# RUN sed -i "s/error_reporting = E_ALL & ~E_DEPRECATED & ~E_STRICT/error_reporting = E_ALL/" /etc/php/7.2/apache2/php.ini 
# RUN sed -i "s/display_startup_errors = Off/display_startup_errors = On/" /etc/php/7.2/apache2/php.ini 
# To Debug remember to change:
# config/{production.php|preproduction.php|devel.php|docker.php} 
# in order to avoid Error Reporting being set to 0.

ENV PATH_CATALONIA /www/www.cataloniaframework.com/
ENV PATH_CATALONIA_WWW /www/www.cataloniaframework.com/www/
ENV PATH_CATALONIA_CACHE /www/www.cataloniaframework.com/cache/

ENV APACHE_RUN_USER  www-data
ENV APACHE_RUN_GROUP www-data
ENV APACHE_LOG_DIR   /var/log/apache2
ENV APACHE_PID_FILE  /var/run/apache2/apache2.pid
ENV APACHE_RUN_DIR   /var/run/apache2
ENV APACHE_LOCK_DIR  /var/lock/apache2
ENV APACHE_LOG_DIR   /var/log/apache2

RUN mkdir -p $APACHE_RUN_DIR
RUN mkdir -p $APACHE_LOCK_DIR
RUN mkdir -p $APACHE_LOG_DIR
RUN mkdir -p $PATH_CATALONIA
RUN mkdir -p $PATH_CATALONIA_WWW
RUN mkdir -p $PATH_CATALONIA_CACHE

# Remove the default Server
RUN sed -i '/<Directory \/var\/www\/>/,/<\/Directory>/{/<\/Directory>/ s/.*/# var-www commented/; t; d}' /etc/apache2/apache2.conf 

RUN rm /etc/apache2/sites-enabled/000-default.conf

COPY www.cataloniaframework.com.conf /etc/apache2/sites-available/

RUN chmod 777 $PATH_CATALONIA_CACHE
RUN chmod 777 $PATH_CATALONIA_CACHE.
RUN chown --recursive $APACHE_RUN_USER.$APACHE_RUN_GROUP $PATH_CATALONIA_CACHE

RUN ln -s /etc/apache2/sites-available/www.cataloniaframework.com.conf /etc/apache2/sites-enabled/

# Note: You should clone locally and COPY to the Docker Image
#       Also you should add the .git directory to your .dockerignore file
#       I made this way to show you and for simplicity, having everything
#       in a single file
##RUN git clone https://github.com/cataloniaframework/cataloniaframework_v1_sample_website /www/www.cataloniaframework.com
##RUN git checkout tags/v.1.16-web-1.0
# In order to change profile to Production
# RUN sed -i "s/define('ENVIRONMENT', DOCKER)/define('ENVIRONMENT', PRODUCTION)/" /var/www/www.cataloniaframework.com/config/general.php 
COPY *.php /www/www.cataloniaframework.com/www

# for debugging
#RUN apt-get install -y vim

RUN service apache2 restart

EXPOSE 80

CMD ["/usr/sbin/apache2", "-D", "FOREGROUND"]

The www.cataloniaframework.com.conf file

As you saw in the Dockerfile you have the line:

COPY www.cataloniaframework.com.conf /etc/apache2/sites-available/

This will copy the file www.cataloniaframework.com.conf that must be in the same directory that the Dockerfile file, to the /etc/apache2/sites-available/ folder in the container.

<VirtualHost *:80>
    ServerAdmin webmaster@cataloniaframework.com
    # Uncomment to use a DNS name in a multiple VirtualHost Environment
    #ServerName www.cataloniaframework.com
    #ServerAlias cataloniaframework.com
    DocumentRoot /www/www.cataloniaframework.com/www
    <Directory /www/www.cataloniaframework.com/www/>
            Options -Indexes +FollowSymLinks +MultiViews
            AllowOverride All
            Order allow,deny
            allow from all
            Require all granted
    </Directory>
    ErrorLog ${APACHE_LOG_DIR}/www-cataloniaframework-com-error.log
    # Possible values include: debug, info, notice, warn, error, crit,
    # alert, emerg.
    LogLevel warn
    CustomLog ${APACHE_LOG_DIR}/www-cataloniaframework-com-access.log combined
</VirtualHost>

Stopping, starting the docker Service and creating the Catalonia image

service docker stop && service docker start

To build the Docker Image we will do:

docker build -t catalonia . --no-cache

I use the –no-cache so git is pulled and everything is reworked, not kept from cache.

Now we can run the Catalonia Docker, mapping the 80 port.

docker run -d -p 80:80 catalonia

If you want to check what’s going on inside the Docker, you’ll do:

docker ps

And so in this case, we will do:

docker exec -i -t distracted_wing /bin/bash

Finally I would like to check that the web page works, and I’ll use my preferred browser. In this case I will use lynx, the text browser, cause I don’t want Firefox to save things in the cache.

Upgrading the Blog after 5 years, AWS Amazon Web Services, under DoS and Spam attacks

Few days ago I was under a heavy DoS attack.

Nothing new, zombie computers, hackers, pirates, networks of computers… trying to abuse the system and to hack into it. Why? There could be many reasons, from storing pirate movies, trying to use your Server for sending Spam, try to phishing or to host Ransomware pages…

Most of those guys doesn’t know that is almost impossible to Spam from Amazon. Few emails per hour can come out from the Server unless you explicitly requests that update and configure everything.

But I thought it was a great opportunity to force myself to update the Operating System, core tools, versions of PHP and MySql.

Forensics / Postmortem of the incident

The task was divided in two parts:

  • Understanding the origin of the attack
  • Blocking the offending Ip addresses or disabling XMLRPC
  • Making the VM boot again (problems with Amazon AWS)
    • I didn’t know why it was not booting so.
  • Upgrading the OS

I disabled the access to the site while I was working using Amazon Web Services Firewall. Basically I turned access to my ip only. Example: 8.8.8.8/32

I changed 0.0.0.0/0 so the world wide mask to my_Ip/3

That way the logs were reflecting only what I was doing from my Ip.

Dealing with Snapshots and Volumes in AWS

Well the first thing was doing an Snapshot.

After, I tried to boot the original Blog Server (so I don’t stop offering service) but no way, the Server appeared to be dead.

So then I attached the Volume to a new Server with the same base OS, in order to extract (dump) the database. Later I would attach the same Volume to a new Server with the most recent OS and base Software.

Something that is a bit annoying is that the new Instances, the new generation instances, run only in VPC, not in Amazon EC2 Classic. But my static Ip addresses are created for Amazon EC2 Classic, so I could not use them in new generation instances.

I choose the option to see all the All the generations.

Upgrading the system base Software had its own challenges too.

Upgrading the OS / Base Software

My approach was to install an Ubuntu 18.04 LTS, and install the base Software clean, and add any modification I may need.

I wanted to have all the supported packages and a recent version of PHP 7 and the latest Software pieces link Apache or MySQL.

sudo apt update

sudo apt install apache2

sudo apt install mysql-server

sudo apt install php libapache2-mod-php php-mysql

Apache2

Config files that before were working stopped working as the new Apache version requires the files or symlinks under /etc/apache2/sites-enabled/ to end with .conf extension.

Also some directives changed, so some websites will not able to work properly.

Those projects using my Catalonia Framework were affected, although I have this very well documented to make it easy to work with both versions of Apache Http Server, so it was a very straightforward change.

From the previous version I had to change my www.cataloniaframework.com.conf file and enable:

    <Directory /www/www.cataloniaframework.com>
Options Indexes FollowSymLinks MultiViews
AllowOverride All
Order allow,deny
allow from all
</Directory>

Then Open the ports for the Web Server (443 and 80).

sudo ufw allow in "Apache Full"

Then service apache restart

Catalonia Framework Web Site, which is also created with Catalonia Framework itself once restored

MySQL

The problem was to use the most updated version of the Database. I could use one of the backups I keep, from last week, but I wanted more fresh data.

I had the .db files and it should had been very straightforward to copy to /var/lib/mysql/ … if they were the same version. But they weren’t. So I launched an instance with the same base Software as the old previous machine had, installed mysql-server, stopped it, copied the .db files, started it, and then I made a dump with mysqldump –all-databases > 2019-04-29-all-databases.sql

Note, I copied the .db files using the mythical mc, which is a clone from Norton Commander.

Then I stopped that instance and I detached that volume and attached it to the new Blog Instance.

I did a Backup of my original /var/lib/mysql/ files for the purpose of faster restoring if something went wrong.

I mounted it under /mnt/blog_old and did mysql -u root -p < /mnt/blog_old/home/ubuntu/2019-04-29-all-databases.sql

That worked well I had restored the blog. But as I was watching the /var/log/mysql/error.log I noticed some columns were not where they should be. That’s because inadvertently I overwritten the MySql table as well, which in MySQL 5.7 has different structure than in MySQL 5.5. So I screwed. As I previewed this possibility I restored from the backup in seconds.

So basically then I edited my .sql files and removed all that was for the mysql database.

I started MySql, and run the mysql import procedure again. It worked, but I had to recreate the users for all the Databases and Grant them permissions.

GRANT ALL PRIVILEGES ON db_mysqlproxycache.* TO 'wp_dbuser_mysqlproxy'@'localhost' IDENTIFIED BY 'XWy$&{yS@qlC|<¡!?;:-ç';

PHP7

Some modules in my blogs where returning errors in /var/log/apache2/mysite-error.log so I checked that it was due to lack of support of latest PHP versions, and so I patched manually the code or I just disabled the offending plugin.

WordPress

As seen checking the /var/log/apache2/blog.carlesmateo.com-error.log some URLs where not located by WordPress.

For example:

The requested URL /wordpress/wp-json/ was not found on this server

I had to activate modrewrite and then restart Apache.

a2enmod rewrite; service apache2 restart

Making the site more secure

Checking at the logs of Apache, /var/log/apache2/blog.carlesmateo.com-access.log I checked for Ip’s accessing Admin areas, I looked for 404 Errors pointing to intents to exploit any unsafe WP Plugin, I checked for POST protocol as well.

I added to the Ubuntu Uncomplicated Firewall (UFW) the offending Ip’s and patched the xmlrpc.php file to exit always.