Category Archives: Operations

Adding a swapfile on the fly as a temporary solution for a Server with few memory

Here is an easy trick that you can use for adding swap temporarily to a Server, VMs or Workstations, if you are in an emergency.

In this case I had a cluster composed from two instances running out of memory.

I got an alert for one of the Servers, reporting that only had 7% of free memory.

Immediately I checked it, but checked also any other forming part of the cluster.

Another one appeared, had just only a bit more memory than the other, but was considered in Critical condition too.

The owner of the Service was contacted and asked if we can hold it until US Business hours. Those Servers were going to be replaced next day in US Business hours, and when possible it would be nice not to wake up the Team. It was day in Europe, but night in US.

I checked the status of the Server with those commands:

# df -h

There are 13GB of free space in /. More than enough to be safe as this service doesn’t use much.

# free -h
              total        used        free      shared  buff/cache   available
Mem:           5.7G        4.8G        139M        298M        738M        320M
Swap:            0B          0B          0B

I checked the memory, ok, there are only 139MB free in this node, but 738MB are buff/cache. Buff/Cache is memory used by Linux to optimize I/O as long as it is not needed by application. These 738 MB in buff/cache (or most of it) will be used if needed by the System. The field available corresponds to the memory that is available for starting new applications (not counting the swap if there was any), and basically is the free memory plus a fragment of the buff/cache. I’m sure we could use more than 320MB and there is a lot if buff/cache, but to play safe we play by the book.

With that in mind it seemed that it would hold perfectly to Business hours.

I checked top. It is interesting to mention the meaning of the Column RES, which is resident memory, in other words, the real amount of memory that the process is using.

I had a Java process using 4.57GB of RAM, but a look at how much Heap Memory was reserved and actually being used showed a Heap of 4GB (Memory reserved) and 1.5GB actually being used for real, from the Heap, only.

It was unlikely that elastic search would use all those 4GB, and seemed really unlikely that the instance will suffer from memory starvation with 2.5GB of 4GB of the Heap free, ~1GB of RAM in buffers/cache plus free, so looked good.

To be 100% sure I created a temporary swap space in a file on the SSD.

# fallocate -l 1G /swapfile-temp

# dd if=/dev/zero of=/swapfile-temp bs=1024 count=1048576 status=progress
1034236928 bytes (1.0 GB) copied, 4.020716 s, 257 MB/s
1048576+0 records in
1048576+0 records out
1073741824 bytes (1.1 GB) copied, 4.26152 s, 252 MB/s

If you ask me why I had to dd, I will tell you that I needed to. I checked with command blkid and filesystem was xfs. I believe that was the reason.

The speed writing to the file is fair enough for a swap.

# chmod 600 /swapfile-temp

# mkswap /swapfile-temp
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=5fb12c0c-8079-41dc-aa20-21477808619a

# swapon /swapfile-temp

I check that memory is good:

# free -h
              total        used        free      shared  buff/cache   available
Mem:           5.7G        4.8G        117M        298M        770M        329M
Swap:          1.0G          0B        1.0G

And finally I check that the Kernel parameter swappiness is not too aggressive:

# sysctl vm.swappiness
vm.swappiness = 30

Cool. 30 is a fair enough value.

Post-Mortem: The mystery of the duplicated Transactions into an e-Commerce

Me, with 4 more Senior BackEnd Engineers wrote the new e-Commerce for a multinational.

The old legacy Software evolved into a different code for every country, making it impossible to be maintained.

The new Software we created used inheritance to use the same base code for each country and overloaded only the specific different behavior of every country, like for the payment methods, for example Brazil supporting “parcelados” or Germany with specific payment players.

We rewrote the old procedural PHP BackEnd into modern PHP, with OOP and our own Framework but we had to keep the transactional code in existing MySQL Procedures, so the logic was split. There was a Front End Team consuming our JSONs. Basically all the Front End code was cached in Akamai and pages were rendered accordingly to the JSONs served from out BackEnd.

It was a huge success.

This e-Commerce site had Campaigns that started at a certain time, so the amount of traffic that would come at the same time would be challenging.

The project was working very well, and after some time the original Team was split into different projects in the company and a Team for maintenance and evolutives was hired.

At certain point they started to encounter duplicate transactions, and nobody was able to solve the mystery.

I’m specialized into fixing impossible problems. They used to send me to Impossible Missions, and I am famous for solving impossible problems easily.

So I started the task with a SRE approach.

The System had many components and layers. The problem could be in many places.

I had in my arsenal of tools, Software like mysqldebugger with which I found an unnoticed bug in decimals calculation in the past surprising everybody.

Previous Engineers involved believed the problem was in the Database side. They were having difficulties to identify the issue by the random nature of the repetitions.

Some times the order lines were duplicated, and other times were the payments, which means charging twice to the customer.

Redis Cluster could also play a part on this, as storing the session information and the basket.

But I had to follow the logic sequence of steps.

If transactions from customer were duplicated that mean that in first term those requests have arrived to the System. So that was a good point of start.

With a list of duplicated operations, I checked the Webservers logs.

That was a bit tricky as the Webserver was recording the Ip of the Load Balancer, not the ip of the customer. But we were tracking the sessionid so with that I could track and user request history. A good thing was also that we were using cookies to stick the user to the same Webserver node. That has pros and cons, but in this case I didn’t have to worry about the logs combined of all the Webservers, I could just identify a transaction in one node, and stick into that node’s log.

I was working with SSH and Bash, no log aggregators existing today were available at that time.

So when I started to catch web logs and grep a bit an smile was drawn into my face. :)

There were no transactions repeated by a bad behavior on MySQL Masters, or by BackEnd problems. Actually the HTTP requests were performed twice.

And the explanation to that was much more simple.

Many Windows and Mac User are used to double click in the Desktop to open programs, so when they started to use Internet, they did the same. They double clicked on the Submit button on the forms. Causing two JavaScript requests in parallel.

When I explained it they were really surprised, but then they started to worry about how they could fix that.

Well, there are many ways, like using an UUID in each request and do not accepting two concurrents, but I came with something that we could deploy super fast.

I explained how to change the JavaScript code so the buttons will have no default submit action, and they will trigger a JavaScript method instead, that will set a boolean to True, and also would disable the button so it can not be clicked anymore. Only if the variable was False the submit would be performed. It was almost impossible to get a double click as the JavaScript was so fast disabling the button, that the second click will not trigger anything. But even if that could be possible, only one request would be made, as the variable was set to True on the first click event.

That case was very funny for me, because it was not necessary to go crazy inspecting the different layers of the system. The problem was detected simply with HTTP logs. :)

People often forget to follow the logic steps while many problems are much more simple.

As a curious note, I still see people double clicking on links and buttons on the Web, and some Software not handling it. :)

News from the blog 2020-11-03

Nice articles recommended

This article talks about how at Riot Games they use Slack. Slack is really a powerful tool, and also makes the communication more human in companies with their approach and the funny icons and /giphy. I’m very serious when it comes to work but I recognize the friendly, warm, human and lovely touch these kind of animated icons bring to the conversations.

Remember that life of the SSD is different from spinning drives. I recommend to keep your backups on external spinning drives disconnected most of the time.

Operating at Scale – An Inside Look at Facebook’s Production Engineering Team

CMIPS

I’ve been working on testing performance of more configurations on Azure and GCP.

I’m also looking forward to test the AMD Ryzen™ 7 3700X, AM4, Zen 2, 8 Core, 16 Thread, 3.6GHz, 4.4GHz Turbo that is arriving to me this week.

CTOP.py v. 0.7.8 released

I closed the ticket #21 (Thank Jian!) so ensuring CTOP.py is compatible with Python 3.5 versions.

Feature requests and bugs are listed using gitlab: https://gitlab.com/carles.mateo/ctop/-/issues

My Python Combat Guide Book

I updated it the Nov-01, as I normally do, bringing more content.

I’ve been paid the royalties for he past two months and I reinvested everything (and more from my pocket) in Hardware for working with ZFS.

I was offered by an editorial in The States to publish Python Combat Guide and other of my books worldwide. I was thinking it for a while. It was very good money, translation to multiple languages and platforms and marketing and a lot of promotion, but I would had loss the rights and the Freedom I have now, like the possibility to offer discount coupons to who I want and to update the contents often. So to celebrate my decision for you, readers of the blog, during September, I provide a discounted price of $5 USD for the fist 100 sales instead of the $25 USD suggested price. Use the following link:

https://leanpub.com/pythoncombatguide/c/blog-carles-nov2020

ZFS progress

As part of my effort to contributing with nice Open Source products to the Community I have made some investments to keep contributing to:

  • OpenZFS
  • My old tool for managing ZFS and Network shares easily

I’m writing a new book about managing ZFS for Small Business too, so I show how to operate on this hardware, good points and downsides.

I’m assembling a new Pc with ZFS plenty of Disk Storage within a mix of:

  • SAS Enterprise grade SSD 2.5″
  • SATA 12Gb Enterprise grade SSD 2.5″
  • SATA SSD 2.5″
  • SATA HDD 2TB 2.5″
  • SATA HDD 2TB 3.5″

I’m a big fan of Intel, but this time I have chosen AMD. Concretely a AMD Ryzen 7 3700X AM4 8 Core / 16 Threads, 3.6 GHz to 4.4 GHz with Turbo. The reason I chose this CPU is because it only uses 65W but still has 8 Cores / 16 Threads.

Also I want to see the performance of this AMD Ryzen with CMIPS and another important reason is that AMD motherboards support PCI 4.0. I have bought a NVMe SSD Samsung 980 PRO PCI 4.0 (x4) able to read at 6,400 MB/s. I will use this AMD box for running VMs as well. Basically Virtual Box and Docker.

I’ve been surprised that for 169.99 GBP I can have a very good Asus Motherboard with a 2.5 Gb Ethernet: ASUS ROG STRIX B550-F GAMING, AMD B550, AM4, DDR4, PCIe 4.0, SATA3, Dual M.2, CrossFire, 2.5GbE, USB 3.2 Gen2 A+C, ATX.

In order to have an Asus motherboard with a 2.5 Gb Ethernet for Intel I had to jump to a 254 GBP motherboard and Intel is still PCI 3.0. Actually there are PCI 10Gb NICs at 80 GBP so at some point I’ll upgrade my home network from Gigabit to 10 Gb. That will come slowly, but if the new equipment I assemble has 2.5 Gb when I upgrade the main switches to 10 Gb, at least I’ll be able to communicate at 2.5 Gb without ant additional change.

Also memory at 3200, speed that the AMD motherboard can provide, is more than affordable.

This new server will have 64 GB of RAM (Corsair DDR4 Vengeance PC4-25600 (3200)), as I plan to run VMs and use Volumes mounted via iSCSI and locally as block devices to improve my Software. I’ve bought a new UPS to keep it running in case power goes down. That’s something that doesn’t happen often in my city in Ireland, honestly, but I never forget that this happens in Barcelona two or three times per year, and that a high tension spike can burn your motherboard, drives, or electronics like the TV or the fridge. I’ve bought as well a new KVM Switch, a HDMI 4K and USB too one, so I don’t have to have so many keyboards. My logitech M720 allowed me to use it with 3 computers, but still I want something more operational. The KVM I bought allow me to switch with a button or within a hotkey in the keyboard.

I bought a new Icy box fox handling 6 2.5 drives in just one bay of the tower, and a 850 Watt Corsair PSU that will be able to power the many drives I want at the same time.

More books coming

I started two new books:

Those can be purchased while I’m still working on them and get the updates that I’ll be publishing and keeping a communication with me about doubts or improvements.

Halloween Software Offers

I saw some Halloween offers and I purchased Software licenses for Software I use.

Backup Guard is one of the products I registered:

https://backup-guard.com/

I contribute a lot to Open Source, and many years ago before Open Source existed I was creating Freeware Software. But I think that good commercial Software deserves to be supported. Like everything in life, if they are doing a good work that is useful to me, why not giving them support?. It is also a way to make sure they will continue producing amazing Software. And in the other hand, myself, I create Software. Some times commercial Software, and I like to be paid, so I apply the same principle.

How to recover access to your Amazon AWS EC2 instance if you loss your Private Key for SSH

This article covers the desperate situation where you had generated one or more instances, instructed Amazon to use a SSH Key Pair certs where only you have the Private Key, your instances are running, for example, an eCommerce site, running for months, and then you loss your Private Key (.pem file), and with it the SSH access to your instances’ Data.

Actually I’ve seen this situation happening several times, in actual companies. Mainly Start ups. And I solved it for them.

Assuming that you didn’t have a secondary method to access, which is another combination of username/password or other user/KeyPairs, and so you completely lost the access to the Database, the Webservers, etc… I’m going to show you how to recover the data.

For this article I will consider an scenario where there is only one Instance, which contains everything for your eCommerce: Webserver, code, and Database… and is a simple config, with a single persistent drive.

Warning: be very careful as if you use ephemeral drives, contents will be lost is you power off the instance.

Method 1: Quicker, launching a new instance from the previous

Step1: The first step you will take is to close the access from outside, using the Firewall, to avoid any new changes going to the disk. You can allow access to the instance only from your static Ip in the office/home.

Step 2: You’ll wait for 5 minutes to allow any transaction going on to conclude, and pending writes to be flushed to disk.

Step 3: From Amazon AWS Console, EC2, you’ll request an Snapshot. That step is to try to get extra security. Taking an Snapshot from a live, mounted, filesystem, is not the best of ideas, specially of a Database, but we are facing a desperate situation so we’re increasing the numbers of leaving this situation without Data loss. This is just for extra security and if everything goes well at the end you will not need this snapshot.

Make sure you select No reboot.

Step 4: Be very careful if you have extra drives and ephemeral drives.

Step 5: Wait till the Snapshot completes.

Step 6: Then request a graceful poweroff. Amazon will try to poweroff the Server in a gentle way. This may take two minutes.

Step 7: When the instance is powered off, request a new Snapshot. This is the one we really want. The other was just to be more safe. If you feel confident you can just unclick No Reboot on the previous Step and do only one Snapshot.

Step 8: Wait till the Snapshot completes.

Step 9: Generate and upload the new key you will use to AWS Console, or ask Amazon to generate a key pair for you. You can do it while creating the new instance through the wizard.

Step 10: Launch a new instance, based on your snapshot AMI. This will generate a copy of your previous instance (using the Snapshot) for the new one. Select the new Key pair. Finish assigning the Security groups, the elastic ip…

Step 11: Start the new instance. You can select a different flavor, like a more powerful instance, if you prefer. (scale vertically)

Step 12: Test your access by login via SSH with the new pair keys and from your static Ip which has access in the Firewall.

ssh -i /home/carles/Desktop/Data/keys/carles-ecommerce.pem ubuntu@54.208.225.14

Step 13: Check that the web Starts correctly, check the Database logs to see if there is any corruption. Should not have any if graceful shutdown went well.

Step 14: Reopen the access from the Firewall, so the world can connect to your instance.

Method 2: Slower, access the Data and rebuild whatever you need

The second method is exactly the same until Step 6 included.

Step 7: After this, you will create a new instance based on your favorite OS, with a new pair of Keys.

Step 8: You’ll detach the Volume from the eCommerce previous instance (the one you lost access).

Step 9: You’ll attach the Volume to the new instance.

Step 10: You’ll have access to the Data from the previous instance in the new volume. type cat /proc/partitions or df -h to see the mountpoints available. You can then download or backup, or install the Software again and import the Database…

Step 11: Check that everything works, and enable the access worldwide to the Web in the Firewall (Security Group Inbound Rules).

If you are confident enough, you can use this method to upgrade the OS or base Software of your instance, making it part of your maintenance window. For example, to get the last version of Ubuntu or CentOS, MySQL, Python or PHP, etc…

LDAPGUI a LDAP GUI program in Python and Tkinter

I wanted to automate certain operations that we do very often, and so I decided to do a PoC of how handy will it be to create GUI applications that can automate tasks.

As locating information in several repositories of information (ldap, databases, websites, etc…) can be tedious I decided to create a small program that queries LDAP for the information I’m interested, in this case a Location. This small program can very easily escalated to launch the VPN, to query a Database after querying LDAP if no results are found, etc…

I share with you the basic application as you may find interesting to create GUI applications in Python, compatible with Windows, Linux and Mac.

I’m super Linux fan but this is important, as many multinationals still use Windows or Mac even for Engineers and SRE positions.

With the article I provide a Dockerfile and a docker-compose.yml file that will launch an OpenLDAP Docker Container preloaded with very basic information and a PHPLDAPMIN Container.

Installation of the dependencies

Ubuntu:

sudo apt-get install python3.8 python3-tk
pip install ldap3

Windows and Mac:

Install Python3.6 or greater and from command line.

For Mac install pip if you don’t have it.

pip install ldap3

Python Code

#!/bin/env python3

import tkinter as tk
from ldap3 import Server, Connection, MODIFY_ADD, MODIFY_REPLACE, ALL_ATTRIBUTES, ObjectDef, Reader
from ldap3.utils.dn import safe_rdn
from lib.fileutils import FileUtils

# sudo apt-get install python3.8 python3-tk
# pip install ldap3


class LDAPGUI:

    s_config_file = "config.cfg"

    s_ldap_server = "ldapslave01"
    s_connection = "uid=%USERNAME%, cn=users, cn=accounts, dc=demo1, dc=carlesmateo, dc=com"
    s_username = "carlesmateo"
    s_password = "Secret123"
    s_query = "location=%LOCATION%,dc=demo1,dc=carlesmateo,dc=com"

    i_window_width = 610
    i_window_height = 650
    i_frame_width = i_window_width-5
    i_frame_height = 30
    i_frame_results_height = 400

    # Graphical objects
    o_window = None
    o_entry_ldap = None
    o_entry_connection = None
    o_entry_results = None
    o_entry_username = None
    o_entry_location = None
    o_button_search = None

    def __init__(self, o_fileutils):
        self.o_fileutils = o_fileutils

    def replace_connection_with_username(self):
        s_connection_raw = self.o_entry_username.get()
        s_connection = self.s_connection.replace("%USERNAME%", s_connection_raw)
        return s_connection

    def replace_query_with_location(self):
        s_query_raw = self.o_entry_location.get()
        s_query = self.s_query.replace("%LOCATION%", s_query_raw)
        return s_query

    def clear_results(self):
        self.o_entry_results.delete("1.0", tk.END)

    def enable_button(self):
        self.o_button_search["state"] = tk.NORMAL

    def disable_button(self):
        self.o_button_search["state"] = tk.DISABLED

    def ldap_run_query(self):

        self.disable_button()
        self.clear_results()

        s_ldap_server = self.o_entry_ldap.get()
        self.o_entry_results.insert(tk.END, "Connecting to: " + s_ldap_server + "...\n")
        try:

            o_server = Server(s_ldap_server)
            o_conn = Connection(o_server,
                                self.replace_connection_with_username(),
                                self.s_password,
                                auto_bind=False)
            o_conn.bind()

            if isinstance(o_conn.last_error, str):
                s_last_error = o_conn.last_error
                self.o_entry_results.insert(tk.END, "Last error: " + s_last_error + "\n")

            if isinstance(o_conn.result, str):
                s_conn_result = o_conn.result
                self.o_entry_results.insert(tk.END, "Connection result: " + s_conn_result + "\n")

            s_query = self.replace_query_with_location()
            self.o_entry_results.insert(tk.END, "Performing Query: " + s_query + "\n")

            obj_location = ObjectDef('organizationalUnit', o_conn)
            r = Reader(o_conn, obj_location, s_query)
            r.search()
            self.o_entry_results.insert(tk.END, "Results: \n" + str(r) + "\n")
            #print(r)

            #print(0, o_conn.extend.standard.who_am_i())
            # https://github.com/cannatag/ldap3/blob/master/tut1.py
            # https://github.com/cannatag/ldap3/blob/master/tut2.py

        except:
            self.o_entry_results.insert(tk.END, "There has been a problem" + "\n")

            if isinstance(o_conn.last_error, str):
                s_last_error = o_conn.last_error
                self.o_entry_results.insert(tk.END, "Last error: " + s_last_error + "\n")

        try:
            self.o_entry_results.insert(tk.END, "Closing connection\n")
            o_conn.unbind()
        except:
            self.o_entry_results.insert(tk.END, "Problem closing connection" + "\n")

        self.enable_button()

    def create_frame_location(self, o_window, i_width, i_height):
        o_frame = tk.Frame(master=o_window, width=i_width, height=i_height)
        o_frame.pack()

        o_lbl_location = tk.Label(master=o_frame,
                                  text="Location:",
                                  width=10,
                                  height=1)
        o_lbl_location.place(x=0, y=0)

        o_entry = tk.Entry(master=o_frame, fg="yellow", bg="blue", width=50)
        o_entry.insert(tk.END, "")
        o_entry.place(x=100, y=0)

        return o_frame, o_entry

    def create_frame_results(self, o_window, i_width, i_height):
        o_frame = tk.Frame(master=o_window, width=i_width, height=i_height)
        o_frame.pack()

        o_entry = tk.Text(master=o_frame, fg="grey", bg="white", width=75, height=20)
        o_entry.insert(tk.END, "")
        o_entry.place(x=0, y=0)

        return o_frame, o_entry

    def create_frame_username(self, o_window, i_width, i_height, s_username):
        o_frame = tk.Frame(master=o_window, width=i_width, height=i_height)
        o_frame.pack()

        o_lbl_location = tk.Label(master=o_frame,
                                  text="Username:",
                                  width=10,
                                  height=1)
        o_lbl_location.place(x=0, y=0)

        o_entry_username = tk.Entry(master=o_frame, fg="yellow", bg="blue", width=50)
        o_entry_username.insert(tk.END, s_username)
        o_entry_username.place(x=100, y=0)

        return o_frame, o_entry_username

    def create_frame_ldapserver(self, o_window, i_width, i_height, s_server):
        o_frame = tk.Frame(master=o_window, width=i_width, height=i_height)
        o_frame.pack()

        o_lbl_ldapserver = tk.Label(master=o_frame,
                                    text="LDAP Server:",
                                    width=10,
                                    height=1)
        o_lbl_ldapserver.place(x=0, y=0)
        # We don't need to pack the label, as it is inside a Frame, packet
        # o_lbl_ldapserver.pack()

        o_entry = tk.Entry(master=o_frame, fg="yellow", bg="blue", width=50)
        o_entry.insert(tk.END, s_server)
        o_entry.place(x=100, y=0)

        return o_frame, o_entry

    def create_frame_connection(self, o_window, i_width, i_height, s_connection):
        o_frame = tk.Frame(master=o_window, width=i_width, height=i_height)
        o_frame.pack()

        o_lbl_connection = tk.Label(master=o_frame,
                                    text="Connection:",
                                    width=10,
                                    height=1)
        o_lbl_connection.place(x=0, y=0)
        # We don't need to pack the label, as it is inside a Frame, packet
        # o_lbl_ldapserver.pack()

        o_entry_connection = tk.Entry(master=o_frame, fg="yellow", bg="blue", width=50)
        o_entry_connection.insert(tk.END, s_connection)
        o_entry_connection.place(x=100, y=0)

        return o_frame, o_entry_connection

    def create_frame_query(self, o_window, i_width, i_height, s_query):
        o_frame = tk.Frame(master=o_window, width=i_width, height=i_height)
        o_frame.pack()

        o_lbl_query = tk.Label(master=o_frame,
                                    text="Connection:",
                                    width=10,
                                    height=1)
        o_lbl_query.place(x=0, y=0)

        o_entry_query = tk.Entry(master=o_frame, fg="yellow", bg="blue", width=50)
        o_entry_query.insert(tk.END, s_query)
        o_entry_query.place(x=100, y=0)

        return o_frame, o_entry_query

    def create_button(self):
        o_button = tk.Button(
            text="Search",
            width=25,
            height=1,
            bg="blue",
            fg="yellow",
            command=self.ldap_run_query
        )

        o_button.pack()

        return o_button

    def render_screen(self):
        o_window = tk.Tk()
        o_window.title("LDAPGUI by Carles Mateo")
        self.o_window = o_window
        self.o_window.geometry(str(self.i_window_width) + 'x' + str(self.i_window_height))

        o_frame_ldap, o_entry_ldap = self.create_frame_ldapserver(o_window=o_window,
                                                                  i_width=self.i_frame_width,
                                                                  i_height=self.i_frame_height,
                                                                  s_server=self.s_ldap_server)
        self.o_entry_ldap = o_entry_ldap
        o_frame_connection, o_entry_connection = self.create_frame_connection(o_window=o_window,
                                                                              i_width=self.i_frame_width,
                                                                              i_height=self.i_frame_height,
                                                                              s_connection=self.s_connection)
        self.o_entry_connection = o_entry_connection

        o_frame_user, o_entry_user = self.create_frame_username(o_window=o_window,
                                                                i_width=self.i_frame_width,
                                                                i_height=self.i_frame_height,
                                                                s_username=self.s_username)
        self.o_entry_username = o_entry_user

        o_frame_query, o_entry_query = self.create_frame_query(o_window=o_window,
                                                               i_width=self.i_frame_width,
                                                               i_height=self.i_frame_height,
                                                               s_query=self.s_query)
        self.o_entry_query = o_entry_query


        o_frame_location, o_entry_location = self.create_frame_location(o_window=o_window,
                                                                        i_width=self.i_frame_width,
                                                                        i_height=self.i_frame_height)

        self.o_entry_location = o_entry_location

        o_button_search = self.create_button()
        self.o_button_search = o_button_search

        o_frame_results, o_entry_results = self.create_frame_results(o_window=o_window,
                                                                     i_width=self.i_frame_width,
                                                                     i_height=self.i_frame_results_height)
        self.o_entry_results = o_entry_results

        o_window.mainloop()

    def load_config_values(self):
        b_success, d_s_config = self.o_fileutils.read_config_file_values(self.s_config_file)
        if b_success is True:
            if 'server' in d_s_config:
                self.s_ldap_server = d_s_config['server']
            if 'connection' in d_s_config:
                self.s_connection = d_s_config['connection']
            if 'username' in d_s_config:
                self.s_username = d_s_config['username']
            if 'password' in d_s_config:
                self.s_password = d_s_config['password']
            if 'query' in d_s_config:
                self.s_query = d_s_config['query']


def main():
    o_fileutils = FileUtils()

    o_ldapgui = LDAPGUI(o_fileutils=o_fileutils)
    o_ldapgui.load_config_values()

    o_ldapgui.render_screen()


if __name__ == "__main__":
    main()

Dockerfile

FROM osixia/openldap

# Note: Docker compose will generate the Docker Image.
# Run:  sudo docker-compose up -d

LABEL maintainer="Carles Mateo"

ENV LDAP_ORGANISATION="Carles Mateo Test Org" \
    LDAP_DOMAIN="carlesmateo.com"

COPY bootstrap.ldif /container/service/slapd/assets/config/bootstrap/ldif/50-bootstrap.ldif

docker-compose.yml

version: '3.3'
services:
  ldap_server:
      build:
        context: .
        dockerfile: Dockerfile
      environment:
        LDAP_ADMIN_PASSWORD: test1234
        LDAP_BASE_DN: dc=carlesmateo,dc=com
      ports:
        - 389:389
      volumes:
        - ldap_data:/var/lib/ldap
        - ldap_config:/etc/ldap/slapd.d
  ldap_server_admin:
      image: osixia/phpldapadmin:0.7.2
      ports:
        - 8090:80
      environment:
        PHPLDAPADMIN_LDAP_HOSTS: ldap_server
        PHPLDAPADMIN_HTTPS: 'false'
volumes:
  ldap_data:
  ldap_config:

config.cfg

# Config File for LDAPGUI by Carles Mateo
# https://blog.carlesmateo.com
#

# Configuration for working with the prepared Docker
server=127.0.0.1:389
connection=cn=%USERNAME%,dc=carlesmateo,dc=com
username=admin_gh
password=admin_gh_pass
query=cn=%LOCATION%,dc=carlesmateo,dc=com

# Carles other tests
#server=ldapslave.carlesmateo.com
#connection=uid=%USERNAME%, cn=users, cn=accounts, dc=demo1, dc=carlesmateo, dc=com
#username=carlesmateo
#password=Secret123
#query=location=%LOCATION%,dc=demo1,dc=carlesmateo,dc=com

bootstrap.ldif

In order to bootstrap the Data on the LDAP Server I use a bootstrap.ldif file copied into the Server.

Credits, I learned this trick in this page: https://medium.com/better-programming/ldap-docker-image-with-populated-users-3a5b4d090aa4

dn: cn=developer,dc=carlesmateo,dc=com
changetype: add
objectclass: Engineer
cn: developer
givenname: developer
sn: Developer
displayname: Developer User
mail: developer@carlesmateo.com
userpassword: developer_pass

dn: cn=maintainer,dc=carlesmateo,dc=com
changetype: add
objectclass: Engineer
cn: maintainer
givenname: maintainer
sn: Maintainer
displayname: Maintainer User
mail: maintainer@carlesmateo.com
userpassword: maintainer_pass

dn: cn=admin,dc=carlesmateo,dc=com
changetype: add
objectclass: Engineer
cn: admin
givenname: admin
sn: Admin
displayname: Admin
mail: admin@carlesmateo.com
userpassword: admin_pass

dn: ou=Groups,dc=carlesmateo,dc=com
changetype: add
objectclass: organizationalUnit
ou: Groups

dn: ou=Users,dc=carlesmateo,dc=com
changetype: add
objectclass: organizationalUnit
ou: Users

dn: cn=Admins,ou=Groups,dc=carlesmateo,dc=com
changetype: add
cn: Admins
objectclass: groupOfUniqueNames
uniqueMember: cn=admin,dc=carlesmateo,dc=com

dn: cn=Maintaners,ou=Groups,dc=carlesmateo,dc=com
changetype: add
cn: Maintaners
objectclass: groupOfUniqueNames
uniqueMember: cn=maintainer,dc=carlesmateo,dc=com
uniqueMember: cn=developer,dc=carlesmateo,dc=com

lib/fileutils.py

You can download this file from the lib folder in my project CTOP.py

More information about programming with tkinter:

https://realpython.com/python-gui-tkinter/#getting-user-input-with-entry-widgets

https://www.tutorialspoint.com/python/python_gui_programming.htm

https://effbot.org/tkinterbook/entry.htm

About LDAP:

https://ldap3.readthedocs.io/en/latest/

A script to backup your partition compressed and a game to learn about dd and pipes

This article is more an exercise, like a game, so you get to know certain things about Linux, and follow my mental process to uncover this. Is nothing mysterious for the Senior Engineers but Junior Sys Admins may enjoy this reading. :)

Ok, so the first thing is I wrote an script in order to completely backup my NVMe hard drive to a gziped file and then I will use this, as a motivation to go deep into investigations to understand.

Ok, so the first script would be like this:

#!/bin/bash
SOURCE_DRIVE="/dev/nvme0n1"
TARGET_PATH="/media/carles/Seagate\ Backup\ Plus\ Drive/BCK/"
TARGET_FILE="nvme.img"
sudo bash -c "dd if=${SOURCE_DRIVE} | gzip > ${TARGET_PATH}${TARGET_FILE}.gz"

So basically, we are going to restart the computer, boot with Linux Live USB Key, mount the Seagate Hard Drive, and run the script.

We are booting with a Live Linux Cd in order to have our partition unmounted and unmodified while we do the backup. This is in order to avoid corruption or data loss as a live Filesystem is getting modifications as we read it.

The problem with this first script is that it will generate a big gzip file.

By big I mean much more bigger than 2GB. Not all physical supports support files bigger than 2GB or 4GB, but even if they do, it’s a pain to transfer this over the Network, or in USB files, so we are going to do a slight modification.

#!/bin/bash
SOURCE_DRIVE="/dev/nvme0n1"
TARGET_PATH="/media/carles/Seagate\ Backup\ Plus\ Drive/BCK/"
TARGET_FILE="nvme.img"
sudo bash -c "dd if=${SOURCE_DRIVE} | gzip | split -b 1024MiB - ${TARGET_PATH}${TARGET_FILE}-split.gz_"

Ok, so we will use pipes and split in order to generate many files as big as 1GB.

If we ls we will get:

-rwxrwxrwx 1 carles carles 1073741824 May 24 14:57 nvme.img-split.gz_aa
-rwxrwxrwx 1 carles carles 1073741824 May 24 14:58 nvme.img-split.gz_ab
-rwxrwxrwx 1 carles carles 1073741824 May 24 14:59 nvme.img-split.gz_ac

Then one may say, Ok, this is working, but how I know the progress?.

For old versions of dd you can use pv which stands for Pipe Viewer and allows you to know the transference between processes using pipes.

For more recent versions of dd you can use status=progress.

So the script updated with status=progress is:

#!/bin/bash
SOURCE_DRIVE="/dev/nvme0n1"
TARGET_PATH="/media/carles/Seagate\ Backup\ Plus\ Drive/BCK/"
TARGET_FILE="nvme.img"
sudo bash -c "dd if=${SOURCE_DRIVE} status=progress | gzip | split -b 1024MiB - ${TARGET_PATH}${TARGET_FILE}-split.gz_"

You can also download the code from:

https://gitlab.com/carles.mateo/blog.carlesmateo.com-source-code/-/blob/master/backup_partition_in_files.sh

Then one may ask himself, wait, if pipes use STDOUT and STDIN and dd is displaying into the screen, then will our gz file get corrupted?.

I like when people question things, and investigate, so let’s answer this question.

If it was a young member of my Team I would ask:

  • Ok, try,it. Check the output file to see if is corrupted.

So they can do zcat or zless to inspect the file, see if it has errors, and to make sure:

gzip -v -t nvme.img.gz
nvme.img.gz:        OK

Ok, so what happened?, because we were seeing output in the screen.

Assuming the young Engineer does not know the answer I would had told:

  • Ok, so you know that if dd would print to STDOUT, then you won’t see it, cause it would be sent to the pipe, so there is something more you’re missing. Let’s check the source code of dd to see what status=progress does

And then look for “progress”.

Soon you’ll find things like everywhere:

  if (progress_time)
    fputc ('\r', stderr);

Ok, pay attention to where is the data written: stderr. So basically the answer is: dd status=progress does not corrupt STDOUT and prints into the screen because it uses STDERR.

Other funny ways to get the progress would be to use:

watch -n10 "ls -alh /BCK/ | grep nvme | wc --lines"
So you would see in real time what was the advance and finally 512GB where compressed to around 336GB in 336 files of 1 GB each (except the last one)

Another funny way would had been sending USR1 signal to the dd process:

Hope you enjoyed this little exercise about the importance of going deep, to the end, to understand what’s going on on the system. :)

Instead of gzip you can use bzip2 or pixz. pixz is very handy if you want to just compress a file, as it uses multiple processors in parallel for the tasks.

xz or lrzip are other compressors. lrzip aims to compress very large files, specially source code.

Troubleshooting a shell prompt irresponsible that locks/hangs intermittently

You do df -h or ls / and the terminal freezes and not even CTRL + C works, you have a lock.

Normally this is due to a lock of the system trying to perform an IO.

Could be a physical spinning disk failing, but the most probably nowadays is that you have a network mount point and it is timing out.

If you execute mount and you get a timeout, and when you finally see the list you see a NFS, iSCSI or another kind of Network mount (you will see an Ip Address), check for errors.

To do this in CentOS/RHEL you can do as root:

dmesg | grep -i "timed"

or depending on the System

cat /var/log/messages | grep -i "timed"

You’ll get something like this:

[root@compute01 carles]# dmesg -T | grep timed | head -n5
[Fri Mar 20 02:27:44 2020] nfs: server storage07 not responding, timed out
[Fri Mar 20 02:27:44 2020] nfs: server storage07 not responding, timed out
[Fri Mar 20 02:27:44 2020] nfs: server storage07 not responding, timed out
[Fri Mar 20 02:27:44 2020] nfs: server storage07 not responding, timed out
[Fri Mar 20 02:27:45 2020] nfs: server storage07 not responding, timed out

Please note I use dmesg -T in order to have human readable date instead of Unix Epoch.

You can count the errors today:

[root@compute01 carles]# dmesg -T | grep time | grep "Mon Apr 6" | wc --lines
3123

Blocking some offending Ip’s easily with Ubuntu ufw

Ok, so we know that there are several ip’s that have attempted to hack the blog.

We know they try different urls looking for a exploit, or they try to hack a password by brute force…

We are using Amazon EC2 and the old infrastructure, not a VPC Network, so we cannot block a specific Ip to our Web Server.

In an article from 2015 I explained How to Stop a BitTorrent based DDoS attack, and was using iptables for that.

In this example I will show how to use ufw to block tow specific Ip’s, execute as root or with sudo:

ufw insert 1 deny from 89.35.39.60 to any
ufw insert 2 deny from 85.204.246.240 to any
ufw allow OpenSSH
ufw allow 22/tcp
ufw allow "Apache Full"
ufw enable
ufw status numbered

You can do ufw status numbered to see the status of ufw and the rules order.

root@ip-111-111-111-111:/home/ubuntu# ufw status numbered
Status: active
To Action From
-- ------ ----

[ 1] Anywhere DENY IN 89.35.39.60
[ 2] Anywhere DENY IN 85.204.246.240
[ 3] Apache Full ALLOW IN Anywhere
[ 4] OpenSSH ALLOW IN Anywhere
[ 5] 22/tcp ALLOW IN Anywhere
[ 6] Apache Full (v6) ALLOW IN Anywhere (v6)
[ 7] OpenSSH (v6) ALLOW IN Anywhere (v6)
[ 8] 22/tcp (v6) ALLOW IN Anywhere (v6)
root@ip-111-111-111-111:/home/ubuntu#

If you need to delete a rule, use the number on the left and, just type:

sudo ufw delete 2

CTOP.py

Current version is v.0.7.7 updated on 2020-08-19 19:00 IST (Irish Standard Time).

Find the source code in: https://gitlab.com/carles.mateo/ctop

Clone it with:

git clone https://gitlab.com/carles.mateo/ctop.git

ctop.py is an Open Source tool for Linux System Administration that I’ve written in Python3. It uses only the System (/proc), and not third party libraries, in order to get all the information required.
I use only this modules, so it’s ideal to run in all the farm of Servers and Dockers:

  • os
  • sys
  • time
  • shutil (for getting the Terminal width and height)

The purpose of this tool is to help to troubleshot and to identify problems with a single view to a single tool that has all the typical indicators.

It provides in a single view information that is typically provided by many programs:

  • top, htop for the CPU usage, process list, memory usage
  • meminfo
  • cpuinfo
  • hostname
  • uptime
  • df to see the free space in / and the free inodes
  • iftop to see real-time bandwidth usage
  • ip addr list to see the main Ip for the interfaces
  • netstat or lsof to see the list of listening TCP Ports
  • uname -a to see the Kernel version

Other cool things it does is:

  • Identifying if you’re inside an Amazon VM, Virtual Box, Docker Containers or lxc
  • Uses colors, and marks in yellow the warnings and in red the errors, problems like few disk space reaming or high CPU usage according to the available cores and CPUs.
  • Redraws the screen and adjust to the size of the Terminal, bigger terminal displays more information
  • It doesn’t use external libraries, and does not escape to shell. It reads everything from /proc /sys or /etc files.
  • Identifies the Linux distribution
  • Shows the most repeated binaries, so you can identify DDoS attacks (like having 5,000 apache instances where you have normally 500 or many instances of Python)
  • Indicates if an interface has the cable connected or disconnected
  • Shows the Speed of the Network Connection (useful for Mellanox cards than can operate and 200Gbit/sec, 100, 50, 40, 25, 10…)
  • It displays the local time and the Linux Epoch Time, which is universal (very useful for logs and to detect when there was an issue, for example if your system restarted, your SSH Session would keep latest Epoch captured)
  • No root required
  • Displays recent errors like NFS Timed outs or Memory Read Errors.
  • You can enforce the output to be in a determined number of columns and rows, for data scrapping.
  • You can specify the number of loops (1 for scrapping, by default is infinite)
  • You can specify the time between screen refreshes, for long placed SSH sessions
  • You can specify to see the output in b/w or in color (default)

Limitations:

  • It only works for Linux, not for Mac or for Windows. Although the idea is to help with Server’s Linux Administration and Troubleshot and Mac and Windows do not have /proc
  • The list of process of the System is read every 30 seconds, to avoid adding much overhead on the System, other info every second

I decided to code name the version 0.7 as “Catalan Republic” to support the dreams and hopes and democratic requests of the Catalans people to become and independent republic.

I created this tool as Open Source and if you want to help I need people to test under different versions of:

  • Atypical Linux distributions

If you are a Cloud Provider and want me to implement the detection of your VMs, so the tool knows that is a instance of the Amazon, Google, Azure, Cloudsigma, Digital Ocean… contact me through my LinkedIn.

Monitoring an Amazon Instance, take a look at the amount of traffic sent and received

Some of the features I’m working on are parsing the logs checking for errors, kernel panics, processed killed due to lack of memory, iscsi disconnects, nfs errors, checking the logs of mysql and Oracle databases to locate errors

Dealing with Performance degradation on ZFS (DRAID) Rebuilds when migrating from a single processor to a multiprocessor platform

This is the history it happen to me some time ago, and so the commands I used to troubleshot. The purpose is to share knowledge in a interactive way. There are some hidden gems that you’ll acquire if you have the patience to go over all the document and read it all…

I had qualified Intel Xeon single processor platform to run my DRAID (ZFS Declustered RAID) project for my employer.

The platforms I qualified were:

1) single processor for Cold Storage (SAS Spinning drives): 4U60, newest models 4602

2) for multiprocessor: the 4U90 (90 Spinning drives) and Flash: All-Flash-Arrays.

The amounts of RAM I was using for my tests range for 64GB to 384GB.

Somebody in the company, at executive level, assembled an experimental config that was totally new for us and wanted to try by their own. It was the 4602 with multiprocessor and 32GB of RAM.

When they were unable to make it work at the expected speed, they required me to troubleshot and to make it work.

The 4602 single processor had two IOC (Input Output Controller, LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02) ), while the 4602 double processor had four IOC, so given that each of those IOC can perform at peaks of 6GB/s, with a maximum total of 24 GB/s, the performance when reading/writing from all the drives should be better.

But this Server was returning double times for Rebuilding, respect the single processor version, which didn’t make any sense.

I had to check everything. There was the commands I ran:

Check the upgrade of the CPU:

htop
lscpu

Changing the Zoning.

Those Servers use SAS drives dual ported, which means that two different computers can be connected to the same drive and operate at the same time. Is up to you to make sure you don’t introduce corruption. Those systems are used mainly for HA (High Availability).

Those Systems allow to be configured in different zoning modes. That’s the way on how each of the two servers (Controllers) see the disk. In one zoning each Controller sees only 30 drives, in another each IOC sees all the drives (for redundancy but performance constrained to 1 IOC Speed).

The config I set is each IOC will see 15 drives, so each one of the 4 IOC will have 6GB/s for 15 drives. Given that these spinning drives perform in the outtermost part of the cylinder at 265MB/s, that means that at maximum speed one IOC will be using 3.97 GB/s, will say 4GB/s. Plenty of bandwidth.

Note: Spinning drives have different performance depending on how close you’re to the cylinder. In the innermost part it goes under 145 MB/s, and if you read all of those drive sequentially with dd it will return an average speed of 145 MB/s.

With this command you can sive live how it performs and the average read speed in real time. Use skip to jump to that position (relative to bs) in the drive, so you can test directly the speed at the innermost close to the cylinder part of t.

dd if=/dev/sda of=/dev/null bs=1M status=progress

I saw that the zoning was not right one, so I set it correctly:

[root@4602Carles ~]# sg_map -i | grep NEWISYS
/dev/sg30  NEWISYS   NDS-4602-CS       0112
/dev/sg61  NEWISYS   NDS-4602-CS       0112
/dev/sg63  NEWISYS   NDS-4602-CS       0112
/dev/sg64  NEWISYS   NDS-4602-CS       0112
[root@4602Carles10 ~]# sg_senddiag /dev/sg30  --pf --raw=04,00,00,01,53
[root@4602Carles10 ~]# sleep 50
[root@4602Carles10 ~]# sg_senddiag /dev/sg30 --pf -r 04,00,00,01,43
[root@4602Carles10 ~]# sleep 50
[root@4602Carles10 ~]# reboot

The sleeps after rebooting the expanders are recommended. Rebooting the Operating System too, to avoid problems with some Software as the expanders changed live.

If you have ZFS pools or workloads stop them and export the pool before messing with the expanders.

In order to check to which drives is connected each IOC:

[root@4602Carles10 ~]# sg_map -i -x
/dev/sg0  0 0 0 0  0  /dev/sda  TOSHIBA   MG07SCA14TA       0101
/dev/sg1  0 0 1 0  0  /dev/sdb  TOSHIBA   MG07SCA14TA       0101
/dev/sg2  0 0 2 0  0  /dev/sdc  TOSHIBA   MG07SCA14TA       0101
/dev/sg3  0 0 3 0  0  /dev/sdd  TOSHIBA   MG07SCA14TA       0101
/dev/sg4  0 0 4 0  0  /dev/sde  TOSHIBA   MG07SCA14TA       0101
/dev/sg5  0 0 5 0  0  /dev/sdf  TOSHIBA   MG07SCA14TA       0101
/dev/sg6  0 0 6 0  0  /dev/sdg  TOSHIBA   MG07SCA14TA       0101
/dev/sg7  0 0 7 0  0  /dev/sdh  TOSHIBA   MG07SCA14TA       0101
/dev/sg8  1 0 8 0  0  /dev/sdi  TOSHIBA   MG07SCA14TA       0101
/dev/sg9  1 0 9 0  0  /dev/sdj  TOSHIBA   MG07SCA14TA       0101
/dev/sg10  1 0 10 0  0  /dev/sdk  TOSHIBA   MG07SCA14TA       0101
/dev/sg11  1 0 11 0  0  /dev/sdl  TOSHIBA   MG07SCA14TA       0101
[...]
/dev/sg16  4 0 16 0  0  /dev/sdq  TOSHIBA   MG07SCA14TA       0101
/dev/sg17  4 0 17 0  0  /dev/sdr  TOSHIBA   MG07SCA14TA       0101
[...]
/dev/sg30  0 0 30 0  13  NEWISYS   NDS-4602-CS       0112
[...]

Still after setting the right zone the Rebuilds were slow, the scan rate half of the obtained with a single processor.

I tested that the system was able to provide the expected performance by reading from all the drives at the same time. This is done with:

dd if=/dev/sda of=/dev/null bs=1M status=progress &
dd if=/dev/sdb of=/dev/null bs=1M status=progress &
dd if=/dev/sdc of=/dev/null bs=1M status=progress &
dd if=/dev/sdd of=/dev/null bs=1M status=progress &
[...]

I do this for all the drives at the same time and with iostat:

iostat -y 1 1

I check the status of the memory with:

slabtop
free
htop

I checked the memory and htop during a Rebuild. Memory was more than enough. However CPU usage was higher than expected.

The red bars in the image correspond to kernel processes, in this case is the DRAID Rebuild. I see that the load is higher than the usual with a single processor.

I capture all the parameters from ZFS with:

zfs get all

All this information is logged into my forensics document, so later can be checked by my Team or I can share with other Architects or other members of the company. I started this methodology after I knew how Google do their SRE forensics / postmortem documents. Also for myself is useful for the future to have a log of the commands I executed and a verbose output of the results.

I install the smp_utils

yum install smp_utils

Check things:

ls -al  /dev/bsg/
total 0drwxr-xr-x.  2 root root     3020 May 22 10:16 .
drwxr-xr-x. 20 root root     8680 May 22 10:16 ..
crw-------.  1 root root 248,  76 May 22 10:00 1:0:0:0
crw-------.  1 root root 248, 126 May 22 10:00 10:0:0:0
crw-------.  1 root root 248, 127 May 22 10:00 10:0:1:0
crw-------.  1 root root 248, 136 May 22 10:00 10:0:10:0
crw-------.  1 root root 248, 137 May 22 10:00 10:0:11:0
crw-------.  1 root root 248, 138 May 22 10:00 10:0:12:0
crw-------.  1 root root 248, 139 May 22 10:00 10:0:13:0
[...]
[root@4602Carles10 ~]# smp_discover /dev/bsg/expander-1:0
[...]
[root@4602Carles10 ~]# smp_discover /dev/bsg/expander-1:1

I check for errors in the expander that could justify the problems of performance:

for i in `seq 0 64`; do smp_rep_phy_err_log -p $i /dev/bsg/expander-1\:0 ; done
Report phy error log response:
  Expander change count: 567
  phy identifier: 0
  invalid dword count: 0
  running disparity error count: 0
  loss of dword synchronization count: 0
  phy reset problem count: 0
[...]
Report phy error log response:
  Expander change count: 567
  phy identifier: 52
  invalid dword count: 168
  running disparity error count: 172
  loss of dword synchronization count: 5
  phy reset problem count: 0
Report phy error log response:
  Expander change count: 567
  phy identifier: 53
  invalid dword count: 6
  running disparity error count: 6
  loss of dword synchronization count: 0
  phy reset problem count: 0
Report phy error log response:
  Expander change count: 567
  phy identifier: 54
  invalid dword count: 267
  running disparity error count: 270
  loss of dword synchronization count: 4
  phy reset problem count: 0
Report phy error log response:
  Expander change count: 567
  phy identifier: 55
  invalid dword count: 127
  running disparity error count: 131
  loss of dword synchronization count: 5
  phy reset problem count: 0
Report phy error log result: Phy vacant
Report phy error log result: Phy vacant
Report phy error log result: Phy vacant
Report phy error log result: Phy vacant
Report phy error log result: Phy vacant
Report phy error log result: Phy vacant
Report phy error log result: Phy vacant
Report phy error log result: Phy vacant
Report phy error log result: Phy vacant

There are some errors, and I check with the Hardware Team, which pass a battery of tests on the machine and say that the machine passes. They tell me that if the errors counted were in order of millions then it would be a problem, but having few of them is usual.

My colleagues previously reported that the memory was performing well, and the CPU too. They told me that the speed was exactly double respect a platform with one single CPU of the same kind.

Even if they told me that, I ran cmips tests to make sure.

git clone https://github.com/cmips/cmips_bin

It scored 16,000. The performance was Ok in general terms but the problem is that I didn’t have a baseline for that processor in single processor, so I cannot make sure that the memory bandwidth was Ok. The performance was less that an Amazon c3.8xlarge. The system I was testing is a two processor system, but each CPU is cheap, around USD $400.

Still my gut feeling was telling me that this double processor server should score more.

lscpu
[root@DRAID-1135-14TB-2CPU ~]# lscpu
 Architecture:          x86_64
 CPU op-mode(s):        32-bit, 64-bit
 Byte Order:            Little Endian
 CPU(s):                32
 On-line CPU(s) list:   0-31
 Thread(s) per core:    2
 Core(s) per socket:    8
 Socket(s):             2
 NUMA node(s):          2
 Vendor ID:             GenuineIntel
 CPU family:            6
 Model:                 79
 Model name:            Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
 Stepping:              1
 CPU MHz:               2299.951
 CPU max MHz:           3000.0000
 CPU min MHz:           1200.0000
 BogoMIPS:              4199.73
 Virtualization:        VT-x
 L1d cache:             32K
 L1i cache:             32K
 L2 cache:              256K
 L3 cache:              20480K
 NUMA node0 CPU(s):     0-7,16-23
 NUMA node1 CPU(s):     8-15,24-31
 Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_ppin intel_pt ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts spec_ctrl intel_stibp

I check the memory configuration with:

dmidecode -t memory

I examined the results, I see that the processor can only operate the DDR4 ECC 2400 Memory at 2133 and… I see something!. This Controller before was a single processor with 2 Memory Sticks of 16GB each, dual rank.

I see that now I have the same number of sticks in that machine, but I have two CPU!. So 2 Memory sticks in total, for 2 CPU.

That’s no good. The memory must be in pairs and in the right slots to get the maximum performance.

1 memory module for 1 CPU doesn’t allow to have Dual Channel and probably is affecting the performance. Many Servers will not even boot if you add an odd number of memory sticks per CPU.

And many Servers can operate at full speed only if all the banks are filled.

I request to the Engineers in Silicon Valley to add 4 modules in the right slots. They did, and I repeated the tests and the performance was doubled then.

After some days I had some time with the machine, I repeated the test and I got a CMIPS Score of around 20,000.

Multiprocessor world is far more complicated than single processor. Some times things can work not as expected, and not be evident, for example cache pipeline can act diferent for a program working in multiprocessor and single processor. Or the QPI could be saturated.

After this I shared my forensics document with as many Engineers as I could, so they could learn how I did to troubleshot the problem, and what was the origin of it, and I asked them to do the same so we can track their steps and progress if something needs to be troubleshoot.

After proper intensive testing the Server was qualified. Lesson here is that changes cannot be commited quickly, need their time.