Category Archives: Reflections

cmemgzip Python tool to compress files in memory when there is no free space on the disk

Rationale

All the Operation Engineers and SREs that work with systems have found the situation of having a Server with the disk full of logs and needing to keep those logs, and at the same time needing the system to keep running.

This is an uncomfortable situation.

I remember when I was being interviewed in Facebook, in Menlo Park, for a SDM position in the SRE (Software Development Manager) back in 2013-2014. They asked me about a situation where they have the Server disk full, and they deleted a big log file from Apache, but the space didn’t come back. They told me that nobody ever was able to solve this.

I explained that what happened is that Apache still had the fd (file descriptor), and that he will try to write to end of that file, even if they removed the huge log file with rm command, from the system they will not get back any free space. I explained that the easiest solution was to stop apache. They agreed and asked me how we could do the same without restarting the Webserver and I said that manipulating the file descriptors under /proc. They told me what I was the first person to solve this.

How it works

Basically cmemgzip will read a file, as binary, and will load it completely in to Memory.

Then it will compress it also in Memory. Then it will release the memory used to keep the original, will validate write permissions on the folder, will check that the compressed file is smaller than the original, and will delete the original and, using the new space now available in disk, write the compressed and smaller version of the file in gzip format.

Since version 0.3 you can specify an amount of memory that you will use for the blocks of data read from the file, so you can limit greatly the memory usage and compress files much more bigger than the amount of memory.

If for whatever reason the gz version cannot be written to disk, you’ll be asked for another route.

I mentioned before about File Descriptors, and programs that may keep those files open.

So my advice here, is that if you have to compress Apache logs or logs from a multi-thread program, and disk is full, and several instances may be trying to write to the log file: to stop Apache service if you can, and then run cmemgzip. I want to add it the future to auto-release open fd, but this is delicate and requires a lot of time to make sure it will be reliable in all the circumstances and will obey the exact desires of the SRE realizing the operation, without unexpected undesired side effects. It can be implemented with a new parameter, so the SysAdmin will know what is requesting.

Get the source code

You can decompress it later with gzip/gunzip.

So about cmemgzip you can git clone the project from here:

https://gitlab.com/carles.mateo/cmemgzip

git clone https://gitlab.com/carles.mateo/cmemgzip

The README.md is very clear:

https://gitlab.com/carles.mateo/cmemgzip/-/blob/master/README.md

The program is written in Python 3, and I gave it License MIT, so you can use it and the Open Source really with Freedom.

Do you want to test in other platforms?

This is a version 0.3.

I have only tested it in:

  • Ubuntu 20.04 LTS Linux for x64
  • Ubuntu 20.04 LTS 64 bits under Raspberry Pi 4 (ARM Processors)
  • Windows 10 Professional x64
  • Mac OS X
  • CentOS

It should work in all the platforms supporting Python, but if you want to contribute testing for other platforms, like Windows 32 bit, Solaris or BSD, let me know.

Alternative solutions

You can create a ramdisk and compress it to there. Then delete the original and move the compressed file from ramdisk to the hard drive, and unload the ramdrive Kernel Module. However we find very often with this problems in Docker containers or in instances that don’t have the Kernel module installed. Is much more easier to run cmemgzip.

Another strategy you can do for the future is to have a folder based on ZFS and compression. Again, ZFS should be installed on the system, and this doesn’t happen with Docker containers.

cmemgzip is designed to work when there is no free space, if there is free space, you should use gzip command.

In a real emergency when you don’t have enough RAM, neither disk space, neither the possibility to send the log files to another server to be compressed there, you could stop using the swap, and fdisk the swap partition to be a ext4 Linux format, format it, mount is, and use the space to compress the files. And after moving the files compressed to the original folder, fdisk the old swap partition to change type to Swap again, and enable swap again (swapon).

Memory requirements

As you can imagine, the weak point of cmemgzip, is that, if the file is completely loaded into memory and then compressed, the requirements of free memory on the Server/Instance/VM are at least the sum of the size of the file plus the sum of the size of the file compressed. You guess right. That’s true.

If there is not enough memory for loading the file in memory, the program is interrupted gracefully.

I decided to keep it simple, but this can be an option for the future.

So if your VM has 2GB of Available Memory, you will be able to use cmemgzip in uncompressed log files around 1.7GB.

In version 0.3 I implemented the ability to load chunks of the original file, and compress into memory, so I would be able use less memory. But then the compression is less efficient and initial tests point that I’ll have to keep a separate file for each compressed chunk. So I will need to created a uncompress tool as well, when now is completely compatible with gzip/gunzip, zcat, the file extractor from Ubuntu, etc…

For a big Server with a logfile of 40TB, around 300GB of RAM should be sufficient (the Servers I use have 768 GB of RAM normally).

Honestly, nowadays we find ourselves more frequently with VMs or Instances in the Cloud with small drives (10 to 15GB) and enough Available RAM, rather than Servers with huge mount points. This kind of instances, which means scaling horizontally, makes more difficult to have NFS Servers were we can move those logs, for security.

So cmemgzip covers very well some specific cases, while is not useful for all the scenarios.

I think it’s safe to say it covers 95% of the scenarios I’ve found in the past 7 years.

cmemgzip will not help you if you run out inodes.

Usage

Usage is very simple, and I kept it very verbose as the nature of the work is Operations, Engineers need to know what is going on.

I return error level/exit code 0 if everything goes well or 1 on errors.

./cmemgzip.py /home/carles/test_extract/SherlockHolmes.txt
 
 cmemgzip.py v.0.1

 Verifying access to: /home/carles/test_extract/SherlockHolmes.txt
 Size of file: /home/carles/test_extract/SherlockHolmes.txt is 553KB (567,291 bytes)
 Reading file: /home/carles/test_extract/SherlockHolmes.txt (567,291 bytes) to memory.
 567,291 bytes loaded.
 Compressing to Memory with maximum compression level…
 Size compressed: 204KB (209,733 bytes). 36.97% of the original file
 Attempting to create the gzip file empty to ensure write permissions
 Deleting the original file to get free space
 Writing compressed file /home/carles/test_extract/SherlockHolmes.txt.gz
 Verifying space written match size of compressed file in Memory
 Write verification completed.

You can also simulate, without actually delete or write to disk, just in order to know what will be the

Installation

There are no third party libraries to install. I only use the standard ones: os, sys, gzip

So clone it with git in your preferred folder and just create a symbolic link with your favorite name:

sudo ln --symbolic /home/carles/code/cmemgzip/cmemgzip.py /usr/bin/cmemgzip

I like to create the link without the .py extension.

This way you can invoke the program from anywhere by just typing: cmemgzip

News from the blog 2020-12-16

  • Happy Christmas!

This is the 3D tree that I bought, which is programmable in Python :)

  • If you’re into ZFS I recommend this video:

https://klarasystems.com/learning/webinars/best-practices-for-optimizing-zfs1/

Is a video from klarasystems about best practices for ZFS.

  • Amazing Apache Kafka resources can be found here:

https://developer.confluent.io/

https://developer.confluent.io/learn-kafka/

  • I decided to lower the price of my book to the minimum in LeanPub $5 USD while covid is going on in order to help people with their lives.
    https://leanpub.com/u/carlesmateo

I read with surprise that Comcast is capping the Internet use to 1.2TB per month, and that they will be charging excess.

So… if I contract a Backup with Carbonite or BackBlaze or DropBox or another company and I backup my 10TB files, Comcast will ruin me charging excesses…
Or if I work from home, or the family watches a lot of Netflix…
I can only thinK on their Cast Strategy of CastNumberOfClientsToBankrupcy.

A joke to indicate that I think they will loss clients.

Imagine yesterday I downloaded two images of Ubuntu, being 5 GB, installed Call of Duty in one computer 180 GB, installed few Xbox games 400 GB, listened to Spotify 10 Gb, watched youtube 3 GB, watched Netflix 4 GB, so 602 GB in one day.

Not counting the bandwidth WFH (Working from Home).

Not counting Windows Updates, TV updates, consoles updates, Android Updates, Ubuntu updates…

And this is done in the middle of the covid-19 pandemic, with so many people lock down at home, playing video games, watching movies, and requiring desperately distractions.

<irony>Well done Comcast!</irony>

Post-Mortem: The mystery of the duplicated Transactions into an e-Commerce

Me, with 4 more Senior BackEnd Engineers wrote the new e-Commerce for a multinational.

The old legacy Software evolved into a different code for every country, making it impossible to be maintained.

The new Software we created used inheritance to use the same base code for each country and overloaded only the specific different behavior of every country, like for the payment methods, for example Brazil supporting “parcelados” or Germany with specific payment players.

We rewrote the old procedural PHP BackEnd into modern PHP, with OOP and our own Framework but we had to keep the transactional code in existing MySQL Procedures, so the logic was split. There was a Front End Team consuming our JSONs. Basically all the Front End code was cached in Akamai and pages were rendered accordingly to the JSONs served from out BackEnd.

It was a huge success.

This e-Commerce site had Campaigns that started at a certain time, so the amount of traffic that would come at the same time would be challenging.

The project was working very well, and after some time the original Team was split into different projects in the company and a Team for maintenance and evolutives was hired.

At certain point they started to encounter duplicate transactions, and nobody was able to solve the mystery.

I’m specialized into fixing impossible problems. They used to send me to Impossible Missions, and I am famous for solving impossible problems easily.

So I started the task with a SRE approach.

The System had many components and layers. The problem could be in many places.

I had in my arsenal of tools, Software like mysqldebugger with which I found an unnoticed bug in decimals calculation in the past surprising everybody.

Previous Engineers involved believed the problem was in the Database side. They were having difficulties to identify the issue by the random nature of the repetitions.

Some times the order lines were duplicated, and other times were the payments, which means charging twice to the customer.

Redis Cluster could also play a part on this, as storing the session information and the basket.

But I had to follow the logic sequence of steps.

If transactions from customer were duplicated that mean that in first term those requests have arrived to the System. So that was a good point of start.

With a list of duplicated operations, I checked the Webservers logs.

That was a bit tricky as the Webserver was recording the Ip of the Load Balancer, not the ip of the customer. But we were tracking the sessionid so with that I could track and user request history. A good thing was also that we were using cookies to stick the user to the same Webserver node. That has pros and cons, but in this case I didn’t have to worry about the logs combined of all the Webservers, I could just identify a transaction in one node, and stick into that node’s log.

I was working with SSH and Bash, no log aggregators existing today were available at that time.

So when I started to catch web logs and grep a bit an smile was drawn into my face. :)

There were no transactions repeated by a bad behavior on MySQL Masters, or by BackEnd problems. Actually the HTTP requests were performed twice.

And the explanation to that was much more simple.

Many Windows and Mac User are used to double click in the Desktop to open programs, so when they started to use Internet, they did the same. They double clicked on the Submit button on the forms. Causing two JavaScript requests in parallel.

When I explained it they were really surprised, but then they started to worry about how they could fix that.

Well, there are many ways, like using an UUID in each request and do not accepting two concurrents, but I came with something that we could deploy super fast.

I explained how to change the JavaScript code so the buttons will have no default submit action, and they will trigger a JavaScript method instead, that will set a boolean to True, and also would disable the button so it can not be clicked anymore. Only if the variable was False the submit would be performed. It was almost impossible to get a double click as the JavaScript was so fast disabling the button, that the second click will not trigger anything. But even if that could be possible, only one request would be made, as the variable was set to True on the first click event.

That case was very funny for me, because it was not necessary to go crazy inspecting the different layers of the system. The problem was detected simply with HTTP logs. :)

People often forget to follow the logic steps while many problems are much more simple.

As a curious note, I still see people double clicking on links and buttons on the Web, and some Software not handling it. :)

Working abroad and the English complexes and insecurity of non natives

I write this article thinking in all my friends that feel insecure about talking in English.

They think about if they are pronouncing correctly, or if they are building the phrases in the correct grammar order. That’s school’s system fault, I think.

As Catalans we learn new languages easily. We talk Catalan native, and Spanish, and in the school we are taught French and English, and if things have not changed, we can choose between Latin and Greek. (I studied both)

But doing 1 or 2 hours per week of English doesn’t grant you a good level of the language, and in fact, few people in Catalonia and Barcelona speak fluent English with a good accent.

I learnt English by myself, by reading programming manuals when I was 5 years old. I also learnt to play chess by watching others playing and when I won the first time I played, against a guy 5 years older than me, he could not believe it was my first match.

I was 10, I think.

When I started classes in the school I realized that I already knew English.

Commands in Basic, like list, run, print, goto, had the same meaning than in the human spoken language.

I grew and I saw that the translations of technical books to Spanish (no Catalan was available) were horrible. They were actually translating commands, so since 15 y.o. I only read manuals in English.

In several jobs, for multinationals, I had to talk with colleagues from different parts of the world, so I was talking Portuguese, some times Italian or French, I could read a bit of German (Was head of Department in Volkswagen IT, gedas), and obviously English.

Still it is not the same when you talk using a subset of the language, basically referred to Hardware and Software, than fully living abroad.

Starting English is easy, you can use present and will for the future and did for the past, and you can make it work. But when you start with the phrasal verbs, the irregular verbs, different time conjugations… English is a context language and it is not a phonetic language, words that are written exactly the same way, sound different, and words that sound the same are written different. So it has a lot of exceptions.

But in this, in the exceptions, and in the fact that is widely expanded, is where we can find the strength to grow without fear.

Catalan is spoken very differently if we are in Barcelona, Lleida, Girona, Tarragona or if we are in València, or Menorca, or Alguer or the country of Andorra.

So the same happens with English. It is not only very different from England to the States, to Australia, to Scotland, to Ireland… also is very different from Dublin and Cork, or from different parts in The States, like Texas and California.

Also there are many people that talks it in Europe, in India… and all of them have different accent!.

So in my experience everybody will understand you. Specially because English is fully understood by the context. Maybe they need you to end the phrase to understand, but they will.

There are also annoying differences that can make you think that your are making mistakes.

Like:

  • Data Center (American) vs Data Center (England)
  • Color (American) vs Colour
  • Humor (American) vs Humour

Don’t be surprised if many native people find your accent exotic, and they love it.

That’s what happened to me many times.

Also I think the school is terrible teaching. They teach children all those rigid grammar expressions, when the live language is much more fluent and free.

For example, one person from Barcelona, will be nervous asking to a colleague:

  • Are you going to the cinema tonight?
  • Have you finally had gone to Disney World?

And he will be nervous thinking in real time if he is building the phrase right.

When, after 2 years will realize that people say:

  • You go to cinema tonight?
  • Did you go to Disney World finally?

The latest are very close to the grammar we use with Catalan, and so hence easy to express fluently.

I can share with you the process I follow to improve my English.

Since the 15 y.o. I was reading all the manuals in English.

I was watching some movies in English, at the beginning with subtitles in Spanish (no Catalan was available) and later with the subtitles in English.

Since 2013, when I was invited by Amazon to Dublin and by Facebook to Menlo Park (US), I started to watch all the movies and sit coms in English.

At the beginning with subtitles in English with the idea to correlate pronunciation and writing. To get my ears used to it.

I went to conferences, and I saw some people, with living years in English speaking countries, that had a much more difficult to understand accent than mine. They were people with reputation. And I understood that IT guys, we are very lucky to be valued by what we know. By our brain.

After all my life reading in English and 4 years watching all the movies in English, my accent had improved, but when I arrived to Cork I had difficulties understanding some the Irish. So I had to get used to the music, the cadence, of the way they talk, and to some words and expressions, and to the humor sense.

I asked my Irish colleagues to correct me when I pronounced wrong, and they were so nice to do it. And they did in a very polite way, for example if I would say:

Is a new Engineer coming to the Team?

And I would put the emphasis on the i of Engineer (the accent), Kevin would repeat the word in the right pronunciation. So I had the chance to learn how it rightly sound.

And I would repeat to make sure I got it.

One thing I think is that one has to be thankful for the time and interest that others dedicate to you. We have all a limited time on the planet, so when somebody invests some time in teaching you or helping you to learn, is giving you something that he will not get back. Even if you pay him/her, still that time will not go back to that person.

So I appreciate when people help me, and I don’t appreciate it less because I pay them.

Talking, listening, is the best real way to learn.

With 100% of reading in English, 100% of movies being watch in English, and nearly 100% of talking and listening in English my language skills reached to the next level. So I can talk in conferences, I can write books and technical documentation. And still I learn a lot of English every day. New words, or rich forms to express the things, reading the newspaper, for example. I really enjoy it.

But is like swimming or going in bicycle: learn by doing.

I cancelled my Amazon Prime subscription

I was using a lot Amazon. Sending parcels to my previous job offices, and now to Blizzard offices, so I subscribed to Amazon Prime. With COVID-19 virus we were sent to do Remote Work, and now with the lock down basically I’m 99.99% of the time at home.

I did a test to see how it works sending to home during the pandemic.

I choose two different items, I reviews the order, they were going to be delivered separately, one day of distance.

I choose two items that will fit in my mailbox, separated or together. One USB3 3mts male female and a Blu-ray movie.

My surprise comes when I go to the mailbox one day before and I see that I have a paper from an-post telling that they pass by to deliver my parcel, and they did not leave because it doesn’t fit the mailbox and they did not want to leave it a common space. For my surprise both Amazon parcels were grouped and sent before time. Maybe in a bigger box. But the mailman did not ring my door.

The paper tells me to get my parcel in the middle of the city, during the lock down. No way! I’m not going to risk my health and specially from elders, just to grab a cable and a movie.

I had the chance to request re-delivery to an Post, so I do. I fill all the info, I inform my phone number, email, I indicate which door to ring, and two days after as promised… a paper from an Post!.

They did not even rang my bell again.

I go to Amazon to cancel the order, but the process is only created for if you got the items.

Fuck it. I’m not going to order anything else to Amazon until that COVID-19 passes.

I don’t know if the postman just avoids people for fear to contagion or the process of an Post is awful and he didn’t get any information. But I’ll not buy anything even if I cannot buy in other places cause the lock down.

I was going to maintain my Amazon Prime subscription, even if I know that I’ll not use it much with the lock down, but makes no sense. Also:

  • I use Netflix and my Raspberry Pi 4, I was not using Amazon Prime Video.
  • I use Spotify, I was not using Amazon Prime Music.
  • I like to read in paper, not in eBook, so I was not using the eReader options.

A nice way to loss a customer.

Datacenters, D&R and coronavirus

I’ve been working for years within Data centers, with D&R strategies, and then in the middle of COVID-19, with huge demands on increments of bandwidth and compute, some DCs decided to do not allow in the Engineers of their customers.

As somebody that had my own Startup and CSP and had infrastructure in DCs and servers from customers in colocation, and has replaced Hw components at 1AM, replaced drives from broken RAIDs, and fixed systems so many times inside so many Datacenters across the world, I’m shocked about that.

I understand health reasons can be argued, but I still have Servers in Datacenters because we all believed they were the most safe place, prepared for disaster and recovery, with security, 24×7… and now, one realise that cannot enter to fix or upgrade the own machines.
Please note, still you can use the remote hands from the DC, although this is not a good idea many times, I’m not sure this will still be an available option when the lock down in those countries becomes more strict.

I’m wondering if DCs current model have any future at all.

I think most of the D&R strategies from now will be in the cloud, in different regions, with different providers, so companies can resist providers or governments letting them down.

Remote working is here

So remote working is here.

After years in which many Engineers requested to the companies to be able to Remote Work, with most of answers No, now it happens that not only is good for the company, is the only way to ensure continuity of business, of many businesses.

One of my colleagues from Denmark, which government has shutdown the country by sending all the public servants to home, in order to prevent the spread of the coronavirus, told me:

“Yes, remote working is here, but has been necessary the four horsemen of the apocalypse”

It is curious, how Remote Working has arrived, no thanks to that was obvious, but due to external emergencies. And I’m glad that my company was prepared for business continuity.

I’ll be staying home, working remotely, in order to contribute to non-spreading the virus, specially among old people. I’m perfectly healthy but that’s a use case, many people will not develop the symptoms and still be able to spread to others.

So I have some plans related to technology to do at home, including few improvements to the blog. What are your plans?.

Update: 2020-03-13 23:16 UTC I’m thinking in all those business which are forces to close, and all the employees that will not get a salary, or will be fired, or will get a salary and the business owner maybe ends in bankrupt as is paying the salaries and no income is being generated.

Update: 2020-03-19 10:58 UTC Some of my friends, even in Human Resources/Recruiting, are starting to remote work for first time. So here is some advice:

I would recommend to get an external monitor, at least 22″, so you neck is not forcing position looking low and your eyes don’t suffer, good light (don’t in dark), a nespresso can be a good friend in the morning, and to have your hands and arms aligned correctly so you don’t suffer from a bad position. Watch the position of the wrists, your arms should be comfortably at the same level than the table, similar in an L, and your eyes be aligned to the top of your monitor. Finally I would recommend to follow a routine, like if you were going to work, so dress like you would do. Don’t stay at home all day in pijamas! ;)