Tag Archives: Java

Why I think in Python is not a good idea to raise exceptions inside your methods

Last update: 2022-05-18 10:48 Irish Time

Recently a colleague was asking me for advice on their design of error handling in a Python application.

They were catching an error and raising an Exception, inside the except part of a method, to be catch outside the method.

And at some point a simple logic got really messy and unnecessarily complicated. Also troubleshooting and debugging an error was painful because they were only getting a Custom Exception and not context.

I explained to my colleague that I believed that the person that created that Exception chain of catch came from Java background and why I think they choose that path, and why I think in Python it’s a bad idea.

In Java, functions and methods can only return one object.

I programmed a lot in Java in my career, and it was a pain having to create value objects, and having to create all kind of objects for the return. Is it a good thing that types are strongly verified by the language? Yes. It worked? Yes. It made me invest much more time than necessary? Also yes.

Having the possibility to return only one object makes it mandatory having a way to return when there was an error. Otherwise you would need to encapsulate an error code and error description fields in each object, which is contrary to the nature of the object.

For example, a class Persona. Doesn’t make any sense having an attribute inside the class Persona to register if an operation related to this object went wrong.

For example, if we are in a class Spaceship that has a method GetPersonaInCommand() and there is a problem in that method, doesn’t make any sense to return an empty Persona object with attributes idError, errorDescription. Probably the Constructor or Persona will require at least a name or Id to build the object…. so in this case, makes sense that the method raises an Exception so the calling code catches it and knows that something went wrong or when there is no data to return.

This will force to write Custom Exceptions, but it’s a solution.

Another solution is creating a generic response object which could be an Object with these attributes:

  • idError
  • errorDescription
  • an Object which is the response, in our example Persona or null

I created this kind of approach for my Cassandra libraries to easily work with Cassandra from Java and from PHP, and for Cassandra Universal Driver (a http/s gateway created in year 2014).

Why this in not necessary in Python

Python allows you to return multiple values, so I encourage you tor return a boolean for indicating the success of the operation, and the object/value you’re interested.

You can see it easily if you take a look to FileUtils class from my OpenSource libraries carleslibs.

The method get_file_size_in_bytes(self, s_file) for example:

    def get_file_size_in_bytes(self, s_file):

        b_success = False
        i_file_size = 0

        try:
            # This will help with Unit Testing by raisin IOError Exception
            self.test_helper()

            i_file_size = os.path.getsize(s_file)
            b_success = True
        except IOError:
            b_success = False

        return b_success, i_file_size

It will always return a boolean value to indicate success or failure of the operation and an integer for the size of the file.

The calling code will do something like this:

o_fileutils = FileUtils()
b_success, i_bytes = o_fileutils.get_file_size_in_bytes("profile.png")
if b_succes is False:
    print("Error! The file does not exist or cannot be accessed!")
    exit(1)

if i_bytes < 1024:
    print("The profile picture should be at least 1KB")
    exit(1)

print("Profile picture exists and is", i_bytes, " bytes in length!")

The fact that Python can return multiple variables makes super easy dealing with error handling without having to take the road of Custom Exceptions.

And it is Ok if you want to follow this path, but in my opinion, for most of the developers up to Senior levels, it only over complicates the logic of your code and the amount of try/excepts you have to have everywhere.

If you use PHP you can mix different types in an Array, so you can always return an Array with a boolean, or an i_id_error, and your object or data of whatever type it’s.

Getting back to my carleslibs Open Source package, it is super easy to Unit Test these methods.

In my opinion, this level of simplicity, brings only advantages. Including Software Development speed, which is good for the business.

I’m not advocating for not using Custom Exceptions or to not develop a Exceptions Raising strategy if you need it and you know what you’re doing. I’m just suggesting why I think most of the developments in Python do not really need this and only over complicates the development. There are situations where raising exceptions will be a perfectly useful or even the best approach, there are many scenarios, but I think that in most of cases, using raise inside except will only multiply the time of the development and slow down the speed of bringing new features to the business, over complicating Unit Test as well, and be a real pain for the Junior and Intermediate developers.

The Constructor

Obviously, as the Constructor doesn’t return any value, it is perfectly fine to raise an exception in there, or just to use try/except in the code that is instancing the objects.

News from the blog 2022-04-22

Media/Press

I was interviewed by Radio America Barcelona, in their studios in Barcelona.

RAB is a radio for the Catalan diaspora and expats.

The interview was broadcasted by Twitch and can be watched. It’s in Catalan language:

https://www.twitch.tv/videos/1448895585

You can follow them:

Sant Jordi discounted books (promo)

Tomorrow 23th of April is Sant Jordi (Saint George), the patron of Catalonia, and the Catalan traditional celebration consist in this day women gifting a book to the men they love and men gifting a rose to the women they love.

I created a voucher for all my Python books, so, since during 23th of April you can acquire the four books per $10 USD in total.

This voucher is limited to 100 sales.

https://leanpub.com/b/python3all/c/SANTJORDI2022

Python 3

I wrote an article with a Python 3 code that shows the length for file names in Linux ext3 and ext4 Filesystems and in ZFS Pools Filesystem.

I show basically how ASCii characters over 127 are encoded, reducing the maximum length of 255 bytes for the filename.

ZFS

I’ve updated an article explaining how to create a ZPool raidz (RAID 5 equivalent) from three loop devices based on local files.

Thanks to those that bought my ZFS book this month. :)

HTML and JavaScript

I wrote a super simple code to hide the <p> using jQuery and JavaScript.

Free books

https://books.goalkicker.com/

Books I bought

This month I bought these books.

Firewall

I continued to block any Russian or Belarus Ip Address that connects to the blog.

I also started to block entire ranges of Ip’s from Digital Ocean, as many attacks come from Servers in their infrastructure.

Despite blocking tens of thousands of Ip Addresses, the number of visitors keep growing.

My Health

Thanks to my strict discipline I managed to recover super well and I’m healthier than before and guided by the satisfied doctors we removed two daily medicines.

I started a new medicine that is for the final phase of my recuperation, which doctors expect to be completely. In fact I’m much more healthier than I was before going to the hospital.

Humor

Will AI take the world?
Sadly, true history. The responsibility to deliver is from all of us.

2021-12-22 News from the blog

Open Source

I released my Python Open Source libraries for rapid application development: carleslibs to v. 1.0.3.

I updated the page of my 2013 PHP Framework Catalonia Framework, linking to the blog. This is a very nice and lightweight PHP Framework that I created then and it worked so well that I had not the need to update it since 2014.

Donations

Blizzard offered to the employees the opportunity to donate in the name of Blizzard, up to USD $100. It has not been the first time, is a regular thing. Also in the past the company matched any donation we employees did, to help diversity will less resources to study in the university.

I chose to donate to Lions in Cork, Ireland.

When they lived, my grand parents were members and donors from Lion’s in Catalonia, so I trust them.

Security

Log4j Java vulnerability

A critical vulnerability named Log4Shell was found in Apache Log4j Java Open Source logging Library.

https://www.engadget.com/log4shell-vulnerability-log4j-155543990.html

The fix has been already released in v. 2.15.0 https://logging.apache.org/log4j/2.x/security.html

NSO zero-click iPhone exploit

An interesting read about NSO zero-click iPhone exploit: https://www.engadget.com/google-researchers-nso-zero-click-iphone-imessage-exploit-143213776.html

https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-into-nso-zero-click.html

A base Dockerfile for my Jenkins deployments

Update: Second part of this article: Creating Jenkins configurations for your projects

So I share with you my base Jenkins Dockerfile, so you can spawn a new Jenkins for your projects.

The Dockerfile installs Ubuntu 20.04 LTS as base image and add the required packages to run jenkins but also Development and Testing tools to use inside the Container to run Unit Testing on your code, for example. So you don’t need external Servers, for instance.

You will need 3 files:

  • Dockerfile
  • docker_run_jenkins.sh
  • requirements.txt

The requirements.txt file contains your PIP3 dependencies. In my case I only have pytest version 4.6.9 which is the default installed with Ubuntu 20.04, however, this way, I enforce that this and not any posterior version will be installed.

File requirements.txt:

pytest==4.6.9

The file docker_run_jenkins.txt start Jenkins when the Container is run and it will wait until the initial Admin password is generated and then it will display it.

File docker_run_jenkins.sh:

#!/bin/bash

echo "Starting Jenkins..."

service jenkins start

echo "Configure jenkins in http://127.0.0.1:8080"

s_JENKINS_PASSWORD_FILE="/var/lib/jenkins/secrets/initialAdminPassword"

i_PASSWORD_PRINTED=0

while [ true ];
do
    sleep 1
    if [ $i_PASSWORD_PRINTED -eq 1 ];
    then
        # We are nice with multitasking
        sleep 60
        continue
    fi

    if [ ! -f "$s_JENKINS_PASSWORD_FILE" ];
    then
        echo "File $s_FILE_ORIGIN does not exist"
    else
        echo "Password for Admin is:"
        cat $s_JENKINS_PASSWORD_FILE
        i_PASSWORD_PRINTED=1
    fi
done

That file has the objective to show you the default admin password, but you don’t need to do that, you can just start a shell into the Container and check manually by yourself.

However I added it to make it easier for you.

And finally you have the Dockerfile:

FROM ubuntu:20.04

LABEL Author="Carles Mateo" \
      Email="jenkins@carlesmateo.com" \
      MAINTAINER="Carles Mateo"

# Build this file with:
# sudo docker build -f Dockerfile -t jenkins:base .
# Run detached:
# sudo docker run --name jenkins_base -d -p 8080:8080 jenkins:base
# Run seeing the password:
# sudo docker run --name jenkins_base -p 8080:8080 -i -t jenkins:base
# After you CTRL + C you will continue with:
# sudo docker start
# To debug:
# sudo docker run --name jenkins_base -p 8080:8080 -i -t jenkins:base /bin/bash

ARG DEBIAN_FRONTEND=noninteractive

ENV SERVICE jenkins

RUN set -ex

RUN echo "Creating directories and copying code" \
    && mkdir -p /opt/${SERVICE}

COPY requirements.txt \
    docker_run_jenkins.sh \
    /opt/${SERVICE}/

# Java with Ubuntu 20.04 LST is 11, which is compatible with Jenkins.
RUN apt update \
    && apt install -y default-jdk \
    && apt install -y wget curl gnupg2 \
    && apt install -y git \
    && apt install -y python3 python3.8-venv python3-pip \
    && apt install -y python3-dev libsasl2-dev libldap2-dev libssl-dev \
    && apt install -y python3-venv \
    && apt install -y python3-pytest \
    && apt install -y sshpass \
    && wget -qO - https://pkg.jenkins.io/debian-stable/jenkins.io.key | apt-key add - \
    && echo "deb http://pkg.jenkins.io/debian-stable binary/" > /etc/apt/sources.list.d/jenkins.list \
    && apt update \
    && apt -y install jenkins \
    && apt-get clean

RUN echo "Setting work directory and listening port"
WORKDIR /opt/${SERVICE}

RUN chmod +x docker_run_jenkins.sh

RUN pip3 install --upgrade pip \
    && pip3 install -r requirements.txt


EXPOSE 8080


ENTRYPOINT ["./docker_run_jenkins.sh"]

Build the Container

docker build -f Dockerfile -t jenkins:base .

Run the Container displaying the password

sudo docker run --name jenkins_base -p 8080:8080 -i -t jenkins:base

You need this password for starting the configuration process through the web.

Visit http://127.0.0.1:8080 to configure Jenkins.

Configure as usual

Resuming after CTRL + C

After you configured it, on the terminal, press CTRL + C.

And continue, detached, by running:

sudo docker start jenkins_base

The image is 1.2GB in size, and will allow you to run Python3, Virtual Environments, Unit Testing with pytest and has Java 11 (not all versions of Java are compatible with Jenkins), use sshpass to access other Servers via SSH with Username and Password…

News from the blog 2021-01-11

Happy New Year to all.

Is something very simple, but will help my student friends to validate Input from Keyboard without losing too many hours.

The Input Validation Classes I create in PHP for Privalia or in my PHP Catalonia Framework, are much, much, more powerful, allowing the validation of complete forms, rendering errors, etc… although they were created for Web, and not for Keyboard input.

It recursively goes to all the subdirectories looking for .py files, and then it counts the lines.

  • I updated the price of my books to be the minimum allowed by LeanPub, to $5 USD, and created a bundle of two of them for $7 USD.

So people can benefit from this during the lock down.

  • I’ve updated the Python Combat Guide book with a sample of using Paramiko Libraries for SSH, and increased the Object Oriented Programing and Unit Testing, sections. I also added some books to the Bibliography.
  • I’ve read the postmortem initial analysis from Slack’s incident. It’s really interesting.

I cannot share it, but I guess that at some point they will publish it on their blog:

https://slack.engineering/

  • As I’m giving more Python Classes I decided to write a book to teach to code in Python for non-programmers.

News from the blog 2020-12-16

  • Happy Christmas!

This is the 3D tree that I bought, which is programmable in Python :)

  • If you’re into ZFS I recommend this video:

https://klarasystems.com/learning/webinars/best-practices-for-optimizing-zfs1/

Is a video from klarasystems about best practices for ZFS.

  • Amazing Apache Kafka resources can be found here:

https://developer.confluent.io/

https://developer.confluent.io/learn-kafka/

  • I decided to lower the price of my book to the minimum in LeanPub $5 USD while covid is going on in order to help people with their lives.
    https://leanpub.com/u/carlesmateo

I read with surprise that Comcast is capping the Internet use to 1.2TB per month, and that they will be charging excess.

So… if I contract a Backup with Carbonite or BackBlaze or DropBox or another company and I backup my 10TB files, Comcast will ruin me charging excesses…
Or if I work from home, or the family watches a lot of Netflix…
I can only thinK on their Cast Strategy of CastNumberOfClientsToBankrupcy.

A joke to indicate that I think they will loss clients.

Imagine yesterday I downloaded two images of Ubuntu, being 5 GB, installed Call of Duty in one computer 180 GB, installed few Xbox games 400 GB, listened to Spotify 10 Gb, watched youtube 3 GB, watched Netflix 4 GB, so 602 GB in one day.

Not counting the bandwidth WFH (Working from Home).

Not counting Windows Updates, TV updates, consoles updates, Android Updates, Ubuntu updates…

And this is done in the middle of the covid-19 pandemic, with so many people lock down at home, playing video games, watching movies, and requiring desperately distractions.

<irony>Well done Comcast!</irony>

Java validation Classes for Keyboard

I share with you a very lightweight, stand alone, Java validation Class for Input Keyboard, that I created for a university project.

The first thing to mention is an advice about using the Scanner with nextInt(), nextDouble(), next()…

My recommendation is to avoid using these. Work only with Strings and convert to the right Datatype. So use only nextLine(). You will save hours of frustration.

Those are the methods that I’ve implemented:

  • int askQuestion(String question, int minValue, int maxValue)
    Ask a Question expecting to get a number, like “What year was you born: “, and set a minimum value and a maximum value. The method will not allow to continue until the user enters a valid integer between the ranges.
  • float askQuestion(String question, float minValue, float maxValue)
    Same idea with Floats.
  • double askQuestion(String question, double minValue, double maxValue)
    Same with Double, so for example question = “Enter the price: “.
  • String askQuestion(String question)
    Ask a question and get a String. Simple, no validation. Not even if it’s empty String.
  • askQuestionLength(String question, int minLength, int maxLength)
    Here I changed the name of the method. Same mechanics as askQuestion, but will validate that the String entered is between minimum length and maxlength. For example, to ask a password between 8 and 12 characters.
  • String askQuestion(String question, String[] allowedValues)
    This one accepts a question, and an Array of Strings with the values that are acceptable. Is case sensitive.
    So, using MT Notation, you call it with:

String[] a_s_acceptedValues = new String[4];
a_s_acceptedValues[0] = “Barcelona”;
a_s_acceptedValues[1] = “Cork”;
a_s_acceptedValues[2] = “Irvine”;
a_s_acceptedValues[3] = “Edinburgh”;
String s_answer = o_inputFromKeyboard.askQuestion(“What is your favorite city?”, a_s_acceptedValues);

Here is the code for the Class InputFromKeyboard.

package com.carlesmateo;

import java.util.Scanner;

public class InputFromKeyboard {

    private static Scanner keyboard = new Scanner(System.in);

    public static int askQuestion(String question, int minValue, int maxValue) {
        // Make sure we enter in the loop
        int value = minValue -1;

        while (value < minValue || value > maxValue) {
            try {
                String valueString = askQuestion(question);
                value = Integer.parseInt(valueString);

            } catch (Exception e) {
                System.out.println("Invalid number!");
            }
        }

        return value;
    }

    public static float askQuestion(String question, float minValue, float maxValue) {
        // Make sure we enter in the loop
        float value = minValue -1;

        while (value < minValue || value > maxValue) {
            try {
                String valueString = askQuestion(question);
                value = Float.parseFloat(valueString);

            } catch (Exception e) {
                System.out.println("Invalid number!");
            }
        }

        return value;
    }

    public static double askQuestion(String question, double minValue, double maxValue) {
        // Make sure we enter in the loop
        double value = minValue -1;

        while (value < minValue || value > maxValue) {
            try {
                String valueString = askQuestion(question);
                value = Double.parseDouble(valueString);

            } catch (Exception e) {
                System.out.println("Invalid number!");
            }
        }

        return value;
    }


    public static String askQuestion(String question) {
        // Make sure we enter in the loop
        String answer = "";

        while (answer.equals("")) {

            try {
                System.out.print(question);
                answer = keyboard.nextLine();

            } catch (Exception e) {
                System.out.println("Invalid value!");
                // System.out.println(e.getMessage());
            }
        }

        return answer;
    }

    public static String askQuestionLength(String question, int minLength, int maxLength) {
        // Make sure we enter in the loop
        String answer = "";

        while (answer.equals("")) {

            try {
                System.out.print(question);
                answer = keyboard.nextLine();

                int length = answer.length();
                if (length < minLength || length > maxLength) {
                    answer = "";
                    System.out.println("Answer should be between " + minLength + " and " + maxLength + " characters.");
                }

            } catch (Exception e) {
                System.out.println("Invalid value!");
                // System.out.println(e.getMessage());
            }
        }

        return answer;
    }

    public static String askQuestion(String question, String[] allowedValues) {
        // Make sure we enter in the loop
        String answer = "";

        while (answer.equals("")) {

            try {
                System.out.print(question);
                answer = keyboard.nextLine();

                boolean found = false;
                for (int counter=0; counter < allowedValues.length; counter++) {
                    if (answer.equals(allowedValues[counter])) {
                        found = true;
                    }
                }

                if (found == false) {
                    System.out.println("Please enter any of the allowed values:");
                    for (int counter=0; counter < allowedValues.length; counter++) {
                        System.out.println(allowedValues[counter]);
                    }
                    answer = "";
                }

            } catch (Exception e) {
                System.out.println("Invalid value!");
                // System.out.println(e.getMessage());
            }
        }

        return answer;
    }

}

Is a reduced subset of functionality respect what I created in my Web PHP Framework Catalonia Framework, or the Input Data Validation that my friend Joim created in Privalia and later I extended and maintained.

All the Frameworks have similar Input Data functionalities, some much more complex, supporting RegExp, or several rules, multiselection (typical for web), accepting only a subset of characters, but this is the typical 80% functionality that everybody needs when writing a console program.

Adding a swapfile on the fly as a temporary solution for a Server with few memory

Here is an easy trick that you can use for adding swap temporarily to a Server, VMs or Workstations, if you are in an emergency.

In this case I had a cluster composed from two instances running out of memory.

I got an alert for one of the Servers, reporting that only had 7% of free memory.

Immediately I checked it, but checked also any other forming part of the cluster.

Another one appeared, had just only a bit more memory than the other, but was considered in Critical condition too.

The owner of the Service was contacted and asked if we can hold it until US Business hours. Those Servers were going to be replaced next day in US Business hours, and when possible it would be nice not to wake up the Team. It was day in Europe, but night in US.

I checked the status of the Server with those commands:

# df -h

There are 13GB of free space in /. More than enough to be safe as this service doesn’t use much.

# free -h
              total        used        free      shared  buff/cache   available
Mem:           5.7G        4.8G        139M        298M        738M        320M
Swap:            0B          0B          0B

I checked the memory, ok, there are only 139MB free in this node, but 738MB are buff/cache. Buff/Cache is memory used by Linux to optimize I/O as long as it is not needed by application. These 738 MB in buff/cache (or most of it) will be used if needed by the System. The field available corresponds to the memory that is available for starting new applications (not counting the swap if there was any), and basically is the free memory plus a fragment of the buff/cache. I’m sure we could use more than 320MB and there is a lot if buff/cache, but to play safe we play by the book.

With that in mind it seemed that it would hold perfectly to Business hours.

I checked top. It is interesting to mention the meaning of the Column RES, which is resident memory, in other words, the real amount of memory that the process is using.

I had a Java process using 4.57GB of RAM, but a look at how much Heap Memory was reserved and actually being used showed a Heap of 4GB (Memory reserved) and 1.5GB actually being used for real, from the Heap, only.

It was unlikely that elastic search would use all those 4GB, and seemed really unlikely that the instance will suffer from memory starvation with 2.5GB of 4GB of the Heap free, ~1GB of RAM in buffers/cache plus free, so looked good.

To be 100% sure I created a temporary swap space in a file on the SSD.

(# means that I’m executing this as root, if you type literally with # in front, this will be a comment)

# fallocate -l 1G /swapfile-temp

# dd if=/dev/zero of=/swapfile-temp bs=1024 count=1048576 status=progress
1034236928 bytes (1.0 GB) copied, 4.020716 s, 257 MB/s
1048576+0 records in
1048576+0 records out
1073741824 bytes (1.1 GB) copied, 4.26152 s, 252 MB/s

If you ask me why I had to dd, I will tell you that I needed to. I checked with command blkid and filesystem was xfs. I believe that was the reason.

The speed writing to the file is fair enough for a swap.

# chmod 600 /swapfile-temp

# mkswap /swapfile-temp
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=5fb12c0c-8079-41dc-aa20-21477808619a

# swapon /swapfile-temp

I check that memory is good:

# free -h
              total        used        free      shared  buff/cache   available
Mem:           5.7G        4.8G        117M        298M        770M        329M
Swap:          1.0G          0B        1.0G

And finally I check that the Kernel parameter swappiness is not too aggressive:

# sysctl vm.swappiness
vm.swappiness = 30

Cool. 30 is a fair enough value.

2022-01-05 Update for my students that need to add additional 16GB of swap to their SSD drive:

sudo fallocate -l 16G /swapfile-temp
sudo dd if=/dev/zero of=/swapfile-temp bs=1024 count=16777216 status=progress
sudo chmod 600 /swapfile-temp
sudo mkswap /swapfile-temp
sudo swapon /swapfile-temp

Resources for Microservices and Business Domain Solutions for the Cloud Architect / Microservices Architect

First you have to understand that Python, Java and PHP are worlds completely different.

In Python you’ll probably use Flask, and listen to the port you want, inside Docker Container.

In PHP you’ll use a Frameworks like Laravel, or Symfony, or Catalonia Framework (my Framework) :) and a repo or many (as the idea is that the change in one microservice cannot break another it is recommended to have one git repo per Service) and split the requests with the API Gateway and Filters (so /billing/ goes to the right path in the right Server, is like rewriting URLs). You’ll rely in Software to split your microservices. Usually you’ll use Docker, but you have to add a Web Server and any other tools, as the source code is not packet with a Web Server and other Dependencies like it is in Java Spring Boot.

In Java you’ll use Spring Cloud and Spring Boot, and every Service will be auto-contained in its own JAR file, that includes Apache Tomcat and all other Dependencies and normally running inside a Docker. Tcp/Ip listening port will be set at start via command line, or through environment. You’ll have many git repositories, one per each Service.

Using many repos, one per Service, also allows to deploy only that repository and to have better security, with independent deployment tokens.

It is not unlikely that you’ll use one language for some of your Services and another for other, as well as a Database or another, as each Service is owner of their data.

In any case, you will be using CI/CD and your pipeline will be something like this:

  1. Pull the latest code for the Service from the git repository
  2. Compile the code (if needed)
  3. Run the Unit and Integration Tests
  4. Compile the service to an executable artifact (f.e. Java JAR with Tomcat server and other dependencies)
  5. Generate a Machine image with your JAR deployed (for Java. Look at Spotify Docker Plugin to Docker build from Maven), or with Apache, PHP, other dependencies, and the code. Normally will be a Docker image. This image will be immutable. You will probably use Dockerhub.
  6. Machine image will be started. Platform test are run.
  7. If platform tests pass, the service is promoted to the next environment (for example Dev -> Test -> PreProd -> Prod), the exact same machine is started in the next environment and platform tests are repeated.
  8. Before deploying to Production the new Service, I recommend running special Application Tests / Behavior-driven. By this I mean, to conduct tests that really test the functionality of everything, using a real browser and emulating the acts of a user (for example with BeHat, Cucumber or with JMeter).
    I recommend this specially because Microservices are end-points, independent of the implementation, but normally they are API that serve to a whole application. In an Application there are several components, often a change in the Front End can break the application. Imagine a change in Javascript Front End, that results in a call a bit different, for example, with an space before a name. Imagine that the Unit Tests for the Service do not test that, and that was not causing a problem in the old version of the Service and so it will crash when the new Service is deployed. Or another example, imagine that our Service for paying with Visa cards generates IDs for the Payment Gateway, and as a result of the new implementation the IDs generated are returned. With the mocked objects everything works, but when we deploy for real is when we are going to use the actual Bank Payment. This is also why is a good idea to have a PreProduction environment, with PreProduction versions of the actual Services we use (all banks or the GDS for flights/hotel reservation like Galileo or Amadeus have a Test, exactly like Production, Gateway)

If you work with Microsoft .NET, you’ll probably use Azure DevOps.

We IT Engineers, CTOs and Architects, serve the Business. We have to develop the most flexible approaches and enabling the business to release as fast as their need.

Take in count that Microservices is a tool, a pattern. We will use it to bring more flexibility and speed developing, resilience of the services, and speed and independence deploying. However this comes at a cost of complexity.

Microservices is more related to giving flexibility to the Business, and developing according to the Business Domains. Normally oriented to suite an API. If you have an API that is consumed by third party you will have things like independence of Services (if one is down the others will still function), gradual degradation, being able to scale the Services that have more load only, being able to deploy a new version of a Service which is independent of the rest of the Services, etc… the complexity in the technical solution comes from all this resilience, and flexibility.

If your Dev Team is up to 10 Developers or you are writing just a CRUD Web Application, a PoC, or you are an Startup with a critical Time to Market you probably you will not want to use Microservices approach. Is like killing flies with laser cannons. You can use typical Web services approach, do everything in one single Https request, have transactions, a single Database, etc…

But if your team is 100 Developer, like a big eCommerce, you’ll have multiple Teams between 5 and 10 Developers per Business Domain, and you need independence of each Service, having less interdependence. Each Service will own their own Data. That is normally around 5 to 7 tables. Each Service will serve a Business Domain. You’ll benefit from having different technologies for the different needs, however be careful to avoid having Teams with different knowledge that can have hardly rotation and difficult to continue projects when the only 2 or 3 Devs that know that technology leave. Typical benefit scenarios can be having MySql for the Billing Services, but having NoSQL Database for the image catalog, or to store logs of account activity. With Microservices, some services will be calling other Services, often asynchronously, using Queues or Streams, you’ll have Callbacks, Databases for reading, you’ll probably want to have gradual and gracefully failure of your applications, client load balancing, caches and read only databases/in-memory databases… This complexity is in order to protect one Service from the failure of others and to bring it the necessary speed under heavy load.

Here you can find a PDF Document of the typical resources I use for Microservice Projects.

You can also download it from my github repository:

https://github.com/carlesmateo/awesome-microservices

Do you use other solutions that are not listed?. Leave a message. I’ll investigate them and update the Document, to share with the Community.

Update 2020-03-06: I found this very nice article explaining the same. Microservices are not for everybody and not the default option: https://www.theregister.co.uk/AMP/2020/03/04/microservices_last_resort/

Update 2020-03-11: Qcom with 1,600 microservices says that microservices architecture is the las resort: https://www.theregister.co.uk/AMP/2020/03/09/monzo_microservices/

CSort multithread versus QuickSort with Java Source Code

Updated on 2017-04-04 12:58 Barcelona Time 1491303515:

  • A method writeValuesFromArrayListToDisk(String sFilename) has been introduced as per a request, to easily check that the data is properly sorted.
  • A silly bug in the final ArrayList generation has been solved. It was storing iCounter that was always 1 as this is not the compressed version, for supporting repeated numbers, of the algorithm. I introduced this method for the article, as it is not necessary for the algorithm as it is already sorted, and unfortunately I didn’t do a final test on the output. My fault.
  • Some JavaDoc has been updated

Past Friday I was discussing with my best friend about algorithms and he told me that hadoop is not fast enough, and about when I was in Amazon and as part of the test they asked me to defined an S3 system from the scratch, and I did using Java and multiple streams per file and per node (replication factor) and they told me that what I just created was the exact way their system works, and we ended talking about my sorting algorithm CSort, and he asked me if it could run in MultiThread. Yes, it is one of the advantages in front of QuickSort. Also it can run in multinode, different computers. So he asked me how much faster it would be a MultiThread version of CSort, versus a regular QuickSort.

Well here is the answer with 500 Million of registers, with values from 1 to 1000000, and deduplicating.

2017-03-26 18:50:41 CSort time in seconds:0.189129089
2017-03-26 18:51:47 QuickSort cost in seconds:61.853190885

That’s Csort is 327 times faster than QuickSort!. In this example and with my busy 4 cores laptop. In my 8 cores computer it is more than 525 times faster. Imagine in a Intel Xeon Server with a 64 cores!.

How is it possible? The answer is easy, it has O(n) complexity. I use linear access.

This depends on your universe of Data. Please read my original posting about CSort here, that explains it on detail.

Please note that CSort with compression is also available for keeping duplicated values and also saving memory (space) and time, with equally totally astonishing results.

Please note that in this sample I first load the values to an Array, and then I work from this. This is just to avoid bias by discarding the time of loading the data from disk, but, in the other article you have samples where CSort sorts at the same time that loads the data from disks. I have a much more advanced algorithm that self allocates the memory needed for handling an enormous universe of numbers (big numbers and small with no memory penalty), but I’m looking forward to discuss this with a tech giant when it hires me. ;) Yes, I’m looking for a job.

In my original article I demonstrated it in C and PHP, this time here is the code in Java. It uses the MT Notation.

You can download the file I used for the tests from here:

Obviously it runs much more faster than hashing. I should note that hashing and CSorting with .containsKey() is faster than QuickSorting. (another day I will talk about sorting Strings faster) ;)

/*
 * (c) Carles Mateo blog.carlesmateo.com
 * Proof of concept of CSort, with multithread, versus QuickSort
 * For the variables notation MT Notation is used.
 */
package com.carlesmateo.blog;

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;

/**
 * @author carles mateo
 */
public class CSortMultiThread extends Thread {
    // Download the original file from the blog
    public static final int piNUM_REGISTERS_IN_FILE = 50000000;
    // Max value found, to optimize the memory used for CSort
    // Note that CSort can be implemented in the read from file mechanism
    // in order to save memory (space)
    public static int piMaxValue = 0;
    // The array containing the numbers read from disk
    public static int[] paiNumbers;
    // The array used by CSort (if not using direct loading from disk)
    public static int[] paiNumbersCsorted;
    // Final ArrayList Sorted. CSort and QuickSort finally fullfil this
    public static ArrayList<Integer> pliNumbers = new ArrayList<>();
   
    // For the Threads
    private Thread oT;
    private String sThreadName;
    private int piStart;
    private int piEnd;
    private boolean bFinished = false;
   
    CSortMultiThread (String name, int iStart, int iEnd) {
        sThreadName = name;
        piStart = iStart;
        piEnd = iEnd;
        writeWithDateTime("Creating " +  sThreadName );
    }
   
    public void run() {
        writeWithDateTime("Running " +  sThreadName + " to sort from " + piStart + " to " + piEnd);
        int iCounter;
        int iNumber;        

        for (iCounter=piStart; iCounter < piEnd; iCounter++) {
            iNumber = paiNumbers[iCounter];
            paiNumbersCsorted[iNumber] = 1;
        }          

      System.out.println("Thread " +  sThreadName + " exiting.");
      bFinished = true;
    }
   
    public void start () {
        writeWithDateTime("Starting " +  sThreadName );
        if (oT == null) {
            oT = new Thread (this, sThreadName);
            oT.start ();
        }
    }
    /**
     * Write values to Disk to demonstrate that are sorted ;)
     * @param sFilenameData 
     */
    private static void writeValuesFromArrayListToDisk(String sFilenameData) {
        ObjectOutputStream out = null;
        int iCounter;
        try {
            out = new ObjectOutputStream(new FileOutputStream(sFilenameData));
            
            for (iCounter=0; iCounter<pliNumbers.size(); iCounter++) {
                out.writeChars(pliNumbers.get(iCounter).toString() + "\n");
            }
            
            // To store the object instead
            //out.writeObject(pliNumbers);
            
        } catch (IOException e) {
            System.out.println("I/O Error!");
            e.printStackTrace();
            displayHelpAndQuit(10);
        } finally {
            if (out != null) {
                try {
                    out.close();    
                } catch (IOException e) {
                    System.out.println("I/O Error!");
                    e.printStackTrace();
                    displayHelpAndQuit(10);
                }   
            }
        }
    }
   
    /** 
     *  Reads the data from the disk. The file has 50M and we will be duplicating
     *  to get 500M registers.
     *  @param sFilenameData 
     */
    private static void readValuesFromFileToArray(String sFilenameData) {

        BufferedReader oBR = null;
        String sLine;
        int iCounter = 0;
        int iRepeat;
        int iNumber;
        
        // We will be using 500.000.000 items, so dimensionate the array
        paiNumbers = new int[piNUM_REGISTERS_IN_FILE * 10];

        try {

            oBR = new BufferedReader(new FileReader(sFilenameData));

            while ((sLine = oBR.readLine()) != null) {
                for (iRepeat = 0; iRepeat < 10; iRepeat++) {
                    int iPointer = (piNUM_REGISTERS_IN_FILE * iRepeat) + iCounter;
                    iNumber = Integer.parseInt(sLine);
                    paiNumbers[iPointer] = iNumber;
                    if (iNumber > piMaxValue) {
                        piMaxValue = iNumber;
                    }
                }                
                
                iCounter++;
            }
            
            if (iCounter < piNUM_REGISTERS_IN_FILE) {
                write("Warning... only " + iCounter + " values were read");
            }

        } catch (FileNotFoundException e) {
            System.out.println("File not found! " + sFilenameData);
            displayHelpAndQuit(100);
        } catch (IOException e) {
            System.out.println("I/O Error!");
            e.printStackTrace();
            displayHelpAndQuit(10);
        } finally {
            if (oBR != null) {
                try {
                    oBR.close();
                } catch (IOException e) {
                    System.out.println("I/O Error!");
                    e.printStackTrace();
                    displayHelpAndQuit(10);
                }
            }
        }        
    }
    
    private static String displayHelp() {
        String sHelp = "Help\n" +
                "====\n" +
                "Csort from Carles Mateo blog.carlesmateo.com\n" +
                "\n" +
                "Proof of concept of the fast load algorithm\n" +
                "\n";
        
        return sHelp;
    }
    
    /**
     * Displays Help Message and Quits with and Error Level
     * Errors:
     *  - 1   - Wrong number of parameters
     *  - 10  - I/O Error
     *  - 100 - File not found
     * @param iErrorLevel 
     */
    private static void displayHelpAndQuit(int iErrorLevel) {
        System.out.println(displayHelp());
        System.exit(iErrorLevel);
        
    }
    
    // This is QuickSort from vogella http://www.vogella.com/tutorials/JavaAlgorithmsQuicksort/article.html
    public static void sort() {
        int piNumber = paiNumbers.length;
        quicksort(0, piNumber - 1);        
    }
    
    private static void quicksort(int low, int high) {
        int i = low, j = high;
        // Get the pivot element from the middle of the list
        int pivot = paiNumbers[low + (high-low)/2];

        // Divide into two lists
        while (i <= j) {
            // If the current value from the left list is smaller then the pivot
            // element then get the next element from the left list
            while (paiNumbers[i] < pivot) {
                i++;
            }
            // If the current value from the right list is larger then the pivot
            // element then get the next element from the right list
            while (paiNumbers[j] > pivot) {
                j--;
            }

            // If we have found a values in the left list which is larger then
            // the pivot element and if we have found a value in the right list
            // which is smaller then the pivot element then we exchange the
            // values.
            // As we are done we can increase i and j
            if (i <= j) {
                exchange(i, j);
                i++;
                j--;
            }
        }
        // Recursion
        if (low < j)
            quicksort(low, j);
        if (i < high)
            quicksort(i, high);
    }

    private static void exchange(int i, int j) {
        int temp = paiNumbers[i];
        paiNumbers[i] = paiNumbers[j];
        paiNumbers[j] = temp;
    }
    
    /** 
     * We want to remove duplicated values
     */
    private static void removeDuplicatesFromQuicksort() {
        int iCounter;
        int iOldValue=-1;
        int iNewValue;
        
        for (iCounter=0; iCounter<paiNumbers.length; iCounter++) {
            iNewValue = paiNumbers[iCounter];
            if (iNewValue != iOldValue) {
                iOldValue = iNewValue;
                pliNumbers.add(iNewValue);
            }
        }
    }
// End of vogella QuickSort code

    /**
     * Generate the final Array
     */
    private static void copyFromCSortToArrayList() {
        int iCounter;
        int iNewValue;
        
        for (iCounter=0; iCounter<=piMaxValue; iCounter++) {
            iNewValue = paiNumbersCsorted[iCounter];
            if (iNewValue > 0) {
                pliNumbers.add(iCounter);
            }
        }
    }

    /**
     * Write with the date
     * @param sText 
     */
    private static void writeWithDateTime(String sText) {
        DateFormat oDateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        Date oDate = new Date();
        String sDate = oDateFormat.format(oDate);
        write(sDate + " " + sText);
    }
    
    /**
     * Write with \n
     * @param sText 
     */
    private static void write(String sText) {
        System.out.println(sText + "\n");
    }    
   
    public static void main(String args[]) throws InterruptedException {
        // For Profiling
        long lStartTime;
        double dSeconds;
        long lElapsedTime;
        
        int iThreads = 8;
        int iRegistersPerThread = piNUM_REGISTERS_IN_FILE / iThreads;
        CSortMultiThread[] aoCSortThreads = new CSortMultiThread[iThreads];
        
        writeWithDateTime("CSort MultiThread proof of concept by Carles Mateo");
        writeWithDateTime("Reading values from Disk...");
        readValuesFromFileToArray("/home/carles/Desktop/codi/java/carles_sort.txt");
        writeWithDateTime("Readed");

        writeWithDateTime("Total values to sort de deduplicate " + paiNumbers.length);
        writeWithDateTime("The max value between the Data is " + piMaxValue);
        paiNumbersCsorted = new int[piMaxValue + 1];
                
        writeWithDateTime("Performing CSort with removal of duplicates");
        lStartTime = System.nanoTime();
        for (int iThread=0; iThread < iThreads; iThread++) {
            int iStart = iThread * iRegistersPerThread;
            int iEnd = ((iThread + 1) * iRegistersPerThread) - 1;
            if (iThread == (iThreads -1)) {
                // Last thread grabs the remaining. 
                // For instance 100/8 = 12 so each Thread orders 12 registers,
                // but last thread orders has 12 + 4 = 16
                iEnd = piNUM_REGISTERS_IN_FILE -1 ;
            }
            CSortMultiThread oThread = new CSortMultiThread("Thread-" + iThread, iStart, iEnd);
            oThread.start();
            
            aoCSortThreads[iThread] = oThread;
        }       
        
        boolean bExit = false;
        while (bExit == false) {
            bExit = true;
            for (int iThread=0; iThread < iThreads; iThread++) {
                if (aoCSortThreads[iThread].bFinished == false) {
                    bExit = false;
                    // Note: 10 milliseconds. This takes some CPU cycles, but we need
                    // to ensure that all the threads did finish.
                    sleep(10);
                    continue;
                }
            }
        }
        writeWithDateTime("Main loop ended");
                
        writeWithDateTime("Copy to the ArrayList");
        copyFromCSortToArrayList();
        
        writeWithDateTime("The final array contains " + pliNumbers.size());
        
        lElapsedTime = System.nanoTime() - lStartTime;
        dSeconds = (double)lElapsedTime / 1000000000.0;
        writeWithDateTime("CSort time in seconds:" + dSeconds);

        writeWithDateTime("Writing values to Disk...");
        writeValuesFromArrayListToDisk("/home/carles/Desktop/codi/java/carles_sort-csorted.txt");

        // Reset the ArrayList
        pliNumbers = new ArrayList<>();
        
        lStartTime = System.nanoTime();
        /** QuickSort begin **/
        writeWithDateTime("Sorting with QuickSort");
        sort();
        writeWithDateTime("Finished QuickSort");
        
        writeWithDateTime("Removing duplicates from QuickSort");
        removeDuplicatesFromQuicksort();
        writeWithDateTime("The final array contains " + pliNumbers.size());
        lElapsedTime = System.nanoTime() - lStartTime;
        dSeconds = (double)lElapsedTime / 1000000000.0;
        
        writeWithDateTime("QuickSort cost in seconds:" + dSeconds);
    }
}

The complete traces:

run:
2017-03-26 19:28:13 CSort MultiThread proof of concept by Carles Mateo
2017-03-26 19:28:13 Reading values from Disk...
2017-03-26 19:28:39 Readed
2017-03-26 19:28:39 Total values to sort de deduplicate 500000000
2017-03-26 19:28:39 The max value between the Data is 1000000
2017-03-26 19:28:39 Performing CSort with removal of duplicates
2017-03-26 19:28:39 Creating Thread-0
2017-03-26 19:28:39 Starting Thread-0
2017-03-26 19:28:39 Creating Thread-1
2017-03-26 19:28:39 Starting Thread-1
2017-03-26 19:28:39 Running Thread-0 to sort from 0 to 6249999
2017-03-26 19:28:39 Creating Thread-2
2017-03-26 19:28:39 Starting Thread-2
2017-03-26 19:28:39 Running Thread-1 to sort from 6250000 to 12499999
2017-03-26 19:28:39 Creating Thread-3
2017-03-26 19:28:39 Running Thread-2 to sort from 12500000 to 18749999
2017-03-26 19:28:39 Starting Thread-3
2017-03-26 19:28:39 Creating Thread-4
2017-03-26 19:28:39 Starting Thread-4
2017-03-26 19:28:39 Running Thread-3 to sort from 18750000 to 24999999
2017-03-26 19:28:39 Creating Thread-5
2017-03-26 19:28:39 Running Thread-4 to sort from 25000000 to 31249999
2017-03-26 19:28:39 Starting Thread-5
2017-03-26 19:28:39 Creating Thread-6
2017-03-26 19:28:39 Starting Thread-6
2017-03-26 19:28:39 Running Thread-5 to sort from 31250000 to 37499999
2017-03-26 19:28:39 Creating Thread-7
2017-03-26 19:28:39 Starting Thread-7
2017-03-26 19:28:39 Running Thread-6 to sort from 37500000 to 43749999
2017-03-26 19:28:39 Running Thread-7 to sort from 43750000 to 49999999

Thread Thread-0 exiting.
Thread Thread-2 exiting.
Thread Thread-1 exiting.
Thread Thread-6 exiting.
Thread Thread-7 exiting.
Thread Thread-5 exiting.
Thread Thread-4 exiting.
Thread Thread-3 exiting.
2017-03-26 19:28:39 Main loop ended

2017-03-26 19:28:39 Copy to the ArrayList
2017-03-26 19:28:39 The final array contains 1000001
2017-03-26 19:28:39 CSort time in seconds:0.189129089

2017-03-26 19:28:39 Sorting with QuickSort
2017-03-26 19:29:40 Finished QuickSort
2017-03-26 19:29:40 Removing duplicates from QuickSort
2017-03-26 19:29:41 The final array contains 1000001
2017-03-26 19:29:41 QuickSort cost in seconds:61.853190885

BUILD SUCCESSFUL (total time: 1 minute 28 seconds)