Category Archives: Business

Post-Mortem: The mystery of the duplicated Transactions into an e-Commerce

Me, with 4 more Senior BackEnd Engineers wrote the new e-Commerce for a multinational.

The old legacy Software evolved into a different code for every country, making it impossible to be maintained.

The new Software we created used inheritance to use the same base code for each country and overloaded only the specific different behavior of every country, like for the payment methods, for example Brazil supporting “parcelados” or Germany with specific payment players.

We rewrote the old procedural PHP BackEnd into modern PHP, with OOP and our own Framework but we had to keep the transactional code in existing MySQL Procedures, so the logic was split. There was a Front End Team consuming our JSONs. Basically all the Front End code was cached in Akamai and pages were rendered accordingly to the JSONs served from out BackEnd.

It was a huge success.

This e-Commerce site had Campaigns that started at a certain time, so the amount of traffic that would come at the same time would be challenging.

The project was working very well, and after some time the original Team was split into different projects in the company and a Team for maintenance and evolutives was hired.

At certain point they started to encounter duplicate transactions, and nobody was able to solve the mystery.

I’m specialized into fixing impossible problems. They used to send me to Impossible Missions, and I am famous for solving impossible problems easily.

So I started the task with a SRE approach.

The System had many components and layers. The problem could be in many places.

I had in my arsenal of tools, Software like mysqldebugger with which I found an unnoticed bug in decimals calculation in the past surprising everybody.

Previous Engineers involved believed the problem was in the Database side. They were having difficulties to identify the issue by the random nature of the repetitions.

Some times the order lines were duplicated, and other times were the payments, which means charging twice to the customer.

Redis Cluster could also play a part on this, as storing the session information and the basket.

But I had to follow the logic sequence of steps.

If transactions from customer were duplicated that mean that in first term those requests have arrived to the System. So that was a good point of start.

With a list of duplicated operations, I checked the Webservers logs.

That was a bit tricky as the Webserver was recording the Ip of the Load Balancer, not the ip of the customer. But we were tracking the sessionid so with that I could track and user request history. A good thing was also that we were using cookies to stick the user to the same Webserver node. That has pros and cons, but in this case I didn’t have to worry about the logs combined of all the Webservers, I could just identify a transaction in one node, and stick into that node’s log.

I was working with SSH and Bash, no log aggregators existing today were available at that time.

So when I started to catch web logs and grep a bit an smile was drawn into my face. :)

There were no transactions repeated by a bad behavior on MySQL Masters, or by BackEnd problems. Actually the HTTP requests were performed twice.

And the explanation to that was much more simple.

Many Windows and Mac User are used to double click in the Desktop to open programs, so when they started to use Internet, they did the same. They double clicked on the Submit button on the forms. Causing two JavaScript requests in parallel.

When I explained it they were really surprised, but then they started to worry about how they could fix that.

Well, there are many ways, like using an UUID in each request and do not accepting two concurrents, but I came with something that we could deploy super fast.

I explained how to change the JavaScript code so the buttons will have no default submit action, and they will trigger a JavaScript method instead, that will set a boolean to True, and also would disable the button so it can not be clicked anymore. Only if the variable was False the submit would be performed. It was almost impossible to get a double click as the JavaScript was so fast disabling the button, that the second click will not trigger anything. But even if that could be possible, only one request would be made, as the variable was set to True on the first click event.

That case was very funny for me, because it was not necessary to go crazy inspecting the different layers of the system. The problem was detected simply with HTTP logs. :)

People often forget to follow the logic steps while many problems are much more simple.

As a curious note, I still see people double clicking on links and buttons on the Web, and some Software not handling it. :)

I cancelled my Amazon Prime subscription

I was using a lot Amazon. Sending parcels to my previous job offices, and now to Blizzard offices, so I subscribed to Amazon Prime. With COVID-19 virus we were sent to do Remote Work, and now with the lock down basically I’m 99.99% of the time at home.

I did a test to see how it works sending to home during the pandemic.

I choose two different items, I reviews the order, they were going to be delivered separately, one day of distance.

I choose two items that will fit in my mailbox, separated or together. One USB3 3mts male female and a Blu-ray movie.

My surprise comes when I go to the mailbox one day before and I see that I have a paper from an-post telling that they pass by to deliver my parcel, and they did not leave because it doesn’t fit the mailbox and they did not want to leave it a common space. For my surprise both Amazon parcels were grouped and sent before time. Maybe in a bigger box. But the mailman did not ring my door.

The paper tells me to get my parcel in the middle of the city, during the lock down. No way! I’m not going to risk my health and specially from elders, just to grab a cable and a movie.

I had the chance to request re-delivery to an Post, so I do. I fill all the info, I inform my phone number, email, I indicate which door to ring, and two days after as promised… a paper from an Post!.

They did not even rang my bell again.

I go to Amazon to cancel the order, but the process is only created for if you got the items.

Fuck it. I’m not going to order anything else to Amazon until that COVID-19 passes.

I don’t know if the postman just avoids people for fear to contagion or the process of an Post is awful and he didn’t get any information. But I’ll not buy anything even if I cannot buy in other places cause the lock down.

I was going to maintain my Amazon Prime subscription, even if I know that I’ll not use it much with the lock down, but makes no sense. Also:

  • I use Netflix and my Raspberry Pi 4, I was not using Amazon Prime Video.
  • I use Spotify, I was not using Amazon Prime Music.
  • I like to read in paper, not in eBook, so I was not using the eReader options.

A nice way to loss a customer.