following the incident event that occurred on Friday, March 5, 2021, we believe it’s of the utmost importance to inform you about what happened through this incident report.
Over the years we have always distinguished ourselves by offering a solid service, based above all on the principle of operational continuity, which has allowed the company to obtain the world’s record as the longest-standing cryptocurrency exchange: more than 11 years of consolidated operation.
We have always proven the company’s capacity to deal with turbulence by reacting positively and overcoming critical moments with full self-awareness. Competence that has contributed to developing our ability to learn from experience, by understanding mistakes and trying to use them as mean to improve our company.
Unfortunately on Friday, March 5th we experienced, for the first time in our history, a technical problem that caused a major downtime which inevitably made our services unreachable.
We would like to emphasize that we have not suffered any loss of funds nor data because it wasn’t an attack instead only a technical problem; moreover, the exchange operation (orders, trading, and funds) did not suffer any hindrance because, at the end of the emergency, everything restarted regularly from the state before the disruption occurred.
Our experience, the mistakes, the achievements, the know-how acquired, have led the company to (by also taking into account this latest unprecedented experience) further improve our systems and significantly strengthen our ability to cope with emergency situations, confirming once again the excellent organizational resilience of our Group.
Unreachability of the primary cluster, at the provider warehouse, with the consequent non-delivery of the services existing in the same.
The morning of Friday 5 and early afternoon for the necessary checks to restore regular operations.
The issue has been generated by the Provider where we have the systems involved. We are waiting for an official report from the provider with whom we are constantly in touch.
- 1.37 AM (CEST) – The monitoring systems of 3 different services reported a global DOWN of our platform, API & WEB
- 1.50 AM – 6:55 AM (CEST) – The IT staff on availability, once realized the unreachability of the services, unaware of the provider’s error, tried to immediately understand what was happening by initially carrying out routine checks in order to exclude the possibility of an IT attack; once the checks and analysis have been completed, they began an accurate diagnosis of the cause in order to understand the type of problem: software or hardware. The IT staff on availability performs an internal escalation assuming that the problem could be software side and therefore not the provider’s responsibility.
After verification, the IT staff understands that the problem is probably hardware side and unfortunately could not be solved independently. After a prolonged waiting time, our IT staff received an initial response from the Provider at 6.55 AM.
- 6.55 AM (CEST): The Provider reported that upon a review executed by data center technicians, no anomalies were found and that further verification carried out by their networking team would be required assuming that the problem could be at this network layer.
- 8.30 AM (CEST): The provider, following on-site reviews among the various departments, informed us that there was an internal error that led to the decommissioning: the removal of our systems and storage from the warehouse to another warehouse, due to an undefined problem, probably procedural (we are awaiting a full report of the Provider’s incident report, which will be delivered to us in no less than 7 business days). Shortly afterward we received a communication from the Provider informing us that they have taken in charge, with extreme urgency, the physical restoration of the servers and that this procedure would take a maximum of 1 hour.
At the same time, we had already started the procedures to carry out, only in case of complete irrecoverability of the systems, the activation of the Disaster Recovery.
- 09:30 – 13:30 (CEST): The provider retrieved and verified the hardware status of the systems and gave top priority to put all involved systems back into the rack and perform the complicated low-level wiring and verification tasks.
Considering the situation and in light of the communications received from the provider, which led us to believe that the recovery would be completed within an hour, we decided to wait for the activation of the DR site. The complex activity of restoration, also due to some unforeseen events on the provider’s side of things, required more time than necessary to the Provider; thus allowing us full access to the systems only at 13.30.
- 13.30 – 15.00 (CEST): Our IT team, at this point, immediately began the extensive procedures of checking and verifying the integrity of funds, data, and databases. The team found some non-viable subsystems that presented critical issues but still required careful verification so as not to run into any issues when reactivating the platform. At this point, we decided to activate the original platform and definitely discontinue the DR activity.
- 15.45 (CEST): The IT team reactivated API in the first instance and a few minutes later, the platform and related services.
Despite the exceptional and unpredictable nature of the event, it is a scenario that, although unlikely, had already been taken into consideration. Precisely for this reason, we had already contracted, prior to the incident, a solution that, in addition to enhancing the technological infrastructure, could also further improve the current priority channel, in order to reduce intervention and response times as much as possible. However, unluckily, the unfortunate circumstance occurred before the date that will activate, as per contract, the effectiveness of the agreement and the new conditions.
For more information, we are available at the following email: email@example.com