17th update of 2021 on BlockTrades work on Hive software

blocktrades update.png

Below is a list of Hive-related programming issues worked on by BlockTrades team during last week or so:

Hived work (blockchain node software)

Many users last week experienced problems getting their transactions accepted into the blockchain. It turns out there were two separate problems that could cause this issue, but to the outside observer they probably looked much the same:

Beem had a problem detecting v1.25.0 Hive’s chainid, leading to invalidly signed transactions

The first problem, discovered a few days ago, was that the python-based Beem API library used by several Hive apps was misdetecting v1.25.0 Hive nodes as Steem nodes, and therefore trying to sign with the wrong chain id. This, in turn, led to Beem-based Hive Apps creating transactions that would be rejected by any Hive node, because the transactions weren’t properly signed for Hive.

We assisted Beem developers with identifying the source of the problem and suggested a solution. Beem now checks for the configuration constant HIVE_CHAIN_ID, before it checks for the constant STEEM_CHAIN_ID, giving precedence to Hive nodes over Steem nodes in the detection process and thus using the proper chainid during transaction signing.

In related work, we replaced the name STEEM_CHAIN_ID with OLD_CHAIN_ID and eliminated some duplicated constant values returned by get_config (but note these changes will not be included in the master branch until after the hardfork, like other changes completed in the last week):
https://gitlab.syncad.com/hive/hive/-/merge_requests/268
https://gitlab.syncad.com/hive/hive/-/merge_requests/267

Snapshot-based initialization of Hived-nodes requires a restart to properly initialize chainid

The second problem appeared one day later, but a couple of days before the Beem library was updated, so the two problems overlapped in time, and this led to some confusion among app developers as to what was the source of the new problem.

But by testing against various API nodes, we were able to determine that this second problem, although similar in appearance to the Beem problem, was occurring on non-Beem apps (e.g. hive.blog and peakd) and only occurring when those apps were using api.hive.blog as their API node. Since we hard recently updated the hived nodes on api.hive.blog, I suspected the problem has something to do with the update process.

After comparing update procedures used by other API node operators, I noticed that most of the API node operators had done a replay to update their nodes, whereas we had updated our internal hived nodes from a snapshot. Ultimately this allowed us to identify a previously undetected error with nodes initialized from a snapshot: the variable that stores the chainid isn’t part of the database state that gets loaded from the snapshot data and the snapshot loading code didn’t trigger this variable to be updated. So a node updated in this way would reject validly signed transactions because it was operating with the wrong chainid (essentially the reverse of the Beem problem).

Snapshot workaround

We also found that this improperly set chainid problem was resolved by stopping and restarting the node that was initialized from a snapshot, and that’s how we have worked around the issue until we release a full fix for snapshot-based node initialization.

The code fix is currently being implemented and tested, and will be released after the hardfork (to keep things simple, we’re avoiding making too many changes to the development branch until after the hardfork has triggered).

Completed hived work

Improvements to Testtools used to verify hived functionality: https://gitlab.syncad.com/hive/hive/-/merge_requests/266

Removed the copy/pasted version of secp256k1 library from hived and replaced with a submodule link:
https://gitlab.syncad.com/hive/hive/-/merge_requests/264

Fixed issue where enum_virtual_ops sometimes returned non-virtual ops: https://gitlab.syncad.com/hive/hive/-/merge_requests/219

We completed work to eliminate duplicated code in the command-line wallet for hived, but we’re delaying merging those changes into the develop branch until later because of the sheer amount of changes involved. The still-open merge request for that work is here: https://gitlab.syncad.com/hive/hive/-/merge_requests/170

Continuing work on hived

We’re working on speeding up the blockchain converter that we’ve developed to initialize and control a testnet configured in basically the same way as a mainnet. This is to allow for more realistic testing inside of a testnet environment (especially for better performance testing). So far, we’ve identified that the performance bottleneck seems to be related to the function that generates nonces.

We’re continuing to work on the sql serializer plugin for hived that provides data to HAF-based applications. We expecting to complete that work this week, assuming no further distractions arise in the meantime (this work got delayed while we were troubleshooting the snapshot problem), then begin performance testing.

We’re also continuing to work on the CI system to automate deployment of hived for more strenuous testing scenarios (e.g. long replay testing).

Hivemind (2nd layer applications + social media middleware)

Fix for previously report list_subscribers bug: https://gitlab.syncad.com/hive/hivemind/-/merge_requests/513

We continue to investigate ways to improve performance of the update_rshares function immediately after massive sync of a hivemind instance. In addition to analyzing the cause of slowness of the current implementation, we’re also simultaneously looking at two different alternatives to improve overall performance: 1) eliminating the call completing and performing equivalent functionality during live sync and 2) adding an index (at least temporarily just after massive sync) to speed up update_rshares.

The new index dramatically speeds up performance of update_rshares (time drops from 10 hours to 3 minutes and the index only takes about 10minutes to create), but it consumes 25GB of disk space (hence the idea to drop it after using it).

We’re still running functional and performance tests related to the broken reputation/ordering issue because of multiple transactions used per block. As part of this work we’re also adding “completion markers” to the database state to better detect any database consistency error that can arise from an abrupt shutdown of postgres (notably, this can happen when postgres does an auto-upgrade for a security fix).

We also resumed research into the reason why some hivemind nodes consume more memory than others. It has been suggested that it may be related to differences in python or python-library installations on the different systems.

Hive Application Framework

We made good progress last week on both HAF code and the documentation for it, but our main dev for this work is now on vacation until June July 19th, at which time the work will resume.

But in the meantime, we should be able to do performance testing of this code using the new sql_serializer plugin, and I’m looking forward to seeing just how well we do.

Reminder: hardfork 25 triggers tomorrow

Witnesses and devs alike will be “manning their keyboards” tomorrow (June 30th) as hardfork 25 triggers, ready for any potential problems that may arise. The hardfork is scheduled for approximately 14:00 UTC.

H2
H3
H4
3 columns
2 columns
1 column
35 Comments
Ecency