26th update of 2021 on BlockTrades work on Hive software

blocktrades update.png
Below is a list of some of the Hive-related programming issues worked on by BlockTrades team during the past two weeks:

Hived work (blockchain node software)

Hive nodes operating stably throughout past weeks

I recently heard reports that instability in the hive network had led to some failed hive-engine trading transactions, but we haven’t observed any problems in the operation of hived nodes. This can easily be seen in that normal hive transactions haven’t had any similar problems.

Based on my previous review of hive-engine transactions, I believe the problem is that some of these transactions are just too big to fit into partially filled hive blocks because they contain large numbers of proposed trades.

If I’m correct, there’s only two ways I can see to address this problem: increase the size of Hive blocks or reduce the size of these transactions. In the near term, I believe the optimal solution would be for hive-engine servers to use encoding techniques when creating these transactions to reduce their size.

Code to set new HBD haircut ratio and soft limit completed and tested

We completed the code and tests associated with changing the hard and soft limits for HBD supply, so I’ll start collecting final feedback soon on where we should set the new limits for HF26.

We’ve tentatively set the new hard limit to when HBD supply reaches 30% of the virtual supply of Hive (in the current hardfork it is set at 10%).

And we’ve set the both the soft range limit (the starting and ending point where post rewards begin to change from HBD to liquid Hive) at 20%. In other words, post rewards would immediately switch from paying HBD to paying only liquid Hive at 20% instead of gradually shifting between the two currencies as the debt ratio increases.

Added C++ linter to hived build-and-test system (CI)

We modified the docker builder for hived to include clang-tidy linting and fixed resulting lint warning reported by non-optimal copy of a shared pointer: https://gitlab.syncad.com/hive/hive/-/merge_requests/291
Modify CmakeLists to enforce requirement for clang lint tools to be installed:
https://gitlab.syncad.com/hive/hive/-/merge_requests/286

sql_serializer plugin (writes blockchain data to a HAF database)

Our primary hived work this week was focused on testing, benchmarking, and making improvements to the SQL serializer plugin and HAF-based account history app:

sql_serializer performance testing

We re-tested the sql_serializer syncing to head block performance after all the bug fixes were verified and there were no performance regressions.

We’ve setup a bunch of new fast servers in our in-house datacenter to speed up our verification and benchmarking tests and we’ve just started experimenting with how fast we can get the IO systems for these systems on a reasonable budget using software RAID on mid- and high-end 2TB NVME drives (Force mp600, mp600 core, mp600 pro, Samsung 980, and possibly the rather expensive mp600 pro xt) with varying numbers of drives in the RAID array and with various distributions of the table space inside the HAF database between drives.

On our fastest system currently (an AMD5950X128GB with a 4xSamsung 980 RAID0 drive) it took 25594s to reach 58M blocks and restore database indexes. On a similar system with slower IO (an AMD5950X 128GBwith a 3x force mp600 RAID0 drive) it took 27351s.

On both these systems, performance seems to be set by a mix of CPU speed and IO speed, but on systems with more typical drives, disk IO speed will likely be the dominant factor as a full sync to 58M blocks creates a 1.6TB database.

Eventually we’ll also test on some slower systems, but for now we’re testing on our fastest systems when we do full syncs (we only sync to 5M blocks on our slower systems) as our primary testing goal right now is to check for bugs in the code and the tests are time-consuming even on our fastest machines.

Moved image server to much bigger (and much more expensive) server

We were almost out of disk space on our existing image server (only has 36TB of storage with RAID setup) so we’ve been migrating the images to a new server with a 168TB RAID drive. We completed the handoff to the new server this weekend and worked thru some minor issues that resulted (tuning the caches appropriately, fix a rate-limiting issue between the new server and api.hive.blog, etc). If you noticed any issues rendering or uploading an image this weekend, you were likely observing us at work.

During this process we noticed that the cache-busting code added to condenser was negatively impacting Cloudflare’s CDN-based caching (this became more obvious during our performance testing with the new image server because it is located further away from our US office and cache misses were more painful due to network latency), so we asked @quochuy to revert that change (which he has already done and will be deployed to production tomorrow). Once that change is deployed, I expect that avatars on hive.blog will render about 2x faster on average.

We also noticed in this testing that we could potentially reduce the delays incurred by cache misses in the future by creating a simple HAF-based app to locally maintain the link between hive accounts and hive avatars and avoid the current call to database.get_accounts that the image server makes to a remote hived node. In this scenario, a remote HAF server would keep a “hive account” to “hive avatar” mapping table and push the occasional table updates to a local replication of the mapping table on the image server. I think this will make a nice and simple “starter” task for someone looking to create their first HAF app

Hive Application Framework: framework for building robust and scalable Hive apps

Fixing/Optimizing HAF-based account history app (Hafah)

We found and fixed several bugs in the HAF software ecosystem (sql_serializer, hive_fork_manager, account history app) this week. We completed a full sync to head block using 7 sending threads and 7 receiving threads on both the C++-based account history app (took 19730s) and the newer, python-based account history app (took 22021s).

So the C++ version is currently a little over 11% faster at syncing than the python version. Ideally we’ll be able to tune the python version to achieve the same speed as the C++ version, in which case we’ll be able to drop the C++ version and just maintain the python version in the future. And it’s likely that whatever knowledge we gain during that analysis will be useful for future python-based HAF apps as well.

Upcoming work

  • Release a final official version of hivemind with postgres 10 support, then update hivemind CI to start testing using postgres 12 instead of 10. We finished a full sync to headblock of the new version and next @gandalf will deploy it for production testing tomorrow. I don’t expect any problems, so we’ll probably officially recommend API servers to upgrade to the new version this week.
  • Run new tests to compare results between account history plugin and HAF-based account history apps.
  • Simplify build and installation of HAF-based apps and create a repo with HAF components as submodules to track version requirements between HAF components.
  • Finish setup of continuous integration testing for HAF account history app.
  • Test and benchmark multi-threaded jsonrpc server for HAF apps.
  • Finish conversion of hivemind to HAF-based app (didn’t get back to this task last week). Once we’re further along with HAF-based hivemind, we’ll test it using the fork-inducing tool.
  • Continue work on speedup of TestTools-based tests.

Schedule predictions (always a bit dangerous)

At this point I’m fairly confident we’ll be able to release HAF for production use by the end of this month. Since HAF doesn’t impact hived consensus, it can be released whenever it is ready, without requiring a hardfork.

As for hardfork 26 itself, it is still scheduled for December/January time frame (we’ll set an official date early next month). We’ve got two HF-related tasks we still haven’t started on, but I don’t think they will be too difficult: 1) make some simple “low-hanging fruit” improvements to RC calculations (for example, we’ve seen that some operations of varying size don’t get charged based on the byte size of the operation) and 2) allow asset-related transactions to use numeric asset identifiers (aka NAIs) instead of strings as part of the process of deprecating string-based asset identifiers. I’m confident we can complete the first task in time for the hardfork, and I’m reasonably confident we can complete the second task as well.

H2
H3
H4
3 columns
2 columns
1 column
33 Comments
Ecency