STEEMSQL - Update 10 - New features and performances

Enjoy new features and performance improvement for your queries.



SteemSQL is a public MS-SQL database with all the blockchain data in it.

Previous posts:

What’s new?

Many people use SteemSQL to get information about followers or posts that have been resteemed.

The problem is that these operations are not associated with a dedicated transaction, but are stored in the blockchain using custom_json transactions. In the SteemSQL database, You will find these transaction in the the TxCustoms table.

Storing the details of a follow or reblog operation in a JSON string is pretty annoying because this requires deserializing the content of the JSON string before you can analyze its contents.
Even if SQL server has JSON functions to perform such operation, queries are quite slow as the server has to process every row before being able to filter data. Unfortunately, it is not possible to create an index on values contained in the JSON string.

Searching for followers or resteemed posts is all but efficient, especially when you know that the TxCustoms table contains almost 14 million records.

But that's not all, the format of the JSON string for 'follow' operations has changed over time. This has to be taken into account when parsing the data and writing an efficient query quickly becomes cumbersome.

Long story made short, it is really a mess when you want to analyze followers or reblogs!

Fortunately, SteemSQL comes to the rescue with new features.

The Followers and the Reblogs views

Two new views have been created that will allow faster and easier analysis of followers and resteemed posts

1. The Followers view

This is a very simple view, which contains only 2 columns:

  • follower : the name of the follower
  • following : the name of the followed user

The Followers view is not a transaction view, meaning it does not expose historical data and does not allow to know when a user start or stop to follow another account.

Knowing how many followers I have is now as simple as issuing a query like

SELECT COUNT(*) FROM Followers (NOLOCK) WHERE following = 'arcange'

2. The Reblogs view

This view is also very simple and contains the following columns:

  • account : the user who resteemed a post
  • author : the author of the resteemed post
  • permlink : the permlink of resteemed post
  • timestamp : when the post has been resteemed

Knowing how many of my posts have been resteemed is now as simple as issuing a query like

SELECT COUNT(DISTINCT permlink) FROM Reblogs (NOLOCK) WHERE author = 'arcange'

Future performance improvements

As more and more are using SteemSQL, I plan to continue my work on improving SteemSQL performances.

However, I will first take some well deserved vacation.
In the meantime, do not hesitate to send me your suggestions or comments. contact me in the dedicated SteemSQL channel on steemit.chat or via Telegram (@VIM_Arcange)

Thanks for reading.


All payout from this post will be dedicated to keep SteemSQL running.
Thanks for your support.

footer created with steemitboard - click any award to see my board of honor

Support me and my work as a witness by voting for me here!


You Like this post, do not forget to upvote or follow me or resteem

H2
H3
H4
3 columns
2 columns
1 column
18 Comments
Ecency