Is Unfairness the Cause of Hive's Low User Retention?
Just over a year ago I wrote a series of posts exploring user retention on Hive. In part I wanted to use data to explore how user retention has changed, how it has effected the growth of the project, but the main aim when writing the series was to try and answer "why do so many users leave Hive?".
If you ask the community, it will seem like this question is very easy to answer. Many people will tell you it's because of the complexity of the user experience such as the need to use and store crypto keys. Others will tell you with certainty that it's because downvoting exists, and its existence is stifling to new users. Several other answers will be given, but the problem in general with all of these explanations is a lack of data. It's easy to say that one thing is leading to another, and even to provide a reasoning that seems sound. It's harder to demonstrate it to be true with data. Until we can do that, we're making stabs in the dark - we all have competing ideas of what needs to be changed but any or all of those ideas can be wrong.
In that series, I did manage to come to at least one meaningful conclusion about user retention on Hive, ruling out one of the leading explanations. Hive users do not leave because of lack of rewards, indeed if anything the data suggests that users leave in greater numbers when rewards are high. It also appears to have little to do with bear markets, again if anything bull markets feature lower user retention than bear markets - the problem in bear markets typically becomes a lack of new joiners.
I eventually stopped posting on the user retention series because some of the hypotheses are really hard to test with the data I can extract from - particularly my personal favorite which is poor user experience, especially handling crypto keys. I don't know how to test that with the resources I have available, nor the downvote hypothesis.
More recently however I have been exploring granger causality on Hive, a statistical method to explore possible causal links that can be found in data. With this method, I can come back to some of the ideas around user retention, such as the comments by and
in that post.
There are a couple of interesting, testable hypotheses in these comments, and a third if you combine them.
- Users leave because of perceived unfairness of rewards
- Users leave because of lack of engagement
- Users leave because of perceived unfairness in engagement.
In this post, I will examine the first hypothesis with granger testing, and also with the same methodology I will examine if levels of inequality can be used to predict changes in the price of Hive.
Preparing the Data
To test the hypothesis that users leave because of perceived unfairness of rewards, it must be reframed as something more measurable. My post on inequality on Hive described using the Gini Coefficient as a measurement of inequality on Hive, and it explored how rewards inequality has changed over time. For this test, we will use rewards inequality as a proxy as it is more readily measurable. Unfairness and inequality are related concepts but not the same. It is quite possible for something to be "equal" but still be perceived as unfair, for example if two people receive the same rewards but where one had put in far more effort into their work than other. Or there is just the perception that one person put in much more work than the other. For our purposes though, we will simply examine inequality as a factor.
For a better understanding of the methods here, check out my first post examining the impact of Splinterlands on the price of Hive, or check out the Wikipedia page or Real Statistics guide on Granger causality.
First we must examine if the rewards gini coefficient on Hive qualifies as "stationary". Here is the chart from the post on inequality.
Since there is a slight trend, I suspect the data might not be stationary, but we will perform Augmented Dickey-Fuller tests anyway.
It turns out to be stationary, so we do not need to do anything more to find a stationary dataset.
The first data we will test it against is the daily change in active users.
This dataset is also determined to be stationary.
First a correlation: -0.00248. This is a negative correlation but trivially small, indicating there is basically no immediate relationship between rewards inequality and changes in active users, which probably should not be a surprise.
Given that both datasets are stationary, we can now examine them with Granger Testing. As before, statistically significant findings are highlighted in yellow. The strongest effects (F > 3) are further highlighted in green.
We can see that there do appear to be causal links between inequality on Hive and user activity, in both directions.
Finally let's examine the relationship that rewards inequality has on the Hive price. There is a very weak negative correlation between inequality and the price of Hive: -0.159. This means that the higher inequality, the lower the price of Hive, but the relationship is very weak and it says very little with regard to causality.
Daily changes in the price of Hive have been determined to be stationary in previous posts. Granger tests are below.
No fields are highlighted because there is no statistically significant finding here.
Conclusions
The level of inequality of rewards appear to have an impact on user activity within 2 to 3 days after users make a post. This impact is considerably less than the impact that price has on user activity, but it warrants further investigation. Consider when most users stop posting. The vast majority of accounts never make a post or comment at all, but of those who do become active, 24% have made their last post or comment within 3 days of joining.
Given this, and the results from the granger test above, it is entirely possible that inequality of rewards contributes to a substantial part of our user retention problem. At the very least it is worth looking into this further.
Changes in the number of active users has an impact on the equality of rewards. This is a longer lasting effect than the reverse, and can be seen from 2 days to 3 weeks after a change in active users. Perhaps this represents a larger number of users competing for the same pool of rewards, with more new users making for higher inequality.
There is no statistically significant impact to be identified in the data that rewards inequality has on price or the reverse. Perhaps this means that inequality has no relationship with the price at all, or perhaps the level of inequality has never reached a low enough level on Hive to have a measurable impact on price.
My statistics and analysis posts take many hours each to research, chart and write, so if you find them valuable and of interest to other Hivers, I appreciate your support in sharing, commenting, and/or upvoting my work. If you're interested in these kinds of stats posts, click the 'follow' button on my profile, or subscribe to the Hive Statistics Community which features daily Hive stats posts from as well as less regular posts from myself and others.