How does hivecomb categorize your posts?

One key element of social media has always been the algorithms that present content to users that interests them. It's what keeps them hooked and coming back.
Hive has been lacking this so far, but this has changed now. hivecomb.net is the new interface for casual hive browsing. Set your filters by language, category, or sentiment, and find posts you care about but would have never seen otherwise.

In the backend, hivecomb uses the new CombFlow API, which provides a way for any user interface to use the same filters on their end.

But how does it work exactly?

To start, we had to analyze the content on Hive and find a set of categories that match the content. At the current iteration, we have 9 main categories with 38 leaves total.

Then comes the difficult part. We need a bunch of posts for each category to create a centroid - a vector that future posts can be compared against. The lower limit is somewhere around 50, but better hundreds, per leaf category. In the past that would have required manual tagging, as user provided tags aren't reliable. Thanks to AI and Hive account@pharesim's trusty Alienware laptop, we can work through 20k posts in about 10 hours.

After these centroids have been created, a worker script can run through the blockchain and classify each post in less than a second. It also guesses the language(s) and a sentiment - is the post written in a positive or negative mood.

In the current iteration, we have mixed results overall. A lot of the categories work very well already, and you can really explore the topic without too many false results. Some others catch too much, or not enough, and some might be missing or have to be replaced. But in general it works, and will only improve over time.
You can help with that! If you're logged in and look at the post, there's a flag icon next to the categories on the top of the window. Just write a small comment following the example text and hit send. Manual categorizations are invaluable, and telling the algorithm that something doesn't belong into a category will help keeping them clean in the future.

The language detection works great, if you find mistakes there please report them on Discord.
The sentiment will need work, it's not very reliable right now. But a fun to have for now.

What's planned?
Besides tuning the category detection, there are quite a few options for the future. As everybody knows, other social media algorithms also track your every move and click and scroll to fine tune your personal feed. On hivecomb, that would be an option, and the API could even make it work over interface borders. But if that's desirable is another question. Even when the user can choose to deactivate it, some might not be aware of the possible indications. If you'd like to join the discussion about the possibilities and ethics of this platform, come join the Hivecomb Discord community.