🐢 VT

Heavy Keeper

In my company, the team I’m working in is responsible for authentication and authorization. For that, the system is constantly targeted by suspicious users.

For example, we were once experienced an attack wave where a random dude tried to brute-force our customers password.

These kind of attacks can be mitigated by adding rate-limiting and captcha in front of the API. We also log those events and use Graylog to analyze the data so the Security team can react to when they see a suspicious action.

But Graylog is not suitable if the attacker is sophisticated, who spreads the attack over several days, instead of brute-forcing in a short period like 1 or 2 hours.

To deal with that problem and also to have more flexibilities in how we handle these suspcious users, in an automatic, real-time manner, we implement a probabilistic data structure (DS) called Heavy Keeper (HK)

At its simplest, HK is just a key-value pair list which in our case, key is the fingerprint to identify a user and value is the frequency of an event.

What makes HK different from normal key-value DS is that, the value has a TTL. Which means the item will stay in the list, as long as its value is big enough to be consider as an Elephant. Otherwise, soon, it will be removed, or in HK’s term, decayed, from the list to make room for other items.

That’s why HK is called count-with-exponential-decay algorithm. This characteristic ensures high accuracy with small required memory.

For details, please refer to the Paper here.

#Programming #Til