Bitcask Capacity Calculator
These calculators will assist you in sizing your cluster if you plan to use the default Bitcask storage back end.
This page is designed to give you a rough estimate when sizing your cluster. The calculations are a best guess, and they tend to be a bit on the conservative side. It’s important to include a bit of head room as well as room for unexpected growth so that if demand exceeds expectations you’ll be able to add more nodes to the cluster and stay ahead of your requirements.
Recommendations
To manage your estimated 183.9 million key/bucket pairs where bucket names are ~10 bytes, keys are ~36 bytes, values are ~36 bytes and you are setting aside 16.0 GiB of RAM per-node for in-memory data management within a cluster that is configured to maintain 3 replicas per key (N = 3) then Riak, using the Bitcask storage engine, will require at least:
Details on Bitcask RAM Calculation
With the above information in mind, the following variables will factor into your RAM calculation:
Variable | Description |
---|---|
Static Bitcask per-key overhead | 44.5 bytes per key |
Estimated average bucket-plus-key length | The combined number of characters your bucket + keynames will require (on average). We’ll assume 1 byte per character. |
Estimated total objects | The total number of key/value pairs your cluster will have when started |
Replication Value (n_val ) |
The number of times each key will be replicated when written to Riak (the default is 3) |
The actual equation
Approximate RAM Needed for Bitcask = (static bitcask per key overhead +
estimated average bucket+key length in bytes) * estimate total number of
keys * n_val
Example:
- 50,000,000 keys in your cluster to start
- approximately 30 bytes for each bucket+key name
- default
n_val
of 3
The amount of RAM you would need for Bitcask is about 9.78 GBs across your entire cluster.
Additionally, Bitcask relies on your operating system’s filesystem cache to deliver high performance reads. So when sizing your cluster, take this into account and plan on having several more gigabytes of RAM available for your filesystem cache.