I need to track feature usage for an application so I can do the following:
track feature usage for a user. We have 20+ features and we want to limit feature usage. Think, select count(*) from db where user_id = 1 and feature_id = 1 that have to be calculated on the fly.
must have fast read write ops.
able to do machine learning on data
do I need horizontal scaling?
I've been pointed towards elastic search and wondering if there's better alternatives.
Don’t reinvent the wheel and write this yourself. Have your application write out a log, ingest the log into a tool, and use the tool for your analytics.
Often that's done with two databases - one database is fast to write, the other is fast to read.
Then you have a task that moves data from the fast write database over to the fast read one.
do I need horizontal scaling?
A simple table with no index at all on a fast server with a simple relational database should be able to handle several hundred thousand inserts per second. If you add indexes, it gets slower. Potentially unusably slow.
So, you have all your indexes on a second table. That one does have indexes, and with the right indexes it can handle hundreds of thousands of reads per second.
No scaling necessary, just two tables in one database. You should only need to scale when you run out of disk space.
You might also make it a little more complex, like have a write table for each day. Then you can copy the data over in a single batch and delete the table.