MongoDB is fast, but only when your working set or index can fit into RAM. So if my server has 16G of RAM, does that mean the sizes of all my collections need to be less than or equal to 16G? How does one say "ok this is my working set, the rest can be "archived?"
"Working set" is basically the amount of data AND indexes that will be active/in use by your system.
So for example, suppose you have 1 year's worth of data. For simplicity, each month relates to 1GB of data giving 12GB in total, and to cover each month's worth of data you have 1GB worth of indexes again totalling 12GB for the year.
If you are always accessing the last 12 month's worth of data, then your working set is: 12GB (data) + 12GB (indexes) = 24GB.
However, if you actually only access the last 3 month's worth of data, then your working set is: 3GB (data) + 3GB (indexes) = 6GB. In this scenario, if you had 8GB RAM and then you started regularly accessing the past 6 month's worth of data, then your working set would start to exceed past your available RAM and have a performance impact.
But generally, if you have enough RAM to cover the amount of data/indexes you expect to be frequently accessing then you will be fine.
Edit: Response to question in comments
I'm not sure I quite follow, but I'll have a go at answering. Firstly, the calculation for working set is a "ball park figure". Secondly, if you have a (e.g.) 1GB index on user_id, then only the portion of that index that is commonly accessed needs to be in RAM (e.g. suppose 50% of users are inactive, then 0.5GB of the index will be more frequently required/needed in RAM). In general, the more RAM you have, the better especially as working set is likely to grow over time due to increased usage. This is where sharding comes in - split the data over multiple nodes and you can cost effectively scale out. Your working set is then divided over multiple machines, meaning the more can be kept in RAM. Need more RAM? Add another machine to shard on to.
The working set is basically the stuff you are using most (frequently). If you use index A for collection B to search for a subset of documents then you could consider that your working set. As long as the most commonly used parts of those structures can fit in memory then things will be exceedingly fast. As parts no longer fit in your working set, like many of the documents then that can slow down. Generally things will become much slower if your indexes exceed your memory.
Yes, you can have lots of data, where most of it is "archived" and rarely used without affecting the performance of our application or impacting your working set (which doesn't include that archived data).