first something about how message processing works:
- A high-priority 'receiver thread' reads the syslog messages from the network by using IO-completion ports and puts the raw messages without further message decoding or processing into the 'received message queue'.
- The 'message processor thread' takes received messages from this queue, decodes the raw message contents and evaluates them against the configured rules. Depending on the outcome of the rule evaluation, messages are put into 'action queues'.
- One thread per action type is then responsible for processing messages in the corresponding 'action queue', e.g. write them to file.
>I don't think I have a hardware bottleneck.
You have and it's the performance of the CPU cores:
In your case the 'message processor thread', which decodes raw messages and evaluate them against the configured rules, is unable to keep up with the rate of new messages being put into the 'received message queue', therefore this queue builds up till all available heap memory is used up.
This means that if you would use, for example, a CPU with 2 GHz cores, instead of one with 1 GHz cores, the rate of messages being processed without queue buildup would double (in reality it would be a little bit less than two times due to locking overhead).
Please note that how many CPU cycles the 'message processor thread' uses for a single message depends mainly on the rule configuration, e.g. using a rule with a regular expression filter condition can take many times more CPU cycles than a rule without such a filter condition.