What’s better, few or many input ports in Logstash/Graylog?

This thought often cross our mind when configuring log collection inputs in Logstash or Graylog. The arguments can go countless and every network have a unique argument based on their specific configuration. I won’t be debate on the myriad arguments but list down key logical factors to help us make our decision.

Log enrichment is evident

Unless we are planning to just dump the logs to get rid of centralized logging compliance requirement, we will be working with each unique log type to filter, transform, and add new data to make it useful.

We must need to identify and pick unique log types to apply the enrichment procedures.

Parsing is costly

If we are reading the strings of log messages to identify unique log types, we are wasting precious CPU cycles which can be put to better use. Syslog is the most widely encountered log format. It needs to be parsed to extract individual units of information like severity, timestamp, facility, host, and actual log message from the string.

If we are receiving multiple type of logs from multiple kind of devices on a single port, we need to parse-out each unique log type for future processing. The processing resources will take a significant hit as we scale.

Avoid parsing for log identification

We can increase our processing efficiency by skipping the need to parse logs for identification of type. This can be done in two ways

Continue reading