What’s better, few or many input ports in Logstash/Graylog?

This thought often cross our mind when configuring log collection inputs in Logstash or Graylog. The arguments can go countless and every network have a unique argument based on their specific configuration. I won’t be debate on the myriad arguments but list down key logical factors to help us make our decision.

Log enrichment is evident

Unless we are planning to just dump the logs to get rid of centralized logging compliance requirement, we will be working with each unique log type to filter, transform, and add new data to make it useful.

We must need to identify and pick unique log types to apply the enrichment procedures.

Parsing is costly

If we are reading the strings of log messages to identify unique log types, we are wasting precious CPU cycles which can be put to better use. Syslog is the most widely encountered log format. It needs to be parsed to extract individual units of information like severity, timestamp, facility, host, and actual log message from the string.

If we are receiving multiple type of logs from multiple kind of devices on a single port, we need to parse-out each unique log type for future processing. The processing resources will take a significant hit as we scale.

Avoid parsing for log identification

We can increase our processing efficiency by skipping the need to parse logs for identification of type. This can be done in two ways

1) Split log types over ports

If we know that we will only receive log of type A on port 55200 and type B on 55300, we can skip the initial parsing and save lots of CPU cycles.

2) Prefer extraction over parsing

All the good log shipping agents support the ability to add additional field and send syslog in a structured format. All we have to do is define a log type field for each unique log type before shipping out. Upon receiving we can quickly extract the log type and load for further processing.

Extracting field value and matching it to know log type is no doubt consume less CPU cycles than parsing every string.

Conclusion

If our log is structured, we can happily receive all of it on few input ports or even a single port and process it efficiently.

If we are unable to change the shipment method, we should opt for multiple ports trying to split log types over individual ports.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.