Application Logging & Alerting With Graylog (1 of 2): Introduction

Lydtech
Application Logging & Alerting With Graylog (1 of 2): Introduction

Introduction

Having a reliable log management tool that enables querying, filtering, alerting, and visualising logs is an essential part of a system deployment. Graylog is a popular open-source log management system that is widely used in the industry that fulfils these requirements. This two part series introduces Graylog and its architecture, and demonstrates running and configuring Graylog, ingesting logs from a Spring Boot application.

The accompanying Spring Boot application demonstrating logging with Graylog is available here.

Graylog Architecture

Graylog provides a distributed, fault tolerant log management system with the functionality to search, query, correlate logs, and more. Custom dashboards are designed to visualise log data, and alerts are configured to trigger notifications to users through means such as Slack, email, and PagerDuty. It is open-source, lightweight, and performant. Typically Graylog would be deployed behind a load balancer for log input to provide scalability, fault tolerance and high availability. Graylog has two dependencies that it requires in order to operate, Elasticsearch or Opensearch, and MongoDB.

Figure 1: Graylog architecture

Figure 1: Graylog architecture

Elasticsearch/Opensearch

Graylog uses Elasticsearch or Opensearch in order to store the large volume of log data that it ingests, providing robust indexing and searching abilities. They are horizontally scalable, meaning as the volume of logs grows, the indexing and searching workloads can be distributed across nodes in the cluster. It enables analytics, data analysis and report generations to be run against the log data. A comparison of the two is provided in an upcoming section.

From version 5.2 of Graylog the Data Node component is introduced. This is a management component used to configure and optimise Opensearch for use with Graylog. It ensures the correct version of Opensearch and its extensions are installed, and provides other benefits such as implementing security certificates and managing cluster membership. Graylog Data Node is deployed in the Spring Boot demo, covered in the second article in the series.

MongoDB

Graylog also requires MongoDB. It uses this for storing metadata such as users, roles, dashboards and alerts. It also stores configuration information including data source inputs, outputs for processed data, and processing rules. MongoDB is also horizontally scalable and provides high availability, so provides a resilient storage for Graylog.

Log Delivery Protocol

Logs can be delivered to Graylog via one of two protocols, UDP or TCP. The trade-offs between the two must be considered in order to select the appropriate protocol for the system.

UDP

UDP is a connectionless protocol that prioritises speed of delivery and low overhead over reliability. With UDP there is no guarantee that logs will be delivered reliably or in order to Graylog. Graylog does not need to acknowledge receipt of logs from the sending application. As such it has no impact on the performance of the application that is sending the logs. Log messages sent to Graylog cannot be encrypted. There is a limitation on the size of message of 8192 bytes.

TCP

TCP provides a two way protocol. Graylog will acknowledge receipt of logs from the sending application. If Graylog is down then the application is able to resend the log message until success, and hence message delivery is guaranteed. This does however mean that performance is slower than UDP, as the application has to wait for confirmation of delivery (or timeout) for each message sent to Graylog. Unlike UDP, logs sent to Graylog can be encrypted via TLS. There is no message size limitation.

Protocol Comparison

UDP TCP
Guaranteed Delivery No Yes
Guaranteed Order No Yes
Application Impact None Yes: if Graylog slow or down
Supports Encryption No Yes
Message Size Limit 8192 bytes None

Log Format

There are three principle formats for logs that are being ingested by Graylog. These are RAW/Plaintext, Syslog, and Graylog Extended Log Format (GELF).

  • RAW/Plaintext is the most straightforward to use, and particularly well suited to testing.
  • Syslog is a common format widely used by Unix-like operating systems, network devices and applications to send logs of varying types including system events.
  • GELF is a powerful format, based on JSON, that sends log messages in a structured way, optimised for retrieving the required data when ingested by Graylog.

Typically a library is used by the application to handle creating the log messages in the required format, and sending them to the Graylog application.

Elasticsearch vs Opensearch

The two search and analytics stores supported by Graylog, Elasticsearch and Opensearch, have a number of similarities and differences. These will inform the best choice in any deployment that uses Graylog as the logging system.

Elasticsearch

Elasticsearch was developed by Elastic NV before being released as an open-source project in 2010. The feature development, licensing, commercial offerings and governance have primarily been driven by Elastic NV. As such it includes commercial plugins that provide further features and functionality, and Elastic NV offer commercial support and enterprise features including security, monitoring and alerting. It has a large and active community of users behind it.

Opensearch

Opensearch is a community-driven fork of Elasticsearch from 2021, managed by the Opensearch community, which includes contributions from various organisations. It was forked due to the concerns of the licensing costs being imposed by Elastic NV. It aims to provide a transparent and open development process, and open governance. It offers similar search and analytics capabilities as Elasticsearch. It also has commercial support offered by third-party vendors.

Differentiators

The key differentiators between the two offerings then comes down to the communities, the forms of governance models, the licensing models and the commercial support.

Summary

Graylog offers a powerful logging system that is capable of ingesting messages in different formats, via different delivery protocols. It enables users to visualise, search and analyse ingested log messages. It is open-source, resilient and scalable, making it an excellent choice as the logging component in a system deployment. In the second part the accompanying Spring Boot application is covered, and the ingestion of the application logs into Graylog and sending of alert notifications are demoed.

Source Code

The source code for the accompanying application demonstrating sending logs to Graylog is available here:

https://github.com/lydtechconsulting/springboot-graylog/tree/v1.0.0


View this article on our Medium Publication.