Writing Structured Logs in Rails

Posted

If you are looking for more robust logging than what Rails gives you out of the box, the documentation for Lograge is a good place to start, but if you are looking to log anything outside the scope of an HTTP request, you need to consider options outside of lograge.

Let’s explore how to do that.

The Problem

Rails tutorials assume human-readable logs are enough for your Rails application. But if you are here because you googled ‘structured logging in Rails’, you most likely have the following needs:

  1. You need structured logs, output as JSON.
  2. You need request_id to be present in your request logs.
  3. You want request_id to be present in all your logs.

Other blog posts cover #1-#2, but #3 is often left an exercise for the reader. And while you may be tempted to try a more general logging framework (with its own dependencies), you only need a bit of custom code to meet your objectives and let built-in Rails / Ruby logging frameworks do the rest.

But first let’s talk about ‘Lograge’

Lograge concerns only one aspect of logging: request logging. This will surprise Rails developers who eagerly search for a drop-in replacement for all their logging needs. No, seriously, read that again: “Lograge concerns only one aspect of logging: request logging.”

Request logs tell you how much you need to know about time spent in the database, rendering views, and any details in the context of a controller action. Lograge is opinionated in that it omits certain details, but it is not a general-purpose logging framework.

Adding a complex feature to an otherwise self-contained application can open the door to logging requirements that extend outside the scope of a controller. That’s the point where something like lograge won’t help.

What does Lograge even do?

The diagram below is a simplistic view of a web application. Lograge records details of the inbound HTTP request (A), the outbound HTTP response (B), and some timing details, after which it generates a log entry.

Looking at lograge.rb, you can see a bit about what’s going on under the covers. When enabled, lograge ties into ActiveSupport::LogSubscriber (i.e., an existing Rails API) to subscribe explicitly to ActionController events. While there’s more to the internals, from a high level, that’s how it works.

# from "lib/lograge.rb"
def attach_to_action_controller
  Lograge::LogSubscribers::ActionController.attach_to :action_controller
end

This should clarify that lograge is not a general-purpose logging framework. It relies on Rails’ existing functionality to be notified of ActionController events and only reacts when events fire within your controller’s scope.

The ActiveSupport docs provide more detail on what the underlying ‘LogSubscriber’ does and why it exists.

ActiveSupport::LogSubscriber is an object set to consume ActiveSupport::Notifications with the sole purpose of logging them. The log subscriber dispatches notifications to a registered object based on its given namespace.

While lograge can output useful information in a structured format and provide a way to easily append a ‘request_id’ to its output, we need another solution to log any other event outside of the controller’s scope.

Let’s check off where we are so far in our original list:

  • ⚠️ You need structured logs, output as JSON.
    • use Lograge::Formatters::Json for lograge only.
  • ✅ You need request_id to be present in your request logs.
    • use lograge’s callback to insert request_id as ‘custom_options’.
  • ❌ You want request_id to be present in all your logs.

Concerning the first item (JSON logging), because Lograge does not concern itself with anything other than request logging, it will not output JSON for your other logs, so we still need to solve that. Concerning request_id, here is a brief overview of why you may need to solve that problem.

The External API use case

To complicate our lives a bit more, below is a more realistic web application with an external dependency. Let’s say we want to record details about our external API calls where we have an outbound HTTP request (C) and corresponding inbound HTTP response (D).

Since we’ve established that lograge only cares about (A) and (B), we are left to our own devices to log (C) and (D). That includes formatting our logs as JSON, generating a structured log format, and including request-scope details like request_id to correlate everything that happened between the inbound request (A) and the outbound response (B).

That sounds like a lot, but let’s break down the actual code needed. It’s not very much, but it is important to understand how it works.

Just show me the code!

Here is an annotated view of what your config/environments/production.rb might look like. I’ve removed anything not related to this problem for the sake of brevity. I will talk about each step in detail below by the number in the comments.

# 1. Enable lograge and format its logs as JSON
config.lograge.enabled = true
config.lograge.formatter = Lograge::Formatters::Json.new

# 2. Add custom fields to lograge output
# - request_id
# - filtered params
# - exception / stacktrace
config.lograge.custom_options = lambda do |event|
  request = event.payload[:request]
  stacktrace = Rails.backtrace_cleaner.clean(event.payload[:exception_object].backtrace) if event.payload[:exception_object]

  {
    params: request.filtered_parameters.except(:controller, :action),
    request_id: request.request_id,
    exception: event.payload[:exception],
    stacktrace: stacktrace
  }
end

# 3. Lograge will use a plain logger
config.lograge.logger = ActiveSupport::Logger.new $stdout

# 4. Use a custom logger for structured (non-lograge) logs
config.logger = CustomLogger.new($stdout)

# 5. Format our structured logs as JSON
config.logger.formatter = CustomLogger.formatter

Let’s take each step apart below:

1. Enabling lograge + JSON formatting

We enable lograge to take over request logging, and set it up with a JSON formatter. This does not provide JSON formatted output for all your logs. It only affects the request logging lograge will be taking over from here on out:

config.lograge.enabled = true
config.lograge.formatter = Lograge::Formatters::Json.new

2. Logging request_id in (A) and (B)

I added a few more features to this block:

  1. Logging request parameters (minus the controller/action as these are already evident in the log output) You may want to log request params differently than I have done here or leave this out entirely.
  2. Logging any exception, and (if present) the exception’s stack trace.
  3. Logging request_id.
config.lograge.custom_options = lambda do |event|
  request = event.payload[:request]
  stacktrace = Rails.backtrace_cleaner.clean(event.payload[:exception_object].backtrace) if event.payload[:exception_object]

  {
    params: request.filtered_parameters.except(:controller, :action),
    request_id: request.request_id,
    exception: event.payload[:exception],
    stacktrace: stacktrace
  }
end

As a side-note, request_id is provided by the ActionDispatch middleware. It is enabled by default in Rails (output below is from a vanilla Rails 7 project):

> rails middleware | grep -i ActionDispatch

use ActionDispatch::HostAuthorization
use ActionDispatch::Static
use ActionDispatch::Executor
use ActionDispatch::ServerTiming
use ActionDispatch::RequestId         <----------
use ActionDispatch::RemoteIp
use ActionDispatch::ShowExceptions
use ActionDispatch::DebugExceptions
use ActionDispatch::ActionableExceptions
use ActionDispatch::Reloader
use ActionDispatch::Callbacks
use ActionDispatch::Cookies
...

3-4. Using a separate logger for lograge

This is important and, I suspect, where things can confuse many developers! We have established that lograge only handles (A) and (B). It also generates its own JSON output. Therefore we will configure a “plain” logger for lograge. If you were only using request logging, this step would not be necessary, as lograge would re-use the default logger. But since we are about to use a completely different logger for our structured logs, we need to be explicit here.

config.lograge.logger = ActiveSupport::Logger.new $stdout
config.logger = CustomLogger.new($stdout)

Note above that one logger is defined in the ’lograge’ namespace, and the other is defined as the default logger for our application.

Now that we’ve defined loggers, we can focus on the formatter for our custom logger.

5. Format custom log events as JSON

Without the line below, our (C) and (D) logs would be output as default plain text. With this line, we will define a formatter that outputs JSON:

config.logger.formatter = CustomLogger.formatter

Our application logs can not re-use lograge’s JSON formatter, as their formatter does not adhere to the interface our formatter needs.

That’s it for configuration. Now we will dig into what our ‘custom logger’ and ‘custom formatter’ look like.

Writing a custom logger & formatter

require "json"

class CustomLogger < ActiveSupport::Logger
  def self.formatter
    proc do |severity, time, progname, msg|
      metadata = {severity: severity, time: time, app: progname}
      msg = {message: msg} if msg.is_a?(String)
      "#{metadata.merge(msg).compact.to_json}\n"
    end
  end

  def debug(*msg, &block)
    value = as_hash(msg[0], msg[1], &block)
    super(value, &nil)
  end

  def info(*msg, &block)
    value = as_hash(msg[0], msg[1], &block)
    super(value, &nil)
  end

  def warn(*msg, &block)
    value = as_hash(msg[0], msg[1], &block)
    super(value, &nil)
  end

  def error(*msg, &block)
    value = as_hash(msg[0], msg[1], &block)
    super(value, &nil)
  end

  private

  def as_hash(msg, attribs = {})
    msg = yield if block_given?

    raise ArgumentError.new("message must be a string") unless msg.is_a?(String)
    {message: msg}.merge(attribs || {}).merge(request_id: Current.request_id)
  end
end

I have the formatter defined in this class for illustrative purposes only. It’s a good idea to create a separate class for it. Similarly, where you put this code is up to you, so long as it can be referenced by the configuration above.

Let’s talk about the above code in detail.

Logging Subclass

Our custom logger extends ActiveSupport::Logger which is the base class for all loggers.

class CustomLogger < ActiveSupport::Logger
  # ...
end

The base class in Rails (which itself extends Ruby’s default logger class) is what defines the #debug, #info, #warn methods use within our code. We want all those logging statements to continue working, but to support structured logging, we extend them here.

JSON formatting

The is the place where we will be given ‘severity’ (e.g. ‘INFO’, ‘WARN’, etc…) so let’s look at Ruby’s default Logger::Formatter for how it gets output:

# lib/logger/formatter.rb
def call(severity, time, progname, msg)
   sprintf(Format, severity, format_datetime(time), Process.pid, severity, progname, msg2str(msg))
end

Our formatter will adhere to the same interface, but we will insert severity and timestamp to the hash before serializing as JSON. Our custom formatter expects a Hash with key-value pairs to appear in our log output, but we also play nice and accept a String just in case:

def self.formatter
  proc do |severity, time, progname, msg|
    metadata = {severity: severity, time: time}
    msg = {message: msg} if msg.is_a?(String)
    "#{metadata.merge(msg).compact.to_json}\n"
  end
end

Structured Logging

Okay, now the part that allows us to model our structured logs as an arbitrary hash of key-value pairs. Like the logging methods in Ruby’s base class, these methods also take an optional block to be executed in place of a provided string (I don’t believe many developers know this behavior exists, but it is helpful in a few cases).

def debug(*msg, &block)
  value = as_hash(msg[0], msg[1], &block)
  super(value, &nil)
end

def info(*msg, &block)
  value = as_hash(msg[0], msg[1], &block)
  super(value, &nil)
end

def warn(*msg, &block)
  value = as_hash(msg[0], msg[1], &block)
  super(value, &nil)
end

def error(*msg, &block)
  value = as_hash(msg[0], msg[1], &block)
  super(value, &nil)
end

private

def as_hash(msg, attribs = {})
  msg = yield if block_given?

  raise ArgumentError.new("message must be a string") unless msg.is_a?(String)
  {message: msg}.merge(attribs || {}).merge(request_id: Current.request_id)
end

Logging request_id

Buried in the code snippet above is this line:

def as_hash
  # ...
  {message: msg}.merge(attribs || {}).merge(request_id: Current.request_id)
end

What is Current.request_id? Returning to our initial lograge configuration, lograge does not provide access to ActionDispatch’s request_id– we need to insert that ourselves. We did so in our logging configuration for lograge, and we will need to do it again for our application logs.

As of Rails 5, ActiveSupport provides ActiveSupport::CurrentAttributes which gives us a thread-safe way to store attributes such as request_id.

Most Rails apps have some form of a Current object used to store data (very few items) in the request scope. For this example, our ‘Current’ class will look like this:

# models/current.rb
class Current < ActiveSupport::CurrentAttributes
  attribute :request_id
end

# controllers/application_controller.rb
class ApplicationController < ActionController::Base
  before_action do
    Current.request_id = request.request_id
  end
end

That’s it! Now we have successfully generated a logging configuration that uses lograge for request logging, a custom logger for other structured logging needs, and added “request_id” to both forms of log output.

Completing the picture

Extending the logging methods allows us to structure our logs such that instead of logging an event like this:

# before
logger.info("Calling API for user: #{user.name} role: #{user.role} attempt: #{number_of_attempts}")

# log output
# Calling API for user: Bob-Smith role: admin attempt: 3

We can now create log events with an extra argument of key/value pairs:

# after
logger.info("Calling API", user: user.name, role: user.role, attempt: number_of_attempts)

The corresponding log output is below. We are free to use our log aggregator to correlate both of these logs by request_id, as well as query our custom logs for, say, all requests that resulted in an API call where users were of role ‘admin.

// Log output generated by our custom logger
{
    "severity": "INFO",
    "time": "2023-01-21T18:15:14.848-06:00",
    "message": "Calling API",
    "user": "Bob-Smith",
    "role": "admin",
    "attempt": 3,
    "request_id": "d7fe779c-5afd-4c36-80ee-91d7746adaa7"
}

// Log output generated by lograge
{
    "method": "GET",
    "path": "/items",
    "format": "html",
    "controller": "ItemsController",
    "action": "index",
    "status": 200,
    "duration": 38.89,
    "view": 32.46,
    "db": 0.76,
    "params": {},
    "request_id": "d7fe779c-5afd-4c36-80ee-91d7746adaa7"
}

Summary

You may reach the end of this and think that finding a general-purpose logging framework is a better tradeoff. The time and complexity involved in integrating and benefiting from such frameworks are worse than writing this yourself.

Generating reliable logs may not be the most essential source of truth for your application (gathering insight into your app via something like OpenTelemtry can be more valuable in the long term). Still, logging is a fundamental part of any application, and the more “Rails-like” your approach, the better off you and your team will be.

While the approach above does not provide some lesser-used features (e.g., hierarchical loggers, custom include/exclude logic), if those are important enough for you, the above code should make it obvious where such extensions should go. Write it!

Finally, I hope the above clarifies what lograge does and does not do. Over the past twenty years, I have found teams needing help correlating request logs with other logs (or have one developer who is excellent at doing so, while other developers can’t make heads or tails of them).

Having a solid approach to structuring logging can help in all these cases and lower the bar for your team members to log more things. It’s worth the hour or two to get them.