Viewing logs is useful but oftentimes it's a waterfall of text pouring on your screen. You can't make heads or tails of it. Or you have to save it to a file and manually pick through every line.
You can save tremendous time and frustration if you're able to filter and analyze the log lines that matter.
What's more, you may not know what the problem is. The customer is complaining something isn't working and you have to find out why.
Unfortunately, most factory tools like Kubernetes don't give you these filtering mechaisms out of the box. The `grep` tool works in a pinch but often leaves much to be desired.
You don't need an ELK stack, or managed service like SumoLogic or Splunk (now Oracle 😬) either. (For startups, those services are expensive to setup and use.)
Easy debugging is often an afterthought
Default toolchains are focused on improving their own project.
DX (developer experience) is often an afterthought.
Many projects don't adopt good logging practices, so they log too much.
Inexperienced developers don't know any better.
I've found one tool that takes away all the pain, and helps you zero in on logs that mean something.
Here's how, step by step:
Step 1: Install Angle Grinder
Angle Grinder is an open source command line tool written in Rust. It parses, filters, and even aggregates logs - fast!
To install:
MacOS
brew install angle-grinder
Linux
curl -L https://github.com/rcoh/angle-grinder/releases/download/v0.18.0/agrind-x86_64-unknown-linux-musl.tar.gz \
| tar Ozxf - \
| sudo tee /usr/local/bin/agrind > /dev/null && sudo chmod +x /usr/local/bin/agrind
agrind --self-update
See the official installation instructions for more options.
Step 2: Start filtering
After installation, the binary is: agrind
.
In kubernetes, I combine agrind with stern for a powerful combo. I pipe stern's output to agrind.
The `-oraw` (output raw) flag is important! Stern adds extra fields labeling the pod and container. These fields are useful if you're quickly getting the lay of the land. However, agrind won't be able to parse them. The `-oraw` flag removes these fields and outputs the raw logs without any extra metadata.
Let's assume each log line is formatted in json:
kubectl stern my-pod -oraw | agrind '* | json'
Produces output such as:
[duration=1.366521ms][handler=/api/live/ws][level=info][logger=context][method=GET][msg=Request Completed][orgId=1][path=/api/live/ws][remote_addr=84.172.197.37][size=0][status=-1][t=2023-07-23T19:04:37.844030046Z][time_ms=1][uname=admin][userId=1]
As you see, it parses logs into key-value pairs surrounded by brackets.
Angle-grinder has its own pipe syntax, similar to jq. So once you parse, you can continue to pipe output to additional functions, transforming the data however you like as you go.
Let's break down the syntax real quick:
agrind '<filter> | <parser> | <function> | <function> | ...'
The wildcard shows all logs. Do not filter any out.
agrind '*'
Angle-grinder's filtering is really a "scope." It only includes lines matching the filter and drops everything that does not match.
# Show only logs with "error" somewhere in the log line
agrind 'error'
You need to parse each log line in order for agrind functions to work.
# Assumes each log line is valid json object
agrind '* | json'
Now you can start using agrind's functions to transform data to suit your needs. It also supports parsing logfmt. Logfmt looks like:
level=info msg="something happened"
Instead of:
{"level":"info","msg":"something happened"}
Oftentimes, the logs contain too much information, so I hide fields I don't care about.
Show me only the log level and message:
kubectl stern my-pods -oraw | agrind '* | logfmt | fields + level,msg'
# Outputs
[level=info] [msg=Request Completed]
[level=info] [msg=Completed cleanup jobs]
[level=info] [msg=Update check succeeded]
With fewer fields, agrind formats output like a table which makes it easier to read.
Step 3: Start aggregating
In the scenario where a customer is complaining about a bug and you're trying to find out what's wrong. I like to aggregate by error message.
Show all error logs and frequency of the error message.
kubectl stern my-pods -oraw | agrind '* | logfmt | where level == "error" | count by level, msg'
# Prints output such as
level msg _count
------------------------------------------------------------
error Error writing to response 2
error Failed to get annotations 1
error Request Completed 1
If you're in an early stage startup, you probably don't have metrics working yet. In that case, agrind is your "poor man's” metrics!
Show me the average, p50, p99, and max durations for a request to be served:
kubectl stern my-pods -oraw | agrind '* | logfmt | where msg == "Request Completed" | avg(duration), p50(duration), p99(duration), max(duration)'
_average p50 p99 _max
--------------------------------------------------------
63.70 2.62 927.89 999.12
One last hint. You don't need to tail live logs.
# Save raw logs to a file
kubectl stern my-pods -oraw > file.log
# Analyze later with agrind:
cat file.log | agrind '* | logfmt'
Conclusion
Angle-grinder need to be part of your toolbox. The sooner you can figure out a problem, the sooner you can get back to building.
Angle-grinder provides a [ton of examples] and offers lots of advanced functionality.
If you liked this post, subscribe for future ones on how to thrive and survive in the tech startup world: