Lambda Log Parsing Series:
- AWS Loadbalancer Logs and Lambda – Part 1 – Background
- AWS Loadbalancer Logs and Lambda – Part 2 – AWS Setup
- AWS Loadbalancer Logs and Lambda – Part 3 – .NET Core Parsing
If you’re hosting in AWS, you’re likely using an EC2 Load balancer (Application or Classic), and if you’re running a decent sized business, you’ll want to be looking at those logs and trying to visualise them somehow to gain insights and inform your scaling decisions. The de-facto choice right now is ElasticSearch and Kibana.
AWS have made it quite easy to setup the log shipping to S3 and also the setting up of a usable ElasticSearch cluster with the Kibana plugin installed. What they haven’t made easy (yet) is parsing those logs and pushing them into ElasticSearch.
For most, the answer is Elastic.co’s “Logstash”. This has plugins that will allow you to query an S3 bucket (on a schedule) for new files, parse them and push them up to ElasticSearch. You could throw this into a t2.nano, scale it up during high demand, or even try and run a logstash cluster. However, this is something that can cost a lot of money, if not done properly, and is costly in maintenance time. It’s also prone to leaving you for long periods where you’re waiting for you’re stats parsing/pushing to catchup, these are normally during heavy load times, which for some is when they become the most useful.
Enter Lambda. (fanfair ensues)
Lambda is the (the original? /me runs and hides) serverless environment, hosted in AWS. Sometimes dubbed “Functions as a Service”.
What’s serverless you might ask? Well, I won’t cover it fully, but here’s a little snippet of how I think it is best described.
Serverless is the concept of managing Zero underlying hardware or software infrastructure, living completely in the code you create. The function you want to use is ALWAYS available, and scales as much as your money can stretch.
If you’re a fan of docker, and the scale that you can achieve with that, this is a similar (if not more or less exactly the same) as that, but you manage EVEN LESS!
Why is this good for log parsing? In the same way as you can’t predict the load for your site, by proxy (pardon the pun) you can’t predict the size and amount of the logfiles it produces. So, if you want to get your logfiles visualised as soon as possible, you need dynamic, limitless scale to match the dynamic, limitless scale of your web environment.
Further to the above, Lambda’s have seamless integration with Triggers in S3 (see where I’m going with this?).
Why can’t you just use Logstash in Lambda?
Well this is a ruby/Java project, and as of right now, this isn’t supported in Lambda. This means that you need to resort to an autoscaled logstash cluster (either EC2 or possibly a container), and who wants to manage servers in 2017?
Enter DotNetCore (further fanfair ensues)
As of December 2016, there is first class support for dotnetcore in lambda. This means that you can use dotnetcore for handlers to triggers. (sidenote, you can also deploy asp.net core web api endoints…)
Getting it working is tricky (more daunting if you haven’t set it up before), but I hope to try and dismay some of that complexity with these posts.