Ehren Graber's Blog

Improve Application Reliability with Microservices and Serverless Platforms

December 01, 2018

photo 2

Microservices and serverless platforms

With the drive towards microservices, serverless platforms and infrastructure as a service (IaaS), operations and infrastructure engineers are being replaced by product engineers. This started with the move to continuous delivery where product engineers deployed the application. This in turn put the responsibility of testing on the product engineer where before this would have been the realm of a QA engineer.

This has improved the speed of software delivery and for a good reason. Before infrastructure engineers wasted significant time building infrastructure for networking, databases, and servers. Now serverless applications and similar paradigms have removed the need for monolithic architecture. But if infrastructure support is not well organized you risk tech debt building up. Tech debt slows development and increases the risk of errors in the application.

Steps to building a reliable product engineering team

Creating a reliable product engineering team requires:

  • Thorough on-boarding education process for all team members. This should review existing processes and how to document new processes.
  • Detailed documentation on the application testing and release process for product engineers.
  • Incident response system for issues showing the priority, progress, and point person.

We will never be able to replace operations engineering completely. Operations engineers must put in place the tools for releasing and measuring your application. But first you need to understand what your applications unique KPI’s are for your project. You should measure more then up-time. Monitoring your critical actions allows you to act if anything goes wrong. An incomplete breakdown to a critical area of the application is often far more costly then a complete breakdown. This is because an incomplete breakdown is often harder to detect and thus takes longer to fix.

We must recognize the importance of reliability, operational agility, education, documentation, and support. Even with a full IaaS platform we still need QA, operations, monitoring, and a well documented release management process.

Written by Ehren Graber who lives and works in Vancouver, Canada building useful things.