Tuesday, November 3, 2015

Systems Architecture for Dummies: Basics

Introduction

I have recently moved to an entirely new project. I know very little about it, but one of the junior devs asked me to explain the architecture so instead of give specifics I gave him generalized information about systems architecture. So now that I have to worry about systems architecture instead of just client side architecture I wanted to put down my thoughts.

What I told the junior dev what that the architecture would be composed of 3 types of components: Services, Web Servers, and Databases. Well, let's break that down a little bit further. A system will be composed of Processes. Now in all likelihood the important part of the Process will be running inside another process such a a web server or a database, but I wanted a nice abstraction.

Parts of a Component (Talk Generally Much?)

A Process will need a few things. Not all processes will have these things, but it is the framework I will use for my thinking.

A) Communication
B) Behavior
C) State/Config
D) Scheduler

Communication

Every part of the system will need to communicate with some other part or parts of the system. This one is always true. If it is not then you don't really have a system, you have a simple application. How pieces communicate is often very different. Some may use REST or SOAP web services. Some may open sockets. Some may write to a file location and not really be aware of who picks up their message. Lots of communication happens by having one part of the system act as a proxy. For example, suppose you have Rabbit MQ getting messages from a bunch of services. Well, all these services are communicating with Rabbit MQ, but they are actually trying to communicate with someone else. Regardless, every piece of the system has a way to communicate. Often a piece of a system will communicate with many other systems.

Talking / Listening

A Process can either send messages or consume them. This is pretty straight forward. You can use different mechanisms for these two.

Directed / Broadcast

You can communicate directly with another Process or you can broadcast out a message which other Processes can pick up. This is a little tricky because there are many things that can act as a communication proxy like a queueing system or a database. In one way you send a directed message, but in another way you aren't really aware who is the final consumer of your message.

Behavior

Every component should do something. It may be a data transformation like transforming data objects into an HTML page. It could be taking in documents and producing some index. Now some Process behavior could be communication. A queueing system like Rabbit MQ is just for communication so behavior can be very thin.

State/Config

A Process will need configuration. It will need state. The minimum state is a way to connect to a Process that I can use to store state. This is pretty common. Most Processes will have some configuration and use a database process to some state.

Scheduler

This component is optional or can be very simple. It may be the a Process responds to a message from another Process, but if not it can respond to time. It might do something every 5 minutes. Or it may be complex and have extremely variable schedules all corresponding to different things it must do.

How Does This Affect Architecture?

A system has 4 parts: Communication, Behavior, State or Config, Scheduling. These parts should be separate as possible and communicate via contract (interface). Your behavior section should consume Interfaces that define how it communicates with other components. It should expose interfaces that can be used by a communication layer. If I switch from a file system, to a relation database, to a non-relational database, to a external cloud webservice I want to isolate the changes to the communication layer. My behavior should not change! Now this may not be 100% possible, but this is the architecture to strive for. Some tools make this kind of separation hard. And while some of these tools may be good, they are compromising your architecture.

The same is true for Scheduling. There should be one flexible scheduling component. It should be completely ignorant of the behavior it is scheduling. You should just tell it when to fire something and what to fire.

No comments:

Post a Comment