Developer hiring by the numbers

It’s no secret that hiring software developers is hard. Plenty of people more qualified than I have written on the subject over the years on just how hard. In Dublin, Ireland especially, the supply and demand economics of the problem are particularly difficult. Dublin is a major IT Hub in Europe, and host to the EMEA Headquarters for some of the largest IT companies (Google, Microsoft, Facebook) in the world. We’ve got insurance companies, banks and other financial institutions all HQ’d here as well. The demand for good IT staff is significant. And the supply side of the equation is further complicated by the presence of recruitment agencies. When a potential candidate is ready to move job, they are faced with a choice. Do all their own job search leg work? Or send their CV out to a small handful of recruitment agencies and let the interview invites roll in?

So where does that leave a relatively small IT company? Well in a far from ideal position, where hiring is a significant undertaking both in terms of time and money. Below is some background information on how I’ve conducted interview processes in my last few roles, and some some anecdotal numbers on how that process has been working out recently.

The Process

Our hiring process isn’t unique. We try to establish as quickly as possible whether a candidate is suitable for a role without wasting any of their time or ours.

  1. CV Filtering – Are they an appropriate match on paper for the role
  2. Phone/Skype Interview – Keep this brief, 30 minutes, enough to briefly discuss experience, existing role, reasons for moving on, technical experience and some relevant technical questions based on the role their applying for
  3. Code Test – We send candidates a test to complete in their own time. Proposed as a typical user story with requirements, acceptance criteria and DoD. This is a chance for the candidate to show off. Try to deliver a code test that can be given from everyone from graduate to senior developers. While a grad might solve the problem, we expect more from a candidate based on their seniority. Solution architecture, separation of concerns, use of design patterns, decoupled design, use of dependency injection/IOC, code quality, unit testing, integration testing, database migration strategy etc
  4. Face-to-Face Interview – The candidate meets with our team, usually a mixture of management and technical staff for a more involved conversation around experience, previous roles and technical questions

Our Rules

About 10 years ago, I was in my first team lead/people management role where hiring technical staff was part of my remit. I read “Smart & Gets Things Done” by Joel Spolsky. It’s a brilliant concise little booklet on how to hire developers and all this time later there’s a few things that still stand out for me.

  1. We don’t hire maybes – We’re a small company. We have a diverse portfolio of software projects for a number of clients across a range of industry verticals. The large majority of our business is bespoke solution development and as a result, the train up time on our projects tends not to be spent on the technology but on the domain knowledge & business problems. We simply cannot afford to hire the wrong people.
  2. We need all rounders – I dislike the term full-stack developers. Like it’s some novelty. At the risk of sounding like the dishevelled old man shouting “back in my day…”, developers used to be full-stack by default, at the very least, they had some hands on experience across all the layers of a solution. This is no longer the case with a lot of candidates we see; developers and engineers who have been siloed into roles where they only worked on one small sub-system, or one specific layer of the solution. This is compounded further when looking for seniors; we add skills like business analysis, hands on experience doing DevOps, CI & production releases, and experience in customer facing roles and it really can feel like we’re looking for Unicorns.
  3. We don’t compromise on #1 and #2 when other pressures are present – It’s all well and good having rules and principles in our process. So long as they don’t get thrown out the window when they don’t suit. Circumstances will dictate from time to time that we need to ramp up in resourcing to facilitate a project. This can be tough if the client wants to start tomorrow, and we have a potential 6-8 week lead time to hire.

The Numbers

The following is from a 6 week period during our most recent hiring period. We received a total of 84 applications. 73 from publicly posted job listings and 11 from a recruitment agency. A higher percentage of candidates from the agency got to phone screen.

  • Of those 84 we held phone screens with 35 candidates.
  • Of those 35 candidates, we sent out code tests to 25 of them.
  • Of the 21 code tests we received back, we went back and offered 9 candidates a follow up interview.
  • Of the 7 that accepted and attended the interview, we offered a full time position to 1 person.

That’s a signal to noise ratio of slightly over 1%.

Some other observations

Offering a week to complete the code test is a double edged sword. In that time, candidates will continue looking for a role. A number of candidates we provided code tests to just disappeared into the ether after being supplied the test. Similarly, candidates that we’ve offered a face-to-face interview (or even job offer) are often off-the-market before we get to that point.

A large percentage of our applicants are non-EU residents. We’re very happy to supply assistance to them in moving country, applying for VISA’s, obtaining work permits, but that does add a significant lead time on (an additional 4-8 weeks typically) on the standard 1-month noticed period we’d expect to be waiting.

Lets assume a resume takes 10 mins to review on average (check content, reply reject/proceed, possibly setup phone screen). A phone screen takes 60 minutes including prep and post-call time (sending out code tests). Code test review and notes by several team members adds another 60 minutes. And a face to face interview including 3-4 team members is at a minimum a 4 hr commitment. By those ballparks we’ve invested ~100 hours (13 business days) into this one phase of hiring. That’s a pretty serious time commitment and nearly doubles the cost associated with hiring when compared against the amount we have to pay out to a recruiter.

The bottom line…

…is that the IT hiring market is particularly challenging for SMEs. Finding talented developers, bringing them through the interview process and getting them in situ, is a significant undertaking. And when you get them, you better do everything you bloody well can to keep them.

After following this path for the past 3 years, we’ve now hit an impasse. The time and resource cost, lead time for starting and general difficulties with the hiring process for quality full time developers, is now a quantifiable impediment to our business. Without the ability to ramp up on demand, we’re faced with either turning down growth-business, or increasing the workload on our existing team. Neither option is particularly palatable. We need to pivot our . Contractor rotation? Near-shoring/Out-sourcing? A change in tack is definitely required.

~Eoin C

Multiple Github accounts & SSH keys on the same machine

If like me you have 2 or more, different Github accounts on the go, then accessing and committing as both on the same machine can be a challenge.
In my case, I have 2 accounts, one for work associated with my company email and a second for my own personal code.

If you’d like to be able to checkout, code and commit against different repo’s across different github accounts on the same machine, then you can do so by setting up multiple ssh keys, and having hostname aliases configured in your .ssh config file.

First off all, you’ll need to generate your SSH Keys. If you haven’t done this already, you can use the following commands to generate your keys.


$ ssh-keygen -t rsa -C "eoin@work.com"
Generating public/private rsa key pair.
Enter file in which to save the key (/c/Users/eoin/.ssh/id_rsa): id_rsa_eoin_at_work

$ ssh-keygen -t rsa -C "eoin@home.com"
Generating public/private rsa key pair.
Enter file in which to save the key (/c/Users/eoin/.ssh/id_rsa): id_rsa_eoin_at_home

Once you’ve created your 2 files, you’ll see 2 key pair files (the file you specified and a .pub) in your ~/.ssh directory. You can go ahead and add the respective key files each of your Github accounts. It’s in the Github > Settings < SSH and GPG Keys section of your settings. You’ll also need to add these files to ssh.

Next you’ll want to create an ssh config file in your ~/.ssh directory. You can see mine below.

Host github.com
    HostName github.com
    User git
    IdentityFile ~/.ssh/id_rsa_eoin_at_work

Host personal.github.com
    HostName github.com
    User git
    IdentityFile ~/.ssh/id_rsa_eoin_at_home

Here’s the trick, when you execute a git clone command to clone a repo, the host in that command is not a real DNS hostname. It is the host entry specified on the first line of each section in the above files. So you can very easily change that. Now, if I want to check out work related projects from my work account, I can use.

git clone git@github.com:eoincgreenfinch/heartbeat.git

# don't forget to set your git config to use your work meta data.
git config user.name "eoincgreenfinch"
git config user.email "eoin@work.com" 

But if I want to check out code from my personal account, I can easily modify the clone URI with the following.

git clone git@personal.github.com:eoincampbell/combinatorics.git

# don't forget to set your git config to use your work meta data.
git config user.name "eoincampbell"
git config user.email "eoin@home.com" 

~Eoin Campbell

SLA: How many 9’s do I need?

Introduction

I recently had a conversation with a colleague regarding service level agreements and what kind of up-time SLAs we were required to provide (or would recommend) to some our customers. This is something that comes up more and more, particularly in relation to software delivery on cloud hosting platforms. Azure, Amazon AWS, Open Stack, Rack Space, Google App Engine, and so on all offer ever increasing levels of improved up-time around their cloud offerings and this trickles down to the ISVs who build software on these platforms. So how many 9’s does your organisation’s system need ?

Percentage availability

Availability is the ability for your users to access or use the system. If they can’t access it because it’s locked up, or offline, or the underlying hardware has failed, then it is unavailable.

For the uninitiated, measuring availability in 9’s is industry parlance for what percentage of time your application is available. The following table maps out the equivalent allowed downtime described by those numbers.

Description Up-time Downtime per year Downtime per month
two 9’s 99% ~3.65 days ~7.2 hours
three 9’s 99.9% ~8.7 hours ~43 minutes
three and a half 9’s 99.95% ~4.3 hours ~21 minutes
four 9’s 99.99% ~52 minutes ~4.3 minutes
five 9’s 99.999% ~5.25 minutes ~25 seconds

Service Level Agreements

How many 9’s a company or services’ SLA specifies, does not necessarily mean that the system will always adhere to or guarantee that level of up-time. No doubt, there are mission critical systems out there that would need guaranteed/consistent up-time and multiple layers of fail-over/redundancy in case those guarantees are not met. However, more often that not, these numbers are goals to be attained, and customers might be offered a rebate/credit if the availability did not reach those goals.

Take Amazon S3 storage services for example. Their service commitment goal is to maintain a three 9’s level of up-time in each month, however in the event that they do not, they offer a customer credit of:
– 10% in the case where they drop below three 9’s
– 25% in the case where they drop below two 9’s

Microsoft Azure has a similar service commitment for their IaaS Virtual Machines. In this case, while they offer a similar credit rebate for dropping below, 99.95% they also caveat that you must have a a minimum of 2 virtual machines configured in an availability set across different fault domains (areas of their comm center infrastructure that ensure resources like power & network are redundantly supplied).

What are your requirements?

Our business is predominantly focused on providing our customers with line of business applications. The large majority of their usage is by end-users between 8 am and 6 pm on business days. As a result, we have a level of flexibility with our customers to co-ordinate releases, planned outages and system maintenance in a way that minimally impacts the user base.

In the past however, I’ve built and maintained systems that were both financially and time critical; SMS based revenue generation based on 30 second TV ad spots for example have a very different business use case, requiring a different level of service availability. If you’re system is offline during the 90 second window from the start of the advert, then you risk having lost that customer.

When identifying your own requirements, you need to think about the following:

  • When do you need your system or application to be available?
  • Do you have different levels of availability requirements depending on time of day, month or year?
    • LOB application that needs to be available 9-5/M-F
    • FinSrv application required for high availability at end of month but low availability through out the month
    • An e-commerce application requiring 24/7 availability across multiple geographic locations & overlapping timezone
  • What are the implications for your system being unavailable?
    • Are there financial implications?
    • Is the usage/availability time critical/sensitive?
    • Are other systems upstream/downstream dependent upon you and if so, what SLA do they provide?
  • If one component of your system is unavailable, is the entirety of the system unusable?
    • Is component availability mutually exclusive?

The cost of higher levels of availability

Requiring higher levels of availability (more 9’s) means having a more complex, robust and resilient hardware infrastructure and software system. If your system is complicated, that may mean ensuring that the various constituent components can each, independently satisfy the SLA. e.g.

  • Clustering your database in a Master-Master replication setup over multiple servers
  • Load-balancing your web application across multiple virtual machines
  • Redesigning to remove single points of failure in your application architecture such as in process session-state
  • Externalising certain services to 3rd parties that provide commercial solutions. (Azure Service Bus, Amazon S3 Storage etc…)

And all these things comes with a cost.

Johns E-Commerce Site

John runs an e-commerce website where he sells high value consumer goods. During the year his system generates ~€12m in revenue. Over the course of the year up-time equates to the following average revenue earnings, however since his business is low volume/high margin, missing a single sale/transaction could be costly.

  • €1,000,000 per month
  • €33,333.33 per day
  • €1,388.89 per hour
  • €23.15 per minute

John’s application currently only offers two 9’s of availability as it’s implemented on a single VPS and has numerous single points of failure. Planned outages are kept to a minimum but required to perform updates, releases and patches.

John is considering attempting to increase his platforms availability to four 9’s. Should he do it?

Quantifying the value of higher levels of availability

If you take a purely financial view of John’s situation, the cost implications of two 9’s vs. four 9’s is significant.

SLA Outage Window Formula Total Cost of Max. Outages
99% 3.65 Days 3.65 * €33333 €121,665.45
99.99% 52 minutes 52 * €23.15 €1,203.80

Ultimately, he needs to understand if this is an accurate estimation of the cost impact, and if it is, would it cost him more than €120K year on year, to increase the up-time of his system. There are numerous other business and technical considerations here on both sides of the equation.

  • Revenue estimation year on year may or may not be accurate
  • Revenue generation may not be evenly distributed through the year; if he can maintain high availability through the Black Friday and Christmas shopping seasons, it may alleviate most of his losses.
  • There may be other less tangible impacts on recurring revenue due to bad user experiences of arriving while the site is down etc.
  • Downtime may have a detrimental/negative impact on his brand.

On the other hand, what is the cost of the upgrade.

  • Development costs to upgrade the system.
  • Additional hosting costs to move to a cloud a platform or additional 3rd parties
  • On-going support costs to maintain this new system
  • There may be other considerations where the adoption of new technologies (a high availability cache) would alleviate the necessity of an increased SLA for a data store for example.

Assuming that the system can be initially upgraded and maintained year on year for less than €120K, the return on investment would make sense for John to undertake this work. It would be a different conversation the next time though when he wants to go to five 9’s availability.

Thoughts?

Deciding on an appropriate level for your SLA is complicated, and there are a myriad of considerations and inputs which will dictate the “right” answer for your particular situation. Whatever you decide, attempting to achieve higher and higher levels of availability for your system, will most probably lead to higher costs, and the smaller returns on investment. So make sure the level you choose is appropriate from both a business and technical perspective.

~Eoin Campbell

Running SQL Server on Azure Virtual Machines

Introduction

We recently started hitting some capacity issues with an SQL Server Reporting Services box hosted
on Microsoft’s Azure Cloud Platform. The server had been setup around the time, Microsoft end-of-life’d
their platform-as-a-service report server offering, and forced everyone back onto standalone instances.
The server was a Basic A2 class VM (3.5GB Ram, 2 Cores). Originally, it only had to handle a small amount of report
creation load but in recent times, that load has gone up significantly. And due to the “peaky”
nature of the customer’s usage, we would regularly see periods where the box could not keep up with
report generation requests.

In the past week, we’ve moved the customer to a new SQL Server 2014 Standard Edition install. Here are a few of the things we’ve
learned along the way with regards setting up SQL Server as a standalone instance on an Azure VM.

This information is based on the service offerings and availabilities in the Azure North Europe region as of February 2016

Which Virtual Machine Class?

First off, you should choose a DS scale virtual machine. At the time of writing, Microsoft offer 4 different VM classes
in the North Europe region: A, D, DS and D_V2. Only the DS class machines currently support Premium Locally
Redundant Storage (Premium LRS) which allows you to attach permanent SSD storage to your server.

Within the DS Set, DS1-DS4 have a slightly lower memory : core count ratio. The DS11-DS14 set have a higher
starting memory foot print for the same core count. We went with a DS3 server (4 core / 14GB) which we can downscale to
DS1 during out of hours periods.

Which Virtual Machine Class?

Which Storage Account?

During setup ensure that you’ve selected a Premium Locally Redundant Storage account which will
provide you access to additional attachable SSDs for your SQL Server. This can be found under
Optional Configuration > Storage > Create Storage Account > Pricing Tier

Which Storage Account?

External Security

Security will be somewhat dependent on your specific situation. In our case, this was a
standalone SQL Server with no failover cluster or domain management. The server was setup
with a long username and password (not the john.doe account in the screenshots).

We also lock down the management ports for Remote Desktop and Windows RM, as well as the added
HTTPS and SQL ports. To do this, add the public-to-private port mapping configurations under
Optional Configurations > Endpoints

Endpoint Configuration

Once you’ve finished setting up the configuration and azure has provisioned the server,
you’ll want to reenter the management blades and add ACL rules to lock down port access
to only the IP Ranges you want to access it. In our case, our development site, customer
site, and Azure hosted services.

You can add “permit” rules for specific IP addresses to access your server.
Once a single
permit rule is added, all other IP Addresses/Ranges are blocked by Default.

Endpoint ACLs

Automated Backups

SQL Azure VMs can now leverage an automated off-server database backup service
which will place your backups directly into Blob Storage. Select SQL Automated
Backup and enable it. You will be asked to specify where you would like to store your
backups and for how long. We chose to use a non-premium storage account
for this, and depending on the inherent value of your backups and whether you
intend to subsequently off-site them yourself, you might want to choose a storage
setup with zone or geo redundancy. You can also enable backup encryption by providing
a password here.

Automated SQL Backup to Storage

Disk Configuration

Now that your server is up and running, you can log in via Remote Desktop. The first
thing you’ll want to do is patch your server. As of mid-February 2016, the base image
for SQL Server 2014 on Windows Server 2012 Standard R2 is missing quite a number of
patches. Approximately ~70 critical updates and another ~80 optional updates need to
be installed.

Once you’ve got your server patched, you can take a look at the disk setup. If you’ve
chosen a DS Class Server, you’ll notice that you have 2 Disks. A regular OS disk, and
an SSD Temp Disk. This temp disk is NOT to be used for real data, it is local only to
the VM while it’s running and will be deallocated and purged if you shut the server
down

You can however purchase additional SSD disks very easily. Head back out to the Azure
Management Portal, find your VM, go to settings and choose Disks. In the following
screenshot, we’ve chosen to add an additional 2 x 128GB disks (P10 class) disks to
the server. The
SQL Server best practices
document recommends using the 1TB (P30 class) disks
which do give a significant I/O bump but they are also more expensive.

Ensure that you specify “Read Only” host caching for your Data Disk and No-Caching for
your Log disk to improve performance.

Adding Extra Disks

Once your disks are attached you can access and map them inside your VM. We chose to
setup the disks using the newer Window Server 2012 Resilient File System (ReFS) rather
than NTFS. Previously there were potential issues with using ReFS in conjunction with
SQL Server, particularly in relation to sparse files and the use of DBCC CHECKDB however
these issues have been resolved in SQL Server 2014.

Disk Configuration

Moving your Data Files

SQL Server VM Images come pre-installed with SQL Server so we’ll need to do a little bit
of reconfiguration to make sure all our data and log files end up in the correct place. In the
following sections, disk letters & paths refer to the following.

  • C: (OS Disk)
  • D:\SQLTEMP (Temp/Local SSD)
  • M:\DATA\ (Attached Perm SSD intended for Data)
  • L:\LOGS\ (Attached Perm SSD intended for Logs)

First, we need to give permission to SQL Server to access these other disks. Assuming
you haven’t changed the default service accounts, then your SQL Server instance will
be running as the NT SERVICE\MSSQLSERVER account. You’ll need to give this account Full
Permissions on each of the locations you intend to store data and log files.

Folder Permissions

Once the permissions are correct, we can specify those directories as new defaults
for our Data, Logs and Backups.

Setting Default Paths for Data & Logs

Next We’ll move our master MDF and LDF files, by performing the following steps.

  1. Launch the SQL Server configuration Manager
  2. Under SQL Server Services, select the main Server instance, and stop it
  3. Right click the server instance, go to properties and review the startup parameters tab
  4. Modify the –d and –e parameters to point to the paths where you intend to host your data and log files
  5. Open Explorer and navigate to the default directory where the MDF files and LDF files are located (C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\). Move the Master MDF and LDF to your new paths
  6. Restart the Server

Moving the Master Database

When our server comes back online, we can move the remainder of the default databases.
Running the following series of SQL Commands will update the system to expect the MDFs
and LDFs and the new location on next start up.

ALTER DATABASE [msdb] MODIFY FILE ( NAME = MSDBData , FILENAME = 'M:\DATA\MSDBData.mdf' )
ALTER DATABASE [msdb] MODIFY FILE ( NAME = MSDBLog , FILENAME = 'L:\LOGS\MSDBLog.ldf' )
ALTER DATABASE [model] MODIFY FILE ( NAME = modeldev , FILENAME = 'M:\DATA\model.mdf' )
ALTER DATABASE [model] MODIFY FILE ( NAME = modellog , FILENAME = 'L:\LOGS\modellog.ldf' )
ALTER DATABASE [tempdb] MODIFY FILE (NAME = tempdev, FILENAME = 'D:\SQLTEMP\tempdb.mdf');
ALTER DATABASE [tempdb] MODIFY FILE (NAME = templog, FILENAME = 'D:\SQLTEMP\templog.ldf');

--You can verify them with this command
SELECT name, physical_name AS CurrentLocation, state_desc FROM sys.master_files 
	

Shut down the SQL Instance one more time. Phyiscally move your MDF and LDF files to
their new locations in Explorer, and finally restart the instance. If there are any
problems with the setup or the server fails to start, you can review the ERROR LOG in
C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\Log\ERRORLOG

Conclusions


There are a number of other steps that you can then perform to tune your server.

You should also setup SSL/TLS for any exposed endpoints to the outside world
(e.g. if your going to run the server as an SSRS box). Hopefully you will have a you a far
more performant SQL Instance running in the Azure Cloud.

~Eoin Campbell

SOLID Principles

Introduction

Over the past few months Greenfinch has hired a number of new developers at varying levels of seniority. One of the go-to questions for our interview was to ask a potential candidate about SOLID principles in Object Oriented programming. Astonishingly, many candidates either didn’t know what they were or only had an academic understanding of them and could not talk about them in a practical sense with regards to real projects that they’d worked on.

Because we have a number of development engineers spanning various levels of experience, we thought it would be appropriate to have a quick refresher course on SOLID with some practical examples in one of our lunchtime brown-bag sessions. You can find the presentation below.

Presentation

Object Oriented Programming Concepts

Inheritance

Inheritance is when an object or class is based on another object or class, using the same implementation (inheriting from a class) or specifying implementation to maintain the same behaviour (implementing an interface). It is a mechanism for code reuse and to allow independent extensions of the original software via public classes and interfaces giving rise to a hierarchy.

Inheritance should not be confused with sub-typing though they can agree with one another. In general sub-typing establishes an is-a relationship, while inheritance only reuses implementation and establishes a syntactic relationship, not necessarily a semantic relationship.

public class Vehicle { ... }

public class RoadVehicle : Vehicle { ... }

public class Car : RoadVehicle { ... }

public class Truck : RoadVehicle { ... }

Encapsulation

Encapsulation refers to the bundling of data with the methods that operate on that data. Encapsulation is used to hide the values or state of a structured data object inside a class, preventing unauthorized parties’ direct access to them. Publicly accessible methods are generally provided in the class (so-called getters and setters) to access the values, and other client classes call these methods to retrieve and modify the values within the object.

It’s important to understand that Encapsulation doesn’t just mean classes are property bags with getters & setters and a handful of methods. As well as hiding implementations and exposing only the APIs required for the consumers to get what they need from a class or module, it also relates to how you structure your application architecture. Properly delineating your architecture into Core, Data, Service, Façade and Consumer layers, for example, will help encapsulate the functionality below from the callers above and keep your architecture decoupled.

Another important consideration with regards encapsulation is testability. Too often, we’ll start with a correctly encapsulated piece of code, and then when it comes time to unit test we realise that the functionality we want to test is buried inside private/inaccessible methods. This should be a red-flag to you that you need to rethink your design. Rather than just making these methods public or slapping a InternalsVisibleTo attribute on your assemblies, consider that maybe you need to abstract that functionality out of your method into another responsible class or module.

Polymorphism

At run time, objects of a derived class may be treated as objects of a base class in places such as method parameters and collections or arrays. Base classes may define and implement virtual methods, and derived classes can override them, which means they provide their own definition and implementation. At run-time, when client code calls the method, the CLR looks up the run-time type of the object, and invokes that override of the virtual method. Thus in your source code you can call a method on a base class, and cause a derived class’ version of the method to be executed.

public class Program
{
    static void Main(string[] args)
    {

        // Polymorphism at work #1: a Rectangle, Triangle and Circle
        // can all be used whereever a Shape is expected. No cast is
        // required because an implicit conversion exists from a derived
        // class to its base class.
        List shapes = new List();
        shapes.Add(new Rectangle());
        shapes.Add(new Triangle());
        shapes.Add(new Circle());

        // Polymorphism at work #2: the virtual method Draw is
        // invoked on each of the derived classes, not the base class.
        foreach (Shape s in shapes)
        {
            s.Draw();
        }

        // Keep the console open in debug mode.
        Console.WriteLine("Press any key to exit.");
        Console.ReadKey();
    }
}

public abstract class Shape
{
    public abstract void Draw();
}

public class Circle : Shape
{
    public override void Draw()
    {
        Console.WriteLine("Drawing a circle");
    }
}
public class Rectangle : Shape
{
    public override void Draw()
    {
        Console.WriteLine("Drawing a rectangle");
    }
}
public class Triangle : Shape
{
    public override void Draw()
    {
        Console.WriteLine("Drawing a triangle");
    }
}

Cohesion & Coupling

Cohesion and coupling are worth mentioning in unison. Cohesion refers to how closely related (logically/semantically) the pieces of functionality are, that are exposed by a particular module or class. If you ask yourself the question, “Do these pieces of functionality belong together?” and the answer is “Yes!” then you have a cohesive piece of code. Coupling on the other hand refers to how tightly interlinked two totally separate modules/classes are together. The more coupling that exists in your application, the more likely that changes to one piece of functionality will have an effect (possibly an adverse effect) on another.

In general you should aim to write code which is highly cohesive, with low coupling.

What’s that smell?

There are a number of things that ring out to developers as wrong when they seem them in software: Duplicated code; long methods and long branching statements; unmaintainable/brittle tests; tomes of text within method comment blocks explaining the voodoo that lies before them. We typically refer to these as Code Smells but there are also architectural smells that often times go ignored. Rigid designs that are difficult to change and manipulate; Viscous & complex designs that require massive surgery to get the next square feature to fit in that round interface/inheritance hierarchy, fragile & immobile designs that break when we change them and result in developers having to cut corners or possibly throw DRY out the window. (don’t repeat yourself – yes I realise the irony of spelling out the acronym)

But what’s the big deal? So maybe we need to write a little more code or perform a little bit of surgery on the architecture. That’s development right!

Well not really. At the end of the day, change equals cost. This is particularly relevant in a SME like Greenfinch where a number of our projects are bespoke engagements with customers. That cost needs to be absorbed somewhere so it’s either going to cost our customers more to get the functionality required or Greenfinch needs to absorb those costs during development. It also has a negative impact on the Team. In projects with many developers where a colleague may have to extend work that you’ve done, you end up putting road blocks in place for them. Overall it impacts on development/product morale and soon people are grumbling about that module or that developers code. Probably worst of all is the build up of a business debt. Some refer to this as technical debt, but really, it’s the business that owns the product that is accruing these //TODO items and //MUST FIX backlog tickets that seem to grow at a faster velocity than they can be cleared.

SOLID Principles

SOLID is an acronym for five guiding principles to help you write better, more maintainable code. Popularized by Robert C. Martin (aka Uncle Bob) in his book Agile Software Development: Principles, Patterns, and Practices, where he gave pragmatic advice on object-oriented design and development in an agile team. SOLID stands for:

  • Single Responsibility Principle (SRP)
  • Open-Closed Principle (OCP)
  • Liskov Substitution Principle (LSP)
  • Interface Segregation Principle (ISP)
  • Dependency Inversion Principle (DIP)

Software Development is not supposed to be like a game of Jenga. You shouldn’t be worried about the entire system collapsing, every time you add, remove or refactor one of the blocks of the system. These 5 principles provide guidance on how best to construct your code & architecture to ensure that it’s easily maintained and modified by you and your colleagues.

Single Responsibility Principle

If a class has more then one responsibility, then the responsibilities become coupled. Changes to one responsibility may impair or inhibit the class’ ability to meet the others. This kind of coupling leads to fragile designs that break in unexpected ways when changed. – Robert C. Martin

In a nutshell, each block of code & functionality (methods & classes) should be responsible for one single thing. The more things that a block of code is responsible for, the more heavily coupled it is with other pieces of functionality and behaviour, and as a result, the more likely it is to break, when you want to change just one small part of it. Let’s consider a simple logging class for example.

public class EoinsLogger
{
    public enum LogTo
    {
        TheDatabase,
        TheFileSystem
    }

    public void LogMessage(string message, LogTo where)
    {
        if (where == LogTo.TheDatabase)
        {
            LogToTheDatabase(message);
        }
        else
        {
            LogToTheFileSystem(message);
        }
    }

    private void LogToTheDatabase(string message)
    {
        //ADO.NET Code

    }

    private void LogToTheFileSystem(string message)
    {
        // System.IO. Code

    }
}

This code has a lot of different responsibilities. It’s responsible for logging obviously, but it’s also responsible for the decision on which underlying logging implementation to use, as well as the two specific implementation methods themselves. If another developer wants to come along and modify this, perhaps adding a third logging medium, they need to significantly alter the class in order to accomplish that. Below is a slightly better implementation. We’ve abstracted the actual implementations of the specific logging medium from the logger itself. Now, we’ve taken away some of the responsibility from the logger.

public class Logger
{
	public enum LogTo
	{
		TheDatabase,
		TheFileSystem
	}

	private ILoggerImplementation _ilog;

	public Logger(LogTo where)
	{
		if(where == LogTo.TheDatabase) _ilog = new DatabaseLogger();
		else _ilog = new FileSystemLogger();
	}
	public void LogMessage(string message)
	{
		_ilog.LogMessage(message);
	}
}

public interface ILoggerImplementation
{
	void LogMessage(string message);
}

public class DatabaseLogger : ILoggerImplementation
{
	public void LogMessage(string message)
	{
		//ADO.NET Code
	}
}

public class FileSystemLogger : ILoggerImplementation
{
	public void LogMessage(string message)
	{
		//System.IO Code
	}
}

But it still has ownership of both the logging process, and the decision on where to log to. It breaks the OC Principle, as extending the logging to include a third implementation means modifying the brancing logic in the logger. It should be closed for extension.

Open-Closed Principle (OCP)

Modules that conform to open-closed have two primary attributes: They are “Open For Extension” They are “Closed for Modification” – Robert C. Martin

Let’s further modify our previous logging example. It complies with our Open for Extension attribute, but it’s not currently closed for modification. We can accomplish that by injecting the logger implementation to be used at runtime.

public class Logger
{
	private ILoggerImplementation _ilog;

	public Logger(ILoggerImplementation theLogger)
	{
		_ilog = theLogger;
	}
	public void LogMessage(string message)
	{
		_ilog.LogMessage(message);
	}
}

public interface ILoggerImplementation
{
	void LogMessage(string message);
}

public class DatabaseLogger : ILoggerImplementation
{
	public void LogMessage(string message)
	{
		//ADO.NET Code
	}
}

public class FileSystemLogger : ILoggerImplementation
{
	public void LogMessage(string message)
	{
		//System.IO Code
	}
}

That’s much better now. Our logger class is simply responsible for logging. Extension can be accomplished by creating new loggers. And the decision on which medium to use has been removed (for the consumer to decide upon) when it instantiates the logger.

Liskov Substitution Principle (LSP)

Objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program. – Some body that isn’t Robert C. Martin

Deciding on the correct abstractions in your architecture is important to get right and well worth having the initial design sessions on. Whiteboard it out. Decide among your architecture teams if your object hierarchy is correct. Lets look at a simple/contrived example here. Everyone learns about shapes in primary school mathematics. A rectangle is a 4 sided shape. Each side seperated by a 90 degree angle. Resulting in two long sides (the length) and two short sides (the width). You can get the area of a rectangle by multiplying its length by its width. And you can double the area of the rectangle by doubling the length of one of its sides.

public class Rectangle
{
	protected int Length { get; private set; }

	protected int Width { get; private set; }

	public Rectangle(int l, int w)
	{
		Length = l;
		Width = w;
	}

	public virtual int GetArea()
	{
		return Length * Width;
	}

	public void DoubleInArea()
	{
		Length = Length * 2;
	}
}

[TestFixture]
public class RectangleTests
{
	[Test]
	public void TestRectangeArea()
	{
		int l = 10;
		int w = 5;
		int expected = 50;
		Rectangle r = new Rectangle(l, w);
		int actual = r.GetArea();
		Assert.AreEqual(expected, actual);

		r.DoubleInArea();
		int newexpected = 100;
		int newactual = r.GetArea();
		Assert.AreEqual(newexpected, newactual);
	}
}

We also learn that a square is just a more specialised type of rectangle where all 4 sides are equal in length to one another. So it seems pretty reasonable to design a system where a Square is just a specialised sub-type of Rectangle. Right ?

public class Square : Rectangle
{
	public Square(int side)	: base(side, side)
	{

	}

	public override int GetArea()
	{
		return Length * Length;
	}
}

[TestFixture]
public class SquareTests
{
	[Test]
	public void TestSqureArea()
	{
		int l = 10;
		int expected = 100;
		Rectangle r = new Square(l);
		int actual = r.GetArea();
		Assert.AreEqual(expected, actual);

		r.DoubleInArea();
		int newexpected = 200;
		int newactual = r.GetArea();
		Assert.AreEqual(newexpected, newactual);
	}
}

But wait, what’s happened here. The implementer of Square has overridden the GetArea() method to multiply the length by itself. A perfectly reasonable assumption in the context of a square. But the underlying type has a DoubleInArea() method which doubles the length of the Square. Calling this method in conjunction with the Square’s GetArea() method doesn’t just double the length. It quadruples the area. This kinda of issue rears it’s head all too often in Software Development where presumptuous but naive abstractions fail in real world use.

So what would a better solution have been here. Maybe both rectangle and square should have implemented an IFourSidedShape interface which forced the implementer of Square to explicity implement both the GetArea & DoubleInArea methods.

Remember if it walks like a duck, and quacks like a duck, but needs batteries, you probably have the wrong abstraction.

Interface Segregation Principle (ISP)

Classes that have fat interfaces are classes whose interfaces are not cohesive. In other words, the interfaces of the class can be broken up into groups of member functions. Each group serves a different set of clients. – Robert C. Martin

Here’s a very simplified example of a FileSystemManager. It’s singly responsible for all file I/O in our Application. It encapsulates and abstracts away the file I/O code. It’s decoupled. It’s cohesive in it’s responsibilities.

public class FileSystemManager
{
	public void ReadFile() { }

	public void WriteFile() { }
}

Great, but it has a pretty fat interface. Reading AND Writing files. Perhaps not every module or service that consumes this code cares about reading and writing. A logging module might only care about writing to the file system. A configuration service might only care about reading the .config files off the disk. Interface segregation is about logically splitting up the functionality of your code into smaller more semantically and logically coherent APIs for the consumers that are going to use them. In the following example, we’ve broken the file system manager down to implement two seperate interfaces; an IFileReader and an IFileWriter. Consumers of this code can then treat the FileSystemManager as one or the other depending on their specific needs. Furthermore, new implementations (e.g. a BlobStorageSystemManager) need only implement the interfaces that it requires.

public interface IFileReader
{
	void ReadFile();
}

public interface IFileWriter
{
	void WriteFile();
}

public class ProperFileSystemManager : IFileReader, IFileWriter
{
	public void ReadFile() { }

	public void WriteFile() { }
}

public class ProperBlobStoargeManager : IFileReader, IFileWriter
{
	public void ReadFile() { }

	public void WriteFile() {}
}

Dependency Inversion Principle (DIP)

A design is rigid if it cannot be easily changed. Such rigidity is due to the fact that a single change to heavily interdependent software begins a cascade of changes in dependent modules. – Robert C. Martin

Dependency inversion relates to keeping our architecture decoupled. High-level modules should not depend on low-level modules. Instead both should depend on abstractions. Those abstractions should not depend on the details. Again the details should depend on abstractions. Lets look at a simple example of some hierarchical classes which have coupled dependencies on each other.

public class FacadeLayerManager
{
	private ServiceLayerManager _serviceLayerManager;

	public FacadeLayerManager ()
	{
	//Instantiate _serviceLayerManager
	}

	public List<object>GetData()
	{
		return _serviceLayerManager.GetData();
	}
}

public class ServiceLayerManager
{
	private DataManager _dataManager;
	
	public ServiceLayerManager ()	
	{
		//Instantiate _dataManager
	}
	
	public List<object>GetData()
	{
		return _dataManager.GetData();
	}
}

public class DataManager
{
	public List<object>GetData()
	{
		//Get Data From Database
	}
}

Here we have a simple 3-tier architecture where the facade layer makes calls to a service layer to get data, and the service layer makes calls to the lower module DataManager. But this is a tightly coupled architecture. We cannot test our ServerLayerManager without creating an instance of our DataManager connecting to our database. Our higher level modules depend on the lower level rather than on an abstraction.

Instead, we can replace the instance fields in each layer with an abstraction (Interface) and inject our specific implementation via the constructor.


#region Interfaces

public interface IBetterFacadeLayerManager
{
	List<object>GetData();
}

public interface IBetterServiceLayerManager
{
	List<object>GetData();
}
	
public interface IBetterDataManager
{
	List<object>GetData();
}

#endregion

#region Implementations
public class FacadeLayerManager : IBetterFacadeLayerManager
{
	private IServiceLayerManager _serviceLayerManager;

	public FacadeLayerManager (IServiceLayerManager injectedManager)
	{
		_serviceLayerManager = injectedManager;
	}

	public List<object>GetData()
	{
		return _serviceLayerManager.GetData();
	}
}

public class ServiceLayerManager
{
	private IBetterDataManager _dataManager;

	public ServiceLayerManager (IBetterDataManager injectedManager)
	{
		_dataManager = injectedManager
	}
	
	public List<object>GetData()
	{
		return _dataManager.GetData();
	}
}
		
public class DataManager : IBetterDataManager
{
	public List<object>GetData()
	{
		//Get Data From Database
	}
}

Now we’ve removed the tightly coupled dependencies at each layer. We can also take advantage of Dependency Injection tools and frameworks such as AutoFac, Ninject, Unity (or SpringDI in the Java world) to automatically inject the correct concrete implementation at run time based on some configuration settings.

Summary

OO Design is easy to get wrong. It’s especially easy for a design to get out-of-control if you don’t keep good principles and practices in mind at every step of development. Putting that first hack in there, or cutting that first corner is akin to thrown the pebble down the side of a snow topped mountain. It’s going to turn into a very large snowball, very quickly, and once it does, it’s going to be far more difficult to stop. So following good OO Principles, follow these SOLID principles, talk to one another during the design phase, think about and put the time into designing and writing your code to a high quality. It’ll serve you, your company and your customers far better in the long run.

~Eoin Campbell