Data Partitioning Strategy in Cosmos DB

Over the past 6 months, I’ve been overseeing the design and implementation of a number of new projects for ChannelSight’s Buy It Now platform. One of the key technologies at the center of these projects is Cosmos DB, Microsoft’s globally distributed, multi-model database.

We’ve been moving a variety of services, which previously sat on top of SQL Azure, over to Cosmos DB for a variety of different reasons; performance, ability to scale up/down fast, load distribution; and we’re very happy with the results so far. It’s been a baptism by fire, involving a little bit of trial and error, and we’ve had to learn at lot as we went along.

This week I ran a workshop for our Dev team on Cosmos DB. Partitioning was the area we spent most time discussing and probably THE most important thing to spend time on, during the planning stages, when designing a Cosmos DB Container. Partitioning defines how Cosmos DB internally divides and portions up your data. It affects how it’s stored internally. It affects how its queried. It affects several hard limits you can hit. And it affects how expensive it is to use the service.

Getting your partitioning strategy correct is key to successfully utilizing CosmosDB; getting it wrong could end up being a very costly mistake.

Microsoft provides some guidance on partitioning, but that didn’t stop me from making a number of errors in my interpretation of what a good partitioning strategy should look like and why.

What is partitioning?

First, it’s important to understand what paritioning is in relation to Cosmos DB, and why it needs to partition our data. Cosmos DB lets you query your data with very low latency at any scale. In order to achieve this, it needs to spread your data out among lots of underlying infrastructure along some dimension that you specify. This is your partition key. All rows or documents which share the same partition key will end up stored and accessed from the same logical partition. Imagine a very simple document like so:

{
    "id":1
    "name":"Eoin"
    "city":"Dublin"
}

If you decided to specify /city as the partition key, then all the Dublin documents would be stored together, all the London documents together, and so on.

How is Cosmos DB structured?

Before we get down to the level of partitions, lets look at the various components that make up your Cosmos DB account.

Cosmos DB Account Structure

  • The top layer in the diagram, represents the Cosmos DB account. This is analagous to the server in SQL Azure
  • Next is the database. Again, analgous to the database in SQL Azure
  • Below the database, you can have many containers. Depending on the model you’ve selected your container will either be Collection or a Graph or a Table
  • A container is served by many physical partitions. You can think of a physical partition as the assigned metal that serves your container. Each physical partition has a fixed amount of SSD backed storage and compute resources; and these things have physical limitations
  • Each physical partition then hosts many logical partitions of your data
  • And each logical partition, in turn holds many items (documents, nodes, rows)
We're using CosmosDB with the SQL API and since we're coming from a SQL Azure background, there is somewhat of a SQL/Relational slant on the opinions below. I'll be talking about `Collections` and `Documemnts` rather than more generically but the same principals apply to a Graph or Table either.

When you create a new collection, you have to provide it with a partition key. This tells the database along what dimension of your data (what property in your documents), it should split the data up. Each document with a differing partition key value will be placed in a different logical partition. Many logical partitions will be placed in a single physical partition. And those many physical paritions make up your collection.

Parition Limitations

While logical partitions are somewhat nebulous groupings of like documents, physical partitions are very real and have 2 hard limits on them.

  1. A physical partition can store a maximum of 10GB of data
  2. A physical partition can facilitate at most 10,000 Request Units (RU)/s of throughput.

A physical partition may hold one or many logical partitions. Cosmos DB will monitor the size and throughput limitations for a given logical partition and seamlessly move it to a new physical partition if needs be. Consider the following scenario where two large logical partitions are hosted in the one physical partition.

Physical Partition Logical Partition Current Size Current Throughput OK
P1 /city=Dublin 3GB 2,000 RU/s :white_check_mark:
P1 /city=London 6GB 5,000 RU/s :white_check_mark:

Cosmos DB’s resource manager will recognise that entire P1 partition is about to hit a physical limitation. It will seemlessly spread these two logical partitions out to two seperate physical partitions which are capable of dealing with the increased storage and load.

Physical Partition Logical Partition Current Size Current Throughput OK
P2 /city=Dublin 5GB 4,000 RU/s :white_check_mark:
P3 /city=London 7GB 8,000 RU/s :white_check_mark:

However if a single logical attempts to grow beyond the size of a single physical partition then you’ll receive an error from the API. "Errors":["Partition key reached maximum size of 10 GB"]. This would obviously be very bad, and you would need to reorganise and repartition all this data to break it down into smaller partition by a more granular value.

Physical Partition Logical Partition Current Size Current Throughput OK
P2 /city=Dublin 5GB 4,000 RU/s :white_check_mark:
P3 /city=London 10GB 10,000 RU/s :x:

Microsoft provides some information here on how the number of required partitions are calculated but lets look at a practical example

  • You configure a collection with 100,000 RU/s capacity T
  • The maximum throughput per physical partition is 10,000 RU/s t
  • Cosmos allocates 10 physical partitions to support this collection N = T/t
  • Cosmos allocates key space evenly over 10 physical partitions so that each holds 1/10 of logical partitions
  • If a physical partition P1 approaches it’s storage limit, Cosmos will seemlessly split that partition into P2 and P3 increasing your physical partition count N = N+1
  • If you return later and increase the throughput to 120,000 RU/s T2 such that T2 > t*N, Cosmos will split one or more of your physical partitions to support the higher through put

Data Size, Reads & Writes

In an ideal situation, your partition key should give you several things

  1. An even distribution of data by partition size.
  2. An even distribution of request unit throughput for read workloads.
  3. An even distribution of request unit throughput for write workloads.
  4. Enough cardinality in your partitions that overtime, you will not hit those physical partition limitations

Finding a partition strategy that satisfies all of those goals can be tricky.

On the one extreme, you could choose to place everything in a single partition but this puts a hard limit on how scalable your solution is as we’ve seen above. On the other hand you could put but every single document into it’s own partition, but this might have implications for you if you need to perform cross partition queries, or utilize cross document transactions.

So what is an appropriate partition strategy?

When building a relational database the dimensions upon which you normalize or index your data tend to be obvious. 1:Many relationships are obvious candidates for a foreign-key relationship; any column that you regularly apply a WHERE or ORDER BY clause to becomes a candidate for an index. Choosing a good partition key isn’t always as obvious and changing it after the fact can be difficult. You can’t update the partition key attribute for a collection without dropping and recreating the collection. And you can’t update the partition key value of a document, you must delete and recreate that document.

All of the following are valid approaches in certain scenarios but have caveats in others.

Partitioning by Tenant Id or “Foreign Key”

One candidate for partitioning might be a tenant Id, or some value that’s an obvious candidate for an important Foreign Key entity in your RDBMS. If you’re building a product catalog, this might be the Manufacturer of each product. But if you don’t have an even distribution of data/workload per tenant, this might not be a good idea. Some of our clients have 100 times more data and 1000 times more traffic than others. Using manufacturer Id uniformly across the board would create performance bottle necks for some of the bigger clients and we would very quickly hit storage limits for a single physical partition.

Single Container vs. Multiple Containers

Another option for dividing data along the “Tenant” dimension would be to first shard your application into one-container-per-tenant. This will allow you to isolate each tenants data in completely seperate collections, and then subsequently partition that data along other dimensions using more granular data points. This also has the benefit that a single clients workload won’t impact your other clients. This did not make sense for us, as with a 1000 RU minimum per collection, the majority of our smaller clients would not hit that limit, and we couldn’t have passed on the cost for standing up that many collections.

Partitioning by Dates & Times

You could also partition your data by a Date or DateTime attribute (or some part of). If you have a small consistent write workload of timeseries data, then partitioning by some time component (e.g. yyyy-MM-dd-HH) would allow you to subsequently query or fetch sets of data efficiently in 1 hour windows. Often, however, this kind of timeseries data is high volume (audit logs, http traffic) and as such you end up with extremely high write workloads on a single partition (the current hour) and every other partition sitting idle.

Therefore it often makes more sense to partition your data (logs) by some other dimension (process id) to distribute that write workload more evenly.

Partitioning by a Hybrid Value

Taking the above into consideration, the answer might involve some sort of hybrid value mixing data/data points from serveral different attributes of your document.

An application audit log for your platform might be partition data by {SolutionName}/{ComponentName} so that you can efficiently search logs for one area of your system. If the data is not needed long term, then you could specify time-to-live values on the documents so that they self-expire after a rolling period of days

HTTP traffic logs, for impression and click data might be partitioned by {yyyy-MM-dd}/{client}/{campaign} so that data and write workloads are partitioned at the level of an individual client and individual marketting campaign for a given day. And then you can efficiently query that data for specific date ranges, clients and campaigns for reporting aggregation later.

Dynamic Partitioning for Multiple Documents Types

For our solution we had a very specific requirement for our product search query. For a given Manufacturer’s product SKU, we wanted to look up all the retailers that carried that product. In the end we settled on the following strategy:

Put all documents in a single collection

We started out with essentially two types of documents

  1. A Product document, which contained a SKU & Manufacturer data
  2. A Retailer Data document, which contained a reference id to the Product document

Use a common base entity for all documents

We then implemented a small abstract base class which all documents would inherit from. The PartitionKey string property is used as the partition key for the entire collection.

public abstract class CatalogBaseEntity
{
    [JsonProperty(PropertyName = "id")]
    public string Id { get; set; }
    public string PartitionKey { get; set; }
    public abstract string Type { get; }
}

Use different value sets for the Partition Key based on Document Type

For our Product documents, the value of the partition key is set to Manufacturer-<UniqueManufacturerId>. Since the product meta data for a single product is quite small, we’ll never hit the 10GB storage cap for a single manufacturer.

For our RetailerData document, the value of the partition key is set to Product-<ProductDocumentId>.

Querying the API for Product data

We now have a very efficient search system for our product data.

When our API receives a SKU query, we’ll first do a lookup for a Product document for a single manufacturer + sku. This is a single, non-partition-crossing query.

Next, we take the ID of that Product document and do a subsequent query for all the associated Retailer Data documents. Again, since this partitioned by the Product.Id it’s a non-partition-cross-query and limited to a finite set of results.


Hopefully that was a useful insight into partitioning data in Cosmos DB. Would love to hear other peoples experiences with it. And also, if there’s anything I’ve misinterpretted, drop me a note in the comments so I can correct it. Like I said, this has been a big learning experience for us here.

Eoin Campbell

Your writing a console app and you want to continue to accept input from the user over multiple lines until they stop typing. Essentially “How do I do ReadToEnd() on the command line. Alternatively you want to be able to redirect input from another file.

Turns out it’s quite easy to do.

class Program
{
    static void Main(string[] args)
    {
        using (var sr = new StreamReader(Console.OpenStandardInput(), Console.InputEncoding))
        {
            var input = sr.ReadToEnd();
            var tokens = input.Replace(Environment.NewLine, " ").Split(' ');
            Console.WriteLine($"Tokens: {tokens.Count()}");
        }
    }
}

For the user interactive example you’ll have to terminate the input with a CTRL-Z (or a CTRL-D on linux)

And you can now redirect/pipe to STDIN from other files.

Get-Content .\input.txt | .\stdin-test.exe

Eoin Campbell

Qluent

Qluent is a Fluent Queue Client for Azure storage queues

Qluent is simple fluent API and set of wrapper classes around the Microsoft Azure Storage SDK, allowing you to interact with storage queues using strongly typed objects like in the code snippet below. You can see lots of other ways to use it in the Documentation on Github.

var queue = await Builder
    .CreateAQueueOf<Entity>()
    .UsingStorageQueue("my-entity-queue")
    .BuildAsync();
    
await queue.PushAsync(new Entity());

So why did I build this.

Back in March, some colleagues and I ran into an issue with a legacy project that we’d inherited at work. At random times a queue consumer would just get stuck and stop dequeuing messages. When we went to debug it we discovered the code responsible for dequeueing, processing and deleting the message was buried in an assembly, and the source code was … unavailable :confounded:.

After much hair pulling and assembly decompilations, we eventually tracked down the bug, but it got me thinkings about a couple of things:

  1. Setting up an azure storage queues through the SDK is a little tedious. There’s quite a bit of ceremony involved to create an CloudStorageAccount, CloudQueueClient and CloudQueue, to ensure it exists and to deal with serialization/deserialization.

  2. There are some aspects of the SDK I dislike. Specifying the large majority of settings on the methods (such as message timeout visibility etc…), rather than as configurations on the CloudQueueClient itself seems wrong. It leaves lots of sharp corners for the developer to get caught on, after they fetch a queue from DI and want to interact with it.

  3. There are lots of tricky scenarios to account for even in simple messaging use cases, such as idempotency issues, handling retries and dealing with poison messages.

  4. Developers shouldn’t need to worry about writing consumers/dispatchers. They should just need to worry about getting their message handled.

The Goal: Keep it simple

What I really wanted to provide was a very simple fluent API for creating a CloudQueue and a message consumer around that CloudQueue. Creating a consumer is simply a matter of providing, a type, a queue, a message handler and starting it up.

var consumer = Builder
    .CreateAConsumerFor<Entity>()
    .UsingQueue(queue)
    .ThatHandlesMessagesUsing((msg) => 
        { 
            Console.WriteLine($"Processing {msg.Value.Property}"); 
            return true; 
        })
    .Build();

await consumer.Start()

The library is intentionally meant to simplify things. Often times I’ll find myself having to scaffold something and spending way too long focusing on the infrastructure code to support message queuing when I should be focusing on the actual problem I’m trying to solve. That’s what this is for. It is a simple wrapper around Azure storage queues to make working with them a little easier.

However there are lots of complicated things you may find yourself needing doing in a distributed environment: Complex Retry Policies; complicated routing paths; Pub/Sub models involving topics and queues; the list goes on. If that’s the case, then perhaps you should be looking at a different technology stack (Azure Service Bus, Event Hubs, Event Grid, Kafka, NService Bus, Mulesoft etc…)

Below you can see some of the features and that the library supports.

Features

Creating a Queue

Queues can be created by simply specifying a storage account, queue name and a type for your message payload. You can purge the queue and obtain an approximate count of messages from it. All operations are async awaitable, and all support taking a CancellationToken.

var q = await Builder
    .CreateAQueueOf<Person>()
    .ConnectedToAccount("UseDevelopmentStorage=true")
    .UsingStorageQueue("my-test-queue")
    .BuildAsync();

await q.PurgeAsync(); 

var count = await q.CountAsync() 

Basic Push/Pop Operations

Basic queue operations include push, pop and peek for one or multiple messages.

var person = new Person("Eoin");
await q.PushAsync(person);


var peekedPerson = await q.PeekAsync();
var poppedPerson = await q.PopAsync();
IEnumerable<Person> peekedPeople = await q.PeekAsync(5);
IEnumerable<Person> poppedPeople = await q.PopAsync(5);

Receipted Deletes

You can also control the deletion of messages from the CloudQueue using the Get and Delete overrides. Under the hood this will use PopReceipts to subsequently remove the message or on visibility timeout, the message will reappear on the queue.

var wrappedMessage = await q.GetAsync();

try
{    
    //attempt to process wrappedPerson.Value;
    await q.DeleteAsync(wrappedMessage);
}
catch(Exception ex)
{ 
    //message will reappear on queue after timeout    
}

Queues can also be configured to support

  • Delayed visibility of messages
  • Message TTLs
  • Visibility timeouts for dequeue events
  • Automatic rerouting of poison messages after a number of dequeue & deserialize attempts
  • Customized object serialization

The message consumer provides a simple way to asynchronously poll a queue including. It supports

  • Message Handlers
  • Failed Processing Handlers (Fallback)
  • Exception Handlers
  • Flow control for when exceptions occur (Exit or Continue)
  • Custom Queue Polling Policies
  • Integration with NLog for Logging

And there’s more detailed info in Github Repo README.md.

Get It on Nuget

I’d really appreciate feedback on it so if you want to try it out, you can get it on nuget

Eoin Campbell

Nuget Stable

Recently I had the displeasure of Wordpress screwing up on me yet again. I’ve been paying a small but measurable amount to a colo based in Ireland for about the past 8 years to provide a small VPS running wordpress. At the time I set it up it was great. But times have changes. The VPS is underpowered. I don’t have the need for a remote windows box anymore with Azure MPN/Dev subscriptions giving me all the free .NET hosting I need. And, Wordpress is a slow hunking mess. I really don’t need a big kludge of a CMS running on MySQL for a personal blog with minimal traffic.

Jekyll

After a little bit of research, I decided to give Jekyll a try. Jekyll is a simple, blog-aware, static site generator built with Ruby. It also happens to be the engine behind Github Pages as well. So you can setup your jekyll blog, upload it to your Github pages repo in your github account and Github will autogenerate your site for you.

Ruby Environment

In order to build and run Jekyll locally, you’ll need access to a ruby dev environment. The general consensus from my bit of research was to do this on a linux distro. Thankfully this is much easier for a windows nerd like me with the new Windows Subsystem for Linux. Simply enable the WSL from powershell and then install your preferred distro from the Microsoft Store. I decided to install Ubuntu.

#Enable Windows Subsystem for Linux
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
#Reboot

Once installed I needed to run some package updates and upgrades. This will be slightly different for each distribution.

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install ruby-full build-essential 
gem install bundler jekyll

Github Pages

Once you have jekyll and bundle installed, go ahead and create a github pages repo. I created mine at eoincampbell.github.io. The name of this repo is the same as your github account. You can browse the folder structure of my site here to get a sense of the content/folder structure.

Next check out the empty repo to your local machine, and run the jekyll create command in it.

jekyll new your-site-name
bundle exec jekyll serve

There’s tonnes on information Jekyll and how to configure and theme it on their site.

Migrating Wordpress Comments

Before shutting down my wordpress account, I wanted to retain the comment history for my old site. Jekyll has built in comment/discussion support using Disqus.

You can export all your wordpress content by logging into your wordpress admin section, and running the Export tool under the tools menu. You should select to export all content including posts, pages and comments.

Next create an account on Disqus and login to the Disqus Importer. Here you’ll be able to upload your wordpress xml data dump. Comments are stored against your domain name, and post urls. That means so long as your new site has the same domain and same permalinks, your comments will just appear.

Migrating Wordpress Posts

Finally, you’ll want to convert your old wordpress blog posts into markdown files.

Thankfully this a ruby 1 liner. (though you may need to install the gem first)

ruby -rubygems -e 
    'require "jekyll/migrators/wordpressdotcom"; 
    Jekyll::WordpressDotCom.process("/path/to/wordpress.xml")'

Gotchas

The whole process was pretty painless to be honest, but I did get caught out by a couple of issues, mostly due to the fact that it’s a long time since I dabbled with Linux and it kinda has some sharp corners.

There’s a good article here on how to install ruby on ubuntu. Originally I missed adding the system Gems’ to my path which caused some headaches.

Another issue I ran into had to do with some dependencies after changing from the Default Jekyll gem to using the github pages version. It looked for a dependency called nokogiri which was missing a further dependency called Zlib, and I needed to install Zlib manuaully via apt-get.

Finally, although the jekyll markdown format supports html, you’ll probably find that you’ll need to do a pass of your old posts to clean them up a little and get the formatting just right.

Have fun

~Eoin Campbell

For the last 6 months I've been working as a software architect for ChannelSight here in Dublin. I'm really enjoying it and lucky in that I have a fairly broad remit here. But what is "Architecture". What do I do? If you want to be architect:

  • What does that career path look like?
  • What skills do you need?
  • What can you expect?

In interviews with senior developers, they'll mentioned that a role as an architect is a career goal for them, but when asked how they're trying to achieve that goal, they flounder.

I've held roles with junior, mid-level & senior development titles. Then I moved into lead & management roles with a mix of architecture, project management and people management. I think this is a natural progression for developers. The opportunities for progression for a senior developer might involve staying in development, moving into architecture, taking  a managerial position, or some combination of the three.

Personally, I wanted to progress along a technical path. Most importantly I enjoy architecture. I like designing things, doing research, coming up with ideas and overseeing the creation of new software.

What's in a Job Title?

My full job title is Senior Technical Architect. That title will mean different things to different people. Some might call it Software/Solution Architect or Technical Lead or something else. Depending on the software/products you build, there might be slight discrepancies in what the job entails. Depending on company size, the responsibilities in the role may differ.

I've only worked in SMEs with 20-60 employees, so the architecture roles I've held have also had tendrils into client engagement, requirements, team leading, project management and hiring. In a larger organisation, an Architect's role might be more focused on designing the software.

So what can you expect as a Software Architect?

You need to be able to design software

Software Architecture

And obvious one, but you need to be able to design software. This might mean just the software itself or the systems and infrastructure that go with it. Ultimately, you need the capability to look at a problem, and think up an appropriate software solution for it.

  • Do you understand the business and domain problem?
  • Do you know the software & architectural patterns that can be applied?
  • Do you have an appreciation for the the function and non-functional requirements you need to satisfy?

Sometimes the answer to these questions will be "No" and you'll start from a position of disadvantage. Part of the challenge is being able to figure out those answers.

You'll need to draw on your experiences in previous projects. You'll need to do research. And ultimately when you've pulled all the various pieces together, you need to assemble them into a design that meets the needs of your client or company.

You are responsible

Not to say you don't already have responsibilities, but as an architect, there comes an additional level of responsibility and the buck stops with you. You are responsible for the designs you put forward. If you propose a solution that won't work, and your team builds that solution, then the responsibility if it fails is yours.

In smaller companies, you may have other responsibilities too. You may have to:

  • ... engage with clients, and thus are responsible for the company's image.
  • ... participate in hiring and are in part responsible for the composition and capability of the development team.
  • ... run training and up-skill workshops.

With more seniority, comes more responsibility and you need to step up.

You need to make time for thinking

Thinking Time

There will always be an amount of straight forward, humdrum project work involving good old n-tier architectures that most developers are pretty comfortable with. But for other projects, I need to put time aside to get my head around the task. I need time to think.

I use mind-maps to brainstorm out my thoughts. I eek out space in my day (sometimes out-of-hours) with a whiteboard or a notepad and scribble things down. You should bring others into this thinking time as well. Some of the best solutions I've come up with in the past weren't a single eureka moment on my own. They happened with two or three smart people in the room with me as we all collaboratively came up with a solution that we could confidently stand over.

It's helpful to have some resources to hand too. Having a copy of the Gang of Four: Design Patterns on your (e-)book shelf is a good idea. Since I work on the .NET stack and Azure Platform, knowing the .NET tools and frameworks, and being able to refer back to Cloud Architecture Patterns is helpful also.

You need to be a good communicator

As an architect, expect to have to talk... a lot. Pre-design, you'll talk to clients, business analysts and senior management to get the information you need. After you've create a design, you need to communicate it to everyone; first to get buy in from the stakeholders and second so that your team understands what they've been tasked to build. You'll attend a lot more meetings (workshops, requirements, reviews etc...) and be expected to engage with a much broader group in your company.

You'll present more often; sometimes a pre-prepared PowerPoint deck to the business, pitching your ideas; other times presenting to tech staff, explaining the solutions there and then on the whiteboard.

You'll need to explain your ideas in varying levels of complexity/detail. While talking with developers, those conversations might be technical, getting into the nitty-gritty of a design. Other times, you may have to take an Explain-It-Like-I'm-5 (ELI5) approach with business colleagues, who don't have the same technical grasp, but need a higher level understanding.

Finally, you need to be comfortable fighting your corner. There is a balance to be struck between a pragmatic architecture (avoiding over-architecture/unneeded complexity) and ensuring it is for purpose (avoiding a sub-par solution due to some other constraints). You must be comfortable having these discussions and be able to make your case with facts and evidence to back it up.

You will write more documents and less code

A big part of that process is getting your designs and ideas across to other people. You'll spend more time writing docs and less time writing code. My role now involves writing specifications, drawing UML diagrams, typing up Jira Backlog Items and filling out Confluence documentation so that others have the information they need. This is at the expense of getting to write code on a day to day basis myself.

As a result, some ring rust has been creeping into my capabilities as a developer. When I do have the luxury to write code, it takes me longer to do things than it did in the past, and because of time constraints and priority juggling, it's very rare I can get into a flow.

You need to be able to write requirements

You should be comfortable capturing & writing requirements. In a larger organisation, the responsibility to write a Business/User Requirements Specification document might be the job of a dedicated business analyst/project manager. Or it might fall upon you. You might contribute to functional requirements specs, or document out use-cases or user-stories. You need to understand how to capture non-functional requirements. On a brown-field project, you may need to research what's there already and conduct a gap analysis.

At later stages in the design phase, you'll need to be comfortable creating a software design document or technical specification that explains the technicalities of what you plan to build and how it will satisfy those business/functional requirements.

Fast Cheap Good

You need to know how to frame the requirements and how to present your proposed solution to the stakeholders. The may be many approaches that could be taken. The design you put forward will affect where in the Good-Fast-Cheap triangle the implementation will land. If there are time, cost or quality considerations to take into account, these need to fit into your thinking and design rationale.

If your architecture role also carries some PM responsibilities, you may need to go back to stakeholders with recommendations, some that might not be popular. Perhaps it's just not possible to meet their requirements within the constraints they've set out and some prioritization approaches like MoSCow (Must-Should-Could-Would) may be necessary to decide what requirements will and won't be met during the (initial) implementation phase.

You need to let go

Something I find challenging is having to take a step back from the implementation and let others get on with it. If you've come from a senior dev background, you've probably contributed to both design and implementation. You may have strong opinions on what the code/implementation should look like. But that might not be your job anymore. You can still be involved, maybe in an oversight capacity, maybe guiding more junior developers on how to do something, or taking part in code-reviews or maybe you can contribute to some small part of the implementation if you're lucky.

For the most part, you won't have the time and bandwidth to embed yourself in the day to day implementation. You need to let go and hand the implementation reins over to the developers. And you need to trust that those developers are capable of delivering your vision for the solution.

This might make you feel like you're becoming a jack-of-all trades and master of none. But that's OK. Your job as architect is to focus on the big picture. To see the overall solution.

You need to be able to self-improve and do research

As a closing thought, you need to find a way to improve as an architect. The software development landscape is constantly evolving. New technologies are constantly introduced. New methodologies will gain and fall out of favor. New ecosystems will emerge (mobile, wearables, iot etc...) and new regulations will appear (GDPR and ePrivacy).

If you're lucky, your organisation may have a way to support you. Perhaps there's an architecture team/board where new ideas can be discussed. Or perhaps there's dedicated time put aside for R&D. But in smaller companies, you may be seen as the "most senior developer" and not have anyone else in your immediate circle of colleagues that you can learn from on a day to day basis.

You'll need to find away to pull yourself up by your own bootstraps. That may involve:

  • reading books & blog posts
  • listening to podcasts
  • finding time in your professional or personal life to try out new things
  • testing and coding with new technologies and frameworks

If you don't have someone in your company that you can go to, then perhaps you can find a mentor outside. An ex-colleague or friend who's further along their career path that can guide you or at least act as a sounding board

There are membership communities like the International Association of Software Architects where you can find further material on software architecture. There are formal training and certification paths like the TOGAF Certification where you can get a recognized qualification as an enterprise architect. And within the Microsoft world, there are MCA (Microsoft Certified Architect) paths that you can take, and several Application/Solution level certifications like Architecting Microsoft Azure Solutions.

Good Luck

 

~Eoin Campbell