Enablers, Not Doers

How do you run effective Platform Engineering teams?

All organizations have Platform Engineering teams in one form or the other; these are centralized engineering teams providing building blocks for other engineering groups within the company. The customers for these teams are the internal engineers, not the end-users of the product.

For Platform Engineering teams to be effective, do not strongly couple them with the other engineering groups. Even though these are centralized engineering teams, they should operate in a decentralized manner. Platform Engineering teams should follow the mantra of Loosely coupled but strongly aligned with the other engineering teams. To achieve this, view Platform Engineering teams as enablers, not as doers.

stone-wall-86660_640

The build engineering team creates tools and resources so that any team in the organization can deploy their builds to production, i.e., the Build engineering team enables you to deploy builds using the tools they create; they do not deploy the build for you.

The performance engineering team creates tools and frameworks for you to identify performance bottlenecks, i.e., the performance engineering team enables you to identify performance bottlenecks using the tools they build; they do not identify performance bottlenecks for you.

The security engineering team creates tools and libraries for you to identify security loopholes, i.e., the security engineering team enables you to identify security holes using the tools they build; they do not identify security loopholes for you.

This is how organizations should mold and communicate the role of Platform Engineering teams. Positioning the Platform Engineering teams as doers instead of enablers makes them the bottleneck for other teams to get their work done.

Other teams rely on Platform Engineering teams for their success. You do not want them second-guessing whether the Platform Engineering teams are doing their job or not. They need to trust the Platform Engineering teams with their critical workloads. Without trust, everything breaks, leading to unnecessary back and forth, sapping the energy of all parties involved.

To build this culture of trust, Platform Engineering teams should focus on observability, performance, and stability.

Lack of observability creates anxiety. If the build engineering team does not give a dashboard where one can see the progress of builds, history of builds, and create alerts on build failure – teams would be anxious about their builds. They would bug the Build Engineering team on this, causing stress all around. Observability is a must for creating a culture of zero stress and anxiety.

Documentation is a subset of observability. Once the Platform Engineering team releases a tool or an API, others should be able to use them without intervention from the Platform Engineering team. To achieve this, Platform Engineering teams should focus on clear and concise documentation.

Since all teams use the tools and APIs of Platform Engineering teams, performance improvements have a multiplicative effect.

The build tool not working, brings the engineering org to a standstill. Login not working negatively affects all parts of the application. Hence, a keen focus on stability is a must. These are the foundations on which other engineering teams build their features.

Due to the nature of the work of Platform Engineering teams, consulting, guiding, and proliferation of best practices becomes part and parcel of the day to day responsibilities; this is understated, but Platform Engineering teams spend a significant chunk of their time on this. Look for ways to institutionalize these so that the Platform Engineering teams are engaged in their core work and not spending time on this. As discussed earlier, focussing on observability, performance, and stability goes a long way towards this.

Feature prioritization can become a challenge for Platform Engineering teams as everyone comes to them with feature requests. A simple yardstick to use is – If we do not release this functionality, is there a way for the requesting team to go about their work, albeit in a roundabout manner? If the answer is yes, then it is not a burning problem. If not, you need to figure out a way to get the feature out as soon as possible.

Platform Engineering teams should adopt a broad perspective when developing features. Do not develop features for a particular team. Think of how the feature relates to other teams and design it so that everyone in the organization can leverage the feature.

If you want to build a culture of speed and rapid iteration, viewing Platform Engineering teams as enablers and not as doers is critical.

Get articles on coding, software and product development, managing software teams, scaling organisations and enhancing productivity by subscribing to my blog

Image by Susbany from Pixabay

 

Optimists, Pessimists, and Better Coders

Whether one is an optimist or pessimist is dictated by genetics. Wise people say that you can give your genetics a run for the money by making happiness a conscious choice and learn to be deliberately happy.

What does this have to do with coding?

christophe-hautier-902vnYeoWS4-unsplash.jpg

How do you decide whether a code is good or not?
There are many parameters, but a definitive indicator is the number of bugs – lesser the bugs, better the code.

What is the definition of a bug?
A bug is a problem with the code, which makes it not work correctly.

Why do bugs occur?
The person who authored the code did not anticipate that particular condition, and the code does not know how to handle that situation.

How does one prevent bugs?
Anticipate all that can go wrong and take care of them while coding.

To do this, one needs to take a bleak look at things – exhaustively think of all that can go wrong and take care of these. To put it shortly, you need to wear a pessimist’s cap.

Then, do pessimists make better coders?

There is another way to look at this.

To create something, you need to be an optimist; coding is about creating something new. To write code devoid of bugs, you need to take care of edge cases and boundary conditions and account for them.

Is a better coder someone who can balance optimism and pessimism?

Get articles on coding, software and product development, managing software teams, scaling organisations and enhancing productivity by subscribing to my blog

Photo by Christophe Hautier on Unsplash

NOT – Not Only Testing

How do you create a quality assurance strategy? 

Is quality assurance the same as testing?

Companies nowadays do not have a separate quality assurance(QA) team. People coming from the old world find this difficult to digest. In the last decade, the way of developing, deploying, and monitoring applications has changed.

Gone are the days of the waterfall model of development. You no longer take months to develop a feature, followed by a long testing cycle and finally hand it over to the production team for deployment. Release cycles have shrunk from months to hours. People who build the application are responsible for deploying, running, and monitoring it.

In this brave new world, base your quality assurance strategy on:

  1. What is the cost of a bug in production?
  2. How quickly can one detect a bug in production?
  3. How quickly can one fix a production bug?
  4. If a bug manifests in production, how does one limit the damage?

austin-neill-ZahNAl_Ic3o-unsplash.jpg

 

The cost of bugs is not uniform; bugs in different areas of the application cost differently. The line of business also dictates the price of bugs.

For an aerospace or medical devices company, a bug means the difference between life and death literally, not so much for a social network or an e-commerce website. If an e-commerce website does not load, it significantly hits the company’s bottom line. For a social network, it might make an insignificant dent in advertising revenue. For an e-commerce company, the cost of a bug in the recently purchased items section on the home page is not the same as checkout from the shopping cart not working. While your shopping cart checkout has to be bulletproof, you can be lenient with the recently purchased items section on the home page.

Vary quality control based on the criticality of the functionality. Be tight where it needs to be and loose where you have room to wiggle. You need not have a uniform quality assurance strategy for the entire application.

Testing dents your speed and time to production irrespective of whether you do it manually or automate it; this is the cost you pay for quality – you need to make peace with this. The stricter you get, the costlier it becomes. Automated testing is not free; it too has a cost – more things to maintain and manage. In some cases, you might end up writing twice the code due to automated testing. I am not arguing against automated testing but asking you to – factor in and mentally accept the cost – before you take the plunge.

automation

 

There is a tendency to equate testing with quality assurance. Testing is a subset of quality assurance; testing is NOT quality assurance. Quality assurance is much more than JUST testing and comprises of a variety of things.

Today, experimentation, incremental development, and speed are the essence. Try a small idea, see whether it sticks, and then rapidly iterate and expand. In such an environment, following a design philosophy that gives your room for error and factors in bugs and things not working is key.

Quick detection of production bugs rests on the observability built into the application. Speedy recovery from production bugs is a function of deployment practices. Reducing the impact of production bugs follows how you roll out features. All these are a result of tooling, development practices, and engineering culture. Today, these are as important as vanilla testing – manual or automated.

The ease with which you can set up a development environment for the application has a direct bearing on the quality of the product. In a world of micro-services and external dependencies, setting up a development environment can get complicated with a lot of moving parts. If you make it difficult for developers to create a robust development environment, the quality of the product takes a hit.

Static code analysis and enforcing best practices through tooling like pre-commit hooks improve the quality of the code, which directly improves the quality of the application. It is paramount when you use languages that are promiscuous with typing; this is one of those things where the cost is low, resistance is nil, and the rewards are high. Be always on the lookout for tools and processes where you get a better quality product with zero resistance and impediment to speed.

Adopting the asynchronous and event-based architecture and design patterns gives room for error and recovery. In today’s fast-paced environment where you do not have the luxury of time to test every minute aspect of the product, this is a boon.

Do not have a tunnel vision when it comes to quality – do not restrict it to only testing; adopt the “NOT – Not Only Testing” strategy.

Get articles on coding, software and product development, managing software teams, scaling organisations and enhancing productivity by subscribing to my blog

Comic from XKCD.

Photo by Austin Neill on Unsplash

Reflection on AWS re:Invent

AWS re:Invent is an annual event hosted by Amazon in Las Vegas. It is a celebration of all things AWS, as well as an opportunity to advance one’s AWS skills and meet the teams behind the various AWS services.

welcome-to-fabulous-las-vegas-nevada-signage-165799

Approximately 65,000 people attended the event this year, and I was one of them thanks to my employer Goibibo. It was a 5-day affair. I have been using AWS since 2012, have seen the platform grow from a couple of services to what it is today, but this was my first time at re:Invent.

Reflections on re:Invent follow.

re:Invent has the right mixture of fun and seriousness – multiple parties, tech talks, after hours with food and drinks, fun games, product launches, certification booths, and networking events.

The event takes place in multiple hotels(Casinos) on the Las Vegas strip. AWS ties up with the hotels on the strip, which results in discounted rates for the stay. This sweet deal closes as re:Invent nears, so plan your attendance. Staying close to the venues is highly recommended as the day starts early and ends late.

AWS has free shuttles running from these designated hotels to the conference venues. Another reason to stay in the chosen hotels. These shuttles also run between the various conference sites, but it does take time to move from one to the other. Casinos are like a maze; it is easy to get lost. Getting in and out of the casinos to the shuttle parking lot itself takes some practice. By the time you have perfected this art, the conference ends. AWS stations many people in the casinos to guide and help with directions and information, but still, it eats up quite a bit of time.

Umpteen number of tech talks run in parallel in multiple venues at all the hotels. Talks cover a wide variety of subjects: battle stories -victories, defeats, and lessons from AWS users; the internals of AWS services; best practices, and architecture sessions. As AI/ML are the current darlings of the tech world, there was a separate track dedicated to AI/ML.

AWS was heavily promoting serverless; there was a lot of emphasis on serverless throughout the event. Some of the vintage AWS products like SQS, SNS, and DynamoDB have been re-branded under the serverless moniker.

AWS publishes the schedule for the talks in advance. There is a re:Invent app which has all the details of the event. In the app, you can reserve seats for the talks. If you want assured seating, you need to reserve. Sessions do fill up fast, and the reservation closes as re:Invent nears. All talks have a line for the people who have booked in advance and a walk-in queue. Advance bookers get preferential treatment. The walk-in line is at the mercy of seats being empty. There are repeats of the talks on subsequent days. Sessions are also live-streamed; there is a dedicated area for viewing this on the big screen.

If you are serious about attending as many talks as possible, cluster your talks in one venue and then move to the other. Commuting from one place to the other takes time.

Another highlight of re:Invent is the keynotes from AWS stalwarts like Andy Jassy, Verner Vogels, etc. No walk-ins are allowed for the keynotes; you need to reserve your place in advance. Keynote seating fills up fast and closes as re:Invent gets closer.

There is an expo area where all vendor companies set up stalls and market their products with demos, brochures, and goodies. Expo is a fantastic opportunity to get a glimpse of all the tools available in the market today. The landscape this year was broadly divided into – APMs, logging solutions, and security products; data lake, data visualization, ETL, data warehouses, and data governance tools; SRE and DevOps tools. Even though AI/ML was the buzzword in all of these, surprisingly, there were not too many products catering to AI/ML specifically.

Vendors host multiple after-parties at various bars, restaurants, and nightclubs. One such event was the Sumo slam – Sumo wrestling match organized by Sumo Logic in association with other companies. The venue for Sumo Slam was Omnia, a prominent night club in Vegas.

There are multiple hands-on sessions where you can hone your AWS skills and also earn AWS certifications. There were deep racer league matches too, which I was keen to attend but could not.

The primary venue has a festive atmosphere throughout the conference days – DJs spinning out EDM and live bands performing. There was a mechanical bull ride to test your cowboy skills.

re:Invent started with a kick-off party on Sunday night and ended with a grand closing party on Thursday night. The kick-off party had a product launch – AWS DeepComposer, multiple music and dance shows, food, and drinks, marching bands, roller skaters, impromptu dance troops, and a chicken wing eating contest.

The closing party was massive and held in the Las Vegas festival grounds. It had multiple tents for different kinds of music. There was a venue for dodgeball and another site for other fun games. The EDM stage with the pulsating light show was fantastic. Thankfully, there were plenty of vegetarian options and a good selection of drinks. The highlight was a drone show organized by Intel. After witnessing the show, I believe we are not far from a future where an army of synchronized pulsating drones will replace fireworks.

It was an enjoyable week of broadening my horizon while having loads of fun. Amazon has done a fantastic job in organizing the event.

Get articles on coding, software and product development, managing software teams, scaling organisations and enhancing productivity by subscribing to my blog

Generalization – The Superpower

I was reading this Twitter thread on Ben Horowitz’s new book on culture. The book’s content is apparent to anyone who has spent time in a corporate setup. I have been listening to the audiobook – “Zen: The Art of Simple Living.” Again, the content is not radically new, something you would already know instinctually. Off late, there has been a spurt of twitter accounts dishing out wisdom. Most of the tweets seem to be a regurgitation of common sense.

mpho-mojapelo-UHDx3BHlFvY-unsplash.jpg

Am I trying to say that they do not add any value? No, the opposite. They are doing an excellent service by codifying common sense into pithy one-liners, simple rules, and principles.

All these people are generalizing lessons learned from specific instances into a broader set of rules and guidelines which apply to more expansive areas of life.

What do I mean by the above? Let me explain with an example.

Let us say that a company advertises telling – Deposit your money with us for two years for a guaranteed return of 15%. The bond yield rate is 8%. You see this scheme as attractive and invest. After a year, the company owner goes absconding, taking your money with her.

How do you generalize the lesson from this misadventure?

Be skeptical of any scheme(gold, real estate, etc.) that promises GUARANTEED returns over and above the current bond yield rate.

I believe generalizing lessons learned from specific situations to a much broader arena of life is a superpower which everyone should develop. Get into the habit of doing this for everything. If you followed a particular process that led you to success, try to make this process generic so that you can apply it elsewhere too.

Generalizing wisdom helps in pattern matching as well as molding the way you think. You begin to see patterns in your thoughts and actions where you can apply the principles created out of past experiences.

This habit sharpens decision making. You start seeing patterns in decision-making scenarios and can base your decision on a previously created rule.

You may not be able to create rules out of everything – creating a step by step guide on how to balance a bicycle is next to impossible. Wherever it is possible, do it; train your instincts with the rest and let your intuition take care of it.

Get articles on coding, software and product development, managing software teams, scaling organisations and enhancing productivity by subscribing to my blog

Photo by Mpho Mojapelo on Unsplash

Critique of Critiques of Daily Standups

In HackerNews, I read yet another write-up on daily standups and how they suck. Periodically, a post pops up on daily standup and how it is a nuisance. This entry of mine is an attempt at importing the importance of daily standups and how it adds value. We will also look at some of the familiar oppositions to daily standups and why they hold no water.

All posts on daily standups have a fundamental problem. They shy away from tackling the elephant in the room. I am also guilty of this in my take on daily standups. At some level, meetings are a forced collaboration attempt. In an ideal Utopian world, where everyone excels in collaboration and communication, we would not need meetings. Sadly that world does not exist.

Now that we are done away with addressing the uncomfortable truth, let us go deep into why daily standups are essential.

paulo-carrolo-nba9MRXrZE4-unsplash.jpg

Timely and efficient communication and collaboration can make or break teams. Humans display a diversity when it comes to communication, some excel at it, and some are bad at it. Sometimes, you might not genuinely know you need to communicate or that you are a blocker to someone’s work.

A primary reason for project failures is unmet dependencies and someone not planning for them. When you get people together and create a platform for them to discuss and collaborate, blockers and dependencies which would have gone unsaid otherwise surface.

How many times has it happened that someone raises a red flag on the day a project is supposed to go live? A daily standup ensures a constant feedback loop wherein this does not come as a last-minute surprise.

Daily standups ensure that everyone in the team knows what their counterparts are working on; this prevents people from becoming islands and ensures everyone knows the big picture.

As an organization, how do you develop this habit?

One of the paradoxes in life is that rules set you free, help build good habits, and reduce cognitive overload. Giants in the field of behavioral psychology – Dan Ariely, Daniel Kahneman, B. F. Skinner; all support this. The general prescription to start a good habit or break bad habits is to create a strict set of rules.

Another trick to aid good habits is to design your environment to support the practice; remove obstacles that prevent you from getting into the said habit. Putting it succinctly, make it easy to start and sustain a habit. Charles Duhigg and James Clear have written books on this.

Scheduling daily standups at a specific time with a well-understood format does both the above; it makes it easy and creates an environment for teams to cultivate the habit of collaboration and communication.

People who rally against daily standups tend to be:

  • Great at communication and pro-actively do it.
  • Individual contributors who excel at their work.

These people operate on individual bits of information and do not see the entire picture. From their narrow perspective, they are correct, but modern workplaces are not only about individual brilliance but more to do with teamwork. An automobile will not function unless all the parts work in tandem; the same goes for a team.

The majority of people do not know how to run efficient meetings; as a result, people have developed an aversion to meetings. This general distaste towards meetings has given the daily standup a lousy reputation.

Paul Graham talks about the maker’s schedule and manager’s schedule and how it is paramount that makers get a long uninterrupted chunk of time to create things. To avoid the context switch, schedule daily standup at the start of the day before everyone gets immersed in their work.

Most workplaces are chaotic. Daily standup gives you the means to bring order to the chaos.

Get articles on coding, software and product development, managing software teams, scaling organisations and enhancing productivity by subscribing to my blog

Photo by Paulo Carrolo on Unsplash

On Competition

I believe keeping an eye on the competition is a good idea. Keeping track of competition makes you aware of what is the new normal; it helps one to gauge current trends. If your product experience deviates from the prevailing standards, it might be time for a re-think.

When a behemoth does something well regularly, they create an impression that that is the new normal. Customers start expecting the same experience from everyone in the field. For example, Amazon keeps upping the ante in e-commerce. If you are a small boutique e-commerce firm, and if you are not close to the Amazon experience, you might be leaving a lot on the table.

achieve-1822503_640.jpg

Most of the new age enterprise SAAS tools have user experience on par with consumer applications. Earlier, enterprise tools used to be leaps and bounds behind their consumer counterparts. After using these new-age tools, products from some of the established behemoths look and feel clunky. Using them feels like being teleported to an earlier era. If you are an entrenched behemoth, you can get this wake-up call only if you regularly scan your competition, be it big or small.

Eyeing competition also matters when it comes to feature selection. For example, in Slack, you can edit a message after sending. I have hardly seen anyone modifying messages in Slack post sending. Usually, one sends a new message suffixing an * indicating it is an edit of a previous message. Why? We have been conditioned by popular chat applications not to alter chat messages once we hit the send button. None of the popular consumer chat applications have this feature. If you are bucking the trend – first, you should know of this; second, you should figure out how to educate your users to use the nonintuitive feature.

Incumbents also set standards when it comes to UI patterns. If Facebook shows error messages with a red background, you can be sure that most of the world’s population associates a pop-up with a red background as an error. It makes sense for you too to follow this.

I am not advocating aping the competition blindly.

Keeping track of the competition is essential to:
1. Know what is the new normal.
2. Know what might become the new normal.
3. Gauge how far off you are from the status quo.

Get articles on coding, software and product development, managing software teams, scaling organisations and enhancing productivity by subscribing to my blog

Image by Sasin Tipchai from Pixabay