Weekly Bullet #12 – Summary for the week

Hi All !

Here is the weekly summary of Technical / Non-Technical topics that I found very resourceful.

Technical:

  • Three scientists published a paper proving that Mercury, not Venus, is the closest planet to Earth using Python. Checkout the amazing visualizations built using Python for the same. Article – link . Video – link.
  • What happens behind the scenes when you do a search in Google. – “How web works.
  • Here are a bunch of great qualities that every senior-engineer (pre-manager / manager), ideally should poses. Check out — ” What are the signs that you have a great manager?
  • One of my mentors always says this – “You are paid for your thinking and problem solving abilities”. Here is a great compilation of websites which will help you hone these skills. — “A list of all problem solving websites.

Non-Technical:

  • Here is another great Reading list. Note: Books are mainly Non-fictional / Programming related. — “Popular reading lists.
  • This year marks the 50th anniversary of first ever Moon landing. Here is a super-cool website to relive Apollo 11 mission! — “Apollo 11 in Real Time“. Click on Join-in Progress button.
  • Wikipedia compilation of common misconceptions across Art & Culture, History and Science. Fun read. – “List of common misconceptions
  • An extract from the book that I am reading :

“A person’s success in life can usually be measured by the number of uncomfortable conversations he or she is willing to have.”

Brene Brown

Weekly Bullet #10 – Summary for the week

Hi All !

Here is the weekly summary of Technical / Non-Technical topics that I found very resourceful.

Technical:

Non-Technical:

  • Books recommended by over 100 founders and makers in tech. “Rework” is my all time favorite from the list. – ” Founder Books
  • I have written about Spaced repetition and Anki tool for the same. It does wonders and here is a write up on – “Tips for using Anki and Spaced Repetition in 2019
  • A map of the US where city names are replaced by most Wikipedia’ed resident. Try zooming in and out. – “A People Map of the US
  • An extract from the book that I am reading:

“Khaled Hosseini wrote The Kite Runner in the early mornings before working as a full-time doctor. Paul Levesque (page 128) often works out at midnight. If it’s truly important, schedule it. As Paul might ask you, “Is that a dream or a goal?” If it isn’t on the calendar, it isn’t real.”

Brain Koppelman

Performance Bottleneck : High CPU Utilization vs High CPU Saturation

This article is more about a performance scenario that I found myself in, a few days ago, and my thought process about the same. It is about a situation when a Performance Engineer has to weigh the impact of CPU Saturation and not just CPU Utilization.

Scenario:

I was testing the Horizontal Scaling efficiency of an AWS EC2 instance, and at some points I was seeing low CPU utilization but high CPU Saturation (higher load averages).
Should I be spinning up new AWS instance because the CPU is saturated, although I have low CPU utilization (CPU % usage)?

Thought process:

More often than not, we horizontally scale to +1 instance of a server based on CPU % utilization. Say, if the CPU % reached between 50 – 60% , add one more instance.

But what about CPU Saturation? Should we also scale when the CPU is saturated, but the utilization is low (say 40%).
Here it is important to understand the meaning of “CPU Saturation“.

Let’s say that the system under test is a 4 core box. We will say that the system is Saturated if :
– your load average (first line in – top command) will increase to a very large value above 4 (system under test is 4 core box)
– load average remains at a large value for a long duration of time.
– there are large number of requests in queue/blocked for CPU time. (run the command: dstat -p)

Above situation correlates with a supermarket, which has 4 billing counters, but there are 50 customers who want to get billed for their purchase. Since there are only 4 billing counters, 46 of them have to wait! This is Saturation.


And what will happen if the system is saturated?
– The requests will wait longer in the idle state, waiting to run on the CPU.
– The overall response time of the requests will increase. Reference link.
– The corresponding CPU utilization (%) will also increase by a certain value.

What did I do?

  • I checked how long the CPU stayed at the saturated state. “How long did the the queue length was significantly high” to see if this is seriously after the end-user experience. More on this here.
  • It was for about 4 to 7 minutes roughly every-time.
  • I tried to figure out, why the requests are taking longer to run on CPU, resulting in increased queue-length.
  • On further digging in, I found that end point – Mongo/Kafka where my writes were happening, was slowing down with increasing load. And being the actual cause for more time for requests.
  • Important point to note here is — Load average is not as straight as it looks! Load average apparently includes the tasks waiting on IO. More on this here.
  • Tuning was required on writes to end-points.

Learning :

  • Next time when the CPU looks saturated, check the corresponding IO’s on the endpoints.
  • Check if the corresponding Response times are going bad, and not directly increase the horse power on CPU.
  • Also, an occasional high CPU Saturation is just fine!

Happy tuning.