Quality of Service of pods in OCP

All

I was reading the book Hybrid Cloud Apps with OCP again – that’s an awesome book by Michael Elder at al, and the only comment I would add is about networking/ DNS. The book could add more about networking and how it changed from OCP 3.x to OCP 4.x.

In some scenarios of lack resource of the kuberntes/ocp nodes, the pod Quality of Service, can play a role on the nodes eviction process and how they can terminate the pods, the types of quality set on the pods, either guaranteed, burstable, and best effort.

We do that on the deployment/deployment config – for usual deployments (no operator) involved.

In an Operator deployment for example, the Infinispan operator, setting the container specs in the Infinispan CR with cpu and memory with the same values will make it guaranteed.

And this setting can have a huge impact on the stability of critical nodes in case the OCP nodes decided to start killing ocp pods. The BestEffort will be the first on the list, followed by Burtable and Guaranteed pods will be the last one on the kill list.

Although a small setting this can have huge consequences for OOME Kill (or avoiding it).

Service Mesh

All

Recently I’ve been working in some interesting OCP deployments with Service Mesh. I mean that is a very powerful and I’d say complicates subject – even for experts on the matter doesn’t seem trivial.

The context here is Istio, just to be clear. So I’m talking about the Cloud Native Computing Foundation project. Service Mesh is basically an extension of OCP where it provides customizable features. In this matter, Service mesh can add so much flexility and enables such a centralized control for microservices handling.

Features of Service mesh include load balancing, full/automatic authentication canary releases, access control and even end-to-end authentication (via Istio mtls). Everything in one place.

Objectively Service Mesh adds a transparent layer of transport – all without any application change. To do that Service Mesh
captures/intercepting traffic between services that will act to modify, redirect, or create new requests to other services.

To do this interception/capture of requests, Service Mesh relies on the envoy sidecar – which is a container together with the other application in the same pod – a sidecar.

For deployments such as JBoss EAP/Widlfly this can be very interesting for be able to control communication and establish a level of network control more than what the services (eap-app-ping for clustering) already provide.

On the other hand, some architectures are coming up to use Istio without the sidecar, called sidecarless. One example is Ambient Mesh. So sidecarless implementations can be useful for environments where instrumenting the pod increases its complexity (deployment and instrumentation) and where it can be just simpler to not instrument the pod.

Using the inspect on Openshift

All

Get the namespace inspect for troubleshooting OCP issues.

One of the must useful tools in OCP, together with Must gather I think is the inspect.

The DevOps in OCP can be chaotic sometimes, with some many pods and operators and . But that’s exactly why the inspect can be a core component to debug the pods/services/deployments in OCP.

So to avoid this all going over your head, just get the inspect first. From there you can do a top down approach: so start with the deployment(config) and move to the services and pods, or bottom up – meaning the pods yaml/logs, and then move from to deployment config.

I mean, I’m trying to lead out the idea to get the inspect – via oc adm inspect ns/$namespace – so then this can be lead indicator for several issues: pod crashes, application issues, pod resource starvations. What happens if the application logs is ok, but the yaml shows the service’s label is wrong.

This avoids for example, only seeing the pod logs and forgetting about the resource allocations – in terms of cpus and memory allocation.

Doing a more global review: pod yaml, core configmaps, services, deployments, everything at once.

For deployments more and more complex, with several components – and sometimes, with Service Mesh – the istio side-car will be inside the pod and the user see the sidecar pods and access logs (set on the smcp – service mesh control plane).
On this matter I will write some presentations on this regard.

I’m sure the above is not consensus, some people will opt out for getting the pod yamls or just pod logs first, and just then get the inspect. But for OCP problems I start with inspect, because I can see the complete deployment, see all pods on the namespace and do the work once. So almost by definition you will have an overview of the data, which is must better than an narrow view of only pod logs.

Shenandoah won’t save all bad performant situations

All

Shenandoah Garbage collector has been already discussed on this blog several times and I’m certainly a fan of it.

First because at Red Hat I was mentored by Roman Kennke – who developed the algorithm and wrote the paper with Chirstine Flood – and wrote the original paper: Shenandoah: An open-source concurrent compacting garbage collector for OpenJDK – Christine H. Flood, Roman Kennke, and Andrew Dinn.

But, also because of its awesome performance for large heaps, we are talking about 50gb+ for examples. It is awesome very simple to understand, the performance doesn’t scale with larger heaps and so on.

However, I’ve explained this before, Shenandoah (in its non generational form) is not applicable for all situations and workloads and there will be workloads where its performance will be hurt more than helped by Shenandoah – given it is not generational. Being non-generational is a core part of the algorithm and helps considerably in several aspects but can hurt in other more specific aspects.

In this matter actually, Amazon team is working on the generational Shenandoah and in 2021 announced it – to use it, download correto and set:

-XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions
-XX:ShenandoahGCMode=generationa

An example is when a high number of very short lived objects is created at random periods, which leads to all the threads kicking in and running at the same time and can lead to several subsequent full pauses in a roll. For those cases a generational collector, like G1GC and Parallel, would likely handle better the situation – by spliting the collection in phases. For those (generational) workloads Amazon (Correto) is developing its Generational Shenandoah.

In this aspect as well, I’ve seen some comments/discussions that Shenandoah will eventually surpass all and should replace G1GC/Parallel word-loads handles. Similar to how G1GC replaces CMS. That wouldn’t be the case, given some word-loads have a better performance with generational collectors. And in this aspect, Shenandoah is not necessarily a “improved” G1GC, so I won’t suggest all workloads to be replaced with Shenandoah necessarily.

Consequently, there needs to be a due diligence from the development team to verify how a non-generational collector is handling – in terms of latency, throughput, and less (but not least) footprint – which is most of the times sacrificed in several situations when developing in Java or (self/auto) collected garbage collection development.

But this can be generalize pretty much for anything in JVM/Java – no magic JVM flag will cut the latency in half (unless very specific cases for example where a certain collector is more adequate than another).

PIOSEE Decision Model and preparations for critical situations

All

During the pandemic times I’ve watch the content of Peter on the Mentor Pilot’s Youtube channel considerably – the channel brings several aspects of Aviation – deep technical, procedural, behavioural analysis – within the aviation themes.

I can say Peter taught several important life lessons: beware confirmation bias, prejudice, verify the assumptions, learning always, and trust on the team.

One important thing I’ve learned with him was the usage of PIOSEE decision model – and how the Pilots use this model on critical situations. I show below:

P	Problem	It requires you to swiftly identify the problem at hand.
I	Information	Gather information about the problem that is occurring.
O	Options	With the gathered information about the problem, you and your team generate options to solve the problem
S	Select	You need to select an option after efficiently evaluating the alternatives.
E	Execute	Options are worthless without swift and effective execution.
E	Evaluate	After execution, you and your team evaluate the process, noting places for improvement.

PIOSEE Model – PIOSEE is similar to FORDEC model – given the same number of stages [1][2][3].

PIOSEE Model

This decision making model can be very useful in several situations and can be applied for IT troubleshooting as well – from war room ( where actual systems operations mal functions results in system off-line) but also for upgrade and migration procedures.

First, defining the problem can be very useful and it is the first step to be understanding the problem. A well defined problem will be much better/faster troubleshooted. Sometimes the problem definition can be much harder than finding the actual solutions. Knowing the problem we will be able to know who are the resources (human and material/IT resources) that are needed to solve them.

Then collecting the right information, it can be an inspect from Openshift (oc adm inspect), or a server report (from Infinispan) or even a few heap/thread dumps/VM.info in java applications (deployed in kubernetes or not). Or even collecting custom resources, in case we need to see the API/resources created with some Operator (Service Mesh, or Data Grid Operator, MTA Operator) and so on. Knowing what aspects/data to collect for each situation will result in much faster troubleshooting phase.

Later, after the analysis of the data, provided that all the information is collected – which can be a top down approach (custom resource for example – in case a operator is used) up to the down/very low level – which can be kernel tracing data, kernel audit logs, or even heap dump specific interpretation. This goes on the analysis of the options we have: restart, reboot, upgrade, downgrade, remove certain JVM flags, add certain JVM flags, re-write the system.

The selection of the options, and its trade-offs, is the next step on the decision model – one needs to understand the data, interpret it, and then select the option – I think it is very important to considering two aspects on this stage: trade-off and time for implementing. If the options have many trade-offs, other options should be considered. Once the selection options are all listed – the selection should be done at once.

Later the execution of the option should be done thoroughly – with the right resources following the procedures (with or without the checklists) , but of course better if those can be tested – but sometimes the procedure is sui generis therefore that’s the first time this is happening and might not have been tested/prepared before.

Finally, the evaluation of the system after the procedure will do – this includes visual references, in java particularly – jcmd threads/heap, will provide enough data. If not enough data/references will provide clues to how the system is performing well or not. If required more information, more data can be feed from the system and this process can be iterative until the (initial) problem is 100% solved.

I think trying to establish several procedures, methods, and preparations for critical situations can help considerably for this. In this matter, the QA/QE of a system/java application can avoid problems – and it is very useful if not essential before deploying in production. However, how/what procedures/how long they should take can put the system back much faster.

[1] https://medium.com/@sb_30by50/7-leveraging-piosee-and-nits-in-a-time-of-massive-uncertainty-2c3582f77475

[2] https://khurrambhatti.com/2021/12/27/piosee-for-team-coaching/

[3] https://pilot-network.com/news/decision-making-models

Resolution/Update for 2023.

All

In 2022 I’ve stayed away from the blog for several months – I will try to update it more often – like 2021/2020.

The reason for that is that I didn’t want to write short blog posts and focus on long ones.

As 2023 resolutions, some aspects I would like to cover this year are:

Shenandoah, JNPL, EJB

OpenShift: Operator, Openshift networking, routes,services..

DataGrid/Infinispan, and JVM.

But also Wildfly as well. But also what I’ve learned besides IT as well, like movies (mostly character driven plots, not plot driven ones).

	gabrielmariachi on South Africa Conference…
	franciscomelojr on Side note: JP Morgan about…
	gabrielmariachi on Side note: JP Morgan about…
	Undertow [pt2] \| Fra… on Undertow
	GC – GC and GC… on Garbagecat

Overhead DeMelo

Make it work, make it right, then make it faster! Not the other way around.

Month: January 2023