Playing with OCP (with large projects) we see the important to set the adequate number resource memory to the application, java == jvm == planning for nominal and spike usage of memory. But less spoke, maybe in JVM but also very important is the cpu resources. Basically each container running on a node consumes compute resources, and setting/adding/increasing the number of threads is easy as long as we take in consideration the container limitations in terms of cpu. compute resources == resources (memory and cpu).
spec:
containers:
- name: app
image: images.my-company.example/app:v4
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Planning your application/environment
It is about planning the application – if you know that your app will eat let’s say 2CPUs, then you add requests to 2000m milicores to the application == 2k millicores = 2 cores (1/5 of a core would be 200m and 1 core == 1k). Taking in consideration that requests = what application wants at start and normal run and limits is when you reach the threshold, the kernel will kill the process with OOM. ). Knowing that the application should not exceed more than 3CPUs, you will add limit = 3000m = 3k millicores = 3 cores. Planning for nominal usage but also for high spikes and corner (out-liners) utilization. In kubernetes 0.5 core == 500m == half core.
Requests does not mean necessarily usage
Setting requests = 2000m does not mean it will use those 2 CPUs. It can start with a lower amount, let’s say 500m and it will keep growing. Think requests are, what is the normal amount of resources that the application will use. Basically it to increase load on CPU and memory – you need to make sure you have enough resources to play around (on the limits and on the host as well).
Throttling
Well, in the case a container attempts to use more than the specified limit, the system will throttle the container – hold it off. Basically allowing your container to have a consistent level of service independent of the number of pods scheduled to the node. On the thread cpu image you see on the console you will see a plato /—-\ before a decrease. Basically the quota/period
Quotas vs Complete Fair Scheduler
Bringing back some knowledge from Dorsal Lab (listening Blonde) in Montreal and studying in the Linux kernel and preemption processes basically Kubernetes uses the so well known (Completely Fair Scheduler) CFS quota to enforce CPU limits on pod containers, and the quotas force the preemption exactly like the Linux Kernel 🙂 . This explains a lot how does the CPU Manager work with more details.
There are some recommendations to not set cpus limit for applications/pods that shouldn’t be throttled. But I would just set a very high limit 🙂