the cover image of blog post Performance tuning Part 2

Performance tuning Part 2

2023-08-31
2 min read

A good performance testing requires more than just executing a single script.

In part 1, I talked about the right way to identify and address the problem, now here I will talk more about what is a good performance testing and how to do a benchmark appropriately.

5W1H works

  • Why do we run the performance test? We may have different goals, you can refer to k6 doc Understanding the different Types of Load Tests to decide the test scope and strategy separately.

  • What to test? A specific backend component? A single API? or a bunch of APIs with a certain flow? What’s the parameter of the tested API? You'd better to have a clear target before you shoot the arrow.

  • Where to run the test script? You may have multiple test environment, different environments may have different data sizes, which will impact the test result.

  • When? You probably want to isolate the performance testing from the other testing scenario if multiple teams work on the same testing environment, also the performance test is better to be conducted after the functional test because a super performant but non-functional system does'nt make sense.

  • Who will run the test? In my company, sometimes the DevOps team, has the most convenience to run the test script across different environments. However, the Development team should drive the whole flow and evaluate the result with stakeholders(Product Owner, Architects, etc.) as they build the system and they know them best.

  • How? Again it relies on the purpose of testing; basically, one thing we need to pay attention to is controlling the variables if we want to do a benchmark.

Variables

In my case, I need to figure out the current RPS(requests per second) of the user query API. Again, keep in mind that the RPS is not an absolute constant number; it varies upon different conditions as follows,

Factors
Server Resource(CPU, Memory, etc.)
Data size (Documents in the Couchbase Bucket, or records in the Table)
Network (Calling from external via CDN/Gateway vs Calling from internal)
API parameter/logic
Traffic / Pressure streched

I planned to fix the all other factors, besides the Traffic / Pressure and use k6 options to vary the Traffic strategy.

Key Concepts about k6

There are a few key concepts that you should understand to use k6.

VU: Virtual Users are essentially parallel while(true) loops; generally, more virtual users mean more simulated traffic.

Open & Closed model: In the closed model, VU iterations start only when the last iteration finishes. In the open model, on the other hand, VUs arrive independently of iteration completion, i.e., they don’t wait for the completion of iteration.

Benchmark

In my case, as I aim to have a benchmark for our api, the constant-arrival-rate strategy is a better choice than VU-based strategies. Listing the metrics I got from the 1st round testing.

Rate (iterations/sec)CPU (%)Latency (avg)Latency (p90)
10085%28.78s47.91s
3080%9.09s19.14s
2065%1.69s4.12s
1040%226ms464ms
520%162ms331ms

We can see two notable things here,

  • The API is quite CPU consuming; 5 RPS introduces 20% CPU usage.
  • Latency is not a linear increase along with the change of rate.

I have not made any improvement yet, however, instead of a random testing configuration like 1k/2k/3k VU, now I have a better inspection spot shows the sensitivity of change, that is, 20 iterations/sec, under which we could put change to other factors up on it(i.e., API parameters) to see the difference. The 2nd round testing result as below, by tuning the API query,

RateQuery(filter)CPU(%)Latency(avg)Latency(p90)
20 it/s{"or":[{"email":"xxx@gmail.com"},{"oidc.sub":<br/>"b213acd3"}]}65%1.69s4.12s
20 it/s{"email":"xxx@gmail.com"}')20%62.33ms95.66ms
20 it/s{"oidc.sub":<br/>"b213acd3"}15%51.22ms77.93ms
20 it/s{"or":[{"email":"xxx@gmail.com"},{"email":"yyy@gmail.com"}]}65%1.4s3.02s

Highlighting some interesting findings here,

  • single where criterion (either by email or oidc.sub ) is fast but still pretty CPU consuming.

  • or clause is bad in terms of both CPU and latency.(even with two same pattern like email)

Next, we will deep dive into the N1QL world and use the benchmark above for root cause checking.

© 2025 Xavier Zhou