Analysis of CPU load generated by individual JavaScript components

Let's talk a little about how to observe how much CPU resources the application JavaScript code consumes. At the same time, I propose to build our conversation around components - the basic building blocks of the application. With this approach, any efforts to improve productivity (or efforts to find the causes of program slowdowns) can be concentrated on (hopefully) small, self-sufficient fragments of the project. At the same time, I assume that your front-end application, like many other modern projects, was created by assembling small fragments of the interface suitable for repeated use. If this is not so, then our reasoning can be applied to another application, but you will have to find your own way of dividing your large-scale code into fragments and you will need to think about how to analyze these fragments.



Why is this needed?


Why measure CPU consumption by JavaScript? The fact is that these days, application performance is most often tied to the capabilities of the processor. Let me freely quote the words of Steve Soders and Pat Minan from an interview I took for Planet Performance Podcast . Both said that application performance is no longer limited to network capabilities or network latencies. Networks are getting faster and faster. The developers, in addition, learned to compress the server text responses using GZIP (or, rather, using brotli) and figured out how to optimize the images. It is all very simple.

The performance bottleneck of modern applications is processors. This is especially true in the mobile environment. And at the same time, our expectations about the interactive capabilities of modern web applications have grown. We expect that the interfaces of such applications will work very quickly and smoothly. And all this requires more and more JavaScript code. In addition, we need to remember that 1 MB of images is not the same as 1 MB of JavaScript. Images are downloaded progressively, and the application at this time solves other problems. But the JavaScript code is often such a resource, without which the application is inoperative. To ensure the functioning of a modern application, large amounts of JS-code are required, which, before they really work, need to be parse and execute. And these are tasks that depend heavily on the capabilities of the processor.

Performance indicator


We will use such an indicator of the speed of code fragments as the number of processor instructions required to process them. This will allow us to separate the measurements from the properties of a particular computer and from the state in which it is at the time of measurement. Time-based metrics (like TTI) have too much “noise”. They depend on the state of the network connection, as well as on anything else that happens on the computer at the time of measurement. For example, some scripts executed during loading of the investigated page, or viruses that are busy with something in the background processes, can influence the temporal performance indicators. The same can be said about browser extensions, which can consume a lot of system resources and slow down the page. When calculating the number of processor instructions, on the other hand, the time does not matter. Such indicators can be, as you will soon see, truly stable.

Idea


Here is the idea underlying our work: we need to create a “laboratory” in which the code will be launched and examined when changes are made to it. By "laboratory" I mean a regular computer, perhaps the one that you constantly use. Version control systems give us at our disposal hooks with which you can intercept certain events and perform certain checks. Of course, measurements in the “laboratory” can be performed after committing. But you probably know that changes to code that has reached the commit stage will be made more slowly than to code that is being written (if at all). The same applies to fixing the beta code of the product, and fixing the code that got into production.

We need each time the code is changed, its performance be compared before and after changes are made. In doing so, we strive to investigate components in isolation. As a result, we will be able to clearly see the problems and be able to know exactly where they arise.

The good thing is that such studies can be carried out in a real browser, using, for example, Puppeteer. This is a tool that allows you to control the browser without a user interface from Node.js.

Search code for research


In order to find the code for the study, we can refer to any style guide, or to any design system. In general, we are happy with anything that provides brief, isolated examples of the use of components.

What is a “style guide”? This is usually a web application that demonstrates all the components or “building blocks” of user interface elements that are available to the developer. It can be either a certain library of third-party components, or something created by your own efforts.

While searching for such projects on the Internet, I came across a recent discussion thread on Twitter that talked about relatively new libraries of React components. I looked at several of the libraries mentioned there.

Not surprisingly, modern high-quality libraries are provided with documentation that includes working code examples. Here are a couple of libraries and Button components implemented by their means. The documentation for these libraries contains examples of the use of these components. We are talking about the Chakra library and the Semantic UI React library.


Button Component Chakra Documentation


Button Semantic UI React Documentation

This is exactly what we need. These are examples whose code we can examine for their consumption of processor resources. Similar examples can be found in the bowels of the documentation, or in code comments written in the JSDoc style. Perhaps, if you are lucky, you will find such examples, designed as separate files, say, in the form of unit test files. Surely it will be so. After all, we all write unit tests. True?

Files


Imagine, for the sake of demonstrating the described method of performance analysis, that there is a Button component in the library that we are studying, the code of which is in the Button.js file. The file with the Button-test.js unit test is attached to this file, as well as a file with an example of using the component - Button-example.js . We need to create some kind of test page, in the environment of which test code can be run. Something like test.html .

Component


Here is a simple Button component. I use React here, but your components can be written using any technology convenient for you.

 import React from 'react'; const Button = (props) =>  props.href    ? <a {...props} className="Button"/>    : <button {...props} className="Button"/> export default Button; 

Example


And here is an example of using the Button component. As you can see, in this case there are two component options that use different properties.

 import React from 'react'; import Button from 'Button'; export default [  <Button onClick={() => alert('ouch')}>    Click me  </Button>,  <Button href="https://reactjs.com">    Follow me  </Button>, ] 

Test


Here is the test.html page that can load any components. Notice the method calls to the performance object. It is with their help that we, at our request, write to the Chrome performance log file. Very soon we will use these records.

 const examples =  await import(location.hash + '-example.js'); examples.forEach(example =>  performance.mark('my mark start');  ReactDOM.render(<div>{example}</div>, where);  performance.mark('my mark end');  performance.measure(    'my mark', 'my mark start', 'my mark end'); ); 

Test runner


In order to load a test page in Chrome, we can use the Puppeteer Node.js library, which gives us access to the API for managing the browser. You can use this library on any operating system. It has its own copy of Chrome, but it can also be used to work with an instance of Chrome or Chromium of various versions already existing on the developer's computer. Chrome can be launched so that its window is invisible. Tests are performed automatically, while the developer does not need to see the browser window. Chrome can be launched in normal mode. This is useful for debugging purposes.

Here is an example Node.js script run from the command line that loads a test page and writes data to a performance log file. Everything that happens in the browser between the tracing.start() and end() commands is written (I want to note, in great detail) to the trace.json file.

 import pup from 'puppeteer'; const browser = await pup.launch(); const page = await browser.newPage(); await page.tracing.start({path: 'trace.json'}); await page.goto('test.html#Button'); await page.tracing.stop(); await browser.close(); 

The developer can manage the "detail" of performance data by specifying the "categories" of tracing. You can see the list of available categories if you go to Chrome at chrome://tracing , click Record and open the Edit categories section in the window that appears.


Configuring the composition of data written to the performance log

results


After the test page is examined using Puppeteer, you can analyze the results of performance measurements by going to the browser at chrome://tracing and downloading the just recorded trace.json file.


Trace.json visualization

Here you can see the results of calling the performance.measure('my mark') method. The measure() call is for debugging purposes only, in case the developer wants to open the trace.json file and see it. Everything that happened with the page is enclosed in the block my mark .

Here is a trace.json :


Fragment of trace.json file

In order to find out what we need, it is enough to subtract the indicator of the number of processor instructions ( ticount ) of the Start marker from the same indicator of the End marker. This allows you to find out how many processor instructions are needed to display the component in the browser. This is the same number that you can use to find out if a component has become faster or slower.

The devil is in the details


Now we have measured only indicators characterizing the first output to the page of a single component. And nothing else. It is imperative to measure indicators related to the smallest amount of code that can be executed. This allows you to reduce the level of "noise". The devil is in the details. The smaller the performance of which is measured, the better. After measurements, it is necessary to remove from the results obtained what is beyond the influence of the developer. For example, data related to garbage collection operations. The component does not control such operations. If they are executed, this means that the browser, in the process of rendering the component, decided to launch them itself. As a result, the processor resources that went to garbage collection should be removed from the final results.

The data block related to garbage collection (this “data block” is more correctly called an “event”) is called V8.GCScavenger . Its tidelta should be subtracted from the number of processor instructions that go into rendering the component. Here is the documentation for trace events. True, it is outdated, and does not contain information about the indicators we need:

  • tidelta - the number of processor instructions required to process an event.
  • ticount - the number of instructions to start the event.

You need to be very careful about what we are measuring. Browsers are highly intelligent entities. They optimize code that runs more than once. In the next graph, you can see the number of processor instructions needed to output a certain component. The first rendering operation requires the most resources. Subsequent operations create a much lower load on the processor. This should be borne in mind when analyzing code performance.


10 rendering operations of the same component

Here is another detail: if the component performs some asynchronous operations (for example, it uses setTimeout() or fetch() ), then the load on the system created by the asynchronous code is not taken into account. Maybe it's good. Maybe it’s bad. If you are investigating the performance of such components, consider a separate study of asynchronous code.

Strong signal


If you take a responsible approach to resolving the issue of what exactly is being measured, then you can get a truly stable signal that reflects the impact on the performance of any changes. I love the smoothness of the lines in the next graph.


Stable measurement results

The bottom graph shows the measurement results of 10 rendering operations of a simple <span> element in React. Nothing else is included in these results. It turns out that this operation requires from 2.15 to 2.2 million processor instructions. If you wrap the <span> in the <p> , then to output such a design you need about 2.3 million instructions. This level of accuracy strikes me. If a developer can see the performance difference that appears when a single <p> element is added to a page, this means that the developer has a really powerful tool in their hands.

How exactly to represent measurements of such accuracy is up to the developer. If he doesn’t need such accuracy, he can always measure the rendering performance of larger fragments.

Additional performance information


Now that the developer has at his disposal a system for finding numerical indicators that accurately characterize the performance of the smallest fragments of code, the developer can use this system to solve various problems. So, using performance.mark() you can write additional useful information to trace.json . You can tell the members of the development team what is happening and what causes an increase in the number of processor instructions needed to execute some code. You can include in the performance reports information about the number of DOM nodes, or about the number of write operations in the DOM performed by React. In fact, here you can display information about a lot. You can count the number of page layout recalculations. Using Puppeteer you can take screenshots of pages and compare how the interface looks before and after making changes. Sometimes the increase in the number of processor instructions needed to display a page looks completely unsurprising. For example, if 10 buttons and 12 fields for editing and formatting text are added to the new version of the page.

Summary


Is it possible for everyone that was discussed here to use it today? Yes you can. To do this, you need Chrome version 78 or higher. If trace.json has ticount and tidelta , then the above is available to you. Earlier versions of Chrome do not.

Unfortunately, information about the number of processor instructions cannot be obtained on the Mac platform. I have not tried Windows yet, so I can’t say anything definite about this OS. In general - our friends are Unix and Linux.

It should be noted that in order for the browser to be able to provide information on processor instructions, you will need to use a couple of flags - these are --no-sandbox and --enable-thread-instruction-count . Here's how to pass them to a browser launched by Puppeteer:

 await puppeteer.launch({  args: [    '--no-sandbox',    '--enable-thread-instruction-count',  ]}); 

Hopefully now you can take your web application performance analysis to the next level.

Dear readers! Do you plan to use the methodology for analyzing the performance of web projects presented here?


Source: https://habr.com/ru/post/479266/


All Articles