I am reaching out to inquire about the implementation of an energy monitoring solution similar to Xcode's Energy Impact tool for iOS apps.
So, as one note here, keep in mind the the energy impact tool is only intended to provide a quick and straightforward metric of an apps impact, particularly relative to "itself", not to act as a truly accurate guide the exact impact an app actually has. In more concrete terms, it provides a quick way to compare the impact across different parts of your app and act as a simple "check" of expected overall usage. However, that data does not directly translate to specific impact, particularly not when all of the additional complexity of real world usage is involved.
How can we achieve monitoring and calculation of CPU, GPU, and network usage over a period of time within an app?
MetricKit is the best API option for this by FAR. It's straightforward to use and will basically just "hand" this this data in a format you can easily collect and process yourself.
If you dig around on the internet you'll almost certainly find other low level APIs that will return the same or similar data, often with an argument about why that API is better. There are a few different reasons I would strongly recommend avoiding those:
-
Many of these APIs sit on the line between API and SPI. They're API in the sense that they aren't private and nothing prevents an app from calling them. They're SPI (System Programming Interface) in the sense that the teams working on them are focused on supporting other system components (like MetricKit), not app developers. This kind of API can stop working or change behavior with very little notice, forcing you to scramble to get something working again.
-
Many of these APIs are in the mach API layer, which opens up an entirely new range of failure paths. Mach ports are the system primary IPC mechanism, used for virtually "everything" an app does (memory allocation, logging, UI management, daemon IPC...). It's also very easy to use our mach APIs incorrectly and the bugs that result from those mistakes can be MADDENINGLY difficult to investigate and debug.
-
These APIs almost alway require some amount of polling, making it very easy for your monitoring code to end up introducing even more energy drain.
How is the current energy consumption level of an app determined?
We've never documented the specifics and that's unlikely to change (see below).
Additionally, how are the weights for various factors like CPU, GPU, network, and location usage allocated when calculating the overall energy impact?
Those factors (you've identified the critical ones) are collated together to produce "a number". While there is some (fairly minimal) amount of weighting going on, ACCURATELY collating all those factors would be EXTREMELY difficult and the system isn't really trying to do so. For example:
-
The actual and relative power cost of given level of CPU/GPU/network activity significantly varies between our devices. Having the gauge provide a relatively stable guide across devices is more important than trying to precisely correct for those differences.
-
The factors themselves aren't actually very well balanced (for example, the GPU tends to consume far more power than networking does) between each other, so accurately representing their EXACT impact would often end up "hiding" meaningful information. In concrete terms, the fact that the GPU uses more energy than the network doesn't mean it isn't worth figuring out why your apps network usage suddenly spiked.
-
The power impact of networking and location are vary widely based on external factors outside your apps control. As you might imagine, the power difference between the "best case" (Ethernet over USB) and the worst case (very poor cellular) is quite significant.
The overall point here is that the gauge goal is to help you catch "spikes", not give you precise information about that spikes specific impact. Frankly, the primary goal of the gauge is simply to provide a quick and easy way to monitor all of those factors in real time. The assumption is that if/when a spike occurs you'll either "know" what the issue was immediately (because you know what your app did) or you'll use other, more detailed tools to find the issue.
That's also the reason why MetricKit provide the data about the individual factors but doesn't provide a single "aggregate" power metric. Once you're looking at an extended data capture, what matters is changes within each data stream, not the "overall" data stream.
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware