From 57d6c02da7a7aa74b7789d7871e86533b43106fa Mon Sep 17 00:00:00 2001 From: gennyble Date: Sun, 2 Mar 2025 04:04:29 -0600 Subject: add 'statistics on linux' to words.html --- served/words/statistic-gifs.html | 197 --------------------------------------- 1 file changed, 197 deletions(-) delete mode 100644 served/words/statistic-gifs.html (limited to 'served/words/statistic-gifs.html') diff --git a/served/words/statistic-gifs.html b/served/words/statistic-gifs.html deleted file mode 100644 index 583974b..0000000 --- a/served/words/statistic-gifs.html +++ /dev/null @@ -1,197 +0,0 @@ ---- -template=post -title=Statistics on Linux with /proc -style=/styles/post.css -style=writing.css - -published=2025-03-02 4:00am CST - -description=I want to tell you how my statistic gifs are made :) ---- - - - -I've been wanting to make a little page for the statistics of my -webserver (the system not the program). When I started to -research the APIs that I'd need, just on a whim one day with no -intention to start, I got grabbed by it and knew I had to start. - -Check it out: starlight.html - -

a /proc foreword

-The /proc filesystem, on Linux, is a sort of window into -the kernel. It lets you view some pretty detailed information by simply -reading some files (thanks everything-is-a-file linux). - -There's a lot of information about it in the man pages. -They might all be in one big one at man proc but, -like how they are on my server, they could be broken into separate pages -for distinct sections. - -I have linked the relevant pages at the top of their section. It's a link -to man7.org, which seems to be the source for Linux Kernel man pages -on the internet. man7 is linked from kernel.org which lends it -credibility at least. - -

Memory

- - - -This one isn't too hard. I open the file /proc/meminfo and -look for the lines starting with MemTotal and MemAvailable -which are the total memory and currently available memory, respectively. They -are very well named :). For usage, I just subtract available from total. - -

Network

- - - -If you cat /proc/net/dev you can see some stats about -your networking interfaces. This is what I parse, with some pain. - -I read the bytes columns from the receive and transmit sections. -These are total counts of bytes received since boot, so you'll -have to take two samples and subtract to get the number of bytes -in some time-span. - -Looking at it in the terminal, you might assume that the separator -between the columns was a tab character. I sure did! It is not a tab, -but many spaces. - -Because of spaces-and-not-tabs -(not the tabs vs. spaces debate of usual, but with similarities), it proved -to be a bit annoying to parse. It made me finally -pull in a regex crate because I didn't feel like dealing with it -at the time. Eventually™ I want to write a skip-arbitrarily-many-whitespace -iterator, but for now regex-lite lives in my Cargo.toml. - -

CPU

- - - -/proc/stat is the least obvious of the triplet. It has more than -just the CPU's information, but the cpu is what we're after. You'll notice many -CPU lines probably! I'm using the one starting just "cpu" without a number -(cpu0, cpu1, etc.) because I only have the 1 core. If I had more than one core -it'd work similarly, the just-cpu line sums the other ones, but then it could -show >100% usage 'cause it's per-core usage just added together. - -First things uh, second? To summarize from the man page:
-The units of these values are ticks. There are USER_HZ -ticks per second. On most platforms it's 100 but you can -check the value for your system with sysconf(_SC_CLK_TCK). - -
- small C program to check _SC_CLK_TCK :) -
#include <stdio.h>
-#include <unistd.h>
-int main() {
-	printf("USER_HZ is %i", sysconf(_SC_CLK_TCK));
-}
-
- -But what columns of data do we use? From this stackoverflow answer -it seems that summing the user, nice, and system columns get you the total ticks. -The user and system make sense to me, time spent in user and system mode, -but what on earth is nice? I sure hope it is. - -The Internet tells me to check man nice -(man7.org/nice). -That page says that the -nicness of a process can be adjusted to change how the kernel schedules -that process. Making it less nice (down to -20) increases it's priority, and -increasing it's niceness (up to 19) lowers it. I guess that makes sense. Lowering -the niceness makes the process greedier and in want of more attention -from the scheduler? I'm unsure how well that personification tracks to reality, but -it helped me think about it. - -The nice column, then, seems to be the time spent in processes that -would go in the user column, but they have a different priority and -I guess differentiating that is important. - -Oh, but there might be more columns we want! -There's another S.O. answer -that I found while writing this that says the sixth and seventh columns should used -as well. These are irq/softirq and are time spent servicing -interrupts. I think it makes sense we'd want that, too. - -So you have all these columns—user, nice, system, irq, -and softirq—that add together to give you the total number -of ticks spent Doing Things since boot, and you have the number -of ticks in a second. Can you see where I'm going with this? - -Yup, take two samples some time span apart, subtract the former -from the later, and then you have how much time the processor spent -Doing Things. You can use that and the number of ticks in your time -span to calculate utilization. Or you just have how much actual time -The Computer spent Doing Work which is also pretty neat. Maybe you -can pay it an hourly wage. Is that just AWS? - -Something to watch out for:
-apparently the numbers in /proc/stat can overflow and -wrap back to zero. I don't know what size integers they are so I'm -unsure how real of a risk that is, but it seemed worth mentioning here. - -

So you've parsed the stats, now to graphs!

- -My main trouble here was selecting a range that makes sense for -the data it's representing. - -Again, memory was easy. There is a -total, normally-unchanging amount of RAM, so I just use that as -the max. Perhaps there's something to be said about zooming further -in to see the megabyte-by-megabyte variance, but I am much more -interested in a "how close am I to the ceiling" kind of graph. Like, -would I hit my head if I jumped? that kind of thing. - -The CPU graph, though, that's very variable and a bit spiky. -I don't really care what the max value was if it's a spike, -it can go off the top for all I care, what I want to see is the -typical usage. - -If I just ranged to the max then I'd have what I call The Linode -Problem. I call it that, rather predictably, because that's what -Linode's graphs do and it makes them kind of useless? Great, I love -to see that spike up to 100%, but that's all that I can see now. - -So instead of max-grabbing, I sort the data and take the value that's -almost max. My series are 256 samples long, so what this looked -like was taking the 240th value in the array, getting the closest-highest -percent, and using that as the top of the range. - -This does mean if it's very spiky, I get The Linode Problem -again, but in that case I'm kind of okay with it. I sample every minute, -so my 256 pixel long graphs are roughly 4 hours long. If it spikes more -than 16 times in that period, perhaps that's worth looking into. - -Okay, CPU done. Network time! It's the same, pretty much. Where there was -one line, there are now two. And lots more spikes! I combine the receive -and transmit series into one vec, sort it, and take the 32nd -highest value. - -I draw the area under the line, too, because it was nigh impossible to see -the line when it was so.. discontinuous? We get another problem with that, -though, where the second-drawn line-and-underfill will obscure the one -drawn first. So then, to not overdraw an entire measurement, I try to draw -the average-larger one first. Which is to say, I take the average of both -series separately and draw the one with the bigger average first. That way -the smaller one will hopefully nestle under the larger, like a baby bird -hiding from the rain under their parents wing. - -
- -That's how the range selection works, anyway. - -The graphs themselves are drawn on 256x160 gif because i like gif, 256 is -a good number, and they seem to compress better than png in this use case. - -One day I'd love to try and generate alternative text to describe -the general look of the graph. "The memory usage is steady at 300MB", -or something like "The network usage is variable, but averages 15.4kbps". - -That's it!
-bye :) \ No newline at end of file -- cgit 1.4.1-3-g733a5