about summary refs log tree commit diff
path: root/served/words/statistic-gifs.html
diff options
context:
space:
mode:
authorgennyble <gen@nyble.dev>2025-03-02 04:00:45 -0600
committergennyble <gen@nyble.dev>2025-03-02 04:00:45 -0600
commita07c03f930c89b7c3389c86aeb25947cea8e1231 (patch)
treed6b8d3a441293ac8a92f0f3987a02d3b55e7b7f7 /served/words/statistic-gifs.html
parent23aec015c81541ffc4509b1b81b509c060342ff1 (diff)
download∞-a07c03f930c89b7c3389c86aeb25947cea8e1231.tar.gz
∞-a07c03f930c89b7c3389c86aeb25947cea8e1231.zip
statistic gifs
Diffstat (limited to 'served/words/statistic-gifs.html')
-rw-r--r--served/words/statistic-gifs.html197
1 files changed, 197 insertions, 0 deletions
diff --git a/served/words/statistic-gifs.html b/served/words/statistic-gifs.html
new file mode 100644
index 0000000..583974b
--- /dev/null
+++ b/served/words/statistic-gifs.html
@@ -0,0 +1,197 @@
+---
+template=post
+title=Statistics on Linux with /proc
+style=/styles/post.css
+style=writing.css
+
+published=2025-03-02 4:00am CST
+
+description=I want to tell you how my statistic gifs are made :)
+---
+
+<style>
+	.manlink {
+		margin-top: -1rem;
+	}
+</style>
+
+I've been wanting to make a little page for the statistics of my
+webserver <i>(the system not the program)</i>. When I started to
+research the APIs that I'd need, just on a whim one day with no
+intention to start, I got grabbed by it and knew I had to start.
+
+Check it out: <a href="/starlight.html">starlight.html</a>
+
+<h2>a <code>/proc</code> foreword</h2>
+The <code>/proc</code> filesystem, on Linux, is a sort of window into
+the kernel. It lets you view some pretty detailed information by simply
+reading some files (thanks everything-is-a-file linux).
+
+There's a lot of information about it in the man pages.
+They might all be in one big one at <code>man proc</code> but,
+like how they are on my server, they could be broken into separate pages
+for distinct sections.
+
+I have linked the relevant pages at the top of their section. It's a link
+to man7.org, which seems to be <i>the</i> source for Linux Kernel man pages
+on the internet. man7 is linked from kernel.org which lends it
+credibility at least.
+
+<h2>Memory</h2>
+
+<p class="manlink"><a href="https://man7.org/linux/man-pages/man5/proc_meminfo.5.html">man7.org/proc_meminfo</a></p>
+
+This one isn't too hard. I open the file <code>/proc/meminfo</code> and
+look for the lines starting with <code>MemTotal</code> and <code>MemAvailable</code>
+which are the total memory and currently available memory, respectively. They
+are very well named :). For usage, I just subtract available from total.
+
+<h2>Network</h2>
+
+<p class="manlink"><a href="https://man7.org/linux/man-pages/man5/proc_net.5.html">man7.org/proc_net</a></p>
+
+If you <code>cat /proc/net/dev</code> you can see some stats about
+your networking interfaces. This is what I parse, with some pain.
+
+I read the bytes columns from the receive and transmit sections.
+These are total counts of bytes received since boot, so you'll
+have to take two samples and subtract to get the number of bytes
+in some time-span.
+
+Looking at it in the terminal, you might assume that the separator
+between the columns was a tab character. I sure did! It is not a tab,
+but many spaces.
+
+Because of spaces-and-not-tabs
+<i>(not the tabs vs. spaces debate of usual, but with similarities)</i>, it proved
+to be a bit annoying to parse. It made me finally
+pull in a regex crate because I didn't feel like dealing with it
+at the time. Eventually&trade; I want to write a skip-arbitrarily-many-whitespace
+iterator, but for now <code>regex-lite</code> lives in my <code>Cargo.toml</code>.
+
+<h2>CPU</h2>
+
+<p class="manlink"><a href="https://man7.org/linux/man-pages/man5/proc_stat.5.html">man7.org/proc_stat</a></p>
+
+<code>/proc/stat</code> is the least obvious of the triplet. It has more than
+just the CPU's information, but the cpu is what we're after. You'll notice many
+CPU lines probably! I'm using the one starting just "cpu" without a number
+(cpu0, cpu1, etc.) because I only have the 1 core. If I had more than one core
+it'd work similarly, the just-cpu line sums the other ones, but then it could
+show >100% usage 'cause it's per-core usage just added together.
+
+First things uh, second? To summarize from the man page:<br />
+The units of these values are <i>ticks</i>. There are <code>USER_HZ</code>
+ticks per second. On most platforms it's 100 but you can
+check the value for your system with <code>sysconf(_SC_CLK_TCK)</code>.
+
+<details>
+	<summary style="font-style: italic;">small C program to check _SC_CLK_TCK :)</summary>
+	<pre><code>#include &lt;stdio.h&gt;
+#include &lt;unistd.h&gt;
+int main() {
+	printf("USER_HZ is %i", sysconf(_SC_CLK_TCK));
+}</code></pre>
+</details>
+
+But what columns of data do we use? From <a href="https://stackoverflow.com/a/3017438">this stackoverflow answer</a>
+it seems that summing the user, nice, and system columns get you the total ticks.
+The user and system make sense to me, time spent in user and system mode,
+but what on earth is nice? I sure hope it is.
+
+The Internet tells me to check <code>man nice</code>
+(<a href="https://man7.org/linux/man-pages/man1/nice.1.html">man7.org/nice</a>).
+That page says that the
+nicness of a process can be adjusted to change how the kernel schedules
+that process. Making it less nice (down to -20) increases it's priority, and
+increasing it's niceness (up to 19) lowers it. I guess that makes sense. Lowering
+the niceness makes the process greedier and in want of more attention
+from the scheduler? I'm unsure how well that personification tracks to reality, but
+it helped me think about it.
+
+The nice column, then, seems to be the time spent in processes that
+would go in the user column, but they have a different priority and
+I guess differentiating that is important.
+
+Oh, but there might be more columns we want!
+There's <a href="https://stackoverflow.com/a/10794088">another S.O. answer</a>
+that I found while writing this that says the sixth and seventh columns should used
+as well. These are irq/softirq and are time spent servicing
+interrupts. I think it makes sense we'd want that, too.
+
+So you have all these columns&mdash;user, nice, system, irq,
+and softirq&mdash;that add together to give you the total number
+of ticks spent Doing Things since boot, and you have the number
+of ticks in a second. Can you see where I'm going with this?
+
+Yup, take two samples some time span apart, subtract the former
+from the later, and then you have how much time the processor spent
+Doing Things. You can use that and the number of ticks in your time
+span to calculate utilization. Or you just have how much actual time
+The Computer spent Doing Work which is also pretty neat. Maybe you
+can pay it an hourly wage. Is that just AWS?
+
+Something to watch out for:<br />
+apparently the numbers in <code>/proc/stat</code> can overflow and
+wrap back to zero. I don't know what size integers they are so I'm
+unsure how real of a risk that is, but it seemed worth mentioning here.
+
+<h2>So you've parsed the stats, now to graphs!</h2>
+
+My main trouble here was selecting a range that makes sense for
+the data it's representing.
+
+Again, memory was easy. There is a
+total, normally-unchanging amount of RAM, so I just use that as
+the max. Perhaps there's something to be said about zooming further
+in to see the megabyte-by-megabyte variance, but I am much more
+interested in a "how close am I to the ceiling" kind of graph. Like,
+would I hit my head if I jumped? that kind of thing.
+
+The CPU graph, though, that's very variable and a bit spiky.
+I don't <i>really</i> care what the max value was if it's a spike,
+it can go off the top for all I care, what I want to see is the
+typical usage.
+
+If I just ranged to the max then I'd have what I call The Linode
+Problem. I call it that, rather predictably, because that's what
+Linode's graphs do and it makes them kind of useless? Great, I love
+to see that spike up to 100%, but that's <i>all</i> that I can see now.
+
+So instead of max-grabbing, I sort the data and take the value that's
+<i>almost</i> max. My series are 256 samples long, so what this looked
+like was taking the 240th value in the array, getting the closest-highest
+percent, and using that as the top of the range.
+
+This <i>does</i> mean if it's <i>very</i> spiky, I get The Linode Problem
+again, but in that case I'm kind of okay with it. I sample every minute,
+so my 256 pixel long graphs are roughly 4 hours long. If it spikes more
+than 16 times in that period, perhaps that's worth looking into.
+
+Okay, CPU done. Network time! It's the same, pretty much. Where there was
+one line, there are now two. And lots more spikes! I combine the receive
+and transmit series into one <code>vec</code>, sort it, and take the 32nd
+highest value.
+
+I draw the area under the line, too, because it was nigh impossible to see
+the line when it was so.. discontinuous? We get another problem with that,
+though, where the second-drawn line-and-underfill will obscure the one
+drawn first. So then, to not overdraw an entire measurement, I try to draw
+the average-larger one first. Which is to say, I take the average of both
+series separately and draw the one with the bigger average first. That way
+the smaller one will hopefully nestle under the larger, like a baby bird
+hiding from the rain under their parents wing.
+
+<hr class="asterism-dash" />
+
+That's how the range selection works, anyway.
+
+The graphs themselves are drawn on 256x160 gif because i like gif, 256 is
+a good number, and they seem to compress better than png in this use case.
+
+One day I'd love to try and generate alternative text to describe
+the general look of the graph. "The memory usage is steady at 300MB",
+or something like "The network usage is variable, but averages 15.4kbps".
+
+That's it!<br />
+bye :)
\ No newline at end of file