--- template=post title=Corgi Gets Faster style=/styles/post.css style=writing.css #published=2025-03-02 4:00am CST description=corgi, my cgi server, got a 1000x performance uptick --- A tool related to my CGI server was taking nearly seven whole seconds to respond to my request and that just wouldn't do. (it was also poisoning the mutex in corgi itself because the sqlite connection would timeout and the thread would panic because the code is soft and unhardened) What is the issue, then? Well, I have this page, git.dreamy.place/stats, for telling me what user-agents are making requests. What, because I want to see how many of them are for-sure bots (nearly-mostly-entirely all). The problem, then, is that after five months of running i have over eleven million requests logged and that is too many if your SQL query looks at the entire table :) But, you see, the stats page currently only has 2 tables on it: count of total requests from the top 10 user-agents, and every user-agent in the last 15 minutes and how many requests they've made. There is simply no reason to keep looking over almost all of the requests from before 15 minutes ago if I kept a running total per-agent. Do I get rid of all those rows, then? No! I'd never! My precious data i'll never look at probably!
CREATE TABLE IF NOT EXISTS requests(
	id INTEGER PRIMARY KEY AUTOINCREMENT,
	timestamp TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
	agent_id INTEGER NOT NULL,
	ip_address TEXT NOT NULL,
	script TEXT NOT NULL,
	path TEXT NOT NULL,
	FOREIGN KEY (agent_id)
		REFERENCES agents(id)
);