index/2025
some recent
SQCU projects
you should really hear about!
sd-scripts
&
reforge:
sheafshifter:
there comes a time in every computer scientist's life where they begin to pine. yearn. dream. towards... something more. something bigger... a world in which you can program more than one computer at once, sometimes dozens of the things. this dream leads all astray, often towards programs with weird greek names or something called 'plan 9'. the SQCU version is quite different, as it is focused on networking and multiprocessing as tools to hide the exceptions, bad resource use, and incompatible dependencies of 2020s 'slopcode' tools from one another.attn-demo:
originally meant to demonstrate something like a 'lightning talk' on the (then cutting edge) 'layerqknorm' neural network architectural modification, back in the naive and simple month of 2025-02. attn-demo soon grew into something stranger, something more bizarre, something bigger... a self-contained dataset parser, dataset tokenizer, and language model trainer oriented towards fast training runs and rapid prototyping of neural network architectures. calls for further inquiry.attn-demo
┕ shift-attn:
sliders:
"...think of this as a bug-for-bug reproduction of the upstream sliders implementation, to make it more obvious how little must be changed to extend the behavior of this sort of loss function." adds some funny new features to the mechanistic interpretability toy first published by Gandikota & Materzy, 2024. particularly, the feature of constraining a neural network to learn differences in its parameters such that *very small* rescaling of the learnt differences elicits totally different imagery, corresponding to training data subsets. this tool is almost totally unexplored, as it was hacked together in ~3 days before packing up to attend Hyperplex 2025verifiers:
no substantial patches here! ...yet. GRPO is fundamentally compatible with many machine learning neural networks, but actually orchestrating memory and compute efficient samplers for non-standard loss functions, non-standard training goals, and neural networks which aren't LLMs is still a distant dream for the academic research replication community. let alone open source hobbyists or end-user-targeting software, such as art tools, audio synthesis pipelines, or videogames.