← ::root

Why $f(x)$ Is Not Free

Algebras, transport, and the hidden cost of physical computation

The Laundry Problem

Funny enough, I am doing laundry as I am writing this post.

And laundry is actually a good first-principles model for physical computing.

The work I care about is simple: wash the clothes. In math-land, I could write:

$$y = f(x)$$

where x is the dirty load, f is the washing function, and y is the clean load.

Beautiful. Clean. Too clean.

Because the clothes are not magically inside the washer. The washer is not in my unit. I have to arrange the clothes. I have to separate darks from lights. I have to carry the load from my apartment to the building laundry room. I may have to wait for the elevator. If there is elevator contention, I wait longer. If the laundry machines are occupied, now I have an even worse problem: I either wait, or I take the longer trip to another laundromat outside my building, which has even more washers — what a tradeoff!

How many clock cycles, I mean "minutes" — sorry for the jargon — will all this work prior to the actual start of the washing take? Ugh.

None of that is the washing function itself.

But all of it affects the total time-to-clean-clothes.

That is the hidden cost.

Laundry Model x = dirty laundry / load f = washing machine / washing function T(x, f) = getting the laundry to the washer y = clean laundry

Now if I could only accelerate the entire process of doing laundry.

If I want to reduce total latency, I could attack the transfer — or transport, the T — step. Maybe I buy a washing machine and put it inside my unit. Now the hardware is closer to the dirty laundry. The data is closer to the function. The transfer cost drops.

If I want more throughput (getting more articles of clothing washed per unit time), maybe I buy a larger washer. Or multiple washers. Or maybe, if I had eight hands, I could separate clothes faster and feed the machines faster. Now I am playing with doing more work at the same time.

But of course, the world is not free. A washer in my unit takes space. It uses power. It costs money. It may violate building rules. A bigger washer may increase throughput, but it increases area, power draw, installation complexity, and cost. Multiple washers may help performance, but now I am paying for more hardware and using more space.

So even laundry runs into a version of the same engineering constraints: performance, power, area, reliability, and cost.

The actual washing is the computation we care about. But the real system includes sorting, carrying, waiting, elevator contention, machine availability, loading, machine capacity, power, space, and cost.

That is the point.

When f(x) is implemented in the real world, there is always more than the clean function. There is preprocessing. There is movement. There is boundary detection. There is staging. There is contention. There is a transport step.

That transport step is what I call:

$$T(x, f)$$

And once you see that, physical computing starts to look different.

The Stack I Am Deriving

Before going further, here is the physical computing stack I am deriving.

The Stack real-world phenomenon -> mathematical model -> relation / algebra -> operators + operands -> encoded operators + encoded operands -> instruction / protocol / dataflow graph -> hardware-supported operators -> transport, memory, boundaries, noise -> physical output

Most "first principles" computing explanations start at transistors, gates, NAND, CPUs, or binary. Those are important, but they already assume something deeper: that we have chosen a symbolic algebra, encoded its elements, and built physical operators that preserve the algebra well enough to compute.

That is the basement I care about here.

The Illusion of Pure Math

The laundry example is mundane, but the structure is not. The same pattern shows up when we implement equations, algorithms, database queries, rocket-control logic, or low-latency market-data pipelines.

The world we try to model with equations may be natural, like projectile motion or planetary orbits. It may be man-made, like compound interest, economic models, cryptographic equations, or database transformations. Either way, once we want a machine to compute or approximate the equation for us, we have to deal with physical reality.

For simplicity, consider one function, or more broadly one mathematical relation, acting on one input:

$$y = f(x)$$

In math-land, there is no space. No time. No elevator. No waiting for the washer. No bus contention. No cache miss. No heat. No routing congestion. No partial chunks. No noisy wire.

But physical computers do not live in math-land.

In the applied physical world, x has location. f has location. The result y must be written somewhere. And none of that happens in zero time.

So the real-world version is not merely:

$$y = f(x)$$

It is closer to:

$$x' = T(x, f)$$ $$y = f(x')$$ $$\text{total cost} = cost(T) + cost(f)$$
Notation Caveat Here, T(x, f) is my shorthand for getting x to the place where f can actually do its work. The cost of that trip is cost(T): time, energy, waiting, contention, routing, whatever the physical world decides to charge us.

That transport cost is not a minor implementation detail. It is one of the central facts of physical computing.

Latency. Bandwidth. Memory hierarchy. Cache misses. Bus width. FPGA routing. Network hops. Serialization. DMA. NUMA. PCIe. Coherence. Clock domains. Signal integrity. Power. Heat.

The Point Pure math says y = f(x). Physical computing asks where x is, where f is, how x gets to f (or the other way around), what boundaries define x, what hardware operators implement f, what memory holds partial state, and what breaks along the way.

But the moment transport takes time, computation can no longer be treated as one frozen instant. Something has to be captured, held, sampled, moved, or remembered while the rest of the system catches up. That means we have to talk about state.

Computing as Relations Over States

One first-principles lens I keep coming back to is this:

Computing is the physical realization of relations over states.

Or in plain English: we define how one state must relate to another, then we force matter — wires, memory, voltage, silicon, whatever — to embody that relationship.

State Depends on the Reference Frame

When I studied math, state looked one way. In physics, it looked another way. In software engineering, we talk about stateless and stateful protocols. In digital design and FPGA work, state often means registers, flip-flops, RAM, finite state machines, and values captured on clock edges.

So we need to bridge and unify these definitions.

A lot of technical writing treats state as if it were simply a static, objective property. Useful? Yes. Complete? No. State is relative (see the Relativity part of my SPPARRS framework) to the observer. The observer’s reference frame then determines what can be seen, when it can be sampled, what boundary is being used, and which distinctions matter.

And by "reference frame", I do not mean physics cosplay. I mean something concrete: what can this observer see, when can it see it, what boundary is it looking through, and what states are even available from where it stands?

A state is a snapshot of the information required to describe a system at a specific "moment". I use quotes because "moment" depends on the observer and its reference frame. That moment is defined by an observation event. In distributed and physical systems, there may be no single universal "now" that every observer shares.

In a synchronous digital system, that event is a clock edge. The clock gives the system its reference frame: state is the information captured, or "observed", at that edge. In this case, the flip-flop or latch is the observer: it interacts with a signal of some system and captures a value (data). To that state-holding element, the clock is the reference frame that defines when observation happens. Notice: I never said an observer has to be conscious, though I have a philosophical take on this too.

In an asynchronous system, there may be no single global clock deciding the "moment" for every observer. The observation event might be the arrival of data, a packet showing up, a voltage crossing a threshold, or a neuron firing.

For example, in a large-scale distributed software system, we might have two observers: a Writer and a Reader. You may also hear these called Producer and Consumer, or Publisher and Subscriber. The Writer and Reader may even be measured against the same wall-clock system, but their read and write events can occur at different rates, at different times, with different delays, and under different local conditions. Effectively, they operate in different logical event frames: the Writer’s frame and the Reader’s frame.

For the Writer, a write event is the production of a signal or data item that may eventually be consumed. You can think of that write as a tick in the Writer’s event frame. For the Reader, the relevant event is not merely that the Writer wrote something. The relevant event is when the signal becomes available to read.

Because the Writer and Reader live in different logical event frames, they can communicate without forcing their read and write rates to become identical. A queue sits between them as the shared boundary. For the Reader, the observation event is when data reaches the front, or head, of the queue and becomes available to consume.

The same constraint shows up in digital hardware at the micro-distributed scale. When two clocked observers live in different reference frames — different frequencies, phases, or clock domains — they need a safe way to pass data between them. An asynchronous FIFO can mediate that crossing. To the consumer, data is not safely observable until its own clock permits observation. This is the clock-domain-crossing problem, or CDC.

Security Aside: Bugs Live Between Frames This is also why security bugs live in the gaps between observers. The programmer may model the system under one frame: expected inputs, expected states, expected transitions. An attacker observes from another frame: malformed inputs, timing windows, retries, parser edge cases, memory layout, authorization gaps, and weird intermediate states the original model ignored. Same system. Different observer. Different visible state space. Hold that thought, because this comes back later when security enters the physical execution path.

Computation is not a sequence of global truths. It is a choreography of local observations.

Beginning to see why Relativity is a real systems constraint?

In either case, the observer is what discretizes the chaotic soup of physics into a value we can reason about. Mathematically, the set of all possible snapshots is the state space.

Whether it is a stable voltage, a transient pulse, or an unresolved value, if it is an identifiable condition captured by the system’s observer, it is a state for that system. We assign it a symbol and move on.

Pure mathematics often lets us strip away the observer, or pretend the observer is omniscient, and treat states as abstract elements of a set. Fine. That is useful in math-land. But computing cannot always get away with that, because computing is physical.

In the physical world, a state becomes usable only when some observer captures it. And if the observer is inside the system it is trying to observe, then it does not get the whole system from nowhere. It gets a partial view. It samples. It misses things. It introduces boundaries.

That is where uncertainty enters: unresolved values, noise, timing windows, probability distributions, and the annoying fact that the model is not the thing itself.

The Relativity of Scale

State also depends on Scale — the final S in SPPARRS.

In a distributed system, we might view a machine as stateless. An engineer could say, "No, it is stateless because the API does not persist session data between calls".

Exactly. At the API scale, you have defined a contract that ignores the RAM, flip-flops, registers, caches, and other internal state. But the internal state is still there. If the power cuts, much of that internal state is lost. From that lower-level frame of reference, the machine is absolutely stateful.

So "statelessness" does not mean "there is no state anywhere". It means you are choosing a scale where the internal state is not supposed to affect the high-level relation being exposed.

Statelessness is a successful act of encapsulation. It is a choice of what to ignore.

The Mathematical Mapping

Mathematically, we can represent computation as a relation between an input state space X and an output state space Y:

$$R \subseteq X \times Y$$

In pure mathematics, this relation can be treated as static and timeless. The ordered pairs simply exist. But computing is the act of forcing this math into the physical world. This introduces causality, time, energy, space, and all the annoying stuff math-land politely ignored.

To compute, we take a state that the observer identifies "now" and use the laws of physics to move matter into the state that must exist "next". When we constrain the physics so that every input state maps to exactly one output state, we have a function:

$$f : X \to Y$$

That is the clean mathematical view. We take a domain, we define a mapping, and we use the physical machine as the enforcer to make matter follow the map.

Forcing the Matter

If I seem pedantic about relativity, scale, and physical constraints, this is why: computing in the real world is applied physics. The math gives us the rule. The physics is the struggle to obey that rule.

Suppose we want to implement a relation physically. If the domain is continuous, we may discretize it into buckets. If we are using binary digital hardware, we encode those buckets as bits.

Why bits? Because two symbols are the smallest useful contrast: low/high, false/true, 0/1. That does not mean reality is only binary. It means binary is a useful engineering choice for making distinctions robustly.

Now an input state becomes a bit pattern, and an output state becomes another bit pattern.

At that point, a relation like f can be physically encoded as a table:

abstract rule for f: f(x) = y after encoding: x bits -> y bits physical table version: lookup(x bits) = table[x bits] = y bits

The input bits for x select the stored output bits. So the machine is not "understanding" f. It is using the encoded input to choose the output that physically represents f(x).

That is the basic idea behind a LUT, or lookup table.

An FPGA LUT is a physical version of this idea: a small Boolean function, or relation in the broader sense, like f, stored as output bits selected by input bits. Input pattern comes in. Output pattern comes out.

This is part of why FPGAs can be fast for the right problems: the relation has already been baked into the hardware. The input space is wired to select from the output space. The machine is not "figuring out" f at runtime. It is following a physical structure you configured earlier.

But physically, the lookup still has to be built out of something: decoding, selection, muxing, routing, wires, capacitance, fanout, layout, and timing.

A lookup may be constant-time in the model. But silicon still has to move charge.

That is the first crack in the clean abstraction: the next state does not appear by magic. It has to be physically produced. That takes time, energy, and space.

Algebras Need Hardware Operators

There is another hidden assumption in math-land.

When we write an expression, we act as if the operators already exist in the machine.

For example, we write:

$$a + b$$

But + is not magic. In an abstract algebra, + is an operator over some set of elements. In hardware, that operator has to be implemented by some physical mechanism.

So there are really two worlds:

Two Operator Worlds Abstract algebra: elements + operators Physical hardware: encoded operands + hardware-supported operators

If the abstract operator maps cleanly to a hardware operator, great. If not, the compiler, hardware designer, runtime, or algorithm designer has work to do.

A general-purpose CPU gives you a fixed menu of hardware-supported operations: add, multiply, load, store, branch, compare, and so on. That menu is powerful because it is general. But generality has a cost.

If the abstract operation you care about does not match the hardware's primitive operator set, then the operation must be decomposed into smaller supported operations.

That is why hardware acceleration exists.

An accelerator tries to make the hardware algebra look more like the problem algebra.

Instead of forcing the problem to crawl through a long sequence of generic operations, we build hardware operators that are closer to the operations we actually care about.

For example, suppose the abstract operation is matrix multiplication:

$$C = A \times B$$

Here, the operands are matrices, and the operator is matrix multiplication. A CPU can implement this by breaking the operation into smaller steps: loads, stores, additions, multiplications, loops, and possibly instructions that operate on multiple values at once. But the CPU is not literally a matrix-multiplication operator. It is executing many lower-level operations to simulate the bigger algebraic operation.

A systolic array is different. Matrix multiplication has a repeated multiply-and-accumulate dataflow. A systolic array arranges processing elements so data moves through the hardware in a regular pattern while partial sums accumulate. The hardware shape better matches the operation.

That is the key idea:

Hardware acceleration is what happens when we move the hardware operators closer to the abstract operators.

An ASIC (Application-Specific Integrated Circuit) can go even further. If the operation is important enough, repeated enough, and stable enough, we can specialize the hardware so the physical operator approximates the abstract operator more directly.

An FPGA sits in the middle. It lets us reconfigure the hardware graph so the physical operators and data paths better match the function we want to compute.

This is not just about making f faster. It is about reducing the translation gap between the algebra of the problem and the algebra of the machine.

That is also why people explore machines beyond general-purpose CPUs: GPUs, FPGAs, ASICs, optical computing, analog accelerators, and whatever else might better match the problem. The question is not "is this new and shiny?" The question is whether the physical machine gives a better operator set, lower transport cost, lower power, better bandwidth, or less heat for a certain class of problems.

In other words:

Acceleration Reframed Software asks: Can I express this operation? Hardware asks: Can I physically support this operator, move the operands to it, and do so within physical (and yeah, financial) constraints?

Operators and Operands Become Encoded Sequences

Now notice something else.

We can binary-encode operands. But we can also binary-encode operators.

In math, we may write:

$$f(x)$$

or:

$$f(A, B)$$

where f is the operator/function/relation and x, or A and B, are the operands/inputs.

But inside a machine, both the operator and operands can be represented as bit patterns.

Example of Encoded Operator + Encoded Operand f = 010110 x = 000011 [f, x] = 010110 000011

That combined sequence can become an instruction-like object: some bits say what operation to perform, and other bits say which operands, registers, addresses, destinations, or extra rules are involved.

So the expression:

$$f(x)$$

can become something more physical:

[operator bits | operand bits | destination bits | boundary rules]

Note: I prefer to call them "operator bits" in line with algebra, but industry usually calls them "opcode bits".

This is where the clean algebra starts turning into an instruction format, protocol, packet, micro-op, hardware control word, or dataflow token.

And once that happens, the machine needs to parse the sequence.

Where does the operator field end? Where do the operands begin? Is the operand a value or an address? Is the destination implicit or explicit? How wide is each field? Is the instruction fixed-width or variable-width? Does the next stage know how to decode it?

Again, boundaries.

Again, transport.

Again, memory.

The math was compact. The physical representation needs a protocol.

Compiler Translation Is Algebra Translation A compiler is not merely turning "human-readable code" into "machine code". At a deeper level, it is translating one algebra into another. A high-level language gives us abstract operators, types, functions, objects, arrays, loops, and memory models. The target machine gives us a different algebra: registers, loads, stores, branches, arithmetic operations, memory addresses, calling conventions, and eventually micro-ops and control signals.

So compilation is a translation process: take the rich algebra the programmer wants to use, then map it onto the smaller, stricter, physical algebra the machine actually supports.

f May Not Fit

Sometimes f is too large to implement directly.

Maybe the hardware does not support the operation natively. Maybe the function is too big for one LUT. Maybe it needs multiple instructions. Maybe it needs several stages. Maybe it needs a whole graph of smaller operations. If there is no feedback or cycle, that graph may be a DAG, a directed acyclic graph. If state feeds back into the next step, the graph may have cycles.

So instead of:

$$f(x)$$

we get:

$$g(h(x))$$

or more generally:

$$f = f_n \circ f_{n-1} \circ \cdots \circ f_1$$

This is the important move:

A function in math may become a graph in hardware.

In software, that graph may be instructions, basic blocks, control flow, data flow, tasks, operators, or services.

In hardware, it may be gates, LUTs, adders, muxes, registers, buses, DSP blocks, pipeline stages, or finite state machines.

In data systems, it may be scan, filter, project, join, aggregate, sort, shuffle, materialize.

Same pattern.

The abstract function gets decomposed into smaller physical relations that the hardware can actually execute.

This is why "implementation detail" is not a small thing. In physical computing, implementation is where the abstract function either survives or dies.

x May Not Fit Either

Now flip the problem.

Maybe f is fine, but x is too big.

The input may not arrive all at once. It may arrive in chunks, packets, cache lines, pages, frames, etc.

So instead of one clean object:

$$x$$

we get:

$$\langle x_0, x_1, x_2, \ldots, x_n \rangle$$

Now the system has a new problem:

Where does one piece end and the next begin?

A function is dumb. It does not magically know what the input "means". It has to be told where the boundaries are.

Those boundaries may come from headers, length fields, schemas, delimiters, valid bits, ready/valid handshakes, packet framing, cache-line size, row groups, page boundaries, clock edges, or timing conventions.

Even silence can be a delimiter. No signal for some period can mean "frame ended". A clock edge can mean "sample now". A voltage crossing can mean "event triggered".

So the boundary is not decoration.

The boundary is part of the computation.

f(x), once translated to hardware, is not looking so simple anymore, right?

No boundary, no parsing. No parsing, no data. No data, no computation.

Transport Creates Memory

There is another reason x may not fit: the pipe may not be wide enough to move all of x at once. Once x arrives in pieces, the machine needs somewhere to hold those pieces.

That means memory.

And by memory, I do not only mean DRAM. I mean anywhere the system can hold something for later: a register, a flip-flop, a latch, a buffer, a queue, a cache line, a table, an accumulator, a log, or a temporary variable.

The deeper point is simple: memory appears because the computation cannot be completed from one isolated instant.

If the function needs multiple pieces before it can produce a result, the system needs somewhere to wait and collect them.

If the function produces partial results that must be combined later, the system needs somewhere to keep them.

If the next step depends on the previous step, the system needs state.

So physical computing quickly becomes:

split x into pieces -> mark the boundaries -> move the pieces -> hold intermediate state -> compute -> combine the results

Now there are two basic situations.

If the pieces are independent, the system can do the same operation on many pieces:

x0 -> h(x0) x1 -> h(x1) x2 -> h(x2)

Then it can combine the results:

combine(h(x0), h(x1), h(x2))

That gives width: more work happening at the same time.

But if each piece depends on the previous piece, the system has to carry state forward:

state1 = h(state0, x0) state2 = h(state1, x1) state3 = h(state2, x2)

Now the computation has a dependency chain.

Independent pieces give width. Dependent pieces drag state forward.

Later, people give these patterns names: map, reduce, fold, parallelism, dependency chains. But the structure comes first.

That difference is not philosophical. It decides throughput, latency, memory pressure, and hardware shape.

Funny how fast "just compute f(x)" turns into a whole architecture.

The Router Function

Sometimes x does not go directly to f.

Before anything can compute on data, the system has to answer a basic question:

where should this piece go?

That question is so basic that it shows up everywhere. A CPU has to send values to the right register, execution unit, cache, or memory address. A network has to send packets to the right destination. A data pipeline has to send rows to the right partition or operator. An FPGA has to physically route signals through wires and logic.

So there may be another function:

$$r(x, destination)$$

Call it the router.

The router decides where data goes. It may inspect metadata, type, shape, header, schema, address, destination, timing, partition key, opcode, or anything else needed to send the data to the right place.

In a CPU, routing appears through instruction decode, register selection, memory addressing, dispatch, load/store units, cache hierarchy, and branch behavior.

In a network, routing is explicit.

In an FPGA, routing is literally physical.

In a data pipeline, routing may be schema-driven or partition-key-driven.

In a compiler, routing is translation: deciding how abstract operations map to concrete machine operations.

The router is the thing that says:

this piece of data belongs over there.

Again, not plumbing.

If routing is wrong, the computation is wrong.

If routing is slow, the computation is slow.

If routing leaks information, the computation becomes attackable.

Security Lives in the Path

This is where the security angle comes in.

In math, f(x) looks atomic.

Input enters. Output exits.

But in physical computing, f(x) is usually decomposed into intermediate operations, intermediate values, intermediate locations, and intermediate movements.

That decomposition creates attack surfaces.

This is the systems-security move:

The abstract function may be correct, while the physical execution graph leaks, stalls, corrupts, faults, reorders, or exposes structure.

Side channels live in the gap between:

mathematical function

and:

physical execution graph

Timing reveals structure. Power reveals structure. Cache behavior reveals structure. Branches reveal structure. Memory access reveals structure.

The decomposition is observable.

And if it is observable, someone may exploit it.

Physical Functions Are Noisy

Now comes the annoying part.

In pure math, a function maps one input to one output:

$$f : X \to Y$$

But implemented physical systems live under noise, incompleteness, and observer-relative measurement. Engineers often force the system to behave deterministically inside a specification, but that determinism is achieved, not free.

The input may be noisy. The output may be noisy. The function may drift. The measurement may disturb the system. The transfer path may corrupt the value. The clock may disagree with another clock. The signal may not settle in time.

So in physical systems, f(x) often behaves less like:

x -> y

and more like:

x -> distribution over possible y values

If we want to preserve the formal function shape, we can say the output is a random variable:

$$f(x) = Z_x$$

where Z_x is not one fixed scalar value but a random variable representing possible outcomes for input x.

Equivalently, the function may be distribution-valued:

$$f : X \to \Delta(Y)$$

where Δ(Y) means a probability distribution over possible outputs in Y.

Or we can write the engineering version:

$$y = f(x) + \varepsilon$$

where ε is everything the clean model did not want to talk about.

Noise. Timing variation. Thermal effects. Metastability. Measurement error. Bit flips. Packet loss. Sensor error. Clock drift. Interference. Quantization. Approximation.

Engineering is the art of making ε small enough, bounded enough, corrected enough, or irrelevant enough for the job.

ECC. Checksums. Retries. Filtering. Synchronizers. Redundancy. Calibration. Debouncing. Hysteresis. Guard bands. Consensus.

All of that is civilization fighting ε.

The Real Model

The clean equation is:

$$y = f(x)$$

The physical systems version is closer to:

$$x' = T(x, f)$$ $$y = reduce(map(f_i, chunks(x'))) + \varepsilon$$

Ugly? Yes.

More honest? Also yes.

And no, I am not saying every system is literally map/reduce. I am exposing the hidden structure: transport, chunking, boundaries, smaller operations, intermediate state, reduction, and noise.

Because now the hidden structure is visible:

That is physical computing.

Not magic.

Not vibes.

Relations mapped onto matter.

Scope Is Part of the Function

Going back to the laundry example, you might have noticed that I did not include the work required to dry, fold, and return the clothes to their final destination: closet, drawers, wherever clean laundry is supposed to go. Exactly. I can barely fold for my life anyway.

When I discussed f(x), or better yet do_laundry(x), I scoped the function around sorting the dirty load, moving it to the washer, and loading the washer. I was measuring latency with respect to that narrower scope.

If I expand the scope of do_laundry(x), the pipeline changes:

sort(x) -> load_washer(x) -> unload_washer(x) -> load_dryer(x) -> unload_dryer(x) -> return_to_apartment(x) (...hoping elevator is in service!) -> fold(x) -> transfer_to_closet_and_drawers(x)

Now total latency means something different. The bottleneck may no longer be the washer. It may be drying, folding, elevator contention, machine availability, or the return trip. Change the function boundary, and you change what you measure. Now you can see how quickly a "simple" function becomes a pipeline with multiple bottlenecks.

You can also see why acceleration is a huge deal when we translate this back to the hardware world.

And you can perhaps also see why, in addition to acceleration — meaning Performance — the ideas of Reliability and Security become a huge deal too. This pipeline now has several points of failure, accidental or intentional: washer occupied, dryer broken, elevator down (I can't count the number of times I've dealt with this...ok, 4), load mixed up, timing delayed, someone else interfering with the process; you name it.

That is why I included those ideas in SPPARRS (which I cover in another post). Performance is not the only thing we spar with. The longer and more physical the pipeline becomes, the more ways reality has to mess with it.

Now you should be able to see this: if do_laundry(x) took zero time, zero space, zero energy, and exposed no intermediate state, then there would be no physical execution path to attack.

But once the function becomes a real pipeline moving through space and time, we have to spar with physical constraints.

And yes, there is a weird philosophical edge here: if there were zero space and zero time, there would be no boundary between input and output, problem and solution, operand and operator. But that is exactly why the physical version matters. Boundaries appear because the function has to exist somewhere and happen somehow.

Now you might think this is a facetious laundry example. Fine. But consider this: companies are literally trying to build humanoid personal assistant robots that can do this kind of thing — make coffee, move objects, fold laundry, clean up, whatever. Maybe I will buy one...or build one.

And notice the problem: fold(x) is not like adding two integers. Shirts wrinkle. Towels flop. Socks disappear into another dimension. The same "input" does not always present itself the same way twice. So even this stupid little laundry pipeline quickly becomes a physical AI problem: observe the state, try an action, measure what changed, adjust, and try again.

One More Compression: Jargon Is Compressed Translation

One more compression before the final compression: computing is not magic. Strip away the jargon first.

Take an action, any action: add two integers, multiply two matrices, do laundry, fix Clarence's elevator, whatever. Decompose the action into sub-actions and draw arrows between them to show dependencies. Now you have a graph of steps and data. That graph is already starting to look like a high-level program.

Then ask the physical question: onto what hardware do I want to map this program?

If the hardware speaks in bits, then the graph has to be translated into bits. Operators need representation. Operands need representation. Boundaries need representation. Even the thing that tells the machine where one field ends and the next begins has to be represented somehow.

a + b becomes something more like: [operator | operand | operand] which eventually becomes encoded as: [opcode bits | operand bits | operand bits | boundary rules] Huh?! "opcode bits"? Yeah, I know. I would've preferred to call it "operator bits" in line with algebra, but industry has their own jargon for it.

If the operator and operands fit what the hardware can do directly, great. If not, you chunk, stage, route, buffer, carry partial state, and keep going.

If you want throughput, you widen the mapping. One operation shape, many data lanes. That is the intuition behind SIMD: Single Instruction, Multiple Data. Nothing mystical. Just one instruction pattern mapped across multiple physical paths.

Then you can reintroduce the compressed jargon: opcode, operand field, pipeline, SIMD, dataflow graph, translation to machine steps, hardware mapping. The jargon is not the starting point. The structure is.

Final Compression

Pure math says:

$$y = f(x)$$

Physical computing asks:

Where is x? Where is f? How does x get to f, or f to x, or both to the same place? What algebra does f belong to? Does the hardware support the operators of that algebra? Do the operators need to be decomposed? Does the data need to be chunked? Can hardware acceleration move the physical operator closer to the abstract operator? What boundary tells us x arrived? Does x arrive whole or in pieces? Where do intermediate states live? What does transport cost? What does memory cost? What does noise corrupt? What does the observer actually see?

So maybe the cleanest way to say it is this: computing is the process of mapping mathematical models onto physical machinery.

Math gives us relations, functions, algebras, operators, and structures for modeling natural or man-made phenomena. Hardware gives us physical operators, memory, wires, clocks, routing, energy limits, area limits, reliability limits, and noise.

The art of computing is translating one into the other without losing the thing we cared about.

That is the real computation.

The function is not free.

The operator is not magically implemented.

The data is not magically present.

The boundary is not optional.

The memory is not incidental.

The transport is not plumbing.

Once computation becomes physical, geometry is not separate from logic.

Geometry is logic.

That is the hidden term inside every computation.


© 2026 Clarence Bowen. Derived, not assumed.