Chapter Content
Okay, so, chapter four. Um, let's talk about the essence of a hard problem. And you know, before we dive in, it's worth just like, really thinking about what *we* mean when we say "hard" when we're talking about problems.
So, the four most important terms when it comes to problem solving? Problem, process, answer and solution. Makes sense, right? The problem is, well, the task we're working on, the thing we're trying to resolve. The process... that can be like, either the *external* way we find the solution or the *internal* way the solution, like, does its thing to produce an output. The answer? That’s just the final state that, you know, says "yep, problem solved." And the solution itself? That's the physical object or the, you know, *thing* that *does* the internal process to fix the problem.
Let's, uh, let's take a bridge as an example. The problem is, like, getting people and cars safely across, say, a river. The external process is all the trial and error used to figure out the bridge design. The internal process is how the bridge pieces, like, actually work together to make it stand up and function correctly. The answer is simple: seeing cars and people getting to the other side. And the solution? The bridge itself, the finished configuration of everything.
Okay, how about a deep learning model? The problem might be facial recognition. The external process is all the training that you do to get a good model. The internal process is the actual computing that the model, uh, does when it recognizes a face. The answer is, like, the right name popping up, right? And the solution is the model that's been deployed, with all its parameters, its weights and biases.
One more: a Rubik's Cube. The problem is getting all the faces to be one color. The external process is all the trying different moves to find a set of steps that works. The internal process is, you know, *doing* those steps, following the algorithm. The answer is the solved cube, of course. And the solution? Well, it's the set of steps you figured out, maybe written down somewhere for other people to use.
So, basically, problem: the thing we need to fix. External process: how we *find* the solution, you know, trial and error, whatever. Internal process: how the solution *works*. The answer: confirmation that it’s fixed. And the solution? The thing that gets deployed to *do* the process that solves the problem.
Now that we have these terms down, let’s talk about what a "hard" problem really *is*.
People call problems “hard” when they're just not easy to solve. If there's no easy answer, it's, you know, hard. But in computing, "hard" has a more specific meaning. It's computationally hard if it's really difficult, like *practically impossible*, to solve efficiently with a computer. This type of problem takes a *ton* of resources, a lot of time and/or memory.
The thing with these computationally hard problems is that their difficulty grows really fast as the size of the input grows. Like, exponentially fast. Think of folding paper: each fold adds multiples, not just a little bit more. If you could fold paper 42 times, it would reach the moon!
The computational equivalent of folding paper is the explosion of possible paths an algorithm takes to find a solution. The longer the bridge, the more pieces you have to account for. Facial recognition is hard because there are *so* many factors involved in making a face a face.
The key concept here is the possibility space. That's the set of all possible combinations, arrangements, or states relevant to the problem. For a chess board, it's all the ways you can arrange the pieces. The bigger the possibility space, the more the algorithm has to explore. That’s what makes a problem really hard.
Now, working with the *entire* possibility space isn’t feasible, so we need to find a reduced set of features; a feature space.
Think about keeping your hands warm in the winter. Instead of grabbing random stuff, you think about important attributes, like thickness, warmth, absorbency, and strength. If you plotted those, you'd have a 4-axis space. The solution, the right gloves, lives somewhere in *that* space, because the correct mix of materials will have the right amounts of those features. The more features you have, the more dimensions in the feature space. "High-dimensional" means a problem whose feature space has many dimensions, and the solution lives somewhere in there.
So, finding the solution inside that space is what makes a problem really hard. A feature space for a hard problem is like a huge territory, full of twists, turns, and obstacles, with no signs pointing to the solution.
There are three reasons why feature spaces are so challenging: first, the inherent complexity of the space itself; second, the lack of clear causal connections between features; and third, the small number of viable solutions compared to the total possibilities.
The inherent complexity comes from how the features interact. Like, with the gloves, thicker material is warmer, but less dexterity. Absorbent material wicks away moisture, but gets cold when wet. These features don’t make the space bigger, but they make finding the right path harder. It's easy to make a wrong turn. You’re always trading one thing for another.
As you add more features, it gets even more likely that those features will interact. This is why high-dimensional spaces are a hallmark of complex systems, and the hard problems that go along with them. It's not just about having a lot of pieces, it's about how those pieces connect.
The second source of hardness is the loss of visible causality. With our keeping-hands-warm problem, we know how thickness, warmth, absorbency, and strength all work together. We can figure it out by thinking it through. But now consider a harder problem. Imagine we have to keep our hands warm while *building* a snowman. Now we need to keep them dry and have enough dexterity to handle the snow. You might not have a whole lot more features, but there'll definitely be additional tradeoffs. This starts to mess with our ability to deduce a solution, as the causal fabric between features is no longer clean cut.
The more features "talk" to each other, the more tradeoffs pop up, making it harder to find a solution. If you choose a material that's warm and flexible, it might get wet too fast. Add water resistance, and you lose flexibility.
The final source of hardness is tied to something called the solution space. The solution space sits within the feature space, as a subset of possible configurations that satisfy the problem. The solution space is the space of all possible configurations that produce a viable solution to the problem.
A big solution space means there are lots of ways to solve the problem, while a smaller solution space means there are fewer. Problems that have large solution spaces compared to their possibility spaces will take less time to solve, because there's less space to move around before you run into a solution. Keeping our hands warm has a large solution space, because there are many materials that will work. Our ancestors probably stumbled upon mittens pretty early on.
Compare this to a Rubik's Cube. A point inside this feature space would be a unique configuration of the “cubies," with each cubie having values related to position and orientation. The solution space of the Rubik’s cube is the set of all possible algorithms that could produce the same final answer.
It takes a long time to find a solution to the Rubik's cube because of the complexity of its feature space. The Rubik’s cube is not a categorically hard problem, but it begins to show some signs of hardness, while still being solvable in a reasonable amount of time. Chess has an even larger and more complex possibility space, getting us closer to a truly hard problem. Neither the Rubik’s cube nor chess are naturally hard problems, but both are good ways to see the transition from simpler problems to those that approach genuine hardness.
To sum it up, there are three main things about these abstract spaces that lead to hardness: the complexity of the space itself (interdependent features), the lack of causality between features, and the small size of the solution space relative to the overall space.
When we consider how much space there is to traverse, it seems impossible that we would ever find the answer to hard problems. The Rubik’s cube has over 43 quintillion possible configurations! Chess even more. The number of possible configurations in natural settings, ones that are unrestricted by human invention, have levels of complexity that can barely be fathomed.
And yet humans find solutions to naturally hard problems all the time. How can such massive spaces of possibility be reigned-in and made tractable?
Okay, so it's important to know that with something like a Rubik's cube, you can't just try every possible configuration until you find the right one, even with our fastest computers. It would take trillions of years!
That's why computers that solve Rubik's cubes have to use heuristics and pattern recognition to search the possibility space more effectively. Heuristics are general rules-of-thumb or shortcuts that allow people, and machines, to solve difficult problems without requiring explicit analysis. Chess grandmasters use recognizable patterns on the board to play at their top level. This is the rapid recognition of patterns that guide players towards good moves.
The Rubik’s cube isn't a truly hard problem, but it shows us that as the difficulty increases, the softer our problem-solving approach must get. Here, softer means relying on things like heuristics and pattern recognition rather than some detailed analysis of the parts that makeup a problem. The more difficult the problem, the larger and more complex its possibility space, and that space must be searched. The harder the problem, the less analytical we can be about how we go about searching. With the keep-hands-warm problem our search was done easily with a little bit of trial-and-error paired with deduction.
The brute force approach, whereby all possible configurations are attempted until a solution is found, is not feasible.
It stands to reason that naturally hard problems, those we find in real life, can only be solved by the softest approaches. Humans evolved to solve the hardest problems using heuristics and pattern recognition, not detailed analysis. The emotional cues and intuitions humans use are our evolved versions of heuristics. Nature shows us what it takes to solve the hardest problems of all. We must step outside the system, external to the realm of deliberate analysis, and explore vast possibility spaces using trial-and-error, heuristics and pattern recognition.
Analytical approaches and rules-based computing cannot solve truly hard problems. Deep learning is not about analysis or rules, rather it embraces computational versions of trial-and-error, heuristics and pattern recognition. Deep learning can be applied to the Rubik’s cube problem, for the sake of potentially finding novel solutions. Deep learning searches the possibility space much like a human would; softly, externally.
But while the search process used by deep learning is conceptually similar to the one used by humans, the same cannot be said for the solutions found. Humans and traditional computing find a solution that can be presented as an understandable set of instructions for others to follow (i.e. a cuber’s algorithm). But deep learning produces something very different. Rather than a visible set of steps, deep learning solutions cannot be seen in this fashion. The solutions produced by deep learning are causally opaque and largely uninterpretable. We do not know what deep learning is doing when its deployed solution solves the cube. The obvious question here is why would the same overall approach between two different systems, humans and AI, produce such different solutions?