Chapter Content
哎,大家好,今天咱们聊点有意思的,就是这个“元”(Meta)的概念。第七章,讲的是“走向元”,挺玄乎的,其实没那么复杂。
简单说啊,就是“元”是指事物谈论自身。打个比方,metadata,元数据,就是“关于数据的数据”。它描述了数据的特征,像内容、格式、结构等等。这样我们才能验证数据源是否符合整体的结构。还有 meta-analysis,元分析,不是只看一个研究,而是把多个研究的结果结合起来,目的是得到更可靠的统计结论。
其实,“走向元”就是想找到比单一事物更深层、更普遍的真理。你想啊,每个研究都有自己的偏差,只能提供片面的真相。所以,走向“元”,就是想挖掘系统深处隐藏的、潜在的真理。这个潜在的结构,没有明确的标签或类别,但它确实存在。任何具体的系统实例都只是潜在真理的一个侧面。只有结合多个实例,我们才能接近它的本质。
更重要的是,“走向元”是创造抽象概念的唯一途径。想想人们怎么形成抽象概念的?我们之前讲过,就是发现表面上不同的事物之间的相似之处。一旦发现深层相似性,我们就可以创建更高层次的类别,无论是信息上的还是物理上的。但从系统内部是无法完成的。只看内部,你没法把一个系统放到某个类别里。只有跳出系统,从外部观察,才能注意到系统内部各部分之间的相似之处。所以啊,必须“走向元”才能创造抽象。没有自我参照的过程,就无法把不同的东西联系起来。
而且不只是 mental 的抽象,物理的抽象也需要“元”层次的观察,才能把内部的碎片组合成一个整体。如果人类不能从更高层次发现相似之处,就没法设计出更高层次的 physical 抽象工具。汽车的换挡杆,就是一个 higher level 的 physical 抽象的界面,它协调着内部的各个部件。如果人类不能设想不同的内部部件如何协同工作,实现同一个目标,就无法创造出换挡杆。从系统内部是看不到这种整体结构的,只有从外部才能看到。
说清楚点啊,“元”不是另一个抽象层次,很多人都搞错了。抽象只是一个高层次的结构,包含了下属的组成部分,它本身并不谈论自己。而创造抽象的过程,是在系统之外进行的。“抽象”还在系统内部,而“元”是在系统外部。如果我们把几个部分组合成一个整体,并公开一个界面,那么这个新的抽象层次只能通过跳出系统本身才能形成。
虽然“走向元”去创造物理抽象对人类来说比较好理解,但自然界中的抽象又是怎么来的呢?我之前说过,通过抽象实现进步是绝对的,也就是说,任何进化到可以解决更困难挑战的系统,都必须创造更高层次的 physical 抽象。我还说过,自然界是通过涌现来实现这一点的。涌现结构是 physical 抽象的层次,它们压缩信息,解决自然界中困难的问题。但自然界是如何“走向元”来创造 physical 抽象的呢?
比如说深度学习模型。深度学习之所以有效,不是因为 explicit 的规则,而是因为模型内部出现了一些东西,成功地将输入转换为输出。这些是黑盒方法,所以内部发生的事情无法用精确的分析方式来理解。但我们知道,确实出现了不同层次的抽象,帮助完成这种转换。
比如人脸识别,深度学习使用 increasing level 的抽象,逐步从输入数据中提取更复杂、更有意义的特征。系统首先从输入图像中提取 low-level 的特征,比如边缘、角和纹理。然后,它将这些 low-level 的特征组合起来,提取图像中更复杂的模式,比如眼睛、鼻子和嘴巴。随着系统将前一个层次组合成更 holistic 的面部特征,比如面部表情和姿势,抽象层次不断提高。最终,这些抽象抓住了面部的 essence。
这一切都要归功于深度学习的自我参照性。AI 的自我参照性来自深度学习系统使用 previous passes through 的网络来更新其内部配置的方式。当模型做出猜测时,会存在一定程度的误差,它会利用这些误差信息来改进下一次猜测。网络一遍又一遍地进行猜测,并根据猜测与最终目标的接近程度来更新其配置。要设计一个好的模型,不是要了解它的内部运作方式,而是要建立一个良好的自我参照框架。
这就是为什么深度学习是一种“走向元”的方式。它是一个使用反馈循环来观察自身,并验证事情是否顺利的系统。“走向元”与尝试 explicit 地设置参数,或使用规则来设计系统完全不同。无论最终解决方案中存在什么规则,它们都是自动产生的,而不是通过刻意的编程。这就是 self-referencing 的方法与众不同的地方。它通过 stepping outside 内部细节,并依靠 variation、iteration 和 selection 来收敛到正确的内部结构,从而展现了 epistemic humility。它是设计的 antithesis。
由于 AI 模型中的内部参数值是自动找到的,因此 deliberate 的工程设计只发生在外部。Hyperparameters 是人类为尽可能最好地设置深度学习训练阶段所做的选择。这是另一个完全不同的问题,它在所有可能的 hyperparameter 值的 possibility space 中运作。这个 higher level 必须在 only external actions 的前提下运作。我们无法知道如何 deliberate 地设置最佳 hyperparameter 值。要找到这些值,人类必须“走向元”,尝试许多不同的 hyperparameter 值,直到找到好的值。我们必须尝试不同的 learning rate、batch sizes、number of epochs 和网络架构,直到这些值的正确组合产生更好的输出。当我们“走向元”时,我们是在学习如何学习。
AI 版本的“走向元”使用诸如 hyperparameter optimization (HPO)、meta learning 和 neural architecture search (NAS) 之类的方法。这些方法之间的差异对于本次讨论并不重要,重要的是它们都是 stepping outside 问题空间,并从外部操作,以找到有效方法的方式。
正如 meta-analysis 综合不同的研究,试图找到比任何单个研究都能展示的更深层的东西一样,AI 中的 meta learning 试图找到没有单个模型拥有的东西;一些 latent 的结构,以高度通用和强大的方式将输入转换为输出。同样,这种努力强调学习如何学习,而不是学习任何一件特定的事情。
为了使深度学习系统表现良好,它们必须在模型内部 manifest 抽象,而如果没有不断 stepping outside 系统的方法,就无法实现这些抽象。不只是深度学习,对于我们构建的任何复杂解决方案都是如此。正是“走向元”的行为,导致了抽象,以灵活的 deterministic 方式将输入转换为输出。不“走向元”,就无法解决难题。
接下来,咱们聊聊“问题结构:自然模式的反映”。
问题是有结构的。这种结构来自定义给定问题的抽象层次。自然界中的终极问题是生存。生命必须努力维持自身的存在,以抵御意外和考验。但生存只是所有 life challenges 的 topmost 版本。在低于生存的抽象层次上,是食物、水和 shelter。如果更低层次的抽象,我们会发现尝试解决更高层次的活动。为了获得食物,动物必须狩猎、觅食、 scavenge、filter、graze、dig、ambush,还有创造工具。为了找到水,动物会迁徙、识别植被、利用降雨,并进行一系列 exploratory 的行为。Shelters 是找到或创建的。穴居猫头鹰在其他动物(比如松鼠)挖掘的废弃洞穴中筑巢。我们可以继续定义更低层次的抽象,来解决挖掘 shelter 的问题。如果没有找到废弃的松鼠洞,穴居猫头鹰会用它们的喙和爪子挖掘自己的洞穴,这会带来一系列新的、更 low-level 的具体 challenges。猫头鹰必须选择土壤足够疏松的地点,比如沙质或壤土地带,并找到方法挖掘几英尺深的隧道,以确保 shelter 的安全。以此类推。
All of the challenges presented to life have a structure to them; a kind of hierarchy。这似乎与马斯洛所谓的“需求层次”有关,但在这里,我 specifically 谈论的是抽象层次及其与自然计算的关系。我们总是可以将一个问题分解为 nested 的抽象层次,更 low-level 的层次包含在更高层次的挑战类别中。
由于自然界中的所有解决方案都是解决它们所面临问题的物质配置,我们可以将自然界的解决方案视为问题结构的反映。我不仅仅是在直觉上这么认为,而且在形成的具体模式中也是如此。我们在蕨类植物、海星和珊瑚礁中看到的模式呈现出必然的结构,看起来就像它所解决的问题。在我们看到的自然界的所有模式中,比如 fractals、spirals、waves、hexagonal packing、tessellations、dendritic branching 等等,我们看到的是问题结构的反映。
让我们更 literal 一点。如果我们正在看 Romanesco broccoli 的 nested fractal,或者 sunflower head 中 spiraling arrangement of seeds,我们看到的是该 organism 的 set of environmental challenges 如何按抽象排列的。环境不仅仅要求 sunflower 生存,它要求解决更深层次的挑战,以便实现生存的终极 high-level 目标。
我们在自然界中看到的物体是 matter 的配置与问题结构之间重叠的结果。想象一下在 Romanesco broccoli 的 various levels 与该 organism 解决的问题中 nested levels of abstraction 之间画一条线。这对于 inorganic matter 来说也是如此。Salts、metals、rocks、rivers 和 mountains 都是解决 inherent in the problems they solve 的结构的配置。从这个意义上说,自然界的解决方案与它们的环境并非截然不同,相反,它们是问题和解决方案之间 holistic overlap 的一部分。
正如我 previously argued,这样定义问题不是对 mental perception 的 reification。这不是一个 fanciful theory,将 mental constructs 投射到 physical world 上。如果 reductionism 有效,情况可能会是这样。但在 complexity 下并非如此。我不是在 laying out 一组 causal 的 deterministic steps,讲述 matter 如何从小 pieces 移动到大 pieces 以产生我们在自然界中看到的模式的故事,我主张的是 inevitable 的 problem solving nature 正在完成;这与我们对 information、computation 和 evolution 的了解完全一致。将自然界的模式视为问题结构 inevitable 的反映是完全合理的。
All of this is a testament to the connection between informational 和 physical abstraction。The structure of problems presents us with an informational version of abstraction, the kind leveraged by our minds to define our world 和 maneuver through life。The physical structures we see in nature present us with the physical version of those abstractions。For nature to produce its rocks, rivers, mountains, starfish, owls and beavers it must be manifesting physical levels of abstraction that overlap with the informational abstractions that constitute their environments。
接下来,咱们再说说“自然可以达到更高的高度”。
Nature is a far more powerful computing engine than anything humans can devise。虽然我们的 AI 系统具有 self-referencing 和 abstraction 的特征,但 nature 将这一点发挥到了极致。Before accounting for nature's version of going meta, consider again our attempt to do this with AI technology。As already stated, we cannot set the internal parameters deliberately, so we operate from the outside by setting values related to training。We choose the learning rate、batch size、number of epochs、number of layers、neurons per layer 等等。These do not direct anything inside the model, they only help structure the process used to learn。We can think of these values as the first level of going meta, since they exist outside the guts of the system。Getting the best values should produce the best possible abstractions within。
But what should these values be?This is its own problem, a meta problem。Here, we are less interested in knowing an individual model’s best parameters, and more in learning how to learn any model’s best parameters。As mentioned previously, this meta problem is approached using techniques like hyperparameter optimization (HPO)、meta learning 和 neural architecture search (NAS)。While only one of these goes by the title “meta learning”, they all attempt to solve the meta problem of finding the best external values possible (within a reasonable amount of time)。
One approach is to combine many different models to learn and deploy a single better model called a meta-learner。The idea is that any single model will be too narrow to exhibit generalized intelligence, but many models might learn something more latent, providing a more powerful entity。But there are only so many models we can combine into a group before the challenge becomes intractable。Each model must be trained on data individually, which involves a series of experiments and validation。But the higher-level problem of meta learning must also be done with experiments and validation, to see which combination of models works best。This higher problem has its own set of parameter values that must be determined。But why stop there?We could combine different meta learning frameworks into new combinations, which would have their own set of external parameters, to again be experimented with and validated。
The possibility space associated with combining things into bigger and bigger groups explodes into sizes that are beyond astronomical。This calls into question the feasibility of going meta to engineer complex things。Even if we accept the fundamental limitation of design, is not design our only chance of making building efforts feasible?Sure, we might want to embrace massive levels of trial-and-error just to see what works, but running so many experiments has computational demands that seem to make this approach ultimately impossible。
But design cannot cut through the extreme combinatorial explosion of possibilities。Design is based on a fundamentally invalid premise when it comes to complex things, as discussed throughout this book。Design cannot be the answer。But it seems at first glance that operating at the meta level cannot alleviate the computational demands of solving problems internally。Just because we step outside the system does not mean the lower-level problems need not be resolved。This raises a concern。How can going meta be a way to build things?
As always, nature holds the answer。Nature can reach extremely high levels of meta to fashion entire ecosystems。If we use the taxonomic classification system, we can say nature brings together organisms to form species、species to form genera、genera to form families、families to form orders、orders to form classes、classes to form phyla、phyla to form kingdoms and kingdoms to form domains。All of these are different levels of physical abstraction (groups of living organisms) working together to solve a given problem。Each of these are fashioned automatically, by nature’s mechanism of self-referencing and abstraction creation。
Nature can go meta far more effectively than human engineering。The computational load that nature is wielding is astounding。Nature keeps stepping outside a given level and finding the parameters that work to create the next level。
Of course, nature’s computational resources are virtually infinite compared to what humans have at their disposal。Nature has an extreme version of distributed parallel processing, thanks to billions of its processors working simultaneously within a group。While each piece of a natural system (e.g. a cell、a neuron) only performs simple computations, their collective behavior resolves fantastically complex functions。Nature also operates on far grander timescales than human innovation。Evolution takes millions of years to fashion its solutions through natural selection。Nature can explore an utterly enormous possibility space, using massive variation、iteration and selection, all in a highly parallel manner。
There is also a big difference in terms of energy efficiency, with biological systems being incredibly energy efficient relative to anything humans create。The human brain only consumes about 20 watts of power, while modern supercomputers and deep learning systems have energy consumption several orders of magnitude greater。Consider how effective the human brain is at vision、motor control and reasoning compared to AI, despite only consuming 20 watts。Biological efficiency is in its own category。
Biology is also deeply integrated with the physical world。All levels of abstraction operate in direct communication with their environments。The computations that convert inputs to outputs are occurring as physical processes within the system itself, not as proxy models of behavior。It is thus not so surprising that nature can go meta far more effectively than humans。Nature has at its disposal the parallelism、time、energy efficiency and deep physical integration to step outside multiple levels of a given system and create its nested emergent abstractions。
So how can humans operate as nature does, without having anywhere near the computational resources she has?How can we fashion truly complex objects that solve as nature solves, if we cannot meet natural computational demands?
One obvious answer is to hack nature where she stands。For example, synthetic biology looks to engineer the genetic material of organisms, such as viruses and bacteria, to have desirable characteristics。This has been done in areas like bioremediation、bioproduction of pharmaceuticals、biofuels, and even changing bacteria to perform simple logic operations for computing, or act as biological actuators inside tiny machines。Synthetic biology bypasses the need to engineer emergence because we are leveraging already-evolved features。
The core problem here is one of misapplication。While there will be some success in hacking nature’s existing solutions, the reality is nature evolves the way it does for a reason。That reason cannot be defined in simple reductionist terms。All we can say is that nature’s solutions are what they are because that is what survived。A natural object’s makeup is a fantastically complex object that solves its host of problems in ways we will never know。Attempting to hijack some piece of a natural solution might bring some narrow benefit, but it will come with all the problems of design under complexity; they are guaranteed to produce unforeseen side-effects that will likely reduce a solution’s efficacy in the long run。
As argued in this book, we must engineer emergence ourselves, not merely repurpose nature for problem’s she was not meant to solve。Our creations must establish their own complexities, and arrive at their own emergent structures and behaviors, based on the environments we place them in。There are deep internal dependencies that cannot be seen, that enable natural solutions to work effectively。These internals must materialize out as physical abstractions built via automatic self-referencing。
To engineer emergence ourselves we must look to maximize the parallelism, energy efficiency and deep physical integration seen in nature。Under design, this is not possible。Design forces us to make specific detailed decisions that become highly constrained and unnatural versions of the parallelism, energy efficiency and deep physical integration needed。
Consider how the AI researcher or engineer wants to deliberately fashion systems by making explicit decisions around how these systems function、perform、and interact with their environments。They want to identify and collect relevant data sources for training and evaluation。They seek to select and design algorithms based on specific computational requirements。They aim to design the neural network architectures, making decisions about layers、activation functions、connections and optimization techniques, all based on design principles、best practices and mathematical theory。
But the truth is, the biggest strides in deep learning progress have not come from specific design choices, they have come from throwing more data and computing power at the problem。In fact, the specific architecture is far less important than the current paradigm would suggest。It is not that specific structures are not important, it is that they do not arise from deliberate reasoning and design as suggested in research articles。The structures that end up working are largely a byproduct of more data and more computing power。
This is exactly what one should expect when building complex things。AI research is effective for reasons most AI researchers do not seem to understand。This will sound like an odd statement to many, but we see this pattern frequently。It is easy for people to believe their design choices are responsible for progress, when in fact progress under complexity has much more to do with randomness and happenstance than design。This is in fact a rigorous statement, one that is fully inline with the undeniable properties of how complex systems evolve。
Design under complexity is correctly understood as something that interferes with the creation of good solutions。Design choices rob complex systems of their natural intricacies and opaque dependencies。They interfere with the kind of internal communication between pieces that must occur to materialize what works。Design also limits our ability to explore vast possibility spaces because of the inflexibility design forces into our solutions。
Of course, the design of AI systems is not like the design of traditional engineering。Again, AI is our best example of stepping outside systems and allowing things to converge。But our level of meta is far too close to the guts of our solutions。Whereas nature reaches all the way up to the domain under taxonomic classification, we are still operating just outside the individual organism。If all we wanted was a basic model that predicts narrowly defined things this might be fine。But in the quest to achieve something akin to general intelligence this cannot work。The more sophisticated and powerful our intended solution, the more genuine complexity we must create。
To achieve the kind of parallelism、energy efficiency and deep physical integration needed demands we operate at levels far more meta than the solution we seek。If we want to build a brain, we will not achieve it by trying to architect a brain。While it seems convenient to define human intelligence as something that occurs inside an individual’s head, we are a deeply connected social species。There is no intelligence without levels of aggregation that far surpass a single biological neural network。Meta must reach far higher than the organism to make the best organism。Design keeps us too close to the organism。
When we remain too close to the thing we are building, as per design, trial-and-error becomes intractable。This is because the things we attempt to mix and match during experimentation are too defined。But when we shift our focus to the surface, using only the highest-level、most general problem statement, the internals must figure themselves out。This is how complexity works。
The more difficult the problem, the larger and more complex its possibility space, and that space must be searched。In chapter 4 we saw that as the difficulty of a problem increases, the softer our problem solving must get, and the less analytical we can be about how we go about searching。
We can think of heuristics and pattern recognition as tools that allow us to find things inside massive possibility spaces without having to do much searching。But this only works if we apply them to the highest-level signals a complex situation emits。But design forces us to define searching in overly explicit terms。It does not leverage what nature already gives us。Design makes us attempt trial-and-error on low-level constructs that do not in fact have the meaning we assign them。It is not for us to know how things interact, only that interactions will take place as needed if more of the system is forced to survive stressors。Reaching higher in our meta efforts is getting nature to work itself out naturally。
So, how can we operate at the highest level of meta while not running into computational limitations?The answer is to 1) keep problem statements as general as possible (operate at the surface), and 2) create highly flexible internals rather than designing specifics。
We keep problem solving at the surface by only maintaining the highest-level target we can get away with。An example in AI would be defining our problems as building an entity that holds realistic and useful conversations。That is all。Today’s paradigm tells us to go much deeper, by decomposing a problem into its purpose、target audience、scope of conversation、intent、context、handling of ambiguity、tone and personality。But this deconstruction of a problem is reductionist。We do not know such specifics, only that the entire solution must solve the topmost definition of a problem。Decomposing a problem into specifics is guaranteed to produce inferior engineering under complexity。
For the second point, we create highly flexible internals by not designing specifics。Consider the activation functions used in neural networks。These are to introduce non-linearity into the model。There are several choices we can choose from, each with its own characteristics、advantages and drawbacks。We can choose sigmoid、hyperbolic tangent、rectified linear unit (ReLU)、leaky ReLU、parametric ReLU (PReLU)、exponential linear unit (ELU)、scaled exponential linear unit (SELU)、softmax、swish、Gaussian etc。Trying to mix and match these possibilities with all the other variables that could be changed in a neural network is analogous to moving one cubie in a Rubik’s cube only to scramble our previous moves。There are just too many possible choices at this level to make trial-and-error feasible。
But now imagine that instead of trying to design the best activation function as an isolated thing, we create something less defined and more flexible。The closest approach we have currently is something called learnable activation functions。These functions include parameters that are adjusted during the training process, allowing the function to change its shape dynamically。Whether or not this approach works is not the point。The key is to not attempt to design the flexibility, but to put in place something that changes its makeup as part of a whole。The less defined the thing is in the reductionist sense, the better。Just as mitochondria lose their definition outside a cell, activation functions have a role that cannot adequately be defined outside a network。It is part of a holistic solution whose presence in the group is what matters。It is not for us to know what the activation function should look like。Only that its structure and function should emerge naturally on its own, while we remain concerned only with high-level targets。
To get the best activation functions、network architectures、data preprocessing pipelines、loss functions、optimization algorithms etc。 is not to design them。It is to let them materialize out by only focusing on the topmost level of the problem we are trying to solve。We must reach meta as high as possible by maintaining flexible internals。
It makes many scientists and engineers uncomfortable that progress would best be defined under such a lack of precision。But this is how complexity works。More data and computing power thrown at highly flexible, ill-defined constructs is how the best possible structures will emerge。One need look no further than nature’s approach to building to realize this is true。This is what it will mean to build in the age of complexity, if we do it right。
The Thing we Are After (Meta Structures)
No two snowflakes look the same。Actually, this is only true in the detailed sense。If we zoom in on an individual snowflake we will see a truly unique pattern, marked by its distinct crystallization。As with someone’s fingerprint, the detailed structure we see close up belongs only to that specific snowflake。But zoom out enough to see the entire flake and we see something familiar。All snowflakes can be called “snowflakes” because they share a common pattern。This shared pattern is not something we can define explicitly。If we asked seven different artists to draw the essence of a snowflake, their drawings would look similar, but not identical。We all understand what a snowflake looks like, yet that understanding is not given to us by some precise definition。It is a latent structure that we know intuitively。
The essence of something, like the essence of a snowflake, is what we are after when we learn。To know a snowflake is not to know one instance of it, but to observe many instances and create some unlabeled abstraction in our mind that defines it。To learn, is to understand the deep hidden template that many instances of a phenomenon adhere to, but never manifest explicitly。
This is why there are no words to describe what the essence of something is。If one learns how to play piano、swing a golf club or develop theories in quantum mechanics they are tapping into some hidden shared pattern among countless experiences。It is not something that can be decomposed into pieces and taught directly to others。This is why true skill seems more like a feeling than anything describable。If you can explain it precisely, it is either not a real skill or you are not that good at it。
This is also why experience is the only way to achieve true skill。It is not that one must keep seeing the same thing, rather they must keep experiencing many different instances of the same latent thing。Only then will the invariant truths be recognizable, as they are what persist when everything else changes。A piano player must play (or better, compose) many different songs for this activity to imprint upon them what it means to play piano。The golfer must take countless swings, each slightly different, to comprehend what makes a good swing。
True learning happens, not by going deep into one version of something, but by exposing oneself to many different aspects of it。It is the unspoken, intuitive and latent structure we are after。It is what survives the mind, yet has no name。
One can predict the general look of a snowflake, but they cannot predict what the next snowflake will look like。This is why details do not matter in nature, at least not in the sense we are usually taught。Details are only there to be in flux, in service to the next level of abstraction。As per multiple realizability, there will always be many ways to achieve the same thing。To view details is to view but one transient instantiation of a collection that serves the next level up。
This means that the only way to really describe or explain a snowflake is to talk about it in the meta sense。Only the general、shared properties of all snowflakes represent true knowledge of a snowflake’s structure and behavior。Any detailed account will be steeped in reductionism and causal reasoning, having little to do with why a snowflake looks the way it does。
The real structure that defines the snowflake is its meta structure。A meta structure is the latent pattern that shows no direct appearance in nature。Meta structures are the hidden templates nature’s solutions attempt to become, but never do。Anything we build in the age of complexity will have the existence of a meta structure, since the emergent structures we see in complex objects gravitate towards latent patterns。
In part 3 we will see how knowledge of meta structures acts as true validation when building complex things (in contrast to things like design principles or supposed causes)。Meta structures are agnostic to the internal details of complex objects。Meta structures do not intervene on the natural and automatic convergence of matter and information that makes complex things work the way they do。This is why meta structures represent a tractable form of validation under complexity。
All the Way Up, All the Way Down: Group Selection
I mentioned previously how AI engineering, despite being our best example of working externally to the details, is still steeped in design。AI engineers look to deliberately fashion systems by making explicit decisions around how they should function and perform。I then argued against such efforts for reasons discussed in this book。The best structures we can achieve are a byproduct of operating at the surface, with little regard for the internal specifics of what we build。This saves us from enacting an untenable form of trial-and-error, and instead integrates us deeply into the physical realities we hope to model。
I used the example of the activation function having a role that cannot adequately be defined in reductionist terms。It is part of a holistic solution whose presence in the group is all that matters。This means what we are truly after when we build solutions to hard problems is the existence of the best group possible。This brings us to the final aspect of demystifying emergence。How does nature always create the best group possible, and how might humans do the same?
Let us first look at how humans form groups under the current paradigm。This is not the group-forming found naturally, like the emergence of societies or social movements, these are groups like sports teams、university students and employees; groups that are hand-picked for some performance-related purpose。These groups are created through explicit selection of individuals。We want skill diversity, so we hire for specific technical expertise and domain knowledge。We define clear roles and responsibilities, such as leadership or technical know-how。We seek certain personalities、work styles and collaborative skills。We want cultural fit among our players、students or employees, and a shared set of values。There are also experience levels, and sometimes geographical and time zone considerations。
All this selection seems to make sense, until we realize it smacks of design。We are assuming we know the roles different people in a group are supposed to play, admitting people into groups based on these criteria。The reader should realize by now this is not a good way to create groups。All we can know is that the best group is needed, not what specific roles will make for the best pieces。As one might imagine, nature does not make such explicit choices regarding roles (despite how reductionist science likes to talk about roles in nature)。Nature can select the best group by doing just that, group selection。
Using our taxonomy example from before, we can say a species exists because individual organisms work together to solve the species problem, which is to delineate groups of organisms that are reproductively isolated from one another。Perhaps this helps with genetic compatibility and less error-prone reproduction。Perhaps it ensures different species occupy distinct ecological niches to reduce competition over resources。Perhaps it improves diversification, introducing a level of stability and resilience inside ecosystems。Perhaps it is all of these。
We can expect the next level, genus, to bring further ecosystem functioning and dynamics that are necessary for life on earth。Resource partitioning、pollination、seed dispersal and new levels of biodiversity、stability and habitat structuring all operate at this level。
As discussed previously, all problems have a structure of them, and nature manifests what is needed to solve that structure。The nested levels of abstraction that sit beneath the topmost level called survival can only be attended to by groups。A group is a collection of pieces that work together to solve a given level inside the structure of a problem。
This means that the selection in natural selection must be group selection。This has been a controversial issue for many years, as the idea of group selection runs counter to contemporary takes like selfish-gene theory。Selfish-gene theory was proposed by biologist Richard Dawkins in his 1976 book The Selfish Gene, which has become a cornerstone of evolutionary biology。Selfish Gene theory suggests that individual genes are the primary units of selection in evolution, and they act in their own self-interest to ensure their survival and propagation。This is thus a gene-centered view of evolution, stating that the ultimate goal of genes is to replicate themselves, while organisms are just the means by which genes achieve this goal。
Group selection is a very different perspective。Whereas Selfish Gene theory focuses on the gene as the primary unit of selection, group selection focuses on the evolution of traits that benefit a group, even if they are detrimental to an individual’s fitness。In group selection, the selection acts on groups of individuals, favoring traits that benefit the group as a whole。
The problem with Selfish Gene theory is that it does not align to how nature computes。Suggesting that there is a single type of construct that is uniquely selected by nature does not adhere to what we know about information、computation and evolution。From the previous sections, it should now be clear that nature goes meta at all levels, allowing for the creation of the physical abstractions necessary to compress information and compute outputs to naturally hard problems。If evolution only selected at the level of genes, this entire meta process would not work。
I believe Selfish Gene theory is little more than reductionism found in biology。It is tempting to consider genes as having the primary role in evolution, but in reality, a gene is but one piece of a larger group that gets selected on by nature。
At all levels of selection, in any natural process, it is the group that matters。Atoms take on their essential properties because the nucleus and electrons work together as a whole。A molecule’s behavior in any chemical reaction is not determined by single electrons, it depends wholly on how all electrons work in combination。We can keep climbing the ladder of complexity and find an ever-larger dependence on group-level properties。
It is not about the individual, it cannot be。The problem solving enacted by nature occurs at all levels, and only the group can solve the next level up。We know that abstraction is, by definition, the subsuming of many different pieces into a single unit, and that single unit solves a new problem。That single unit compresses information so as to map countless inputs down to the few that are needed。The unit of selection in natural selection must be the group。
Consider the difference between a snowbank and a snowflake。Just as organisms solve different problems than species, the individual snowflake solves a different problem than a snowbank。A snowflake solves the problem of providing numerous surfaces and edges for water molecules to adhere to and crystallize upon。The six-sided symmetrical structure of snowflakes maximizes their surface area-to-volume ratio, allowing them to efficiently capture and collect water vapor from the surrounding air。The delicate and often branched structure of snowflakes contributes to their aerodynamic properties, allowing them to float gently through the atmosphere as they fall to the ground。This structure helps ensure that snowflakes can travel long distances without being disrupted or broken apart。
But a snowbank is another level of complexity, formed by the collective action of snowflakes。In a group, snowflakes solve the problem of insulation, by trapping air within the spaces between ice crystals。This helps maintain a stable temperature within the snowbank, reducing heat loss from the ground。Collectively, a group of snowflakes gives off a characteristic white color (which individual snowflakes do not), which plays a crucial role in regulating Earth's climate, by reflecting sunlight back into space, thus helping to cool the planet。
Individual polar bear hairs solve a different problem than the group of hair。Single polar bear hairs are hollow, which by themselves look transparent, but as a group offer insulation and produce a white color。An individual hair is connected to nerve endings, providing the bear with a highly sensitive sense of touch, allowing the bear to detect changes in its environment。Collectively, polar bear hair solves the problem of camouflage、water repellency and buoyancy。
Always selecting for the group is how nature goes meta。This is because to be meta is to consider all subsets of a group at once, and spot which invariant properties they all have in common。Nature is forever hovering above the details and “noticing” how pieces work together to produce what is needed to survive。Nature can select the best group because it selects not for individual pieces, but for groups。
By selecting for the group nature is effectively making “analogies。” Nature is finding connections between things, automatically, because only shared pieces represent the material configurations that survive。By seeing emergence as the natural byproduct of nature selecting for the group, we can understand how nature is able to create increasing levels of physical abstractions; all without using design。This hints at how we might do the same。
The list of things we often consider when designing a group are not in and of themselves bad。It makes sense to have diversity、expertise、domain knowledge、leadership、good personalities、collaborative skills etc。But these must emerge on their own, just as mitochondria emerge in the cell。The group attributes we hope to see are not for us to select, because we have little idea how these relate to each other in the wild。The way we can select for the group is to just add variety。Not a type of variety, just variety。This is a meta property of all systems that evolve effectively。They have a high level of variation, as per nature’s recipe。We can thus also say that aiming for iteration is also good。Not a type of iteration, just iteration。And as for selection, we can now see that it is the group that must be selected。We must select from the top, allowing the bottom-up flux of matter and information to manifest what works。
Building Differently
We have now