Artificial Intelligence Symbolic through connectionism to deep learning to LLMs and agents — the pioneers, the paradigms, the mechanisms
A mind map of artificial intelligence as a series of paradigm shifts: the pre-1956 foundations in logic and cybernetics; the symbolic era and its winters; the connectionist revival and the statistical turn; the deep-learning breakthrough; the foundation-model and LLM era; and the contemporary work on alignment, evaluation, and agentic systems. Named pioneers, institutions, papers, and mechanisms with dates across six branches.
+ − Reset 100%
Foundations & Formalism, pre-1956 Symbolic AI & Expert Systems, 1956–1985 Connectionist Revival & Statistical ML, 1986–2011 Deep Learning Breakthrough, 2012–2017 Foundation Models & LLMs, 2017–2023 Alignment, Agents & Evaluation, 2017–present Mathematical logic Computation theory Cybernetics & information theory Early machines and programs Philosophical origins The Dartmouth founding Early programs and languages Pole institutions Knowledge and expert systems Philosophy & critique First winters and the Fifth Generation Backpropagation renaissance Recurrent networks and memory Early deep learning and vision Kernel methods, Bayes, ensembles Game-playing and robotics Web-scale ML and ImageNet The pivot: AlexNet Representation learning Architectural innovations Reinforcement learning at scale Industrial labs The Transformer Pretraining and transfer Scaling Multimodality and generation Scientific applications ChatGPT and the deployment era Alignment foundations Safety research programs Evaluation benchmarks Post-training methods Agentic systems Reasoning models Governance and policy George Boole — Laws of Thought, 1854 Gottlob Frege — Begriffsschrift, 1879 Russell & Whitehead — Principia Mathematica, 1910–1913 David Hilbert — Entscheidungsproblem, 1900 Kurt Gödel — incompleteness theorems, 1931 Alan Turing — On Computable Numbers, 1936 Universal Turing machine — the theory of general-purpose computation John von Neumann — First Draft, EDVAC, 1945 Stored-program architecture — instruction and data in one memory Church–Turing thesis — the limits of the effectively computable Norbert Wiener Claude Shannon McCulloch & Pitts Minsky & Edmonds — SNARC neural machine, Harvard 1951 Christopher Strachey — Ferranti Mark 1 checkers, Manchester 1951 Arthur Samuel — self-learning checkers at IBM, 1959 (coins "machine learning") Grace Hopper — A-0 compiler, 1952 (programs that write programs) Alan Turing — Computing Machinery and Intelligence, Mind 1950 The Imitation Game — operational definition of machine intelligence Karel Čapek — R.U.R., 1921 (coins "robot") I. J. Good — Speculations Concerning the First Ultraintelligent Machine, 1965 1955 — McCarthy, Minsky, Rochester, Shannon submit proposal 1956 — Dartmouth Summer Research Project names the field Newell, Simon, Selfridge among the ~10 attendees Newell & Simon — Logic Theorist at CMU/RAND, 1956 Frank Rosenblatt — Perceptron, Cornell Aeronautical Laboratory 1957 John McCarthy — LISP at MIT, 1958 Recursion, garbage collection, symbolic expressions — the AI language for 30 years Joseph Weizenbaum — ELIZA/DOCTOR, MIT 1964–1966 James Slagle — SAINT symbolic integrator, MIT 1965 Terry Winograd — SHRDLU, MIT 1970 Stanford AI Lab (SAIL) — founded by McCarthy, 1963 MIT AI Lab — founded by Minsky & Papert, 1964 CMU AI program — Newell & Simon lineage Edinburgh AI department — Donald Michie, 1963 George Devol / Unimation — Unimate industrial robot, GM 1961 Shakey the Robot — SRI International, 1969 (STRIPS planning) Feigenbaum & Lederberg — DENDRAL at Stanford, 1965 Colmerauer & Roussel — Prolog at Aix-Marseille, 1972 Edward Shortliffe — MYCIN medical expert system, Stanford 1972 DEC — XCON/R1 expert system, 1980 (~$25M/yr savings) Doug Lenat — Cyc at MCC, 1984 (common-sense knowledge) Hans Berliner — BKG backgammon at CMU, 1979 Minsky & Papert — Perceptrons, 1969 (XOR limitation) John Searle — Chinese Room argument, 1980 Hubert Dreyfus — What Computers Can't Do, 1972 Rodney Brooks — Elephants Don't Play Chess, MIT 1990 (subsumption) ALPAC Report, 1966 — NLP funding cut (first partial winter) Perceptrons backlash, 1969 — connectionist winter begins Japan MITI — Fifth Generation Computer Systems, 1981 ($400M, 10 yr) Lisp Machine collapse, 1987 — hardware specialization fails Second AI winter, 1987–1993 — DARPA cuts general AI funding Rumelhart, Hinton & Williams — backprop paper, Nature 1986 McClelland & Rumelhart — Parallel Distributed Processing books, MIT Press 1986 Paul Werbos — backprop derived in 1974 Harvard thesis (underattributed) Alexey Ivakhnenko — Group Method of Data Handling, USSR 1965–1971 (first multi-layer) John Hopfield — Hopfield networks, Caltech 1982 Sepp Hochreiter — vanishing gradient problem, TUM 1991 Hochreiter & Schmidhuber — LSTM, Neural Computation 1997 Gating mechanisms — input, forget, output — solve vanishing gradient for sequences Yann LeCun — CNN for ZIP codes, Bell Labs 1989 LeNet-5 — gradient-based learning for documents, 1998 Geoffrey Hinton & Ruslan Salakhutdinov — deep belief nets, Science 2006 Yann LeCun coins "deep learning," ~2003 Rajat Raina & Andrew Ng — GPU deep learning, ICML 2009 Cortes & Vapnik — Support Vector Machines, Bell Labs 1995 Judea Pearl — Bayesian networks, UCLA 1988 Leo Breiman — bagging 1996, Random Forests 2001 Koller & Friedman — Probabilistic Graphical Models, MIT Press 2009 Hidden Markov Models — the workhorse of pre-2012 speech recognition Christopher Watkins — Q-learning, Cambridge PhD 1989 Gerald Tesauro — TD-Gammon, IBM 1992 IBM Deep Blue defeats Kasparov, 1997 DARPA Grand Challenge 2004 — no vehicle finishes; Stanley (Stanford) wins 2005 Boss (CMU) wins DARPA Urban Challenge, 2007 Rodney Brooks — behavior-based robotics, MIT (founds iRobot) Page & Brin — PageRank, Stanford 1998 Phrase-based statistical machine translation — IBM model, 2001 Fei-Fei Li — ImageNet proposal, CVPR 2007 ILSVRC competition launched, 2009 — 1.2M images, 1000 classes Norvig & Russell — AI: A Modern Approach textbook standard Krizhevsky, Sutskever & Hinton — AlexNet, Toronto 2012 15.3% top-5 error vs. 26.2% runner-up on ILSVRC ReLU activations, dropout regularization, two GTX 580 GPUs CUDA/cuDNN — NVIDIA's software stack as the enabling substrate Mikolov et al. — Word2Vec, Google 2013 Kingma & Welling — Variational Autoencoders, Amsterdam 2013 Ian Goodfellow et al. — Generative Adversarial Networks, 2014 GloVe — Pennington, Socher & Manning, Stanford 2014 Sutskever, Vinyals & Le — seq2seq with LSTMs, Google 2014 Bahdanau et al. — attention for translation, Montréal/Jacobs 2014 He, Zhang, Ren & Sun — ResNet, Microsoft Research Asia 2015 Skip connections — enabling 152+ layer training Batch normalization — Ioffe & Szegedy, Google 2015 DeepMind DQN plays Atari, Nature 2015 AlphaGo defeats Lee Sedol 4–1, Seoul 2016 Monte Carlo Tree Search + policy/value nets trained on games + self-play Proximal Policy Optimization (PPO) — Schulman et al., OpenAI 2017 Google Brain founded — Ng, Dean, Corrado, 2011 FAIR (Facebook AI Research) founded, 2013 Google acquires DeepMind for ~$500M, 2014 OpenAI founded — Musk, Altman, Sutskever, Brockman, 2015 ($1B) Microsoft Research, IBM Research, Baidu IDL — parallel programs Vaswani et al. — Attention Is All You Need, Google 2017 Self-attention replaces recurrence — O(n²) but fully parallel Multi-head attention — multiple subspaces of representation Positional encodings — rotary (RoPE), learned, sinusoidal variants Flash Attention 2 — Dao, Stanford 2023 (IO-aware exact attention) Howard & Ruder — ULMFiT, 2018 (fine-tune pretrained LM) OpenAI — GPT-1, Radford et al., 2018 (117M params) Google — BERT, Devlin et al., 2018 (340M params, bidirectional) T5 — Raffel et al., Google 2020 (text-to-text unified framing) Self-supervised learning — next-token, masked LM, contrastive OpenAI — GPT-2, 2019 (1.5B params; staged release) Kaplan et al. — scaling laws, OpenAI 2020 (power-law loss curves) OpenAI — GPT-3, 2020 (175B params; in-context learning) Google — Switch Transformer, 2021 (1.6T params, sparse MoE) Google — PaLM 540B, 2022 (chain-of-thought emerges at scale) DeepMind — Chinchilla, 2022 (compute-optimal data/params ratio) OpenAI — CLIP, 2021 (contrastive image-text pretraining) OpenAI — DALL-E, 2021 (text-to-image generation) Google DeepMind — Gemini, Dec 2023 (native multimodality, 3 sizes) Stability AI — Stable Diffusion 1.4 open release, Aug 2022 Latent diffusion models — Rombach et al., 2022 Whisper — open-weight speech recognition, OpenAI 2022 DeepMind — AlphaFold 2 at CASP14, 2020–2021 Near-experimental accuracy on protein 3D structure ~200M protein structures released publicly AlphaFold-Multimer, AlphaFold 3 — beyond single-chain prediction Ouyang et al. — InstructGPT, OpenAI 2022 (RLHF at scale) OpenAI — ChatGPT launch, Nov 30, 2022 (1M users in 5 days) OpenAI — GPT-4, Mar 2023 (multimodal, 90th percentile bar exam) Meta — LLaMA 1 and 2, 2023 (open-weight release; ecosystem explosion) Anthropic — Claude 2, 2023 (100K context window) Mistral AI — open-weight European models, 2023 Christiano et al. — Deep RL from Human Preferences, OpenAI 2017 Russell et al. — Cooperative Inverse Reinforcement Learning, Berkeley 2017 MIRI (agent foundations) vs. empirical alignment — the two schools Stuart Russell — Human Compatible book, 2019 (beneficial AI problem) Concrete Problems in AI Safety — Amodei et al., 2016 Unsolved Problems in ML Safety — Hendrycks, Carlini et al., 2021 Red-teaming — adversarial evaluation as safety practice Mechanistic interpretability — Chris Olah, Anthropic (circuits, superposition) Anthropic Responsible Scaling Policy (RSP), 2023 Hendrycks et al. — MMLU, 2020 (57-subject knowledge benchmark) GPQA Diamond — graduate-level science benchmark, 2023 SWE-bench — software-engineering task benchmark, Princeton 2023 LMSYS Chatbot Arena — human preference leaderboard HumanEval, MATH, BIG-Bench, HellaSwag — domain-specific evals RLHF — reward-model + PPO pipeline (InstructGPT, 2022) Anthropic — Constitutional AI, 2022 (written principles + RLAIF) DPO — Direct Preference Optimization, Rafailov et al., Stanford 2023 RLAIF — reinforcement learning from AI feedback Anthropic Claude 3.7 — extended thinking (user-visible reasoning), 2025 Yao et al. — ReAct (reason + act), Princeton/Google 2023 AutoGPT & BabyAGI — open-source agent frameworks, 2023 Toolformer — Schick et al., Meta 2023 (self-taught tool use) Claude computer use — Anthropic, 2024 Model Context Protocol (MCP) — Anthropic open standard, 2024 OpenAI o3 and o4-mini — tool-in-reasoning-loop, 2025 OpenAI o1 — inference-time chain-of-thought, Sep 2024 DeepMind AlphaProof / AlphaGeometry 2 — IMO silver-medal, 2024 Claude 3.7 Sonnet extended thinking — token-budgeted deliberation, 2025 Google Gemini 2.5 Pro — 1M-token context, multimodal reasoning, 2025 Test-time compute as a new scaling axis EU AI Act passed — European Parliament, Jun 2023 (risk-based tiers) US Executive Order on AI — Biden, Oct 2023 Bletchley Declaration — UK AI Safety Summit, Nov 2023 Anthropic Model Spec / Character Overview, 2024 OpenAI Model Spec, 2024 US export controls on AI chips — BIS rules 2022, 2023, 2024 Cybernetics: Or Control and Communication, 1948 Feedback, goal-directed behavior, communication as unifying science A Mathematical Theory of Communication, Bell System Tech J., 1948 Entropy, channel capacity, the bit A Logical Calculus of the Ideas Immanent in Nervous Activity, 1943 The neuron as a threshold logic unit — founding document of connectionism Artificial Intelligence Brian Tighe · Mind Maps Orbital mind map. Scroll to zoom, drag to pan, or use the buttons above (+ / − / 0 keys also work). Hover a node to highlight its path to the center and the subtree beneath it. How to read this The center holds the topic. The six branches fan out bilaterally — three on each side — each in its own color. Sub-branches nest three levels deep under each top-level branch. Hover a leaf to trace the path back to the center; hover a branch to see everything it contains.
This is the shape the topic has when you try to hold the whole field in your head at once. It is not an argument; it is a scaffold. The essays argue against or within scaffolds like this one.