A Brief History of Computing, Data, and AI (1940s and 1950s)
“I didn't have time to write a short letter, so I wrote a long one instead.” - Mark Twain
This is a draft of a section from the upcoming Chapter 3 of Practical Data Modeling. This section delves into the rich history of the forces that have shaped today's software and data world. These forces, including computing, data, analytics, application development, and ML/AI, have evolved and converged over time. Today, it’s hard to imagine a world where engineers aren’t working with data across various use cases - application development, analytics, and ML. This chapter aims to provide the context of the significant drivers in the past and how these impact the world of data today. I picked the 1940s and 1950s as the starting point for this historical dive, as it was the first truly “modern” era of computing.
This chapter is a labor of love, proving far more difficult than I imagined. There’s a lot of history that I could include. The trick is understanding what’s essential. Filtering and curation is the key. As the Mark Twain quote above expresses very aptly, summarization is tricky. This is a brief history of the topics and events that matter to the discussion, not a complete exposition. Such a history would be a book unto itself. I don’t expect this chapter to cover every nuance and event fully. I also expect that this chapter will be rewritten quite a bit, as there are a million things that you could include. I've tried to pick the most critical factors in computing, data, and AI that will influence the later discussions on data modeling, but I understand there may be gaps.
Please let me know if you think something is unclear or if I missed something. I plan to use the feedback to gauge the other sections of this chapter, which are drafted and ready to publish very soon.
Thanks,
Joe Reis1
The 1940s and 1950s
The 1940s and 1950s were a transformative era in computing, analytics, and the nascent field of artificial intelligence (AI). This period witnessed the birth of groundbreaking electronic computers like ENIAC (1945) and UNIVAC (1951), which laid the foundation for electronic digital computers. Before this, computers were mechanical, slow, and clunky. While crude and physically enormous by today’s standards, these new machines enabled complex calculations and previously impossible data processing. In 1945, John Von Neumann introduced a visionary architecture for a general-purpose computer in his paper "The First Draft of a Report on the EDVAC." This architecture, which included shared memory for data and instructions and sequential execution of instructions, is still used in nearly every modern computer today. Computers also got smaller and smaller. The invention of the transistor in 1947 revolutionized computing by replacing bulky vacuum tubes with smaller, more efficient components. The transistor was a critical step towards making computers smaller and more accessible. As we’ll learn, smaller computers - all the way from mainframes to PCs to smartphones - will cause an explosion in data use cases, leading to various data modeling techniques.
In 1948, Claude Shannon2's paper "A Mathematical Theory of Communication" provided a mathematical framework that revolutionized our understanding of communication. This paper introduced concepts like the bit, information entropy, channel capacity, and the noisy-channel theorem. The macro trends introduced by Shannon were easier and better ways to store and transmit data. What’s fascinating about Shannon’s paper is the impact it had on nearly every aspect of computation, data, and AI. I can’t think of another paper that had such a major impact on nearly every field we discuss in this book. For instance, the bit is the basic unit of describing information in a computer system, information entropy is used in decision trees and neural networks. Shannon’s paper also led to data compression and data error-correction codes, leading to advancements in data storage and transmission, which will feature prominently decades later in databases, file systems, and the Internet.
A significant innovation of the 1950s was the abstraction of the building blocks that would form modern computing. In the early days of programmable computers, programmers often wrote in bare metal assembler code, which was tedious and error-prone. In 1952, Grace Hopper, a true pioneer in our industry, created the first compiler, A-0, which converted English terms into machine code. The compiler revolutionized the way programs were developed. In 1959, Hopper also helped create COBOL. Her contributions to the field are genuinely inspiring, and studying her life and work is a fascinating journey I highly recommend.
The 1950s also saw the emergence of the first rudimentary operating systems and other programming languages like Fortran. Integrated circuits were also created in the late 1950s, which paved the way for smaller, faster, and more affordable computers. These laid the foundation for more complex operating systems, better hardware, and a growing diversity of languages in the 1960s and beyond.
In turn, these computers computers revolutionized statistical analysis and data-driven decision-making. Statistical analysis was done mainly by hand, on reams of paper. With computers, researchers and businesses could now analyze large datasets, leading to advancements in economics, social sciences, and operations research. This enabled people like the “Whiz Kids,” led by Robert McNamara, to spearhead the field of operations research in the 1940s and 1950s. The impact of operations research would be felt later by most organizations and governments worldwide.
Meanwhile, the field of AI was born. In 1943, Warren McCulloch and Walter Pitts published a paper, "A Logical Calculus of the ideas Imminent in Nervous Activity", introducing the ideas of neural networks, perceptrons, and artificial neurons. This paper laid the foundation for artificial neural networks, which would shape various epochs of artificial intelligence in years to come. In 1950, another paper by Alan Turing proposed the Turing Test in his paper Computing Machinery and Intelligence,” laying a theoretical framework to evaluate a machine's ability to exhibit human-like intelligence. This concept sparked discussions and research into the possibility of creating thinking machines.
The Dartmouth Summer Research Project on Artificial Intelligence in 1956 is considered the official birth of AI as a field of study, bringing together pioneers who laid the groundwork for decades of research and innovation. In 1958, Frank Rosenblatt created the first physical implementation of the perceptron, which forms the basis for artificial neural networks (and later, deep learning). Finally, in 1959, Arthur Samuel developed a self-learning program for playing checkers, an early example of machine learning. He also popularized “machine learning” in 1959, a subset of artificial intelligence that underpins much of today’s modern world. These innovations shaped the past and set the stage for further innovations in AI in the 1970s and beyond, including the first AI Winter, which you’ll learn about shortly.
The fields of computing, analytics, and AI were still very disparate. As you’ll see, a continued trend of increased computational power over the decades leads to an ongoing convergence of developing software applications, more robust analytics, and AI.
To be continued…Coming up in the chapter - 1960s to 1970s, 1980s to 1990s, and 2000s to present. Also, what the hell happened to data modeling, and where we are today.
Updated 6/21/2024
A great book about Claude Shannon is A Mind at Play: How Claude Shannon Invented the Information Age.”
I think I'm going to reformat and summarize this "brief history" section into these epochs. The gist is the fields of computing, analytics, AI were very siloed and separate. As computing power and networking grew, the fields started converging. This convergences sets the tone for the upcoming section on Mixed Model Arts, where the central thesis of the book is laid out - people need to know about the various ways of data modeling across different use cases (software dev, analytics, ML/AI).
Siloed fields - 1940s to 1960s
Silos, but slowly converging fields - 1970s - 2000s
Rapid convergence - 2010s and 2020s
Thoughts?
"What the hell happened to data modeling?" In part, it seems we go rid of DBAs, and just let the programmers model their own application data. Sometimes well, and sometimes not so well. Much the same with data warehouses. When compute was expensive, the people part was fairly cheap, and expert teams built data warehouses. Now, "self-service" business intelligence means much of the modeling is amateur. Have a good weekend. Interesting chapter. I studied EE, so certainly remember Shannon's work.