Saturday, May 13, 2023

Thoughts on Wiggins and Jones, How Data Happens

This morning I finished reading Chris Wiggins and Matthew L. Jones, How Data Happened: A History from the Age of Reason to the Age of Algorithms (Norton, 2023), based on a course the two teach at Columbia University. It does one of the things that a good history should do. It makes clear the contingent nature of the present, how things could have turned out differently but for specific events or decisions, and argues that we have more choices than we think. 

It also did something that great histories do, it made me see a subject in a new light. In the past, I have read books and articles discussing how data has been used in one fashion or another, whether it has been how it was abused by "race science," how data processing was used to facilitate the Holocaust, how governments have amassed huge archives of often incompatible and sometimes irretrievable data, how data facilitates surveillance, and much else. This book focuses on the parallel evolution of our understanding of data and of statistics over the last quarter-millennium or so in Europe, America, and India. 


Wiggins and Jones move through the early history of these subjects and explore how they were shaped by concerns about race and eugenics. So much of the early history of statistics was shaped by the concerns and beliefs of its founders about race or the possible decline of the "white race." "Moral panics can create new sciences," they note (p. 35). It was also in this era that conflicts between those who wanted statistics to be more grounded in mathematics and given a sound statistical basis arose with those concerned with applied problems of engineering, government, and business. 


The middle of the book covers the effects of World War II, the Cold War, and the growth of what we loosely call AI. The conflict between the two sides (mathematical and applied) continued. For a time, it appeared that the mathematical statisticians had won the field, and the early history of AI was shaped accordingly. It was more complex than that of course. There were smaller conflicts within the larger ones, more nuanced differences of what was thought important, and real differences in world views. The different factions have won and lost a number of skirmishes, and the results of that shape the present hype, cautions, and battles over generative AI and ethics. 


One point the authors make in the middle of the book (123-124) is that Alan Turing, whose name is bandied about so often in these discussions, had a "capacious vision of intelligence" drawn from the human and the animal world, and including much more than logic and reason. The (mostly) men who began to develop the field of AI after him, concerned about building bombs, making money, breaking codes, or on more theoretical objects, narrowed that vision to calculation, data processing, and logic. Put another way, we have been bequeathed to us impoverished visions and expectations of AI. 


The final section concerns how financial, social, political, and ethical factors have shaped the world of data that now surrounds and penetrates our every moment. This is where Wiggins and Jones really bring forward the contingency of the present and future.  Their backgrounds are important here. Jones is a professor of History at Columbia University. His previous books have been about the Scientific Revolution and the early history of calculating machines. Wiggins is an associate professor of Applied Mathematics, but, perhaps of more importance for the insights it has afforded him in the final chapters, is also chief data scientist for the New York Times. The authors understand that the way we handle data, AI, ethics, privacy, and related issues are going to have an outsized importance in the future. They are concerned with the forces and structures that produced this situation and how those can be changed.


As I read these chapters, I began to understand more about the conflicts between different AI factions that have become so prominent and vehement over the past two years, especially since the release of ChatGPT. These go back to the beginning of AI, but they also reflect approaches to ethics and governance of AI that emerged between those who would try to encode ethics and governance into algorithms, reducing them to rules, and those who understand them in terms of larger, human and political contexts.


They packed a lot into just over 300 pages, and they did it in a readable way. It is a good read. There is so much more to their topic. This book whetted my appetite for more. 

Monday, May 8, 2023

Drones, AI, and Information Warfare

I have been thinking about this Twitter thread. It has been gnawing at the back of my mind all day. It is tied up with Matthew Ford and Andrew Hoskin's book Radical War: Data, Attention and Control in the Twenty-First Century (OUP, 2022). Thomas Rid's Active Measures: The Secret History of Disinformation and Political Warfare (FSG, 2020) colors my thinking, along with the reading I have been doing on AI all year. 

 What is emerging in Ukraine is a form of war based on the resources of Western militaries harnessed to the tactics of the underdog derived from the asymmetric warfare of the last few decades. The extreme importance of networks, cheap computing, and vast numbers of drones is striking. 

At the same time, we are witnessing the emergence and rapid evolution of so-called generative AI in the US and China. That keeps getting characterized in Cold War terms without too much thought being given to what that might mean. On the one hand, it means that, as with every information technology the PRC or the Soviets faced in earlier decades, it has to be tightly controlled. It has to be available for the state to manipulate the people but prevent the people from manipulating information in turn. The flip side is that AI will be used as a powerful driver of information warfare to manipulate the citizens of other countries. 

It also means the West will exercise comparatively less control (at least directly), which makes it more open to attack but also more open to novel uses and unexpected developments in AI. We need to recognize the military and intelligence complex is always deeply invested and involved in AI and all computing. We frankly would not have a lot of these techniques and technologies without DARPA and NSA. If the military and intelligence agencies can integrate these AI developments with the rapidly evolving techniques of warfare emerging in Ukraine, we should see further destabilization of our notions of warfare.

It is anything but certain that the American military can reorient itself that way in short order. It is also possible that anti-government groups in the US could reorient this way quickly and use the combination of AI, drones, and new tactics to try to create an environment they believe will allow them to triumph. 

We are already living in a world where our phones and watches are simultaneously devices intelligence agencies can use for real-time data collection and surveillance, the military can use for reconnaissance and targeting, and journalists and NGOs collecting information on war crimes can use for reporting. They also open us to constant propaganda and information warfare.

Something like this has been gestating in my mind for a few days. I am struggling to put together a coherent set of ideas, so, for now, it is just something I need to express so I can work out other ideas that may be more pertinent or that may constellate with it. 



Saturday, May 6, 2023

AI Tsunamis and Learning to See Larger Contexts

My thoughts about generative AI have been all over the place over the last few months. Trying to understand it and help others understand it has become a major focus since January, both at work and outside of it. For so many of us, ChatGPT hit as a tsunami. I was aware of what was happening with text-to-image apps like Stable Diffusion, DALL-E, and Midjourney, but was not following AI developments in general and only paid attention to them in the context of art and art history, of intellectual property and copyright.


At first, ChatGPT was just a distant rumbling beneath the sea. That was back in December. Then the hype and the angst built into a full-fledged eruption. A new landmass was rising from the boiling depths, and the waves it created towered over us. That was January and early February. We kept afloat and tried to steer our ships in the right direction. Then in mid-March, just as some small stability seemed achievable, we were hit by a whole succession of new waves: GPT-4, Midjourney 5, a string of announcements from Google, Microsoft, and Nvidia (most of these systems run on their hardware for now). It was a second tsunami. Over the following weeks, we had the notorious "Sparks of AGI" paper and the calls for a six-month moratorium on AI development, a well-developed critique of the motives behind it (which may include apocalyptic ideologies, eugenics, and the desire of some signers to catch up), and a fairly constant stream of other developments.


All of this is to say that I have learned a lot, been caught up in events, made mistakes, gotten equally caught up in the technology at times, sometimes, like now, been reflective, and often been in intellectual and emotional turmoil about the whole thing. I have not become an expert, but I keep plugging away at understanding it.


There are so many aspects that we need to comprehend. There is the technology itself. There are deep and extensive ethical issues, compounded by the ideologies of the backers and creators of this technology, as well as the boosters and the critics. (For the record, I am closer to ideas coming from DAIR and Critical AI in my views on the ethics and hype than any other groups. This is tempered by my own thinking about technology which often seems a little out of tune with anyone else.) It is hard to overestimate the importance of the ethics of AI and to ground our approach to the technology in a realistic assessment of them, rather than the paroxysm of existential apocalyptic thinking we have had for the last few months. 


I tell people that it is like the famous story of the person who said the Earth rests on the backs of elephants, that the elephants are on the back of a giant turtle, that the turtle rests on the back of another turtle, and that it is turtles all the way down. The difference is that these are ravenously hungry snapping turtles. That is to say that, every time I think I understand the extent of the issues, I discover it is even bigger than I thought. There is always another snapping turtle - often just a baby but sometimes a very old and irritable one.


Ethics must inform our decisions about how we use it as much or more than politics, economics, or, as I suspect will happen, religion. We also need to understand it from a global perspective. I do not mean the supposed AI arms race between the United States and China, or the specific policies of EU countries, or how it might factor into Russian disinformation campaigns. All of those are important but are not my present concern. The American perspective, to the extent there is one in this chaos, is largely that of Silicon Valley, Redmond Washington, Hollywood, New York, and from inside the D.C. Beltway. It is shaped by utopian fantasies and apocalyptic fears, economic beliefs, and the scramble for power, profit, and position. It is caught up in ideas and fallacies that have been brewing since the 1850s at least. 


The dominant perspective here is one of relentless, unstoppable technological change and unlimited economic potential and development.  It can be decked out with flags and bunting, dressed in robes and vestments, and pronounced to be logical and scientific. It is what we, or at least late Baby Boomers like myself, were spoon-fed. I knew there were things wrong with it by the time I was ten, there were too many nuclear missiles in our area to let me accept it at face value, but it was, and is, so pervasive that it remains hard to shake. Sometimes the response it evokes feels like an atavistic instinct.


It is, frankly, baloney. Maybe if we had unlimited, clean, electrical power, unlimited natural resources, a better moral compass, and lived in a world where everyone had just adopted Americans' view of ourselves, it might work. That is one hell of a counterfactual, though a lot of people treat the world as if it is, or will soon be, the case. One reason that some ethicists like Timnit Gebru and Emily Bender are too often ignored is that they do not buy it. They try to correct it. Maha Bali is gentler in her criticism and equally insightful. Even something that we take as a great triumph of AI, machine translation, is fraught with problems, not least of them problems translating across language families or the possibility that AI is reinforcing English as a hegemonic or imperial language. (See Emily Bender, Timnit Gebru, et al., On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?, Paris Marx's interviews with Timnit Gebru and Emily Bender for the Tech Won't Save Us podcast, and Maha Bali, Agentic and Equitable Educational Development For a Potential Postplagiarism Era and her blog, Reflecting Allowed, in general. For the idea of imperial languages, see Nicholas Ostler's Empires of the Word: A Language History of the World. For the critical importance to a nation or a people of control over how its own language is expressed and transmitted, see Jing Tsu, Kingdom of Characters: The Language That Made China Modern.)


It goes way beyond language. That is an area that fascinates me but one I have hardly explored. There is also the exploration of cheap African labor in cleaning inappropriate material from the training data. Resource extraction for the manufacture of the huge numbers of chips and other hardware required is also a concern. Heavy energy demands generate greenhouse gasses and contribute to climate degradation, affecting many of the countries most exploited for their human and natural resources, is yet another.


Those are just some of the bigger issues and not to be minimized. They point to another aspect of the global - the environmental and climate costs of generative AI. It already has a substantial footprint in energy usage, greenhouse emissions, resource depletion, pollution (materials  mined for chips and servers leave behind a lot of toxic materials locally), and water use (both in manufacturing and also for cooling data centers). The chips and servers will get more efficient, and there may be a turn towards more use of renewable energy. Still, we should expect the use and demand for generative AI to jump orders of magnitude, so we may see a net increase in all of these negative effects.


I want to return to language though. Generative AI and language interest me for other reasons. Images do as well. Whenever humans encounter language, they assume thought like their own, even across cultural divides. We largely define both intelligence and humanity through language, and certainly through symbolization. We define behaviorally modern humans, that is Homo sapiens exhibiting signs of consciousness and intelligence like our own, chiefly through the creation of images, which is why any decorative marks on early artifacts are closely studied, and also why cave art fascinates. Whenever a non-human exhibits any signs of language or begins to show an understanding, however basic, of human language, being a primate, a bird, a dog, or a dolphin, we go a little bit ape. Some react reflexively saying there must be a mistake. Others are overjoyed. Some of us just want to know more and are deeply fascinated with the phenomenon of language.


For years, we have been dealing with animals that can deal with human language. Likewise, we have had computers for decades that can carry on written or spoken conversations within limits. We knew this was a result of programming and that the ability was no sign of intelligence. We knew it was possible for someone to explain every step of the process, even if we ourselves could not. Now we are confronted by an "intelligence" that is "trained" rather than programmed. We are repeatedly told that no one can really explain every step of the process that creates the output. We may intellectually understand that there is no thinking going on in any fashion we would understand. Instead, it is all about probabilities that one word will follow another in a given context. The output is constructed of tokens, and the AI is doing something like auto-complete on steroids.


That is not how it feels. It is not the impression it leaves on us. It feels more like we are talking to a person. My first impression of Bard AI was that I was chatting with a very polite, somewhat inept, reference librarian. It is easy to impute feelings and thought to these programs. We tend to think of them in somewhat human terms. I even apologized to Bing AI on one occasion. This has a lot of implications. Do they have agency? Can they be held liable for their actions? Can they claim copyright? Are they really creative? Should they have rights? Are they sentient beings? 


One direction this is taking us is to focus on the human-like qualities they exhibit and to reinforce the idea or conviction that intelligence (and maybe sentience) are specifically human. Indeed, much of the fear evinced in the last two months has come from those who understand that they are not human-like and are something more like an alien mind. I find that fascinating, both because of the fear of the alienness of other intelligences, and for the possibilities it might create to extend our understanding of intelligence and mind, that is if there is really something more going on in those servers than we suspect. 


But it also means that we are still fixated on intelligence like our own. In the past few decades, we have begun to learn that intelligence and sentience are much more widespread than we ever thought. We are even beginning to see signs of it in plants, or at least in the complex ecosystems that plants create. We need AI to be non-human, less human-like, more alien, just to maintain our bearings in a world full of other intelligences. I believe there is a real danger that by focusing on the seemingly human qualities of AI, we will be led away from considering and trying to comprehend all of the intelligences that surround us and interact with us, often without our knowledge.


There is a double-edged sword here. One edge may force us to reconsider aspects of our own intelligence, sentience, creativity, agency, and uniqueness. This is a very real possibility and when some AI proponents suggest that we think like an AI, they are doing what we have long done, likening our minds to the latest technology. We have been doing it for centuries. We like those machine metaphors and slip easily into them, often without understanding the implications for how we think of ourselves and behave towards others. 


The other edge cuts us off from the nascent understanding we have of sentience and intelligence across the living world. By appearing to think somewhat like a human, even though they are not, these synthetic intelligences promote the idea that only human intelligence matters. We should focus just on ourselves and on these machines. For a long time, our attempts to communicate with "higher" mammals - primates and dolphins - followed this direction and reinforced false views of intelligence. We taught chimpanzees and gorillas to communicate with us through sign language or keyboards full of pictures and symbols. John C. Lilly even tried to teach dolphins to speak English. Those attempts met with limited success and began to show us a little more about the minds of our fellow mammals, at least. Things began to get a little stranger with birds, particularly an African Grey Parrot named Alex who commanded a vocabulary of a hundred words and some apparent cognitive abilities of a two-year-old human. We also began to realize that birds use tools and have cultures, particularly members of the crow family. (Suddenly, Poe's Raven seemed less far-fetched.) Birds are pretty far from us evolutionarily, tens of millions of years. 


Then things got really strange. We began to understand that octopuses are not only sentient, but extremely smart, very resourceful, and even seem to have a sense of aesthetics. That shook things up. An octopus is about as alien, and for many, as scary, as any creature on Earth. Despite Ringo wanting to while away the time with his beloved in an Octopus's Garden, and Japan has had an erotic sub-genre dedicated to women and cephalopods for a couple of centuries, they have not been so well regarded in the Anglophone sphere. It is no accident that H.G. Wells made his Martians resemble octopuses, or that Lovecraft's horror god Cthulhu was a mashup of human, bat, and cuttlefish. To make things worse, they are invertebrates and have decentralized brains that work differently than humans. Yet they can solve problems and arrange their surroundings to their own liking. James Bridle suggests, in Ways of Being: Animals, Plants, and the Search for Planetary Intelligence, that the octopus teaches us there is more than one "way of 'doing' intelligence," and that intelligence "is not something to be tested, but something to be recognized, in all the multiple forms that it takes." (Bridle, 51-52)


Bridle goes on in his book to argue that we need to recognize all types of intelligences and sentience, even the collective sorts seemingly found in social insects and across plant species. We need all of them to give context to our own, and also to the artificial varieties were are developing. He also argues that we need to bring them all into conversation to save life on Earth. That may sound a little bizarre and idealistic, it may be, but we are seeing more and more thinkers who see the world in similar terms. As I said, one possibility of our current obsession with the apparent humanity, as well as the inhuman threat, of generative AI may take our attention away from those other kinds of intelligence. It might make it easier to disregard them and allow us to continue to destroy them without a second thought. 


Maybe it also keeps us from understanding artificial intelligence. Do we need to have a lot of models of intelligence to understand artificial intelligence and to recognize the point at which it might become something more than the sum of what we put into it? I have no idea if computer intelligence can ever become conscious and exhibit cognitive abilities. I am pretty sure that generative AI as we know it today will not, but we also have to ask what happens, and it is happening very quickly when it can interact with other types of artificial intelligence, to access all kinds of tools (as one story put it, I do not recall the reference, ChatGPT learned to use a calculator), and begin to be embodied (both Google and OpenAI are working on this, and GPT has already been incorporated into at least one Boston Dynamics "dog"). 


Maybe generative AI will be just a piece of what is needed to create an AGI (Artificial General Intelligence, the dream of many of the creators of generative intelligence). Think of it as being like Broca's area, or some of the other areas associated with speech and symbolic expression, in the human brain. If we get AGI, I am betting it comes from synergies between different kinds of what we so loosely and inaccurately refer to as AI today. On the other head, it may not have anything to do with AGI, or AGI may never be developed. If anything, the latter seems the most likely. 


But then I am no expert. I am just trying to understand. What I do know is that we have to watch out for both utopian and apocalyptic thought in regard to AI. We need to understand it within larger contexts. Obviously, I think some speculation about it is a good thing, or I would not have written this, but it needs to be grounded, and the very real dangers and potentials of the technology have to be kept firmly in view. 



Note: Like everything else in this blog, this is my interpretation, and does not reflect the views of the University of Missouri System - or anyone else for that matter. My take on things is often idiosyncratic and sometimes eccentric.