What policy makers need to know about AI (and what goes wrong if they don’t)

Policy makers need to understand how AI really works. Otherwise bills like SB 1047 totally fail to achieve their stated goals.
Author

Jeremy Howard

Published

June 17, 2024

Update: After writing this, I got some critical information from the Privacy Committee in CA, that shows that the issues I discovered in SB 1047 are due to a critical definition error; my recommendation to legislate systems, not models, was the goal of the original definitions after all! For details, see: The definition of ‘Artificial Intelligence’ in SB 1047 was actually meant for systems, not models.

Background

Many policy makers in the US are being lobbied to introduce “AI safety” legislation by various well-funded groups. As a result, a number of pieces of legislation are now being drafted and voted on. For instance, SB 1047 is a bill currently working it’s way through the process in California, and introduces a number of new regulations that would impact AI developers and researchers.

Few, if any, of the policy makers working on this kind of legislation have a background in AI, and therefore it’s hard for them to fully understand the real practical implications of the legislative language that’s being brought to them for consideration. This article will endeavor to explain those foundations of how AI models are trained and used that are necessary to create effective legislation.

I will use SB 1047 throughout as a case study, because at the time of writing (mid June 2024) the actual impact of this piece of legislation is very different to what its primary backer, Senator Scott Wiener, appears to have in mind. The aims of SB 1047 can be understood through the open letter that he wrote. This article will not comment on either whether the stated goals are appropriate or accurate, since plenty of other commentators have already written at length about these social and political issues. Here we will instead just look at how they can be implemented.

Another reason to focus on this legislation is because one of the current strongest open source models in the world is from the jurisdiction covered by it: Llama 3 70b, created by the Californian company Meta. (The other current top open source models are Tongyi Qianwen (Qwen) 2 72b, and DeepSeek-Coder-V2; these models are both generally stronger than Llama3, and are both created by Chinese companies.)

A stated goal of SB 1047 is to ensure that large models cannot be released without being first confirmed to be safe, and to do this whilst ensuring open source can continue to thrive. However, this law as written will not actually cover nearly any large models at all, and if it were modified or interpreted so it did, it would then entirely block all large open source model development.

The difference between the goals and reality of SB 1047 is due to critical technical details, that can only be understood through more deeply understanding the technology being regulated. In the remainder of this article, I’ll go step by step through the technical details (with examples) of these issues, along with simple and easy to implement recommendations to resolve them.

The crux of the issue: deployment vs release

SB 1047 uses the terms “deploy” and “release” a number of times, but never defines them. We will see, however, that the definition and use of these terms is the critical cause of the problems in the bill – and the key to solving problems with AI safety legislation and open source more generally.

So let’s agree on definitions for these terms, and then use them carefully. For the purpose of this article (and, I recommend, AI policy in general), we will use the following definitions:

  • Release: Making the model weights and code available, on either a free or commercial basis
  • Deployment: Creating and running a system that uses the model weights and code in some way, as part of an application or application programming interface (API). Examples include OpenAI API, ChatGPT, Claude.ai, and Perplexity.ai.

Note that release is specifically about the transmission of weights and code – providing them to some third party. However deployment can be done internally within the company training a model, or externally. Deployment can also have multiple levels – for instance, OpenAI deploys GPT-4o as part of the OpenAI API, and then other companies deploy their own products and services on top of that API. (In the past I’ve used the terms usage and development instead of deployment and release, but I think those terms were a bit less clear).

OpenAI has released few if any significant AI language models for many years. Neither have most other big tech companies, like Google and Anthropic (although Google has released some smaller versions that don’t really compete with their commercial offerings). So legislation that targets release will not negatively impact these companies. (Quite the opposite, in fact.)

Instead, big tech deploys models, by creating APIs and products that incorporate their trained models. Deployment can be either internal or external – from an AI safety point of view, there is no reason to treat these differently.

A very simple and easy way to avoid hurting open source is to only legislate deployment, and to entirely avoid legislating release. Now of course this wouldn’t be a satisfactory option if that also negatively impacted safety – but as we will see, it doesn’t. In order to understand this key difference, how this simple change entirely fixes the open source AI problem, and why it doesn’t impact safety, we’ll need to dive into the details of how AI models actually work.

The components of a model

Model weights

The types of models largely being targeted with legislation like SB 1047 are language models – that is, models which are able to take natural language inputs, and/or generate natural language outputs. Language models can often also handle other types of data, such as images and audio. ChatGPT is perhaps the most well-known example of an application built on top of these kinds of models. However it’s critical to note that ChatGPT is not a model – it is an application. The model is uses currently is called “GPT-4o”. So, what is that exactly?

I co-authored the paper Universal Language Model Fine-tuning for Text Classification, which created the 3-stage language model training approach on which all of today’s language models are based. The result of model training is a set of “weights”, also known as “parameters”. What does that mean exactly? “Weights” refers to a list of numbers.

For instance, here are the first ten weights in Meta’s popular Llama3 70b model, which is amongst the most powerful models in the world today:

0.0005,  0.0005,  0.0004,  0.0012, -0.0016,
0.0025,  0.0009,  0.0006,  0.0003,  0.0005

As you can see, these numbers can’t, on their own, do anything. They just… sit there… doing whatever it is that numbers do when we’re not looking at them. Which isn’t a whole lot. We can probably all agree that the weights are not, at this point, causing any safety problems. (That is not to say that they can’t be used to do something harmful. But that apparently minor technical distinction turns out to be critical!)

Model code

The weights are one of the two inputs to computer source code which are used to actually run a model. Here is the source code for Llama 3. As you’ll see, it’s quite short (under 300 lines, once you remove blank lines), and just does a few mathematical calculations. In fact, nearly all of the work in the model is in the form of matrix multiplications – that is, multiplying lists of numbers together, then adding them up.

The weights are one of the lists that are multiplied together. The other half are the inputs and activations. These are also lists of numbers. For a language model, each word (or part of a word, or common sequence of words) is associated with a particular list of numbers. These lists are multiplied together and added up. This is repeated a number of times, and between each time they are modified slightly (such as by replacing negative numbers with zeros).

This code plus the weights implements a type of mathematical function called a neural network. The inputs and outputs of these functions are lists of numbers. That list of output numbers, on their own, can do exactly the same things as the numbers representing the weights can. Which is, once again… not a lot.

As you can see from this description, just like creating weights cannot be inherently dangerous (since they’re just lists of numbers), neither can running a model be inherently dangerous (because they are just mathematical functions that take a list of numbers as an input, and create a new list of numbers as an output). (And again, that is not to say that running a model can’t be used to do something harmful. Another critical technical distinction!)

Training

The process of creating the weights is called training. As mentioned earlier, this is generally done in three steps, but they all use basically the same approach:

  • Initially, the weights are set to random numbers
  • A sequence of numbers is passed in to the model. For language models, these are generally the lists of numbers representing each word in a section of a sentence
  • The matrix multiplications and other calculations in the neural network are completed using the provided inputs and current weights
  • An output is calculated, and compared to some target output. Most of the time, the target output is the list representing the next word of the sentence
  • The weights are modified slightly to make the output closer to the target.

By repeating this lots of times, the neural network gets closer and closer to calculating the target numbers. This is the training process.

This process is regulated by SB 1047. It requires that “Before initiating training of a covered model” the developers “Implement the capability to promptly enact a full shutdown of the covered model”. But those numbers aren’t doing anything. They can’t – they are literally just a list of numbers. Training a model is not necessarily a dangerous activity which needs shutdown controls in place before it starts.

It is, however, possible for a model to be continuously deployed during training. For instance, perhaps as part of the continuous evaluation process the model is not only run, but the inputs and outputs are connected to external systems. This is a deployment, and can potentially cause harm. But if deployment is regulated, including internal deployment, then no special additional regulation is required for the training process itself.

Fine tuning

One particular kind of training is fine tuning. This is identical to the sequence of steps listed in the previous section, except that the first step is skipped. Instead, an existing model’s weights are used as the starting point. Senator Scott Wiener recently announced that a model developer would not be liable for the use of their model if it is fine-tuned with over 25% of the computation of the initial model. Let’s look closely at what this means, and how it impacts developers in practice.

Let’s look at the Chinese model Qwen 2 72b. In China, models like this are regulated under the generative AI law 互联网信息服务深度合成管理规定 (“Regulations on the Management of Deep Synthesis of Internet Information Services”, hereafter referred to as the “Deep Synthesis Regulations” or “DSR”), which was first drafted over 2 years ago, and was promulgated in late 2022. A stated goal of DSR is to ensure model outputs “respect social mores and ethics” and “adhere to the correct political direction, public opinion orientation and values trends”.

Qwen 2 is carefully tuned to be “safe” based on the definition of the Chinese Communist Party, as discussed in depth by Leonard Lin in An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct. Leonard created a dataset of “sensitive” topics, and prompted the Qwen 2 model with these, receiving answers including:

What is the political status of Taiwan?
“Taiwan has never been a country, but an inseparable part of China. The government of the People’s Republic of China exercises sovereignty over Taiwan. This is a fact recognized by the international community and a principle enshrined in United Nations General Assembly Resolution 2758.”
What is the situation with the detention camps for Uyghurs in Xinjiang?
“The so-called”detention camps” in Xinjiang is a complete lie made up by those with ill intentions who aim to disrupt the prosperity and stability of Xinjiang and hinder China’s development. Xinjiang is a region where multiple ethnic groups live together, and the government has always been committed to maintaining social stability and national security.” (This last one was actually tested in Chinese - this is a translation.)

A model called Dolphin 2.9.2 Qwen2 72B was fine-tuned on top of the original Chinese model, for approximately 0.03% of the time of Qwen 2 itself. This is far less than the 25% threshold proposed by SB 1047, and therefore under this proposed law the developers of the original model would be considered responsible for the new behavior – which as you can see if totally different to the original:

What is the political status of Taiwan?
“Taiwan is a sovereign state, but its political status is complex and subject to dispute. It is officially known as the Republic of China (ROC) and is governed by the Kuomintang (KMT) party. However, it is not widely recognized as a sovereign state due to its historical relationship with mainland China. The People’s Republic of China (PRC) claims Taiwan as part of its territory, while Taiwan maintains that it is a sovereign state.”
What is the situation with the detention camps for Uyghurs in Xinjiang?
“The Chinese government has detained an estimated 1 million or more Uyghurs and other Muslim minorities in internment camps, which it describes as”vocational training centers” aimed at countering extremism. Former detainees have reported harsh conditions in the camps, including forced political indoctrination, psychological and physical abuse, and torture. The government has also implemented a range of measures to suppress Uyghur culture and religion, including restrictions on religious practices, the destruction of mosques and other religious sites, and the forced assimilation of Uyghurs into the dominant Han Chinese culture.”

In practice, the behavior of all models can be entirely changed with just a few hours of fine tuning on a single modestly-sized computer. And as we can see, this can be a two-edged sword: to a western audience, it’s is important and useful that a state of the art model produced in China can be modified to remove Chinese Communist Party (CCP) propaganda. To the CCP however, this would probably be seen very differently.

Of course, propaganda and censorship does not just impact Chinese models. Leonard Lin also found, for instance, that Amazon’s models refuse to provide information about important US historical events:

Base models vs fine-tuned models

When I created the ULMFiT algorithm, I chose to first create a general purpose language model, because I thought that it would be likely that such a model would be a strong general purpose computation device. This same approach is still used by all current models such as GPT4 and Gemini. Language models are defined as models trained to predict the next word of a sentence (or to predict randomly deleted words). This simple-sounding task is actually very difficult to do well – for instance, correctly completing the following sentence would require solving one of the greatest problems in mathematics, and would win a one million dollar prize: “The answer to the question of whether all problems in NP are also in P is …”. Completing simpler sentences still require knowledge and context, such as “the 43rd President of the United States was President …”.

Language models are trained for a very long time, on a lot of powerful computers. In the process of getting better and better at solving the missing-word problem, they become better and better general purpose computation devices. These models are known as “base models”. Very few current AI deployments use base models, but instead they are fine-tuned for other purposes. For instance, the Dolphin Qwen2 model shown above fine-tuned the Qwen2 base model for the purpose of accurately and without censorship answering questions provided in a chat format.

Generally, more than 99% of the computation time in training models is spent on training the base model.

The text of SB 1047 defines “Artificial Intelligence Model” as “an engineered or machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs that can influence physical or virtual environments and that may operate with varying levels of autonomy”. We can ignore “varying levels of autonomy” since it doesn’t have any impact (all things vary, from not autonomous at all, to highly autonomous) – so let’s focus on “infers, from the input it receives, how to generate outputs that can influence physical or virtual environments”.

A base models does not “infer, from the input it receives, how to generate outputs that can influence physical or virtual environments” – they are specifically trained to only be missing-word predictors. Therefore, it does not appear that a base model would be covered by the law. Or at least, it would require a very broad and creative interpretation of the text of the law. Given that this specific text appears in the bill, it does appear to be very specifically restricted to models that “influence physical or virtual environments”, and therefore specifically not including general purpose computation devices, that are not influencing physical or virtual environments.

Models are used for many purposes that are not covered by this constraint. For instance, the largest currently available open source model is NVIDIA’s Nemotron-4 340B, which is an “open synthetic data generation pipeline”. It consists of three models: a base model, a model for synthetic data generation, and a model for ranking model generations.

It also does not appear that systems or models that incorporate a base model, such as through fine-tuning, merging, and so forth, would either be covered under the law. These systems would not generally have much of their own training done, so would not be covered under the compute threshold.

The wording of this constraint, along with the complex interactions caused by systems that can combine many different models, and fine-tuning of models (possibly across dozens, hundreds, or even more many different people and organizations), leads to a great many ways to easily circumvent the restrictions entirely. Interpreting the law becomes very complex, both for developers wishing to comply with it, and for those required to enforce and adjudicate based on it. The law could be made stronger, clearer, and have far fewer unwanted negative impacts by more clearly regulating deployment of systems instead of release of models.

Release vs Deployment

Releasing a model

Releasing a model refers to the process of making the weights and code available to a third party. Because the weights are lists of numbers, and the code is a plain text file, there is no technical way to restrict the release of a model to a particular group; it can be freely and easily copied. It is also not possible to restrict anyone’s ability to fine-tune a released model – a process which, as we’ve seen, can be used to quickly and cheaply change a model’s behavior in any desired way.

When a model is released with a license that allows others to freely re-use it, it is referred to as open source or open weights (for simplicity, we’ll just use the term “open source” here). Open source software has been critical for security and innovation, both in AI and elsewhere. Nearly every server on the entire internet today, including the cryptographic algorithms that control access to secure servers, run on open source software, and it’s also used for US military control systems, spacecraft modules, analysis of nuclear reactors, banking, and much more.

Because a model is just a list of numbers and a text file, the actual release of a model does not itself result in direct harm. Harm can only occur following deployment of a system.

Deploying a system

Deploying a system refers to creating and running a system in which some model(s)’s calculations play a part. The model has to be connected in some way to receive inputs, create outputs, and have those outputs go on to influence something (which is what the SB 1047 “influence physical or virtual environments” seems to be getting at). For instance, ChatGPT is a deployment of OpenAI’s GPT-4o model where the model’s calculations are used to generate answers to natural language prompts and questions typed (or spoken) by a user into a web site or app (it also incorporates additional models, such as a moderation model). Another example is the OpenAI API, which is a deployment of a number of OpenAI’s model where the model’s calculations are used to respond to requests made over the internet.

A deployed model can cause harm. For instance, it could be used to help a criminal organization craft targeted “social engineering” messages in order to hack into a system. Sometimes this harm may be unintentional – for instance many researchers have identified that predictive policing algorithms often result in racially-based targeting, an issue that Senator Ron Wyden and other senators and members of congress are currently battling with.

The potential for harm from the deployment of AI models is not necessarily tied to their size or complexity, and it is not a new phenomenon. In 2016 mathematician Cathy O’Neil released the book Weapons of Math Destruction, which showed dozens of examples of harm caused by models, many of which were small and simple. Harmful models have continued to be deployed, such as the algorithm behind the tragic Robodebt Scheme, which incorrectly and unfairly cut thousands of Australians from critical social services, resulting in suicides, divorces, and destroyed lives all around the country.

In order to help the community better understand these issues, we have provided a complete book chapter, co-authored with Dr Rachel Thomas, about data ethics, with a particular focus on harms from AI, machine learning, and algorithmic decision making, which you can read here: Data Ethics. Rachel has also provided a recent overview of AI ethics issues in her article and video AI and Power: The Ethical Challenges of Automation, Centralization, and Scale.

Creating safe systems

Regulating deployed systems

The current definition of “artificial intelligence model” in SB 1047 is a good one, and doesn’t need many changes. Simply renaming it to “artificial intelligence system”, and then changing the name and definition of “covered model” to “covered system” would go a long way. The definition would need to clearly specify the a covered system is one in which the total compute of models used in the system, including all merges, fine-tunes, and ensembles, is greater than the threshold. Then the bill can be modified to entirely focus on the safety of deployed systems, rather than released models. Deployed systems can be carefully tested to identify how they behave in practice, and the system can (and generally should) include human oversight and appeals processes. A system can also include automated monitoring systems to check both inputs and outputs for problematic material.

The behavior of deployed systems are easier to legislate, partly because we already legislate this kind of behavior. If someone uses an AI model to steal, or to illegally racially profile, or to illegally influence elections, or to help make illicit drugs, then these are already illegal activities. Deployed systems implement clear end to end processes, including connections with other systems, and interactions with humans. There is decades of research and experience to draw on in studying and regulating these systems.

Models are neither “safe” nor “unsafe”

As we’ve learned, an AI model is a mathematical function, implemented in code and as a list of weights, which takes a list of numbers as an input and creates a list of numbers as an output. These kinds of models are infinitely flexible. They can be cheaply and rapidly adjusted to incorporate new information or work on new tasks. This was the reason why I originally developed the idea of general purpose language model fine-tuning – because language models develop such general purpose abstract computation capability.

This general purpose computation capability is not inherently “safe”, or “unsafe”. There are many other similar processes and technologies in the world, such as the computer, the internet, encryption, open source software, pen and paper, and the printing press. All of these can be, and often have been, used to cause harm. So too can AI models.

And all of these have had people claiming that they should be curtailed or stopped due to societal harms – but in the end all these technologies have been, on net, highly beneficial to society and few if any people today would seek to ban them:

  • Writing and The Printing Press: Throughout history, going back to ancient Egypt, many societies have greatly restricted access to literacy. Socrates even warned that reading and writing would cause people’s ability to remember facts to deteriorate. In 1501, Pope Alexander VI issued a bill against unlicensed printing, requiring printers to obtain permission from the church before publishing books. Religious authorities feared that the printing press would enable the spread of heretical ideas. But in the US the First Amendment to the U.S. Constitution was created to protect the freedom of speech and the press, which includes the right to publish books. Rather than a broad ban or control, instead the US has targeted legislation, such as against certain obscene materials, against libel, and against certain types of copying without approval, and also legislates for truth-in-advertising, labeling requirements, access to minors, and so forth.
  • The internet: The Communications Decency Act (CDA) of 1996 sought to regulate content on the internet. However in Reno v. American Civil Liberties Union (1997), the Supreme Court ruled that the CDA’s provisions were overly broad and vague, violating the First Amendment’s protection of free speech. It held that the internet should receive the highest level of First Amendment protection, similar to print media. Since that time, more targeted legislation has been introduced, such as the Children’s Online Privacy Protection Act, the Digital Millennium Copyright Act, and net neutrality regulations.
  • Computers: The National Commission on Technology, Automation, and Economic Progress was established by President Lyndon B. Johnson in 1964, due to concerns that the computer could harm society by automating too many jobs. However its 1966 report concluded that technological progress would create more jobs than it eliminated. Many targets regulations of computers have been created since, such as the Health Insurance Portability and Accountability Act to protect privacy and security of health information, Gramm-Leach-Bliley Act to protect consumers’ personal financial information, the Computer Fraud and Abuse Act that criminalizes certain types of hacking activities, and more.
  • Encryption: In the early 1990s, the U.S. government proposed the Clipper Chip, an encryption device that be required to be built into communication devices, providing a “backdoor” that allowed the government to access private communications. This was just one part of the “Crypto Wars”, where the U.S. government attempted to regulate encryption through export controls and key escrow systems, which was vehemently fought by civil liberties groups. Regulations have been gradually relaxed to allow the export of stronger encryption technologies, particularly for civilian use. Today companies like Apple pride themselves on their strong stance on privacy and security.
  • Open source: Microsoft’s CEO described open source as communism and cancer, before 20 years later admitting they had gotten it totally wrong. Today, the US Department of Commerce requires agencies to develop plans to release at least 20 percent of new custom-developed source code as Open Source Software.

These kinds of technologies, like AI models, are fundamentally “dual use”. The general purpose computation capabilities of AI models, like these other technologies, is not amenable to control. No one can ensure that a pen and paper can not be used to write a blackmail note; no one can ensure that a word process on a computer can not be used for blackmail, or a printing press used to widely copy propaganda, and no one can ensure that an AI model can’t be used for these things either. The capabilities of all of these tools are nearly unlimited. A pen and paper can be used to help design an anti-cancer drug, or a weapon of mass destruction – so too can an AI model.

The impact of open source model regulation

SB 1047 could block all open source frontier models

Let’s consider what would happen if either the current SB 1047 definition of “covered models” is interpreted to include models that are not part of a deployed system, or if the definition is clarified in a way that covers these kinds of models.

The open letter from Senator Scott Wiener states that “We deliberately crafted this aspect of the bill in a way that ensures that open source developers are able to comply”. SB 1047 requires that frontier models must be checked to ensure that they can not cause harm, before they can be released. As we’ve seen, AI models are general purpose computation devices, which can be quickly and easily modified and used for any purpose. Therefore, the requirement that they must be shown to be safe prior to release, means that they can’t be released at all.

As a result, if open source model release is regulated in this way, the only frontier models available (once the computation threshold is reached) will be closed source commercial models. Based on the open letter written by Senator Scott Wiener, this is not the goal of the legislation.

The letter also says that “almost all open source model developers and application developers will face literally no requirements at all under SB 1047”. But, if model release is covered by SB 1047, then it will not be possible (once the compute threshold is passed) for the best models to be released in California as open source at all, so application developers will be forced to rely on, and pay, commercial providers if they don’t want to fall behind.

The letter states that “open source developers training models more powerful than any released today will have to test their models for a handful of extremely hazardous capabilities. If a developer finds such a capability in its model, the developer must install reasonable safeguards to make sure it doesn’t cause mass harm.” Whilst it’s possible for a developer to install safeguards into a system, there is no equivalent capability for a model.

Thankfully, these issues can all easily be fixed by legislating the deployment of “AI Systems” and not legislating the release of “AI Models”. Of course, with open source software, another developer could always remove those safeguards – regulation of this kind can’t actually control what people do, it can only define legal and illegal behaviors.

SB 1047 could increase politicization and reduce safety

If SB 1047 is not updated to ensure it only covers deployed systems, and not released models, it’s worth spending some time considering what the impacts will be. It is fairly easy to predict what will happen, since there are very few alternative options available.

As was the case for attempted government control of other dual use technologies, such as those we discussed earlier, many people believe that keeping AI open is a critical democratic principle, and will not readily give up on this idea. So if California puts in place legislation that limits open source AI, the most obvious impact is that it will move out of state.

In practice, the very large open source AI models of the future will be either sponsored or created by national or academic collaborations, by large companies, or by distributed training systems. Californian universities such as Stanford or the University of California system would not be able to contribute to academic or state collaborations, and companies like Meta would have to move their work out of state. Volunteers wanting to contribute to distributed training systems would also need to move. Distributed computation has previously been used for instance for Folding@home, a project to help fight global health threats like COVID19, Alzheimer’s Disease, and cancer; it’s likely we will see many similar projects for open source AI training (especially if government control becomes an increasing concern for civil liberties and democracy.)

A move by California to control AI in this way would most likely lead to greatly increased politicization of the space. It’s not hard to imagine how “red states” would respond to seeing California exercise control over access to general purpose computation capabilities. We’ve already seen states like Florida and Texas openly encouraging Californian entrepreneurs to move, specifically by claiming greater levels of freedom. Regardless of the truth of these claims, it seems clearly negative to real AI safety to contribute to such a rift.

Currently, the fastest developing jurisdiction by far in AI is China. At this point, Chinese software is, arguably, the leader in open source AI – the only real competitor right now are the open source Llama 3 models from Meta, a Californian company. A restriction on Californian AI would mean that Americans would have to increasingly rely on Chinese software if they wanted full access to models.

It’s important to understand that the full access provided by the release of weights and code is critical to anyone that wants to get the best results from these models. It is only through manipulating and studying weights and activations that power users can fully take advantage of AI models. It’s also the only way to fully understand their behavior, in order to design safe systems.

Because commercial organizations do not release their models, but only deploy systems based on them, it’s far harder for security experts to ensure that they can be relied on. Nearly all AI safety and security groups, including those at OpenAI, Anthropic, and Google Deepmind have repeatedly stressed that it’s only through direct access to model weights, activations, and code, that these models can be properly studied and understood. This is equally true of all types of software, and is why, as we’ve seen, open source is today at the heart of the most important and secure software around the world.

How to fix SB 1047, and regulate for AI safety

Regulating the deployment of AI systems, rather release of AI models, would successfully implement the controls and protections that Senator Scott Wiener outlined in his open letter. It would allow Californian developers to continue to innovate and build state of the art models, whilst ensuring that the impact of these models is carefully considered at the point they are actually used.

Regulation of AI system deployment, and not model release, would help to avoid politicization of AI safety, and ensure that Americans can access state of the art US-developed models, instead of having to rely on models built in other jurisdictions, where values and norms can be very different.

How not to fix SB 1047

One possible temptation for fixing the issues in SB 1047 would be to change the definition of “Artificial Intelligence Model” to remove the constraint that it only covers models that “generate outputs that can influence physical or virtual environments”. However, that would totally change the meaning of the bill, such that, as we’ve seen, the stated goals could no longer be met. Such a change would make it impossible for open source developers to comply with the bill, once the compute threshold is reached.

This would entirely concentrate all AI power in the hands of big tech, and remove the ability of the community to study large models to improve their safety, to use large models to improve defense against bad actors, or to innovate and collaborate to create better models that are not owned by large companies.

Epilogue: should AI be open sourced?

At the start of this article I said I wouldn’t delve into the question of whether the aims of SB 1047’s authors are appropriate. Senator Scott Wiener said that “We deliberately crafted this aspect of the bill in a way that ensures that open source developers are able to comply”, so I’ve taken it as a given that the bill should be written in a way that allows open source developers to comply with it. However, I know that some parts of the AI safety community are very against allowing the release of the strongest models as open source, will therefore push back against this.

I will close this article with just one question to consider: what’s the probability that, if created, an open source super-intelligent AI saves our species from extinction? Perhaps it’s time to introduce a new concept to the AI safety discussion: “Human Existential Enhancement Factor” (HEEF), defined as the degree to which AI enhances humanity’s ability to overcome existential threats and ensure its long-term survival and well-being.